Tag Archives: lincRNA

It ain’t the bricks it’s the plan — take II

A recent review in Neuron (vol. 88 pp. 681 – 677 ’15) gives a possible new explanation of how our brains came to be so different from apes (if not our behavior of late).

You’ve all heard that our proteins are only 2% different than the chimp, so we are 98% chimpanzee. The facts are correct, the interpretation wrong. We are far more than the protein ‘bricks’ that make us up, and two current papers in Cell [ vol. 163 pp. 24 – 26, 66 – 83 ’15 ] essentially prove this.

This is like saying Monticello and Independence Hall are just the same because they’re both made out of bricks. One could chemically identify Monticello bricks as coming from the Virginia piedmont, and Independence Hall bricks coming from the red clay of New Jersey, but the real difference between the buildings is the plan.

It’s not the proteins, but where and when and how much of them are made. The control for this (plan if you will) lies outside the genes for the proteins themselves, in the rest of the genome (remember only 2% of the genome codes for the amino acids making up our 20,000 or so protein genes). The control elements have as much right to be called genes, as the parts of the genome coding for amino acids. Granted, it’s easier to study genes coding for proteins, because we’ve identified them and know so much about them. It’s like the drunk looking for his keys under the lamppost because that’s where the light is.

We are far more than the protein ‘bricks’ that make us up, and two current papers in Cell [ vol. 163 pp. 24 – 26, 66 – 83 ’15 ] essentially prove this.

All the molecular biology you need to understand what follows is in the following post — https://luysii.wordpress.com/2010/07/07/molecular-biology-survival-guide-for-chemists-i-dna-and-protein-coding-gene-structure.

The neuron paper is detailed and fascinating to a neurologist, but toward the end it begins to fry far bigger fish.

Until about 10 years ago, molecular biology was incredibly protein-centric. Consider the following terms — nonsense codon, noncoding DNA, junk DNA. All are pejorative and arose from the view that all the genome does is code for protein. Nonsense codon means one of the 3 termination codons, which tells the ribosome to stop making protein. Noncoding DNA means not coding for protein (with the implication that DNA not coding for protein isn’t coding for anything).

Well all that has changed. The ENCODE Consortium showed that well over half (and probably all) our DNA is transcribed into RNA — for details see https://en.wikipedia.org/wiki/ENCODE. This takes energy, and it is doubtful (to me at least) that organisms would waste this much energy if the products were not doing something useful.

I’ve discussed microRNAs elsewhere — for details please see — https://luysii.wordpress.com/2010/07/14/junk-dna-that-isnt-and-why-chemistry-isnt-enough/. They don’t code for protein either, but control how much of a given protein is made.

The Neuron paper concerns lncRNAs (long nonCoding RNAs). They don’t code for protein either and contain over 200 nucleotides. There are a lot of them (10,000 – 50,000 are known to be expressed in man. Amazingly 40% of them are expressed in the brain, and not just in adult life, but during embryonic development. Expression of some of them is restricted to specific brain areas. It is easier for an embryologist to tell what type a cell is during brain cortical development by looking at the lncRNAs expressed than by the proteins a given cell is making. The paper contains multiple examples of the lncRNAs controlling when and where a protein is made in the brain.

lncRNAs can contain multiple domains, each of which has a different affinity for a particular RNA (such as the mRNA for a protein), or DNA, or protein. In the nucleus they influence the DNA binding sites of transcription factors, RNA polymerase II, the polycomb repressor complex. The review goes on with many specific examples of lncRNA function — synaptic plasticity, neurotic extension.

Getting back to proteins, the vast majority are nearly the same in all mammals (this is where the 2% Chimpanzee argument comes from). Here is where it gets interesting. Roughly 1/3 of lncRNAs found in man are primate specific. This includes hundreds of lncRNAs found only in man. The paper gives evidence that hundreds of them have shown evidence of positive selection in humans.

So the paper provides yet another mechanism (with far more detail than I’ve been able to provide here) for why our brains are so much larger, and different in many ways than our nearest evolutionary ancestor, the chimpanzee. This is the largest molecular biological difference found so far for the human brain as opposed to every other brain. Fascinating stuff. Stay tuned. I think this is a watershed paper.

Why drug discovery is so hard: Reason #21 — RNA sequences won’t help you determine function

We are just beginning to understand all the things RNA does in the cell, despite its importance obvious to all for half a century (think messenger RNA which goes back that far).  This means that RNA is likely to be a target of useful drugs.  Posts #4, #11 and #20 concern some of the more newly discovered effects of RNA in the cell.

While we’re still discovering proteins with no obvious resemblance  in their amino acid sequence to known proteins, most of them do have some resemblance we’ve seen before.  So if we see a kinase-like domain, or a group of 7 rather hydrophobic sequences, we have a leg up on what that protein is actually doing.

A similar attack (comparing sequences to RNAs of known function) should help us figure out what some of the RNAs in the cell not coding for protein are actually doing.  If you see a mistyke in this sentence, you still probably know what I meant (e.g. how that word is meant to function in the sentence).  That’s the hope underlying the technique anyway

Recent work in the zebrafish [ Cell vol. 147 pp. 1537 – 1550 ’11 ] shows that this isn’t very likely in the RNA world. For some background on large intervening nonCoding RNAs (lincRNAs — aka lncRNAs) see https://luysii.wordpress.com/2011/03/02/we-dont-know-all-the-players-which-is-why-finding-good-drugs-is-so-hard/.  The zebrafish has become a plaything of embryologists (because it is transparent, and because like most fish (except sharks) it is a vertebrate.

At any rate the work found some 550 distinct lincRNAs in the zebrafish.  But only 29 had detectable sequence similarity with lincRNAs in mammals (which are just as numerous).  Even though chromosomes have been scrambled many times over geologic time, many genes near each other in the zebrafish are near each other in humans as well (the term for this is synteny).  This means one can look at DNA to see where the lincRNA is binding in two organisms, and infer that they’re doing something similar physiologically if they are binding to a syntenic site.

So they did this and found some  lincRNAs with almost no sequence similarity to each other binding to identical syntenic sites in man and zebrafish.  Next they used antisense reagents targeting the small regions of the lincRNAs conserved between us and fish and produced developmental defects (in the fish)  Amazingly, despite very little sequence similarity, human orthologs (determined by synteny) could prevent the embryological defects.

So in this case at least, and probably more generally, we’re not going to be able to look at the sequence of lincRNAs (or the many other types of non messenger RNAs present in the cell) and infer what they are doing.  This will make drug discovery in this area even harder.