Tag Archives: 3′ Untranslated Region

Forgotten but not gone — take III

It’s pretty clear that life originated in the RNA world.  Consumed by thinking of proteins, enzymes, DNA etc. we tend to forget that there is a lot of RNA out there doing things we didn’t suspect.  Here are two more examples, one of which may explain why even genes coding  for proteins are relatively free of codons transcribed into amino acids.  The champ of course is dystrophin, discussed in the last post — https://luysii.wordpress.com/2019/05/05/duchenne-muscular-dystrophy-a-novel-genetic-treatment/.  The gene is a monster with  2,220,233 nucleotides coding for just 3,685 amino acids, meaning that less than 1/200th of the gene is actually coding for protein. The work below should make us think about just what else the 199/200th of dystrophin might be doing,

Unsuspected use of RNA #1.   [ Neuron vol. 102 pp. 507 – 509, 553 – 563 ’19 ]  The Tumor protein p53 inducible nuclear protein 2 (Tp53inp2) gene codes for a low complexity protein of 222 amino acids, all in one exon.  However the ‘3 untranslated region (3’UTR)  of the RNA for it is nearly 5 times longer (3,121 nucleotides) vs. 666 amino acid coding nucleotides.  The protein is made from the mRNA in some cells, but not in sympathetic neurons, even though the mRNA for Tp53inp2 is the most abundant RNA in the axons of these neurons.

Why do animals lick their wounds?  Because their saliva contains nerve growth factor (NGF) among other things.  NGF is crucial for the growth of sympathetic neuron axons, and their very survival in embryonic life.  It is a protein, which binds to a receptor for it (TrkA) on the axon membrane.  The receptor/NGF complex is then internalized and transported back to the nucleus turning on the genes necessary for axon growth and cell survival.

Even though the mRNA for Tp53inp2 is NOT translated into protein in the axon, it is crucial for the internalization of TrkA/NGF.

People have studied proteins whose function it is to bind RNA for years.  They are called RBPs (RNA Binding Proteins), and our genome has 750 of them.  200 RBPs are associated with genetic disease.  This work turns everthing on its head.  Here is an RNA whose function it is to bind a protein (e.g. TrkA).

How many more mRNAs have nonCoding (for protein) parts with other functions?

Unsuspected use of RNA #2. Circular RNAs had been missed for years (although known since 1976).  The classic sequencing methods isolate only RNAs with characteristic tails (such as polyAdenine).  Circular RNAs don’t have any.    They are formed by back splicing of 3′ end of exon N to the 5′ end of exon N.  Fortunately this is only 1% as efficient as the normal way.

So what?  Circular RNAs are crucial in the innate immune response to microbial invaders.  Double stranded DNA belongs inside the nucleus.  When it gets into the cytoplasm when some organism brings it there,it binds to Protein Kinase R (PKR) activating it so it phosphorylates eukaryotic initiation factor 2 (eiF2) bringing protein synthesis to a screeching halt.

This means that the cell needs a mechanism to keep PKR quiet.  This is where circular RNAs come in   [ Cell vol. 177 pp. 797 – 799, 865 – 880 ’19 ].  If the nucleotides in the circle can reach across the circle and base pair with each other forming a duplex of any length, it will bind to PKR inhibiting it.  Most circular RNAs are expressed at only a handful of copies/cell, the cell containing just 10,000 of them.

The work found that overexpression of a single circular RNA able to form duplexes (dsRNA) inhibits PKR.  Over expression of linear RNA of the same sequence does not, nor does overexpression of circular RNA which can’t form dsRNA.

So when an invader with dsDNA or dsRNA gets into the cell, RNAase L, a cytoplasmic endonuclease is activated, cleaving circular RNA, and uninhibiting PKR.

So it’s back to the drawing board for mRNA and those parts (introns, 3’UTRs) we didn’t think were doing anything.  Perhaps that’s why there are so many of them, and why they take up more room in mRNA and genes than the ones coding for amino acids.  Also it’s time to look at RNAs as protein binders and modifiers, rather than the other way around as we have been doing.

Here’s a link to an earlier member of the series — https://luysii.wordpress.com/2019/04/15/forgotten-but-not-gone-take-ii/xa


Cultural appropriation, neuroscience division

If Deng Xiaoping can have Socialism with Chinese Characteristics, I can have a Chinese saying with neuroscientific characteristics — “The axon and the dendrite are long and the nucleus is far away” mimicking “The mountains are high and the Emperor is far away”. The professionally offended will react to the latest offense du jour — cultural appropriation  — of course.  But I’m entitled and I spoke to my Chinese daughter in law, and people over there found it flattering and admiring of Chinese culture that the girl in Utah wore a Chinese cheongsam dress to her prom.

Back to the quote.  “The axon and the dendrite are long and the nucleus is far away”.  Well, neuronal ends are far away from the cell body — the best example are axons from the sacral spinal cord which in an NBA player can be a yard long.  But forget that, lets talk about the ends of dendrites which are much closer to the cell body than that.

Presumably neurons have different types of dendrites so they can respond to different types of inputs. Why should dendrites respond identically if their inputs are different? They don’t.    A dendrite responding to acetyl choline will express neurotransmitter receptors distinct from another dendrite on the same neuron distinct from a dendrite responding to dopamine.  The protein cohorts of axons and dendrites are different.  How does this come about?  Because the untranslated part of mRNA on the 3′ end (3’UTR) contains a sequence called a zipcode which binds to specific proteins which then move the mRNA to a specific location in the neuron (axon or dendrite).  Presumably all dendrites initially had the same complement of mRNA.

So depending on what’s happening at a particular dendrite on a neuron, more or less of a given protein is made.   This is way too abstract.  Suppose you want to strengthen a synapse.  You’d make more of a neurotransmitter receptor or an ion channel for whatever transmitter that dendrite is getting.

It is well established that axons and dendrites store mRNAs and make proteins from them far from the nucleus (aka the emperor).  If you think about it, just how a receptor for dopamine gets to a dendrite receiving dopamine and not to a dendrite (on the same neuron) getting glutamic acid as a transmitter, is far from clear.  There are zipcodes distinguishing axons from dendrites, but I’m unaware that there are zipcodes for dopamine dendrites distinct from other types of dendrites.

If that weren’t enough consider [ Neuron vol. 98 pp. 495 – 511 ’18 ].  Even for an mRNA coding for the same protein (presumably transcribed from just one gene), there can be more than one type of 3’UTR (and this in the same cell).  Note also that 3’UTRs are longer in neurons than in other tissues.

So the authors looked at the mRNAs in dendrites — they did this by choosing a tissue (the hippocampus) where rows of cell bodies are well separated from their dendrites.  They found that for a given dendritic mRNA there was more than one 3’UTR, and that the mRNAs with longer 3’UTRs had longer halflives.  Even more exquisitly neuronal activity altered the proportion of the different 3’UTR isoforms. The phenomenon is quite general — over 50% of all genes and over 70% of genes enriched in neurons showed multiple 3′ UTRs.

So there is a whole control system built into the dendritic system, and it varies with what is happening locally.

The emperor emits directives (mRNAs) but what happens locally is anyone’s guess

The old year goes out with a bang

A huge amount of cellular genomics will have to be redone if the following paper is replicated. Remember “Extraordinary claims require extraordinary evidence.” Carl Sagan.

What’s all the shouting about? Normally when you think about messenger RNA (mRNA) as it exists in the cytoplasm after the initial transcript is significantly massaged in the nucleus, you think about the part that codes for amino acids. This ‘coding region’ is the part that is translated into amino acids by the ribosome. But mRNA is invariably larger having nucleotides at each end (3′ and 5′) which have other uses. These are called the 3′ Untranslated Region (3′ UTR) and 5′ Untranslated Region (5′ UTR).

So if you do single cell RNA sequencing (which we can do now) it shouldn’t matter what nucleotide sequence you search for (5′ UTR, 3′ UTR or the coding region) as all mRNA contains one of each.

Not so says this paper [ Neuron vol. 88 pp. 1149 – 1156 ’15 ].

Given the mRNA for a given protein in a single cell, using a probe for the 3’UTR and a probe for the coding sequence should give you the same abundance for both. That’s not what they found at all for single neurons from the brain. In some cases there was much more RNA coding for the 3’UTR than for the coding segment of a given mRNA for a protein. In others there was much less. Even more impressively is that the 3’UTR/(3’UTR + coding) ratio for a given protein varies between different parts of the brain. Obviously this ratio should be .5 given what we knew about mRNA in the past. The ratio has to be between 0 and 1.

Well they looked at a lot of proteins. The did find around 1,400 genes with a ratio of .5 (as expected), but they found 700 showing a ratio of .2 (lots more 3’UTR than coding sequence), and 1,100 showing a ratio of .8. Overall plotting the ratio vs. number of genes with that ratio gives something looking like a bell curve (Gaussian distribution).

It’s long been known that mRNA levels don’t exactly correlate with the levels of proteins made from them. If there’s lots of 3’UTRs around the authors found that there was relatively little protein made from the gene.

A variety of brain atlases have published mRNA abundances for various regions of the brain. If they just used one probe (as they probably did) this is clearly not enough.

The 3’UTRs may be acting as ceRNAs (competitive endogenous RNAs). These have been known for years — I’ve included a post of 3 years ago on the subject (at the end).

So this work (if replicated) throws everything we thought we knew about mRNA into a cocked hat. It’s why I love science, there’s always something really new to think about. Happy New Year !!!

Chemiotics II
Lotsa stuff, basically scientific — molecular biology, organic chemistry, medicine (neurology), math — and music
Why drug discovery is so hard: reason #20 — competitive endogenous RNAs

The chemist will appreciate le Chatelier’s principle in action in what follows. We are far from knowing all the players controlling cellular behavior. So how in the world will we find drugs to change cellular behavior when we don’t know all the things affecting it. The latest previously unknown cellular player to enter the lists are competitive endogenous RNAs (ceRNAs). For details see Cell vol. 147 pp. 344 – 357, 382 – 395 ’11. The background the pure chemist needs for what follows can all be found in the category “Molecular Biology Survival Guide.

Recall that microRNAs are short (20 something) polynucleotides which bind to the 3′ untranslated region (3′ UTR) of mRNA, and either (1) inhibit its translation into protein (2) cause its degradation. In each case, less of the corresponding protein is made. The microRNA and the appropriate sequence in the 3′ UTR of the mRNA form an RNA-RNA double helix (G on one strand binding to C on the other, etc.). Visualizing such helices is duck soup for a chemist.

Molecular biology is full of such semantic cherry bombs as nonCoding DNA (which meant DNA which didn’t cord for protein), a subset of Junk DNA. Another is the pseudogene — these are genes that look like they should code for protein, except that they don’t because of lack of an initiation codon or a premature termination codon. Except for these differences, they have the nucleotide sequence to code for a known protein. It is estimated that the human genome contains as many pseudogenes (20,000) as it contains true protein coding genes [ Genome Res. vol. 12 pp. 272 – 280 ’02 ]. We now know that well over half the genome is transcribed into mRNA, including the pseudogenes.

PTEN (you don’t want to know what it stands for) is a 403 amino acid protein which is one of the most commonly mutated proteins in human cancer. Our genome also contains a pseudogene for it (called PTENP). Interestingly deletion of PTENP (not PTEN) is found in some cancers. However PTENP deletion is associated with decreased amounts of the PTEN protein itself, something you don’t want as PTEN is a tumor suppressor. How PTEN accomplishes this appears to be fairly well known, but is irrelevant here.

Why should loss of PTENP decrease PTEN itself? The reason is because the mRNA made from PTENP, even though it has a premature termination codon, and can’t be made into protein, is just as long, so it also contains the 3’UTR of PTEN. This means PTENP is sopping up microRNAs which would otherwise decrease the level of PTEN. Think of PTENP mRNA as a sponge.

Subtle isn’t it? But there’s far more. At least PTENP mRNA closely resembles the PTEN mRNA. However other mRNAs coding for completely different proteins, also have binding sites in their 3’UTR for the microRNA which binds to the 3UTR of PTEN, resulting in its destruction. So transcription of a completely different gene (the example of ZEB2 is given) can control the abundance of another protein. Essentially its mRNA is acting as a sponge, sopping up the killer microRNA.

It gets worse. Most microRNAs have binding sites on the mRNAs of many different proteins, and PTEN itself has a 3’UTR which binds to 10 different microRNAs.

So here is a completely unexpected mechanism of control of protein levels in the cell. The general term for this is competitive endogenous RNA (ceRNA). Two years ago the number of human microRNAs was thought to be around 1,000. Unlike protein coding genes, it’s far from obvious how to find them by looking at the sequence of our genome, so there may be quite a few more.

So most microRNAs bind the 3’UTR of more than one protein (the average number is unclear at this point), and most proteins have binding sites for microRNAs in their 3’UTR (again the average number is unclear). What a mess. What subtlety. What an opportunity for the regulation of cellular function. Who is going to be smart enough to figure out a drug which will change this in a way that we want. Absence of evidence of a regulatory mechanism is not evidence of its absence. A little humility is in order.

Are you as smart as the (inanimate) blind watchmaker

Here’s a problem the cell has solved. Can you? Figure out a way to send a protein to two different membranes in the cell (the membrane encoding it { aka the plasma membrane }, and the endoplasmic reticulum) in the proportions you wish.

The proteins must have exactly the same sequence and content of amino acids, ruling out alternative splicing of exons in the mRNA (if this is Greek to you have a look at the following post — https://luysii.wordpress.com/2012/01/09/molecular-biology-survival-guide-for-chemists-v-the-ribosome/ and the others collected under — https://luysii.wordpress.com/category/molecular-biology-survival-guide/).

The following article tells you how the cell does it. Recall that not all of the messenger RNA (mRNA) is translated into protein. The ribosome latches on to the 5′ end of the mRNA,  subsequently moving toward the 3′ end until it finds the initiator codon (AUG which codes for methionine). This means that there is a 5′ untranslated region (5′ UTR). It then continues moving 3′ ward stitching amino acids together.  Similarly after the ribosome reaches the last codon (one of 3 stop codons) there is a 3′ untranslated region (3′ UTR) of the mRNA. The 3′ UTR isn’t left alone but is cleaved and a polyAdenine tail added to it. The 3′ UTR is where most microRNAs bind controlling mRNA stability (hence the amount of protein produced from a given mRNA).

The trick used by the cell is described in [ Nature vol. 522 pp. 363 – 367 ’15 ]. The 3’UTR is alternatively processed producing a variety of short and long 3’UTRs. One such protein where this happens is CD47 — which is found on the surface of most cells where it stops the cell from being eaten by scavenger cells such as macrophages. The long 3′ UTR of CD47 allows efficient cell surface expression, while the short 3′ UTR localizes it to the endoplasmic reticulum.

How could this possibly work? Once the protein is translated by the ribosome, it leaves the ribosome and the mRNA doesn’t it? Not quite.

They say that the long 3′ UTR of CD47 acts as a scaffold to recruit a protein complex which contains HuR (aka ELAVL1), an RNA binding protein and SET to the site of translation. The allows interaction of SET with the newly translated cytoplasmic domains of CD47, resulting in subsequent translocation of CD47 to the plasma membrane via activated RAC1.

The short 3′ UTR of CD47 doesn’t have the sequence binding HuR and SET, so CD47 doesn’t get to the plasma membrane, rather to the endoplasmic reticulum.

The mechanism may be quite general as HuR binds to thousands of mRNAs. The paper gives two more examples of proteins where this happens.

It’s also worth noting that all this exquisite control, does NOT involve covalent bond formation and breakage (e.g. not what we consider classic chemical reactions). Instead it’s the dance of one large molecular object binding to another in other ways. The classic chemist isn’t smiling. The physical chemist is.