Reason #19 why drug discovery is so hard — we are far from knowing all the players in the cell. (For the first 18 see https://luysii.wordpress.com/2011/11/21/a-new-category/). Here’s a shocker showing how little we know about proteins. You’d think that, by now, we’d know just about everything about them — how they are made (including splicing variants) from the same gene. How they are destroyed. But we don’t.
[ Cell vol. 147 pp. 789 – 802 ’11 ] Is an incredible paper, showing that of 5000 protein coding genes in mouse embryonic stem cells, translation of the mRNA begins at 13,454 initiation sites, with 65% of the mRNAs having more than one site where translation begins (start sites), 16% had more than 4 start sites. All the background a pure chemist needs to understand all this is in the Category “Molecular Biology Survival Guide for Chemists”.
The start sites could be within the coding section of the gene, giving amino truncated products, or upstream (5′ to) the coding section giving proteins with an amino terminal extensions. A recent paper [ Proc. Natl. Acad. Sci. vol. 109 pp. 197 – 202 ’12 ] gives an example of just how important an amino truncated protein can be. Checkpoint kinase 1 (Chk1) is a crucial regulator of the cell cycle, preventing mitosis from occuring in cells with damaged DNA. An amino terminally truncated variant (due to alternative splicing, not different initiation) of Chk1 binds Chk1 and represses its activity, letting the cell cycle proceed. DNA damage results (by a complicated mechanism) in phosphorylation of Chk1, relieving the inhibition by the amino truncated variant, and allowing Chk1 to stop the cell cycle.
The authors also found a class of short RNAs coding for multiple small proteins (they call them sprcRNAs — short polycistronic ribosome associated coding RNAs.) These short proteins (or peptides if you wish — when a peptide is long enough to be called a protein is a matter of taste) weren’t known.
So now we have a whole bunch of new proteins in the cell, most related to known ones. Could the drugs we have be affecting the new ones rather than what we’ve thought was their actual target?
The way this was found is almost as interesting as what they found. It involves a technique called ribosomal profiling. For background on the ribosome see https://luysii.wordpress.com/2012/01/09/molecular-biology-survival-guide-for-chemists-v-the-ribosome/.
The ribosome is large — a roughly spherical blob 250 – 300 Angstroms in diameter, with the active site of protein synthesis nearly in the center of the molecule. The messenger RNA within an active ribosome is protected from enzymes which can destroy it (nucleases). So chop up all the RNA in the cell, disassemble the ribosome, then use reverse transcriptase to make a DNA copy of the messenger RNA that’s left, and sequence all of it (using Illumina deep sequencing).
By using inhibitors of either translation initiation (harringtonine) or progression, it is possible to find translation start sites, along with their distribution. You can also find out just how fast ribosomes are translating mRNA (about 6 amino acids/second in this system).