The death of the synonymous codon

I don’t think anyone can fully appreciate molecular biology without a serious knowledge of organic chemistry.  That said, I’m not sure just how much molecular biology chemists of any stripe actually know.  So I’ve made a new category “Molecular Biology Survival Guide for Chemists” which will contain all the background required to understand this and future posts on the subject.  Currently it contains 3 longish posts in it called I, II, III.   I’ll indicate where background is to be found by the Roman numeral.  Just go to the category subheading on the left side, click it, and you see all three.

Synonymous and nonsynonymous codons are discussed in III.  Until recently, it was thought that one synonymous codon (for a given amino acid) acted pretty much the same as another.  Certainly this is true for the amino acids they code for.  But they code for more than that.  Here are two examples.

Example #1.  Our genes occur in pieces.  Dystrophin is the protein mutated in the commonest form of muscular dystrophy.  The gene for it is 2,220,233 nucleotides long but the dystrophin contains ‘only’ 3685 amino acids, not the 770,000+ amino acids the gene could specify.  What happens? The whole gene is transcribed into an RNA of this enormous length, then 78 distinct segments of RNA (called introns) are removed by a gigantic multimegadalton machine called the spliceosome, and the 79 segments actually coding for amino acids (these are the exons) are linked together and the RNA sent on its way.  

All this was unknown in the 70s and early 80s when I was running a muscular dystrophy clininc and taking care of these kids.  Looking back, it’s miraculous that more of us don’t have muscular dystrophy; there is so much that can go wrong with a gene this size, let along transcribing and correctly splicing it to produce a functional protein.
One final complication — alternate splicing.   The spliceosome removes introns and splices the exons together.  But sometimes exons are skipped or one of several exons is used at a particular point in a protein.  So one gene can make more than one protein.  The record holder is something called the Dscam gene in the fruitfly which can make over 38,000 different proteins by alternate splicing. 

Alternate splicing is not rare.  [ Proc. Natl. Acad. Sci. vol. 102 pp. 12813 – 12818 ’05 ] contains 7 references which variously estimate the amount of alternative splicing of mammalian genes from 22 to 74%.  What controls alternate splicing?  Sequences in the gene for the protein itself.  These sequences can be in either a given intron or a given exon and they can either enhance splicing or inhibit it.  They are called ESS (for exonic splicing suppressor) or  ESE (for exonic splicing enhancer).  ISS and ISE have similar meanings where I stands for intron.

All very nice but ESS’s and ESE’s are found in exons, and mutations of them will change alternate splicing (something a functioning cell has a great interest in).  It’s easy to see how changing one nucleotide in an ESS or an ESE could render it more or less effective, while leaving the amino acid sequence of the underlying protein unchanged (e.g. where all synonymous codons don’t act exactly alike).  In short the ‘neutral mutation rate’ may in fact not be neutral at all (if it is in an ESE or an ESS).  Or possibly switching one amino acid for another has nothing whatever to with the protein and everything to do with controlling alternate splicing. 

Here is one particularly horrible example (again from the muscular dystrophy clinic).  There is nothing worse than watching an infant waste away and die.  That’s what Werdnig Hoffmann disease is like, and I saw one or two cases during my years at the clinic.  It is also called infantile spinal muscular atrophy.  We all have two genes for the same crucial protein (called unimaginatively SMN).  Kids who have the disease have mutations in one of the two genes (called SMN1)  Why isn’t the other gene protective?  It codes for the same sequence of amino acids (but using different synonymous codons). What goes wrong?

[ Proc. Natl. Acad. Sci. vol. 97 pp. 9618 – 9623 ’00 ] Why is SMN2 (the centromeric copy (e.g. the copy closest to the middle of the chromosome) which is normal in most patients) not protective?  It has a single translationally silent nucleotide difference from SMN1 in exon 7.  This disrupts an exonic splicing enhancer < ESE > and causes exon 7 skipping leading to abundant production of a shorter isoform (SMN2delta7).  Thus even though both genes code for the same protein, only SMN1 actually makes the full protein. 

Intellectually fascinating, but truly ghastly to (ineffectually) watch. 

Example #2:  MicroRNAs — for background see II. These were thought to bind to messenger RNAs at the parts not coding for amino acids (II again).  They control the stability of a given mRNA, usually targeting it for destruction when they bind, indirectly controlling the levels of the protein the mRNA codes for.  Well guess what?  They can also bind to the parts of the mRNA coding for amino acids.  This happens in Nanog, Oct4 and Sox2 — the names of 3 genes crucially important in a very hot topic today — stem cells.  Silent (e.g. not changing the amino acid coded for) mutations at the predicted targets abolish microRNA activity and alter the regulation of the corresponding gene.  Not only that but there are multiple microRNA targets in the coding sequences of each of the 3 genes. 

Head swimming yet?  The example in the next post is even more subtle, and it leads to a philosophic discussion of how just far reductionism of cellular events to chemistry can take us.

Post a comment or leave a trackback: Trackback URL.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: