Tag Archives: Dystrophin

A synonymous codon that isn’t

Molecular biology is simply too elegant and beautiful to be left to the molecular biologists.  So I’m going to present the intriguing result of a recent paper as I would take notes on it for myself, and then unpack it explaining the various terms contained as I go along.

It you’re really adventurous — start reading a series of 5 posts I wrote starting with https://luysii.wordpress.com/2010/07/07/molecular-biology-survival-guide-for-chemists-i-dna-and-protein-coding-gene-structure/ and follow the links.

It should explain everything in the paper below.

The paper itself is Nature vol. 602 pp. 335 – 342 ’22 — https://www.nature.com/articles/s41586-022-04451-4.pdf.

The unvarnished result:  Just mutating glutamine to lysine at position 61 of the KRAS oncogene (Q61K)isn’t enough to make KRAS resistant to an anticancer drug that attacks it (Osimertinib).  One of the synonymous codons for glycine at position 60 must be switched to another.

OK:  let’s unpack this starting with synonymous codon.

The DNA making up our genome is a string of elements (nucleotides also known as bases) strung together.  Similarly, our proteins are strings of elements (amino acids).  The order is crucial; just as it is with the 26 letters making up words. Consider the two words united and untied.

Bases come on 4 varieties (A, T, G and C).  Amino acids come in twenty varieties (of which three are glycine (G), Glutamine (Q) and lysine (K) — the one letter abbreviations don’t make much sense but that’s the way it is.

Since order of both bases and amino acids are important, it’s clear that  A T and T A are different. 2 bases  can only code for 16 amino acids.  Go up to 3 bases and you can code for 64 amino acids, which is overkill.   A sequence of 3 bases is called a codon. All 64 codons   code for an amino acid (except for three of them about which much more later).  This means that there must be several codons coding for the same amino acid —  these are the synonymous codons.

The number of codons for a given amino acid ranges from 1 (methionine M) to 6 (Leucine L).  Here are the 4 synonymous codons for glycine — GGA, GGC, GGG and GGT.  Note how similar they are.

Now the human genome has 3,200,000,000 bases strung together divided into 46 pieces (the chromosomes).  If placed end to end (Dorothy Parker fashion) they would be 3 feet 3 inches (1 meter) long.  All this is in a cell so small it is invisible to the naked eye.   If this is too much to get your head around, you might enjoy the following series of 6 posts — start here and follow the links https://luysii.wordpress.com/2010/03/22/the-cell-nucleus-and-its-dna-on-a-human-scale-i/

Any 3 bases linked together code for an amino acid, but there are many different ways to ‘read’ the genome. Among the many proteins our genome codes for are the transcription factors (1,639 of them as of 2018) which bind to stretches of 10 or more bases, to activate certain genes.   That’s 4^10 possibilities (over a million) allowing a unique binding site for the 1,639.  So transcription factors read the genome in groups of 10 or so not 3.

There is yet another way to read the genome, and this has to do with the fact the genes coding for proteins are much longer (have more bases) than the 3 times the number of amino acids they code for.  The classic example is dystrophin, a gene mutated in Duchenne muscular dystrophy.  It’s a monster protein with 3,685 amino acids — so it needs 3,685 *3 = 11,055 bases in a row to code for them at 3 bases/amino acids.  The dystrophin gene, however, stretches for 2,220,223  bases.  So the protein coding parts of the gene (the exons) come in 79 different pieces separated by parts that don’t code for amino acids (the introns).

I’m skipping a lot here, but the introns must be spliced out of a copy of the gene (mRNA).  Again the genome is read by yet another machine (the spliceosome) which removes introns from newly formed copies of the gene (the mRNA).  The spliceosome is a huge molecular machine containing 5 RNAs (called small nuclear RNAs, aka snRNAs), along with 50 or so proteins with a total molecular mass again of around 2,500,000 kiloDaltons (a carbon atom is 12 Daltons).  Most proteins have introns and exons, and most of them exist in multiple forms due to alternative splicing of introns.  The spliceosome reads the mRNA in 6 – 8 base chunks looking for sites (splicing sites) to bind and begin splicing out introns. Yet another way to ‘read’ a sequence of bases.   Exon sequences which promote or repress alternative splicing sites are known (these are called EXE == exonic splicing enhancers, and ESSs = exonic splicing suppressors).

And now, at very long last, we get to the four synonymous codons of glycine which aren’t functionally synonymous at all.  This isn’t trivial: they determine the base sequence a mutated gene must have to produce cancer.

Here’s the unvarnished result once again — Just mutating glutamine to lysine at position 61 of the KRAS oncogene (Q61K) isn’t enough to make KRAS resistant to an anticancer drug that attacks it (Osimertinib).  One of the synonymous codons for glycine at position 60 must be switched to another.

What is KRAS?  A protein which gets its name from a virus causing cancer in rats.  Kirsten RAt Sarcoma virus.  KRAS, when active, relays signals from outside the cell to the nucleus to make the cell proliferate.  The protein exists in active and inactive forms.  Humans have KRAS, and 3 similar proteins.  Mutations causing  members of the protein family to remain in constantly active form are found in 1/3 of all cancers.  In the case of KRAS some activating mutations occur at positions 60 and 61 of the 189 amino acid protein.  That’s all it takes.

The codon for glutamine at position 61 in KRAS is CAA.  To change it to the codon for lysine requires a change of just one base e.g. from CAA (glutamine) to AAA (lysine) and now you have  a KRAS which is always active producing cancer.

Recall that glycine has 4 codons (GGA, GGC, GGG and GGT).  The one found in unmutated KRAS is GGT.  This codon is never found in the KRAS Q61K mutant seen in tumors.  Why?  Because GGTAAA forms a splice site which the splicing machine uses to cut out a different set of introns going to an exon.  This exon contains one of the 3 codons  mentioned above not coding for an amino acid.  They are called termination codons or stop codons, and tell the machinery making mRNA from DNA to quit.   This means that the full mutated  KRAS with its 188 amino acids is never made.  So tumor producing KRAS has GGGAAA or GGAAAA or GGCAAA at positions 60 and 61 and never GGTAAA

So the 3 synonymous glycine codons have very nonsynonymous effects.  Now you know.  Elegant isn’t it?

 

 

Forgotten but not gone — take III

It’s pretty clear that life originated in the RNA world.  Consumed by thinking of proteins, enzymes, DNA etc. we tend to forget that there is a lot of RNA out there doing things we didn’t suspect.  Here are two more examples, one of which may explain why even genes coding  for proteins are relatively free of codons transcribed into amino acids.  The champ of course is dystrophin, discussed in the last post — https://luysii.wordpress.com/2019/05/05/duchenne-muscular-dystrophy-a-novel-genetic-treatment/.  The gene is a monster with  2,220,233 nucleotides coding for just 3,685 amino acids, meaning that less than 1/200th of the gene is actually coding for protein. The work below should make us think about just what else the 199/200th of dystrophin might be doing,

Unsuspected use of RNA #1.   [ Neuron vol. 102 pp. 507 – 509, 553 – 563 ’19 ]  The Tumor protein p53 inducible nuclear protein 2 (Tp53inp2) gene codes for a low complexity protein of 222 amino acids, all in one exon.  However the ‘3 untranslated region (3’UTR)  of the RNA for it is nearly 5 times longer (3,121 nucleotides) vs. 666 amino acid coding nucleotides.  The protein is made from the mRNA in some cells, but not in sympathetic neurons, even though the mRNA for Tp53inp2 is the most abundant RNA in the axons of these neurons.

Why do animals lick their wounds?  Because their saliva contains nerve growth factor (NGF) among other things.  NGF is crucial for the growth of sympathetic neuron axons, and their very survival in embryonic life.  It is a protein, which binds to a receptor for it (TrkA) on the axon membrane.  The receptor/NGF complex is then internalized and transported back to the nucleus turning on the genes necessary for axon growth and cell survival.

Even though the mRNA for Tp53inp2 is NOT translated into protein in the axon, it is crucial for the internalization of TrkA/NGF.

People have studied proteins whose function it is to bind RNA for years.  They are called RBPs (RNA Binding Proteins), and our genome has 750 of them.  200 RBPs are associated with genetic disease.  This work turns everthing on its head.  Here is an RNA whose function it is to bind a protein (e.g. TrkA).

How many more mRNAs have nonCoding (for protein) parts with other functions?

Unsuspected use of RNA #2. Circular RNAs had been missed for years (although known since 1976).  The classic sequencing methods isolate only RNAs with characteristic tails (such as polyAdenine).  Circular RNAs don’t have any.    They are formed by back splicing of 3′ end of exon N to the 5′ end of exon N.  Fortunately this is only 1% as efficient as the normal way.

So what?  Circular RNAs are crucial in the innate immune response to microbial invaders.  Double stranded DNA belongs inside the nucleus.  When it gets into the cytoplasm when some organism brings it there,it binds to Protein Kinase R (PKR) activating it so it phosphorylates eukaryotic initiation factor 2 (eiF2) bringing protein synthesis to a screeching halt.

This means that the cell needs a mechanism to keep PKR quiet.  This is where circular RNAs come in   [ Cell vol. 177 pp. 797 – 799, 865 – 880 ’19 ].  If the nucleotides in the circle can reach across the circle and base pair with each other forming a duplex of any length, it will bind to PKR inhibiting it.  Most circular RNAs are expressed at only a handful of copies/cell, the cell containing just 10,000 of them.

The work found that overexpression of a single circular RNA able to form duplexes (dsRNA) inhibits PKR.  Over expression of linear RNA of the same sequence does not, nor does overexpression of circular RNA which can’t form dsRNA.

So when an invader with dsDNA or dsRNA gets into the cell, RNAase L, a cytoplasmic endonuclease is activated, cleaving circular RNA, and uninhibiting PKR.

So it’s back to the drawing board for mRNA and those parts (introns, 3’UTRs) we didn’t think were doing anything.  Perhaps that’s why there are so many of them, and why they take up more room in mRNA and genes than the ones coding for amino acids.  Also it’s time to look at RNAs as protein binders and modifiers, rather than the other way around as we have been doing.

Here’s a link to an earlier member of the series — https://luysii.wordpress.com/2019/04/15/forgotten-but-not-gone-take-ii/xa

Duchenne muscular dystrophy — a novel genetic treatment

Could the innumerable genetic defects underlying Duchenne muscular dystrophy all be treated the same way?  Possibly.  Paradoxically, the treatment involves actually making the gene  even worse.

Understanding how and why this might work involves a very deep dive into molecular biology.  You might start by looking at the series of five background articles I wrote — start at https://luysii.wordpress.com/2010/07/07/molecular-biology-survival-guide-for-chemists-i-dna-and-protein-coding-gene-structure/ and follow the links.

I have a personal interest in Duchenne muscular dystrophy because I ran such a clinic from ’72 to ’87 watching young boys and adolescents die from it.  The major advance during that time, was NOT medical or anything I did, but lighter braces, so the boys could stay ambulatory longer.  Things have improved as survival has improved by a decade so they die in their late 20s.

So lets start.  Duchenne muscular dystrophy is caused by a mutation in the gene coding for dystrophin, a large (3,685 amino acids) protein which ties the contractile apparatus of the muscle cell (actin and myosin) to the cell membrane. Although it isn’t the largest protein we have — titin, another muscle protein with 34,350 amino acids is, the gene for dystrophin is the largest we have, weighing in at 2,220,233 nucleotides.  This is why Duchenne is one of the most common diseases due to a defect in a single gene, the gene is so large that lots of things can (and do) go wrong with it.

The gene comes in 79 pieces (exons) which account for under 1/200 of the nucleotides of the gene.  The rest must be spliced out and discarded.  Have a look at http://www.dmd.nl.  to see what can go wrong — the commonest is deletion of parts of the gene (60 – 70% of cases), followed by duplication of other parts (10% of cases) with the rest being mutations that change one amino acid to another.

Duchenne isn’t like cystic fibrosis where some 600 different mutations in the causative CFTR gene were known by 2003 but with 90% of cases due to just one.  So any genetic treatment for that young boy sitting in front of you had better be personalized to his particular mutation.

Or should it?

Possibly not.  We’ll need to discuss 3 things first

l. Nonsense Mediated Decay (NMD)

2. Nonsense Induced Transcriptional Compensation (NITC).

3. The MDX mouse model of Duchenne muscular dystrophy

Nonsense mediated decay.  Nonsense is a poor term, because the 3 nonSense codons (out of 64 possible) tell the ribosome to stop translating mRNA into protein and drop off the mRNA.  That isn’t nonsense.  I prefer stop codon, or termination codon

An an incredibly clever piece of business tells the ribosome (which is after all an inanimate object) when a stop codon occurs too early in the mRNA when there are a bunch of codons afterwards needed to make up the whole protein.

Lets go back to dystrophin and its 79 exons, and the fact that 99.5% of the gene is made of introns which are spliced out.   Remember the mRNA starts at the 5′ end and ends at the 3′ end.  The ribosome reads and translates it from 5′ to 3′. When an intron is spliced out, a protein complex of several proteins is placed on the mRNA some 20 – 24 basepairs 5′ to the splice site (this happens in the nucleus way before the mRNA gets near a ribosome in the cytoplasm).  The complex is called the Exon Junction Complex (EJC). The ribosome then happily munches along the mRNA from 5′ to 3′ knocking off the EJCs as it moves, until it hits a termination codon and drops off.

Over 95% of  genes do not have introns after the termination codon.  What happens if it does? Well then it is called a premature termination codon (PTC) and there is usually an EJC 3′ (downstream) to it.  If a termination codon is present 50 -55 nucleotides 5′ (upstream) to an EJC then NMD occurs.

Whenever any termination codon is reached, release protein factors (eRF1, eRF3, SMG1) bind to the mRNA.  It there is an EJC around (which there shouldn’t be) the interaction between the two complexes triggers phosphorylation of one of EJC proteins, triggering NMD.

So that’s how NMD happens, when there is a PTC.  Clever no?

Nonsense Induced Transcriptional Compensation (NITC).  I realize that this is a lot to throw at you, but a treatment for Duchenne is worth the effort (not to mention other genetic diseases in which the mechanism to be described also applies).

NITC is something I never heard about until two papers appearing in the 13 April Nature (vol. 568 pp. 179 – 180 (editorial), 193 – 197, 259 – 263).  Ever since we could knock out by placing a PTC early (near the 5′ end) of the gene we’ve been surprised by some of the results –e.g. knocking out some genes thought to be crucial had little or no effect.  Other technologies which didn’t affect the gene, but which decreased the expression of the mRNA (such as RNA interference, aka Post-Transcriptional gene silencing — PTGS) did have big phenotypic effects.

This turns out to be due NITC, which turns out to be due to increased transcription of genes which are ancestrally related to the mutant. Gene.  Hard to believe.

Time to go back to NMD.  It doesn’t break mRNA down nucleotide by nucleotide, but fragments it.  These fragments get into the nucleus, and bind to complementary genomic sequences of the gene containing the PTC, and also to genes ancestrally related to the mutant gene (so they’ll have similar nucleotide sequences). Then epigenetics takes over because the fragments recruit the COMPASS complex which catalyzes the formation of H3K4Me3 which is part of the histone code which helps turn on transcription of the gene.  The sequence similarity of ancestrally related genes, allows them and only them to be turned on by NITC.  Even cleverer than finding a PTC by the ribosome.

Something so incredible needs evidence.  Well heterozygotic zebrafish can bemade to have one normal gene and one with a PTC. What do you think happens?  The normal gene is upregulated (e.g. more is made).  Pretty good.

Finally the Mdx mouse.  I’ve been reading about it for years.  It has a PTC in exon 23 of the dystrophin gene, resulting in a protein only 27% as long as it should be.  All sorts of therapeutic maneuvers have been tried on it.  Now any drug development chemist will tell you that animal models are lousy, but they’re all we’ve got.

The remarkable thing about the mdx mouse, is that they don’t get weak.  They do have muscle pathology.  All the verbiage above probably explains why.

So to treat ALL forms of Duchenne put in a premature termination codon (PTC) in exon #23 of the human gene. It should work as there are  4 dystrophin related proteins scattered around the genome — their names are — utrophin, dystrophin related protein 2 (DRP2), alpha dystrobrevin, and beta dystrobrevin

There is an even better way to look for a place to put a PTC in the dystrophin gene.  Our genomes are filled with errors — for details see — https://luysii.wordpress.com/2018/05/01/how-badly-are-thy-genomes-oh-humanity-take-ii/.

There are lots of very normal people around with supposedly lethal mutations (including PTCs) in their genomes.  Probably scattered about various labs are at least 1,000,000 exome sequences in presumably normal people.  I’m not sure how much clinical information about them is available (other than that they are normal).  Hopeful their sex is.  Look at the dystrophin gene of normal males (females can be perfectly healthy carrying a mutant dystrophin gene as it is found on the X chromosome and they have 2) and see if PTCs are to be found.  You can’t have a better animal model than that.

At over 1,000 words this is the longest post I’ve written, and hopefully the most useful.

Man’s best friend

I usually pay little attention to animal models of neurologic disease. After all, our brain is what separates us from animals (recent human behavior excepted). Neuromuscular disease is different because our peripheral nerves and muscles work the same way as animals. An astounding paper from Harvard and Brazil, gives us an entirely new angle to treat muscular dystrophy, particularly the Duchenne form. I ran a muscular dystrophy clinic for 15 years in the 70s and 80s and haplessly watched young boys deteriorate and die from Duchenne. The major therapeutic advance during that time was — hold your breath — lighter weight braces, allowing the boys to stay out of wheelchairs a bit longer.

Some background for those who don’t know, the molecular defect in Duchenne was found in ’87. Interestingly Kunkel, one of the authors on the original paper [ Cell vol. 51 pp.; 919 – 928 ’87 ] is an author on the present one [ Cell vol. 163 pp. 1204 – 1213 ’15 ]. Duchenne dystrophy affects only males, as the gene for the protein (dystrophin) is found on the X chromosome, so women with a normal X and a mutant X escape. To show how pathetic things were back then, we tried to find out if a sister of a patient was a carrier. How did we do it. By measuring an enzyme released by damaged muscle (CPK) on several occasion. Carriers often showed an elevation.

The mutated protein is called dystrophin. It hooks the contractile apparatus of a muscle cell to the membrane. Failure of this makes muscle cells more fragile when they contract resulting in eventual loss. From a molecular biological point of view the protein is fascinating. The gene is one of largest known, stretching over 2,220,233 positions (nucleotides) on the X chromosome and containing 79 exons. Figuring a transcription rate of 100 nucleotides a second, it takes 6 hours to make the messenger RNA (mRNA) for it. The protein has 3,685 amino acids and figuring a translation rate of 3 – 6 amino acids/second it takes 10 minutes for the ribosome to make it. Given that it takes only 3 nucleotides to code for an amino acid, the protein coding part of the gene takes up only .5% of the gene. Correctly splicing out the introns is a huge task, which we all perform well. This size and complexity of the gene explains why mutations are so common, making it the most common form of hereditary muscular dystrophy (most are).

There are currently all sorts of efforts underway to correct the mutation, particularly in a milder form called Becker dystrophy. Derek has covered them and they constitute a logical direct attack on the pathology.

What is so remarkable about the current Cell paper is that it gives us an entirely new and different way to attack Duchenne (and possible all forms of muscular dystrophy). It involves a colony of dogs in Brazil. They have GRMD (Golden Retriever Muscular Dystrophy) with a mutation in one of the many splice sites in dystrophin (it has 79 exons in man) leading to a premature stop codon and no functional dystrophin in the dogs’ muscles. The animals weaken and become non ambulatory with a shortened lifespan. However, a few of the dogs in the colony seemed pretty normal. So they went to work. The obvious reason was that gene was in some way repaired so the animals had normal amounts of dystrophin. Not so, even though ambulatory, the animals’ muscles had no dystrophin. So the whole genome was sequenced. What they found was that a mutation at an upstream site of a protein called Jagged1 lead to increased transcription of the gene and increased levels of the protein.

Jagged1 is a protein ligand for the Notch system of receptors. The Notch system is important in muscle regeneration. The myoblasts of the animals had more proliferative capacity. The Notch system is far too complicated to go into here — https://en.wikipedia.org/wiki/Notch_signaling_pathway, but expect to see a lot more research money pumped into it.

What I find so fabulous about this paper, is that it gives us an entirely new way of thinking about Duchenne, totally unrelated to the genetic defect, which had been our focus up to now. It also rubs our noses in how little we understand about our molecular biology and cell physiology. If we really understood things, we’d have been focused on Notch years ago. Yet another reason drug discovery is so hard. We are trying to alter a system we only dimly understand.

How little we know

Who would have thought that a random mutagenesis experiment throwing Ethyl Nitroso Urea (ENU) at unsuspecting mice looking for genes using a mutagenesis strategy to identify novel immune regulatory genes would point to a possible treatment for muscular dystrophy? When the experimenters looked at the mutated offspring, they found that the muscles appeared unusually red.

What happened?

You need to know a bit more about muscles. On a very simplistic level there are only two types of muscle fibers, red and white. Carnivores eating chicken know about dark meat and white meat. The dark meat is composed of red fibers, which have that appearance because of large numbers of mitochondria (which are full of iron) giving them the same red appearance as blood (which is also full of iron). In both cases the iron is bound by porphyrin rings. As one might expect, these muscles consume a lot of energy, being postural for the most part. The white meat made of white fibers has muscle which can contract very quickly and strongly, for flight and fight. They don’t have nearly the endurance of red muscle, because they can’t produce energy for the long term.

Humans have the two types of muscle fibers mixed up in each of our muscles.

The ENU had produced a mutation in something called fnip1 (Folliculin INteracting Protein 1). What’s folliculin? It prevents a gene transcription factor (TFE3) from getting into the nucleus. Folliculin prevents an embryonic stem cell from differentiating. It is mutated in the Birt Hogg Dube syndrome which is characterized by many benign hair follicle tumors. What in the world does this have to do with muscular dystrophy? It’s not something someone would start investigating looking for a cure is it? Knock out both copies of folliculin and the embryo dies in utero.

It gets deeper.

What does Fnip1 do to folliculin? It, and its cousin fnip2 form complexes with folliculin. The complex binds an enzyme called AMPK (which is turned on by energy depletion in the cell. AMPK phosphorylates both fnip1 and folliculin. Folliculin binds and inhibits AMPK.

So animals lacking fnip1 have a more activated AMPK. So what? Well AMPK activates a transcriptional coactivator called PGC1alpha (you don’t want to know what the acronym stands for). This ultimately results in production of more mitochondria (recall that AMPK is an energy sensor, and one of the main functions of mitochondria is to produce energy, lots of it).

This ultimately means more red muscle fibers. There is a mouse model of Duchenne dystrophy called the mdx mouse (which has a premature termination codon in the dystrophin protein, resulting in a protein only 27% as long as it should be. That still leaves a lot, as normal dystrophin contains 3,685 amino acids. Knocking out fnip1 in the mdx mice improved muscle function. Impressive !!

I’m quite interested in this sort of work, as I ran a muscular dystrophy clinic from ’72 to ’87 and watched a lot of kids die. The major advance during that time wasn’t anything medical. It came from engineering — lighter braces using newer materials allowed the kids to stay out of wheelchairs longer.

You can read all about it in Proc. Natl. Acad. Sci. vol. 112 pp. 424 – 429 ’15 ] Clearly we know a lot (AMPK, dystrophin, PGC1alpha, fnip1, fnip2, folliculin, TFE3), but what we didn’t know was how in the world they function together in the cell. We’re sure to learn a lot more, but this whole affair was uncovered when looking for something else (immune regulators) using the bluntest instrument possible (throw a mutagen at an animal and see what happens). No one applying for a muscular dystrophy grant would dare to offer the original work as a rationale, yet here we are.

So directed research isn’t always the way to go. Although we know a lot, we still know very little.

I sincerely hope it works, but I’m very doubtful

A fascinating series of papers offers hope (in the form of a small molecule) for the truly horrible Werdnig Hoffman disease which basically kills infants by destroying neurons in their spinal cord. For why this is especially poignant for me, see the end of the post.

First some background:

Our genes occur in pieces. Dystrophin is the protein mutated in the commonest form of muscular dystrophy. The gene for it is 2,220,233 nucleotides long but the dystrophin contains ‘only’ 3685 amino acids, not the 770,000+ amino acids the gene could specify. What happens? The whole gene is transcribed into an RNA of this enormous length, then 78 distinct segments of RNA (called introns) are removed by a gigantic multimegadalton machine called the spliceosome, and the 79 segments actually coding for amino acids (these are the exons) are linked together and the RNA sent on its way.

All this was unknown in the 70s and early 80s when I was running a muscular dystrophy clininc and taking care of these kids. Looking back, it’s miraculous that more of us don’t have muscular dystrophy; there is so much that can go wrong with a gene this size, let along transcribing and correctly splicing it to produce a functional protein.

One final complication — alternate splicing. The spliceosome removes introns and splices the exons together. But sometimes exons are skipped or one of several exons is used at a particular point in a protein. So one gene can make more than one protein. The record holder is something called the Dscam gene in the fruitfly which can make over 38,000 different proteins by alternate splicing.

There is nothing worse than watching an infant waste away and die. That’s what Werdnig Hoffmann disease is like, and I saw one or two cases during my years at the clinic. It is also called infantile spinal muscular atrophy. We all have two genes for the same crucial protein (called unimaginatively SMN). Kids who have the disease have mutations in one of the two genes (called SMN1) Why isn’t the other gene protective? It codes for the same sequence of amino acids (but using different synonymous codons). What goes wrong?

[ Proc. Natl. Acad. Sci. vol. 97 pp. 9618 – 9623 ’00 ] Why is SMN2 (the centromeric copy (e.g. the copy closest to the middle of the chromosome) which is normal in most patients) not protective? It has a single translationally silent nucleotide difference from SMN1 in exon 7 (e.g. the difference doesn’t change amino acid coded for). This disrupts an exonic splicing enhancer and causes exon 7 skipping leading to abundant production of a shorter isoform (SMN2delta7). Thus even though both genes code for the same protein, only SMN1 actually makes the full protein.

Intellectually fascinating but ghastly to watch.

This brings us to the current papers [ Science vol. 345 pp. 624 – 625, 688 – 693 ’14 ].

More background. The molecular machine which removes the introns is called the spliceosome. It’s huge, containing 5 RNAs (called small nuclear RNAs, aka snRNAs), along with 50 or so proteins with a total molecular mass again of around 2,500,000 kiloDaltons. Think about it chemists. Design 50 proteins and 5 RNAs with probably 200,000+ atoms so they all come together forming a machine to operate on other monster molecules — such as the mRNA for Dystrophin alluded to earlier. Hard for me to believe this arose by chance, but current opinion has it that way.

Splicing out introns is a tricky process which is still being worked on. Mistakes are easy to make, and different tissues will splice the same pre-mRNA in different ways. All this happens in the nucleus before the mRNA is shipped outside where the ribosome can get at it.

The papers describe a small molecule which acts on the spliceosome to increase the inclusion of SMN2 exon 7. It does appear to work in patient cells and mouse models of the disease, even reversing weakness.

Why am I skeptical? Because just about every protein we make is spliced (except histones), and any molecule altering the splicing machinery seems almost certain to produce effects on many genes, not just SMN2. If it really works, these guys should get a Nobel.

Why does the paper grip me so. I watched the beautiful infant daughter of a cop and a nurse die of it 30 – 40 years ago. Even with all the degrees, all the training I was no better for the baby than my immigrant grandmother dispensing emotional chicken soup from her dry goods store (she only had a 4th grade education). Fortunately, the couple took the 25% risk of another child with WH and produced a healthy infant a few years later.

A second reason — a beautiful baby grandaughter came into our world 24 hours ago.

Poets and religious types may intuit how miraculous our existence is, but the study of molecular biology proves it (to me at least).