Category Archives: Molecular Biology

Life at 250 Atmospheres pressure 1.8 tons/square inch

Tube worms (actually a form of mollusc) live on the depths of the ocean floor where there is almost no light, and very little oxygen. Just as plants use light energy to remove electrons from water to form oxygen and fix carbon, passing the stolen electrons back to oxygen taxing it though intermediary metabolism, symbiotic bacteria living in the worms remove electrons from hydrogen sulfide (H2S) formed by the hydrothermal vents on the seafloor. . How did the tube worms get this far down? By riding decaying wood down there. [Proc. Natl. Acad. Sci. vol. 114 pp. E3652 – E3658 ’17 ] This is the wooden-steps hypothesis [Distel DL, et al. (2000) Nature 403:725–726] which states that the large chemosynthetic mussels (ship worms) found at deep-sea hydrothermal vents descend from much smaller species associated with sunken wood and other organic deposits, and that the endosymbionts of these progenitors made use of hydrogen sulfide from biogenic sources (e.g., decaying wood) rather than from vent fluids.

At 2500 meters down the water pressure is 3750 pounds per square inch. One can only imagine the changes required in the amino acid sequences of their proteins required so they aren’t denatured or aggregated by such pressure.

The idea that life on planetary moons with subsurface oceans (Ganymede, Europa, Titan, Enceladus) could exist is no longer as fantastic as it initially seemed.

If it be found the implications for our conception of our place in the natural world are enormous.

Why wasn’t this mentioned in Genesis or any known creation myth? Assume for the moment that there actually is a creator who made itself known to our ancestors. If it tried to give Abraham, Gautama Budda, Mohammed et. etc. knowledge of these things, it wouldn’t have been believed. Planets? Planets with moons? Please. A few miracles here and there would be all that would be needed.

Remember entropy?

Organic chemists have a far better intuitive feel for entropy than most chemists. Condensations such as the Diels Alder reaction decrease it, as does ring closure. However, when you get to small ligands binding proteins, everything seems to be about enthalpy. Although binding energy is always talked about, mentally it appears to be enthalpy (H) rather than Gibbs free energy (F).

A recent fascinating editorial and paper [ Proc. Natl. Acad. Sci. vol. 114 pp. 4278 – 4280, 4424 – 4429 ’17 ]shows how the evolution has used entropy to determine when a protein (CzrA) binds to DNA and when it doesn’t. As usual, advances in technology permit us to see this (e.g. multidimensional heteronuclear nuclear magnetic resonance). This allows us to determine the motion of side chains (methyl groups), backbones etc. etc. When CzrA binds to DNA methyl side chains on the protein move more, increasing entropy (deltaS) and as well all know the Gibbs free energy of reaction (deltaF) isn’t just enthalpy (deltaH) but deltaH – TdeltaS, so an increase in deltaS pushes deltaF lower meaning the reaction proceeds in that direction.

Binding of Zinc redistributes these side chain motion so that entropy decreases, and the protein moves off DNA. The authors call this dynamics driven allostery. The fascinating thing, is that this may happen without any conformational change of CzrA.

I’m not sure that molecular dynamics simulations are good enough to pick this up. Fortunately newer NMR techniques can measure it. Just another complication for the hapless drug chemist thinking about protein ligand interactions.

The incredible combinatorial complexity of cellular biochemistry

K8, K14, K20, T92, P125, S129, S137, Y176, T195, K276, T305, T308, T312, P313, T315, T326, S378, T450, S473, S477, S479. No, this is not some game of cosmic bingo. They represent amino acid positions in Protein Kinase B (AKT).

In the 1 letter amino acid code K is lysine T, threonine, S serine, P proline, Y tyrosine.

All 21 amino acids are modified (or not) one of them in 3 ways. This gives 4 * 2^20 = 4,194,304 possible post-translational modifications. Will we study all of them? It’s pretty easy to substitute alanine for serine or threonine making an unmodifiable position, or to substitute aspartic acid for threonine or serine making a phosphorylation mimic which is pretty close to phosphoserine or phosphothreonine, creating even more possibilities for study.

Most of the serines, threonines, tyrosines listed are phosphorylated, but two of the threonines are Nacetyl glucosylated. The two prolines are hydroxylated in the ring. The lysines can be methylated, acetylated, ubiquitinated, sumoylated. I did take the trouble to count the number of serines in the complete amino acid sequence and there are 24, of which only 6 are phosphorylated — so the phosphorylation pattern is likely to be specific and selected for. Too lazy do the same for lysine, threonine, tyrosine and proline. Here’s a link to the full sequence if you want to do it — http://www.uniprot.org/uniprot/P31749

The phosphorylations at each serine/threonine/tyrosine are carried out by not more than one of the following 8 kinases (CK2, IKKepsilon, ACK1,TBK1, PDK1, GSK3alpha, mTORC2 and CDK2)

AKT contains some 481 amino acids, divided (by humans for the purposes of comprehension) into 4 regions Pleckstrin Homology (#1 – #108), linker (#108 – #152) catalytic –e.g. kinase (#152 – #409),regulatory (#409 – #481).

This is from an excellent review of the functions of AKT in Cell vol. 169 pp. 381 – 3405 ’17. It only takes up the first two pages of the review before the functionality of AKT is even discussed.

This raises the larger issue of the possibility of human minds comprehending cellular biochemistry.

This is just one protein, although a very important one. Do you think we’ll ever be able to conduct enough experiments, to figure out what each modification (along or in combination) does to the many functions of AKT (and there are many)?

Now design a drug to affect one of the actions of AKT (particularly since AKT is the cellular homolog of a viral oncogene). Quite a homework assignment.

Progress has been slow but not for want of trying

Progress in the sense of therapy for Alzheimer’s disease and Glioblastoma multiforme is essentially nonexistent, and we could use better therapy for Parkinsonism. This doesn’t mean that researchers have given up. Far from it. Three papers all in last week’s issue of PNAS came up with new understanding and possibly new therapeutic approaches for all three.

You’ll need some serious molecular biological and cell physiological chops to get through the following.

l. Glioblastoma multiforme — they aren’t living much longer than they were when I started pracice 45 years ago (about 2 years — although of course there are exceptions).

The human ZBTB family of genes consists of 49 members coding for transcription factors. BCL6 is also known as ZBTB27 and is a master regulator of lymph node germinal responses. To execute its transcriptional activity, BCL6 requires homodimerization and formation of a complex with a variety of cofactors including BCL6 corerpressor (BCoR), nuclear receptor corepressor 1 (NCoR) and Silencing Mediator of Retinoic acid and Thyroid hormone receptor (SMRT). BCL6 inhibitors block the interaction between BCL6 and its friends, selectively killing BCL6 addicted cancer cells.

The present paper [ Proc. Natl. Acad. Sci. vol. 114 pp. 3981 – 3986 ’17 ] shows that BCL6 is required for glioblastoma cell viability. One transcriptional target of BCL6 is AXL, a tyrosine kinase. Depletion of AXL also decreases proliferation of glioblastoma cells in vitro and in vivo (in a mouse model of course).

So here are two new lines of attack on a very bad disease.

2. Alzheimer’s disease — the best we can do is slow it down, certainly not improve mental function and not keep mental function from getting worse. ErbB2 is a member of the Epidermal Growth Factor Receptor (EGFR) family. It is tightly associated with neuritic plaques in Alzheimer’s. Ras GTPase activation mediates EGF induced stimulation of gamma secretase to increase the nuclear function of the amyloid precursor protein (APP) intracellular domain (AICD). ErbB2 suppresses the autophagic destruction of AICD, physically dissociating Beclin1 vrom the VPS34/VPS15 complex independently of its kinase activity.

So the following paper [ Proc. Natl. Acad. Sci. vol. 114 pp. E3129 – E3138 ’17 ] Used a compound downregulating ErbB2 function (CL-387,785) in mouse models of Alzheimer’s (which have notoriously NOT led to useful therapy). Levels of AICD declined along with beta amyloid, and the animals appeared smarter (but how smart can a mouse be?).

3.Parkinson’s disease — here we really thought we had a cure back in 1972 when L-DOPA was first released for use in the USA. Some patients looked so good that it was impossible to tell if they had the disease. Unfortunately, the basic problem (death of dopaminergic neurons) continued despite L-DOPA pills supplying what they no longer could.

Nurr1 is a protein which causes the development of dopamine neurons in the embryo. Expression of Nurr1 continues throughout life. Nurr1 appears to be a constitutively active nuclear hormone receptor. Why? Because the place where ligands (such as thyroid hormone, steroid hormones) bind to the protein is closed. A few mutations in the Nurr1 gene have been associated with familial parkinsonism.

Nurr1 functions by forming a heterodimer with the Retinoid X Receptor alpha (RXRalpha), another nuclear hormone receptor, but one which does have an open binding pocket. A compound called BRF110 was shown by the following paper [ Proc. Natl. Acad. Sci. vol. 114 pp. 3795 – 3797, 3999 – 4004 ’17 ] to bind to the ligand pocked of RXRalpha increasing its activity. The net effect is to enhance expression of dopamine neuron specific genes.

More to the point MPP+ is a toxin pretty selective for dopamine neurons (it kills them). BRF110 helps survival against MPP+ (but only if given before toxin administration). This wouldn’t be so bad because something is causing dopamine neurons to die (perhaps its a toxin), so BRF110 may fight the decline in dopamine neuron numbers, rather than treating the symptoms of dopamine deficiency.

So there you have it 3 possible new approaches to therapy for 3 bad disease all in one weeks issue of PNAS. Not easy reading, perhaps, but this is where therapy is going to come from (hopefully soon).

An obvious idea we’ve all missed

In 3+ decades as a clinical neurologist I saw several hundred unfortunate people with primary brain tumors. Not one of them was made of proliferating neurons. Not a single one. Most were tumors derived from glial cells (gliomas, glioblastomas, astrocytomas, oligodendrogliomas) which make up half the cells in the brain. Some came from the coverings of the brain (meningiomas), or the ventricular lining (ependymomas).

A recent paper in Nature (vol. 543 pp.681 – 686 ’17) decided that it might be worthwhile to figure out why some organs rarely if ever develop cancer (brain, heart, skeletal muscle). Obvious isn’t it? But no one did it until now.

Most of these tissues are terminally differentiated (unlike, skin, lung, breast and gut) and don’t undergo cellular division. This means that they don’t have to copy their DNA over and over to replenish old and dying cells, and so they are much less likely to develop mutation.

They also use oxidative phosphorylation (a mitochondrial function) rather than glycolysis to generate energy. So they looked for genes that were upregulated in terminally differentiated muscle (not brain) cells relative to proliferating muscle cell precursors. Not a complicated idea to test once you think of it (but you and I didn’t). They found 5 such, and tested them for their ability to suppress tumor growth. One such (LACTB) decreased the growth rate of a variety of tumor cells in vitro and in vivo (e.g.– when transplanted into immunodeficient animals). Amazingly it seems to have no effect on normal cells.

Showing how little we understand the goings on inside our cells, why don’t you try to guess what LACTB given your (and our) knowledge of cellular biochemistry and physiology.

LACTB changes mitochondrial lipid metabolism, by reducing the rate of decarboxylation of mitochondrial phosphatidyl serine — say what?

Even when you know what LACTB is doing you’d be hard pressed to figure out how this effect slows cancer cell growth (and possibly prevents it from occuring at all).

So given our knowledge we’d have never found LACTB and having found it we still don’t know how it works.

Why antioxidants may be bad for you

Antioxidants (vitamin E, beta carotene, vitamin C etc. etc. ) were very big a while ago. They were held to prevent all sorts of bad things (heart attack, stroke). However one pretty good study done years and years ago (see the bottom) showed that they increased the risk of lung cancer in 29,000 Finnish male smokers by 18%. People still take them however.

Now we are beginning to find out the good things that oxidation does for you. One oxidation product is 8-oxo-guanine–https://en.wikipedia.org/wiki/8-Oxoguanine — and it is estimated that it occurs 100,000 time a day in every cell in our body. This isn’t very often as we have .24 x 3,200,000 = 768,000,000 guanines in our genome.

One good thing 8-oxo-guanine may do for you is turn on gene transcription [ Proc. Natl. Acad. Sci. vol. 114 pp. 2788 – 2790, 2604 – 2609 ’17 ].This occurs when the guanine occurs in an elegant DNA structure called a G-quartet (G quadruplex) — https://en.wikipedia.org/wiki/G-quadruplex. Oxidation recruits an enzyme to remove it (8-oxo-guanine glycosylase — aka OGG1 ) generating a DNA lesion — a sugar in the backbone without a nucleotide attach. This causes the binding of Apurinic/Apyrmidic Endonuclease 1 (APE1) which recruits other things to repair the DNA.

As you know DNA in our cells is compacted 100,000 fold to fit its 1 meter length into a nucleus .00001 meters in size. Compaction involves wrapping the helix around all nucleosomes and then binding the nucleosomes together.

It’s pretty hard for RNA polymerase to even get to a gene to transcribe it into mRNA, and DNA lesions cause opening up of this compaction so repair enzymes can actually get to the double helix.

One such gene is Vascular Endothelial Growth Factor (VEGF), a gene induced by low oxygen (hypoxia). The promoter of VEGF has a potential G quadruplex sequence. If the authors put 8-oxo-guanine at 5 different positions in the G quartet, transcription of the VEGF gene was increased 2 – 3 times over the next few days. Showing the importance of the DNA lesion, if OGG1 levels were decreased this didn’t happen — showing that guanine oxidation and with the subsequent formation of a DNA lesion is required for increased transcription of VEGF.

Aside from being another mechanism for gene activation under oxidative stress, 8-oxo-guanine may actually be another epigenetic DNA modification, like 5 methyl cytosine.

So this may explain the result immediately below.

[ New England J. Med. vol. 330 pp. 1029 – 1035 ’94 ] The Alpha-Tocopherol, Beta-Carotene Trial (ATBC trial) randomized double blind placebo controlled of daily supplementation with alpha-tocopherol (a form of vitamin E), beta carotene or both to see if it reduced the incidence of lung cancer was done in 29,000 Finnish male smokers ages 50 – 69 (when most of the damage had been done). They received either alpha tocopherol 50 mg/day, beta carotene 20 mg/day or both. There was a high incidence of lung cancer (876/29000) during the 5 – 8 year period of followup. Alpha tocopherol didn’t decrease the incidence of lung cancer, and there was a higher incidence among the men receiving beta carotene (by 18%). Alpha tocopherol had no benefit on mortality (although there were more deaths from hemorrhagic stroke among the men receiving the supplement). Total mortality was 8% higher among the participants on beta carotene (more deaths from lung cancer and ischemic heart disease). It is unlikely that the dose was too low, since it was much higher than the estimated intake thought to be protective in the uncontrolled dietaryt studies. The trial organizers were so baffled by the results that they even wondered whether the beta-carotene pills used in the study had become contaminated with some known carcinogen during the manufacturing process. However, tests have ruled out that possibility.

Needless to say investigators in other beta carotene clinical trials (the Women’s Health Study, the Carotene and Retinoid Efficacy Trial) are upset. [ Science vol. 264 pp. 501 – 502 ’94 ] “In our heart of hearts, we don’t believe [ beta carotene is ] toxic” says one researcher. Touching isn’t it. Such faith in a secular age, particularly where other people’s lives are at stake. I love it when ecology, natural vitamins and pseudoscience take it in the ear.

Back to the drawing board on knockouts and knockdowns

Nothing could be simpler than the distinction between the initial product of genes that code for proteins (mRNA) and genes that don’t (long non-Coding RNAs — aka lncRNA, lincRNA). Not anymore according to an exceedingly clever and well thought out piece of work.

[ Cell vol. 168 pp. 753 – 755, 843 – 855 ’17 ] We know that ultraviolet light damages DNA primarily by forming pyrimidine dimers. Naturally transcription of DNA won’t be as accurate, so the cell has ways to shut it down. Ultraviolet exposure results in an unusual type of restriction of transcription along with slower elongation, with the result that only the promoter proximal 20 – 25 kiloBases of a protein coding gene are efficiently transcribed into mRNA.

In addition, after ultraviolet damage there is a global switch in pre-mRNA processing resulting in a preference for the production of transcripts containing alternative last exons not normally included in the dominant mRNA isoform. Some 84 genes are processed this way.

ASCC3 is the strongest regulator of transcription following UV damage, acting to repress it after UV damage. It is a DEAD/DEAH box DNA helicase component. The ASCC3 protein interacts with RNA polymerase II (Pol II) and becomes highly ubiquitinated and phosphorylated on UV irradiation. It isn’t required to establish transcriptional repression, just maintainance. Disruption of the UV specific form — e.g. the short isoform containing the alternative last exon has the opposite effect, allowing transcriptional recovery after UV damage.

This explains why the human genes remaining expressed (or actually induced) after UV irradiation are invariably ‘very short’ (whatever that means).

The short and long isoforms constitute an autonomous regulatory module, and are related functionally, so the effect of deleting one can at least be partially compensated for by deleting the other.

The 3,100 nucleotide long ‘short’ isoform, codes for a protein, but the protein itself didn’t have the effect of the short form mRNA (see if you can figure out, without reading further how the authors proved this). The mRNA produced from the short isoform is found almost exclusively in the nucleus. The authors put in a stop codon immediately downstream of the start codon which ablated protein production but not transcription into the appropriate mRNA, but there was still rescue of the transcriptional recovery phenotype. So the functional form of the short RNA isoform is mediated by a nonCoding RNA encoded in the ASCC3 protein coding gene. The short ASCC3 isoform has an open reading frame of 333 nucleotides, but functionally it is a lncRNA (of 3.5 kiloBases).

So protein genes can produce functional lncRNAs. How many of them actually do this is unknown. When you knockdown a gene, how much of the effect is due to less protein and how much due to the (putative) lncRNA which also might be produced by the gene. That’s why it’s back to the drawing board for knockout mice (or even mRNA knockdown using shRNA etc. etc.)

The current definition of lncRNA is absence of protein coding potential in a gene.

Why have the same gene code for two different things — there may be a regulatory advantage — controlling the function of the protein. lncRNAs have the unique ability to act in close spatial proximity to their transcription loci.

Stay tuned. It’s just fascinating what we still don’t know.

The humble snow flea teaches us some protein chemistry

Who would have thought that the humble snow flea (that we used to cross country ski over in Montana) would teach us a great deal about protein chemistry turning over some beloved shibboleths in the process.

The flea contains an antifreeze protein, which stops ice crystals from forming inside the cells of the flea in the cold environment in which it lives. The protein contains 81 amino acids, is 45% glycine and contains six  type II polyProline helices each 8 amino acids long (https://en.wikipedia.org/wiki/Polyproline_helix). None of the 6 polyProline helices contain proline despite the name, but all contain from 2 to 6 glycines. Also to be noted is (1) the absence of a hydrophobic core (2) the absence of alpha helices (3) the absence of beta turns (4) the protein has low sequence complexity.

Nonethless it quickly folds into a stable structure — meaning that (1), (2), and (3) are not necessary for a stable protein structure. (4) means that low sequence complexity in a protein sequence does not invariably imply an intrinsically disordered protein.

You can read all about it in Proc. Natl. Acad. Sci. vol. 114 pp. 2241 – 2446 ’17.

Time for some humility in what we thought we knew about proteins, protein folding, protein structural stability.

Memories are made of this ?

Back in the day when information was fed into computers on punch cards, the data was the holes in the paper not the paper itself. A far out (but similar) theory of how memories are stored in the brain just got a lot more support [ Neuron vol. 93 pp. 6 -8, 132 – 146 ’17 ].

The theory says that memories are stored in the proteins and sugar polymers surrounding neurons rather than the neurons themselves. These go by the name of extracellular matrix, and memories are the holes drilled in it which allow synapses to form.

Here’s some stuff I wrote about the idea when I first ran across it two years ago.

——

An article in Science (vol. 343 pp. 670 – 675 ’14) on some fairly obscure neurophysiology at the end throws out (almost as an afterthought) an interesting idea of just how chemically and where memories are stored in the brain. I find the idea plausible and extremely surprising.

You won’t find the background material to understand everything that follows in this blog. Hopefully you already know some of it. The subject is simply too vast, but plug away. Here a few, seriously flawed in my opinion, theories of how and where memory is stored in the brain of the past half century.

#1 Reverberating circuits. The early computers had memories made of something called delay lines (http://en.wikipedia.org/wiki/Delay_line_memory) where the same impulse would constantly ricochet around a circuit. The idea was used to explain memory as neuron #1 exciting neuron #2 which excited neuron . … which excited neuron #n which excited #1 again. Plausible in that the nerve impulse is basically electrical. Very implausible, because you can practically shut the whole brain down using general anesthesia without erasing memory. However, RAM memory in the computers of the 70s used the localized buildup of charge to store bits and bytes. Since charge would leak away from where it was stored, it had to be refreshed constantly –e.g. at least 12 times a second, or it would be lost. Yet another reason data should always be frequently backed up.

#2 CaMKII — more plausible. There’s lots of it in brain (2% of all proteins in an area of the brain called the hippocampus — an area known to be important in memory). It’s an enzyme which can add phosphate groups to other proteins. To first start doing so calcium levels inside the neuron must rise. The enzyme is complicated, being comprised of 12 identical subunits. Interestingly, CaMKII can add phosphates to itself (phosphorylate itself) — 2 or 3 for each of the 12 subunits. Once a few phosphates have been added, the enzyme no longer needs calcium to phosphorylate itself, so it becomes essentially a molecular switch existing in two states. One problem is that there are other enzymes which remove the phosphate, and reset the switch (actually there must be). Also proteins are inevitably broken down and new ones made, so it’s hard to see the switch persisting for a lifetime (or even a day).

#3 Synaptic membrane proteins. This is where electrical nerve impulses begin. Synapses contain lots of different proteins in their membranes. They can be chemically modified to make the neuron more or less likely to fire to a given stimulus. Recent work has shown that their number and composition can be changed by experience. The problem is that after a while the synaptic membrane has begun to resemble Grand Central Station — lots of proteins coming and going, but always a number present. It’s hard (for me) to see how memory can be maintained for long periods with such flux continually occurring.

This brings us to the Science paper. We know that about 80% of the neurons in the brain are excitatory — in that when excitatory neuron #1 talks to neuron #2, neuron #2 is more likely to fire an impulse. 20% of the rest are inhibitory. Obviously both are important. While there are lots of other neurotransmitters and neuromodulators in the brains (with probably even more we don’t know about — who would have put carbon monoxide on the list 20 years ago), the major inhibitory neurotransmitter of our brains is something called GABA. At least in adult brains this is true, but in the developing brain it’s excitatory.

So the authors of the paper worked on why this should be. GABA opens channels in the brain to the chloride ion. When it flows into a neuron, the neuron is less likely to fire (in the adult). This work shows that this effect depends on the negative ions (proteins mostly) inside the cell and outside the cell (the extracellular matrix). It’s the balance of the two sets of ions on either side of the largely impermeable neuronal membrane that determines whether GABA is excitatory or inhibitory (chloride flows in either event), and just how excitatory or inhibitory it is. The response is graded.

For the chemists: the negative ions outside the neurons are sulfated proteoglycans. These are much more stable than the proteins inside the neuron or on its membranes. Even better, it has been shown that the concentration of chloride varies locally throughout the neuron. The big negative ions (e.g. proteins) inside the neuron move about but slowly, and their concentration varies from point to point.

Here’s what the authors say (in passing) “the variance in extracellular sulfated proteoglycans composes a potential locus of analog information storage” — translation — that’s where memories might be hiding. Fascinating stuff. A lot of work needs to be done on how fast the extracellular matrix in the brain turns over, and what are the local variations in the concentration of its components, and whether sulfate is added or removed from them and if so by what and how quickly.

—-

So how does the new work support this idea? It involves a structure that I’ve never talked about — the lysosome (for more info see https://en.wikipedia.org/wiki/Lysosome). It’s basically a bag of at least 40 digestive and synthetic enzymes inside the cell, which chops anything brought to it (e.g. bacteria). Mutations in the enzymes cause all sorts of (fortunately rare) neurologic diseases — mucopolysaccharidoses, lipid storage diseases (Gaucher’s, Farber’s) the list goes on and on.

So I’ve always thought of the structure as a Pandora’s box best kept closed. I always thought of them as confined to the cell body, but they’re also found in dendrites according to this paper. Even more interesting, a rather unphysiologic treatment of neurons in culture (depolarization by high potassium) causes the lysosomes to migrate to the neuronal membrane and release its contents outside. One enzyme released is cathepsin B, a proteolytic enzyme which chops up the TIMP1 outside the cell. So what. TIMP1 is an endogenous inhibitor of Matrix MetalloProteinases (MMPs) which break down the extracellular matrix. So what?

Are neurons ever depolarized by natural events? Just by synaptic transmission, action potentials and spontaneously. So here we have a way that neuronal activity can cause holes in the extracellular matrix,the holes in the punch cards if you will.

Speculation? Of course. But that’s the fun of reading this stuff. As Mark Twain said ” There is something fascinating about science. One gets such wholesale returns of conjecture out of such a trifling investment of fact.”

Tidings of great joy

One of the hardest things I had to do as a doc was watch an infant girl waste away and die of infantile spinal muscular atrophy (Werdnig Hoffmann disease) over the course of a year. Something I never thought would happen (a useful treatment) may be at hand. The actual papers are not available yet, but two placebo controlled trials with a significant number of patients (84, 121) in each were stopped early because trial monitors (not in any way involved with the patients) found the treated group was doing much, much better than the placebo. A news report of the trials is available [ Science vol. 354 pp. 1359 – 1360 ’16 (16 December) ].

The drug, a modified RNA molecule, (details not given) binds to another RNA which codes for the missing protein. In what follows a heavy dose of molecular biology will be administered to the reader. Hang in there, this is incredibly rational therapy based on serious molecular biological knowledge. Although daunting, other therapies of this sort for other neurologic diseases (Huntington’s Chorea, FrontoTemporal Dementia) are currently under study.

If you want to start at ground zero, I’ve written a series https://luysii.wordpress.com/category/molecular-biology-survival-guide/ which should tell you enough to get started. Start here — https://luysii.wordpress.com/2010/07/07/molecular-biology-survival-guide-for-chemists-i-dna-and-protein-coding-gene-structure/
and follow the links to the next two.

Here we go if you don’t want to plow through all three

Our genes occur in pieces. Dystrophin is the protein mutated in the commonest form of muscular dystrophy. The gene for it is 2,220,233 nucleotides long but the dystrophin contains ‘only’ 3685 amino acids, not the 770,000+ amino acids the gene could specify. What happens? The whole gene is transcribed into an RNA of this enormous length, then 78 distinct segments of RNA (called introns) are removed by a gigantic multimegadalton machine called the spliceosome, and the 79 segments actually coding for amino acids (these are the exons) are linked together and the RNA sent on its way.

All this was unknown in the 70s and early 80s when I was running a muscular dystrophy clininc and taking care of these kids. Looking back, it’s miraculous that more of us don’t have muscular dystrophy; there is so much that can go wrong with a gene this size, let along transcribing and correctly splicing it to produce a functional protein.

One final complication — alternate splicing. The spliceosome removes introns and splices the exons together. But sometimes exons are skipped or one of several exons is used at a particular point in a protein. So one gene can make more than one protein. The record holder is something called the Dscam gene in the fruitfly which can make over 38,000 different proteins by alternate splicing.

There is nothing worse than watching an infant waste away and die. That’s what Werdnig Hoffmann disease is like, and I saw one or two cases during my years at the clinic. It is also called infantile spinal muscular atrophy. We all have two genes for the same crucial protein (called unimaginatively SMN). Kids who have the disease have mutations in one of the two genes (called SMN1) Why isn’t the other gene protective? It codes for the same sequence of amino acids (but using different synonymous codons). What goes wrong?

[ Proc. Natl. Acad. Sci. vol. 97 pp. 9618 – 9623 ’00 ] Why is SMN2 (the centromeric copy (e.g. the copy closest to the middle of the chromosome) which is normal in most patients) not protective? It has a single translationally silent nucleotide difference from SMN1 in exon 7 (e.g. the difference doesn’t change amino acid coded for). This disrupts an exonic splicing enhancer and causes exon 7 skipping leading to abundant production of a shorter isoform (SMN2delta7). Thus even though both genes code for the same protein, only SMN1 actually makes the full protein.

More background. The molecular machine which removes the introns is called the spliceosome. It’s huge, containing 5 RNAs (called small nuclear RNAs, aka snRNAs), along with 50 or so proteins with a total molecular mass again of around 2,500,000 kiloDaltons. Think about it chemists. Design 50 proteins and 5 RNAs with probably 200,000+ atoms so they all come together forming a machine to operate on other monster molecules — such as the mRNA for Dystrophin alluded to earlier. Hard for me to believe this arose by chance, but current opinion has it that way.

Splicing out introns is a tricky process which is still being worked on. Mistakes are easy to make, and different tissues will splice the same pre-mRNA in different ways. All this happens in the nucleus before the mRNA is shipped outside where the ribosome can get at it.

The papers [ Science vol. 345 pp. 624 – 625, 688 – 693 ’14 ].describe a small molecule which acts on the spliceosome to increase the inclusion of SMN2 exon 7. It does appear to work in patient cells and mouse models of the disease, even reversing weakness.

I was extremely skeptical when I read the papers two years ago. Why? Because just about every protein we make is spliced (except histones), and any molecule altering the splicing machinery seems almost certain to produce effects on many genes, not just SMN2. If it really works, these guys should get a Nobel.

Well, I shouldn’t have been so skeptical. I can’t say much more about the chemistry of the drug (nusinersen) until the papers come out.

Fortunately, the couple (a cop and a nurse) took the 25% risk of another child with the same thing and produced a healthy infant a few years later.