Tag Archives: Stop Codon

Duchenne muscular dystrophy — a novel genetic treatment

Could the innumerable genetic defects underlying Duchenne muscular dystrophy all be treated the same way?  Possibly.  Paradoxically, the treatment involves actually making the gene  even worse.

Understanding how and why this might work involves a very deep dive into molecular biology.  You might start by looking at the series of five background articles I wrote — start at https://luysii.wordpress.com/2010/07/07/molecular-biology-survival-guide-for-chemists-i-dna-and-protein-coding-gene-structure/ and follow the links.

I have a personal interest in Duchenne muscular dystrophy because I ran such a clinic from ’72 to ’87 watching young boys and adolescents die from it.  The major advance during that time, was NOT medical or anything I did, but lighter braces, so the boys could stay ambulatory longer.  Things have improved as survival has improved by a decade so they die in their late 20s.

So lets start.  Duchenne muscular dystrophy is caused by a mutation in the gene coding for dystrophin, a large (3,685 amino acids) protein which ties the contractile apparatus of the muscle cell (actin and myosin) to the cell membrane. Although it isn’t the largest protein we have — titin, another muscle protein with 34,350 amino acids is, the gene for dystrophin is the largest we have, weighing in at 2,220,233 nucleotides.  This is why Duchenne is one of the most common diseases due to a defect in a single gene, the gene is so large that lots of things can (and do) go wrong with it.

The gene comes in 79 pieces (exons) which account for under 1/200 of the nucleotides of the gene.  The rest must be spliced out and discarded.  Have a look at http://www.dmd.nl.  to see what can go wrong — the commonest is deletion of parts of the gene (60 – 70% of cases), followed by duplication of other parts (10% of cases) with the rest being mutations that change one amino acid to another.

Duchenne isn’t like cystic fibrosis where some 600 different mutations in the causative CFTR gene were known by 2003 but with 90% of cases due to just one.  So any genetic treatment for that young boy sitting in front of you had better be personalized to his particular mutation.

Or should it?

Possibly not.  We’ll need to discuss 3 things first

l. Nonsense Mediated Decay (NMD)

2. Nonsense Induced Transcriptional Compensation (NITC).

3. The MDX mouse model of Duchenne muscular dystrophy

Nonsense mediated decay.  Nonsense is a poor term, because the 3 nonSense codons (out of 64 possible) tell the ribosome to stop translating mRNA into protein and drop off the mRNA.  That isn’t nonsense.  I prefer stop codon, or termination codon

An an incredibly clever piece of business tells the ribosome (which is after all an inanimate object) when a stop codon occurs too early in the mRNA when there are a bunch of codons afterwards needed to make up the whole protein.

Lets go back to dystrophin and its 79 exons, and the fact that 99.5% of the gene is made of introns which are spliced out.   Remember the mRNA starts at the 5′ end and ends at the 3′ end.  The ribosome reads and translates it from 5′ to 3′. When an intron is spliced out, a protein complex of several proteins is placed on the mRNA some 20 – 24 basepairs 5′ to the splice site (this happens in the nucleus way before the mRNA gets near a ribosome in the cytoplasm).  The complex is called the Exon Junction Complex (EJC). The ribosome then happily munches along the mRNA from 5′ to 3′ knocking off the EJCs as it moves, until it hits a termination codon and drops off.

Over 95% of  genes do not have introns after the termination codon.  What happens if it does? Well then it is called a premature termination codon (PTC) and there is usually an EJC 3′ (downstream) to it.  If a termination codon is present 50 -55 nucleotides 5′ (upstream) to an EJC then NMD occurs.

Whenever any termination codon is reached, release protein factors (eRF1, eRF3, SMG1) bind to the mRNA.  It there is an EJC around (which there shouldn’t be) the interaction between the two complexes triggers phosphorylation of one of EJC proteins, triggering NMD.

So that’s how NMD happens, when there is a PTC.  Clever no?

Nonsense Induced Transcriptional Compensation (NITC).  I realize that this is a lot to throw at you, but a treatment for Duchenne is worth the effort (not to mention other genetic diseases in which the mechanism to be described also applies).

NITC is something I never heard about until two papers appearing in the 13 April Nature (vol. 568 pp. 179 – 180 (editorial), 193 – 197, 259 – 263).  Ever since we could knock out by placing a PTC early (near the 5′ end) of the gene we’ve been surprised by some of the results –e.g. knocking out some genes thought to be crucial had little or no effect.  Other technologies which didn’t affect the gene, but which decreased the expression of the mRNA (such as RNA interference, aka Post-Transcriptional gene silencing — PTGS) did have big phenotypic effects.

This turns out to be due NITC, which turns out to be due to increased transcription of genes which are ancestrally related to the mutant. Gene.  Hard to believe.

Time to go back to NMD.  It doesn’t break mRNA down nucleotide by nucleotide, but fragments it.  These fragments get into the nucleus, and bind to complementary genomic sequences of the PTC gene, and also to genes ancestrally related to the mutant gene (so they’ll have similar nucleotide sequences). Then epigenetics takes over because the fragments recruit the COMPASS complex which catalyzes the formation of H3K4Me3 which is part of the histone code which helps turn on transcription of the gene.  The sequence similarity of ancestrally related genes, allows them and only them to be turned on by NITC.  Even cleverer than finding a PTC by the ribosome.

Something so incredible needs evidence.  Well heterozygotic zebrafish can bemade to have one normal gene and one with a PTC. What do you think happens?  The normal gene is upregulated (e.g. more is made).  Pretty good.

Finally the Mdx mouse.  I’ve been reading about it for years.  It has a PTC in exon 23 of the dystrophin gene, resulting in a protein only 27% as long as it should be.  All sorts of therapeutic maneuvers have been tried on it.  Now any drug development chemist will tell you that animal models are lousy, but they’re all we’ve got.

The remarkable thing about the mdx mouse, is that they don’t get weak.  They do have muscle pathology.  All the verbiage above probably explains why.

So to treat ALL forms of Duchenne put in a premature termination codon (PTC) in exon #23 of the human gene. It should work as there are  4 dystrophin related proteins scattered around the genome — their names are — utrophin, dystrophin related protein 2 (DRP2), alpha dystrobrevin, and beta dystrobrevin

There is an even better way to look for a place to put a PTC in the dystrophin gene.  Our genomes are filled with errors — for details see — https://luysii.wordpress.com/2018/05/01/how-badly-are-thy-genomes-oh-humanity-take-ii/.

There are lots of very normal people around with supposedly lethal mutations (including PTCs) in their genomes.  Probably scattered about various labs are at least 1,000,000 exome sequences in presumably normal people.  I’m not sure how much clinical information about them is available (other than that they are normal).  Hopeful their sex is.  Look at the dystrophin gene of normal males (females can be perfectly healthy carrying a mutant dystrophin gene as it is found on the X chromosome and they have 2) and see if PTCs are to be found.  You can’t have a better animal model than that.

At over 1,000 words this is the longest post I’ve written, and hopefully the most useful.

Advertisements

A science fiction story (for the cognoscenti) — answer to the puzzle and a bit more

Comrade Chen we have a serious problem.

Don’t tell me one of our bugs escaped confinement.

Worse.  One of theirs did.  And it’s affecting the PLA (People’s Liberation Army).  Some are turning into pacifists.

It doesn’t kill them?

No. But for our purposes it might as well.

It’s a typical adenoassociated virus (AAV) like we use.

Well, what does the genome look like?

We’ve sequenced it and among other things, it codes for a protein which enters the brain and alters behavior.

What?

Well, the enemy has some excellent biologists, one of whom works on Wolbachia.

What’s that?

It’s a rickettsial organism which changes the sex life of some insects.

I don’t believe that.

Do you have a cat?

Yes.

Well many cats contain another organism (toxoplasma gondi).

So what.

Rats infected by the organism become less afraid of cats.

Another example please.

A fungus infecting carpenter ants causes the ant to leave its colony, climb a tree, chomp down on the underside of a leaf and die, freeing fungal spores to fall on the ground where they can reinfect new ants.

Well what is the genome of the virus?

It has some very unusual sequences, and one which proves that the Wolbachia biologist on the other side has a very large ego.

How so.

Well in addition to the brain infecting protein, there is a very unusual triplet of peptides all in a row.

Methionine Alanine Aspartic Acid Glutamic acid, then a stop codon, then Isoleucine Asparagine, than a stop codon, then Threonine Alanine Isoleucine Tryptophan Alanine Asparagine.  We think that the first two in some way cause readthrough of the stop codons so the protein following the short peptides is made.

Where does the big ego come in?

Sir, proteins can have hundreds and hundreds of amino acids.  People got tired of writing their full names out, so each of the 20 amino acids was given a single letter to stand for it.

M – Methionine

A – Alanine

D – Aspartic acid

What does D have to do with Aspartic acid?

Nothing sir, look on the letters as Chinese characters.

E -Glutamic Acid

I – isoleucine

What about the stop codon between Glutamic acid and Isoleucine

Just regard it as a space.

N – Asparagine

Nooo! ! ! I I’m beginning to get the picture.

Yes sir, it stands for MADE IN TAIWAN

—-

A few years later

Well the Taiwanese biologist outsmarted himself (or herself).   The Taiwanese soldiers wouldn’t fight either as the virus spread.  Most conflicts between nation states pretty much ended (Russia/Ukraine, North Korea/South Korea) etc. etc.  The Taiwanese biologist was nominated for the Nobel Peace Prize, and did receive it in absentia, as every military type in the world was looking for him (or  her), so he (or she) went into hiding, and is believed to be living in an Ashram near Boulder, Colorado.

Unfortunately, the idea of using viruses to change human behavior spread past nation states, and private groups with their own agendas began using it.

The ‘new soviet man’ of the previous century looked rather benign compared to what subsequently happened.

The next story for the scientific cognoscenti will describe the events leading up to the impeachment trial of President Jon Tester in 2028.

 

A science fiction story (for the cognoscenti)

Comrade Chen we have a serious problem.

Don’t tell me one of our bugs escaped confinement.

Worse.  One of theirs did.  And it’s affecting the PLA (People’s Liberation Army).  Some are turning into pacifists.

It doesn’t kill them?

No. But for our purposes it might as well.

It’s a typical adenoassociated virus (AAV) like we use.

Well, what does the genome look like?

We’ve sequenced it and among other things, it codes for a protein which enters the brain and alters behavior.

What?

Well, the enemy has some excellent biologists, one of whom works on Wolbachia.

What’s that?

It’s a rickettsial organism which changes the sex life of some insects.

I don’t believe that.

Do you have a cat?

Yes.

Well many cats contain another organism (toxoplasma gondi).

So what.

Rats infected by the organism become less afraid of cats.

Another example please.

A fungus infecting carpenter ants causes the ant to leave its colony, climb a tree, chomp down on the underside of a leaf and die, freeing fungal spores to fall on the ground where they can reinfect new ants.

Well what is the genome of the virus?

It has some very unusual sequences, and one which proves that the Wolbachia biologist on the other side has a very large ego.

How so.

Well in addition to the brain infecting protein, there is a very unusual triplet of peptides all in a row.

Methionine Alanine Aspartic Acid Glutamic acid, then a stop codon, then Isoleucine Asparagine, than a stop codon, then Threonine Alanine Isoleucine Tryptophan Alanine Asparagine.  We think that the first two in some way cause readthrough of the stop codons so the protein following the short peptides is made.

Where does the big ego come in?

Figure it out.

 

Answer next week.

A research idea yours for the taking

Why would the gene for a protein contain a part which could form amyloid (the major component of the senile plaque of Alzheimer’s disease) and another part to prevent its formation. Therein lies a research idea, requiring no grant money, and free for you to pursue since I’ll be 80 this month and have no academic affiliation.

Bri2 (aka Integral TransMembrane protein 2B — ITM2B) is such a protein.  It is described in [ Proc. Natl. Acad. Sci. vol. 115 pp. E2752 – E2761 ’18 ] http://www.pnas.org/content/pnas/115/12/E2752.full.pdf.

As a former neurologist I was interested in the paper because two different mutations in the stop codon for Bri2 cause 2 familial forms of Alzheimer’s disease  Familial British Dementia (FBD) and Familial Danish Dementia (FDD).   So the mutated protein is longer at the carboxy terminal end.  And it is the extra amino acids which form the amyloid.

Lots of our proteins form amyloid when mutated, mutations in transthyretin cause familial amyloidotic polyneuropathy.  Amylin (Islet Amyloid Polypeptide — IAPP) is one of the most proficient amyloid formers.  Yet amylin is a protein found in the beta cell of the pancreas which releases insulin (actually in the same secretory granule containing insulin).

This is where Bri2 is thought to come in. It is also found in the pancreas.   Bri2 contains a 100 amino acid motif called BRICHOS  in its 266 amino acids which acts as a chaperone to prevent IAPP from forming amyloid (as it does in the pancreas of 90% of type II diabetics).

Even more interesting is the fact that the BRICHOS domain is found in 300 human genes, grouped into 12 distinct protein families.

Do these proteins also have segments which can form amyloid?  Are they like the amyloid in Bri2, in segments of the gene which can only be expressed if a stop codon is read through.  Nothing in the cell is perfect and how often readthrough occurs at stop codons isn’t known completely, but work is being done — Nucleic Acids Res. 2014 Aug 18; 42(14): 8928–8938.

I find it remarkable that the cause and the cure of a disease is found in the same protein.

Here’s the research proposal for you.  Look at the other 300 human genes containing the BRICHOS motif (itself just a beta sheet with alpha helices on either side) and see how many have sequences which can form amyloid.  There should be programs which predict the likelihood of an amino acid sequence forming amyloid.

It’s very hard to avoid teleology when thinking about cellular biochemistry and physiology.  It’s back to Aristotle where everything has a purpose and a design.  Clearly BRICHOS is being used for something or evolution/nature/natural selection/the creator would have long ago gotten rid of it.  Things that aren’t used tend to disappear in evolutionary time — witness the blind fish living in caves in Mexico that have essentially lost their eyes. The BRICHOS domain clearly hasn’t disappeared being present in over 1% of our proteins.

Suppose that many of the BRICHOS containing proteins have potential amyloid segments.  That would imply (to me at least) that the amyloid isn’t just junk that causes disease, but something with a cellular function. Finding out just what the function is would occupy several research groups for a long time.   This is also where you come in.  It may not pan out, but pathbreaking research is always a gamble when it isn’t stamp collecting.

 

The chemical ingenuity of the cell

If you know a bit of molecular biology, you know that messenger RNA (mRNA) has a tail of consecutive adenines added at its 5′ end (sorry ! ! !  3′ end — oh well). If you don’t know that much all the background you need can be found in https://luysii.wordpress.com/2010/07/07/molecular-biology-survival-guide-for-chemists-i-dna-and-protein-coding-gene-structure/ — just follow the links.

The adenines are not coded in the genome. Why? I’ve always thought of it as something preventing the mRNA from being broken down before the ribosome translates it into protein. Gradually the adenines are nibbled off by cytoplasmic nucleases. The literature seems to agree — from my notes on various sources

Most mRNAs in mammalian cells are quite stable and have a half life measured in hours, but others turn over within 10 to 30 minutes. The 5′ cap structure in mRNA prevents attack by 5′ exonucleases and the polyadenine (polyA) tail prohibits the action of 3′ exonucleases. The absence of a polyA tail is associated with rapid degradation of mRNA. Histone mRNAs lack a polyA tail but have near their 3′ terminus a sequence which can form a stem loop structure this appears to confer resistance to exonucleolytic attack.

polyA — the polyAdenine tail found on most mRNAs must be removed before mRNA degradation can occur. Anything longer than 10 adenines in a row seems to protect mRNA. The polyA tail is homogenous in length in most species ( 70 – 90 in yeast, 220 – 250 nucleotides in mammalian cells). PolyA shortening can be separated into two phases, the first being the shortening of the tail down to 12 – 25 residues, and the second terminal deadenylation being the removal of some or all of them.

Molecular Biology of the Cell 4th Edition p. 449 — Once a critical threshold of tail shortening has been reached (about 30 As) the 5′ cap is removed (decapping) and the RNA is rapidly degraded. The proteins that carry out tail shortening compete directly with the machinery that catalyzes translation; therefore any factors increasing translation initiation efficiency increase mRNA stability. Many RNAs carry in the 3′ UTR sequences binding sites for specific proteins that increase or decrease the rate of polyA shortening.

But why polyAdenine? Why not polyCytosine or PolyGuanine or polyUridine? Here’s were the chemical ingenuity comes in. Of the 64 possible codons for amino acids only 3 tell the ribosome to stop. These are called various — termination codons, stop codons,and (idiotically) nonsense codons — they aren’t nonsense at all, and are  functionally vital for the following reason. Stop codons cause the ribosome to separate into two parts releasing the mRNA and the protein. Suppose a given mRNA doesn’t have a stop codon? Then the ribosome and the mRNA remain stuck together, and future protein synthesis by that particular ribosome becomes impossible. Not good.

This is probably why the codons for stop are so similar UAA, UAG and UGA — mutating a G to an A gives another one, and mutating either A in UAA to a G gives another stop codon. So the coding chosen for stop codons is somewhat resistant to mutation, because mRNAs with stop codons are disastrous for reasons shown above.

Well, randomness happens and suppose that the termination codon has been mutated to another amino acid. These are called nonStop RNAs which code for nonStop proteins. So the poor ribosome then translates the mRNA right to its 3′ end. Well what does AAA translate into — lysine. Lysine is quite basic and quickly becomes protonated on its epsilon lysine (even within the confines of the ribosome). The exit tunnel for the ribosome is strongly negatively charged, and so coulomb interaction grinds things to a halt. What other basic amino acids are there? There’s arginine, and perhaps histidine, but no codons for them is CCC or GGG or UUU.

Then the Ribosomal Quality Control system (RQC) then springs into action. I didn’t realize this until reading the following paper this year. Did you? Amazing cleverness on the part of the cell.

[ Nature vol. 531 pp. 191 – 195 ’16 ] Translation of an mRNA lacking a stop codon (nonStop mRNA) in eukaryotes results in a polyLysine protein (AAA codes for lysine). The positively charged lysine cause stalling in the negatively charged ribosomal exit tunnel. The Ribosomal Quality Control complex (RQC complex) recognizes nonStop proteins and mediates their ubiquitination and proteasomal degradation.

The eukaryotic RQC comprises Listerin (Ltn1) an E3 ubiquitin ligase, Rqc1, Rqc2 and the AAA+ protein CDC48. On dissociation of the stalled ribosome, Rqc binds to the peptidyl tRNA of the 60S sunit and recruits Ltn1 which curves around the 60S ribosome, positioning its ligase domain near the nascent chain exit. R2c2 is a nucleotide binding protein that recruits tRNA^Ala and tRNA^Thr to the 60S peptidyl tRNA complex. This results in the addition of a Carboxy terminal Ala/Thr sequence (a CAT tail) to the stalled nascent chain.

Mutation of Listerin causes neurodegeneration in mice.