Tag Archives: CUG

The death of the synonymous codon – V

The coding capacity of our genome continues to amaze. The redundancy of the genetic code has been put to yet another use. Depending on how much you know, skip the following four links and read on. Otherwise all the background you need to understand the following is in them.





There really is no way around the redundancy producing synonymous codons. If you want to code for 20 different amino acids with only four choices at each position, two positions (4^2) won’t do. You need three positions, which gives you 64 possibilities (61 after the three stop codons are taken into account) and the redundancy that comes along with it. The previous links show how the redundant codons for some amino acids aren’t redundant at all but used to code for the speed of translation, or for exonic splicing enhancers and inhibitors. Different codons for the same amino acid can produce wildly different effects leaving the amino acid sequence of a given protein alone.

The latest example — https://www.pnas.org/content/117/40/24936 Proc. Natl. Acad. Sci. vol. 117 pp. 24936 – 24046 ‘2 — is even more impressive, as it implies that our genome may be coding for way more proteins than we thought.

The work concerns Mitochondrial DNA Polymerase Gamma (POLG), which is a hotspot for mutations (with over 200 known) 4 of which cause fairly rare neurologic diseases.

Normally translation of mRNA into protein begins with something called an initator codon (AUG) which codes for methionine. However in the case of POLG, a CUG triplet (not AUG) located in the 5′ leader of POLG messenger RNA (mRNA) initiates translation almost as efficiently (∼60 to 70%) as an AUG in optimal context. This CUG directs translation of a conserved 260-triplet-long overlapping open reading frame (ORF) called  POLGARF (POLG Alternative Reading Frame — surely they could have come up something more euphonious).

Not only that but the reading frame is shifted down one (-1) meaning that the protein looks nothing like POLG, with a completely different amino acid composition. “We failed to find any significant similarity between POLGARF and other known or predicted proteins or any similarity with known structural motifs. It seems likely that POLGARF is an intrinsically disordered protein (IDP) with a remarkably high isoelectric point (pI =12.05 for a human protein).” They have no idea what POLGARF does.

Yet mammals make the protein. It gets more and more interesting because the CUG triplet is part of something called a MIR (Mammalian-wide Interspersed Repeat) which (based on comparative genomics with a lot of different animals), entered the POLG gene 135 million years ago.

Using the teleological reasoning typical of biology, POLGARF must be doing something useful, or it would have been mutated away, long ago.

The authors note that other mutations (even from one synonymous codon to another — hence the title of this post) could cause other diseases due to changes in POLGARF amino acid coding. So while different synonymous codons might code for the same amino acid in POLG, they probably code for something wildly different in POLGARF.

So the same segment of the genome is coding for two different proteins.

Is this a freak of nature? Hardly. We have over an estimated 368,000 mammalian interspersed repeats in our genome — https://en.wikipedia.org/wiki/Mammalian-wide_interspersed_repeat.

Could they be turning on transcription for other proteins that we hadn’t dreamed of. Algorithms looking for protein coding genes probably all look for AUG codons and then look for open reading frames following them.

As usual Shakespeare got there first “There are more things in heaven and earth, Horatio, Than are dreamt of in your philosophy.”

Certainly the paper of the year for intellectual interest and speculation.

18 at one blow said the molecular biologist

With apologies to the brothers Grimm, molecular biologists may have found a way to treat 18 genetic diseases at one blow [ Cell vol. 170 pp. 899 – 912 ’17 ]. They use adeno-associated virus (AAV) packing a modified enzyme and an RNA to remove repeat expansions from RNA.   The paper give a list of the 18, all but one of which are neurologic.  They include such horrors as Huntington’s chorea, the most common form of familial ALS, 3 forms of spinocerebellar ataxia and 6 forms of spinocerebellar atrophy.

They use Cas9 from Streptococcus Pyogenes, part of the CRISPR system (https://en.wikipedia.org/wiki/CRISPR)  bacteria use to defend themselves against viruses, with a single guide RNA.  Even more interestingly, Cas9 is an enzyme which breaks up RNA, but the Cas9 they used is catalytically dead.  They think that just binding to the aggregated RNA containing the repeats is enough to break up the aggregate.  This is the way antiSense oligoNucleotides are thought to work.

The problem with getting a bacterial enzyme into a human cell is avoided here by using a virus to infect them (AAV).  It did get rid of RNA aggregates in patients’ cells from 4 of the diseases (two myotonic dystrophies, and the familial ALS).

It is almost too fantastic to be true.

Why almost all of these repeat expansion diseases affect the nervous system is anyone’s guess.  As you can image theories abound.  So all we have to do is figure out how to get the therapy into the brain (hardly a small task).