Tag Archives: Frameshifting

Frameshifting

It is a pleasure to get back to the science after the ugly real world intruded with

l. A president in early dementia — https://luysii.wordpress.com/2021/06/30/biden-is-in-early-dementia-the-evidence/

2. The latest in politically correct racism  — https://luysii.wordpress.com/2021/07/03/hitler-would-have-loved-it/

but these things needed to be addressed.

I was very pessimistic about the chance of a vaccine for the pandemic based on my experience with AIDS/HIV1.  Why? Because no vaccine for HIV1 has been forthcoming despite 40 years of intense effort.  I am delighted to be wrong about pandemic vaccines.

But AIDS isn’t the kiss of death it was when I was in practice back in the 80s.  Why?  Because we know so much about what happens after the virus infects cells.  We attack all it’s weak points, from its genome, its reverse transcriptase.  So AIDs is now a chronic manageable disease.

So the more we know about SARS-Cov-2 the more ways we’ll find to attack it.   Which brings me to Science vol. 372 pp. 1306 – 1313 ’21.

The pandemic virus SARS-CoV-2 (and all coronaviruses) use something called frameshifting.

Here is a brief tutorial

Her fox and dog ate our pet rat

H erf oxa ndd oga teo urp etr at

He rfo xan ddo gat eou rpe tra t

The last two lines make no sense at all, but (neglecting the spaces) they have identical letter sequences.

Here are similar sequences of nucleotides making up the genetic code as transcribed into RNA

ATG CAT TAG CCG TAA GCC GTA GGA

TGC ATT AGC CGT AAG CCG TAG GA.

GCA TTA GCC TAA GCC GTA GGA ..

Again, in our genome there are no spaces between the triplets. But all the triplets you see are meaningful in the sense that they each code for one of the twenty amino acids (except for TAA which says stop). ATG codes for methionine (the purists will note that all the T’s should be U). I’m too lazy to look the rest up, but the ribosome doesn’t care, and will happily translate all 3 sequences into the sequential amino acids of a protein.

Both sets of sequences have undergone (reading) frame shifts. The examples are of +1 and +2 frameshifts.

SARS-CoV-2 uses a -1 frameshift.  this is necessary for the synthesis of nonstructural protein 12 (nsp12), crucially important to the virus as it codes for the viral RNA dependent RNA polymerase.

To produce the frameshift, the virus actually throws a monkey wrench at the ribosome.  At the site of the future frameshift the viral genome forms a pseudoknot  (https://en.wikipedia.org/wiki/Pseudoknot) which blocks the smooth translation of the ribosome along the viral genome, then it backs up by 1 (the -1 frameshift) and chugs on.

So PNAS vol. 118 32023051118 ’21 threw the kitchen sink (e.g. every compound they could think of) at the virus to find one which stopped the frameshift and they found one: merafloxacin a fluoroquinolone.  There are all sorts of fluoroquinolones in use as antibodies, so it’s time to try the others out.

This is unlikely to be a general approach to coronavirus therapy, as the RNA sequence at the frameshift site is likely to be different in each coronavirus.

I don’t think frameshifting occurs in eukaryotic cells, but I’m not sure.  Does anyone out there know?

 

Frameshifting

hed oga tet hec atw hoa tet her atw hob ith erp aw

Say what?  It’s a simple sentence made of 3 letter words frameshifted by one

he dog ate the cat who ate the rat who bit her paw

Codons are read as groups of three nucleotides, and frameshifting has always been thought to totally destroy the meaning of a protein, as an entirely different protein is made.

Not so says PNAS vol. 117 pp. 5907 – 5912 ’20. Normally a frameshifted protein has only 7% sequence identity with the original.  This is about what one would expect given that there are 20 amino acids, and chance coincidence would argue for 5%.  But there are more ways for proteins to be similar rather than identical.  One can classify our amino acids in several ways, charged vs. uncharged, aromatic vs. nonaromatic, hydrophilic vs. hydrophobic etc. etc.

The authors looked at 2,900 human proteins, then they frameshifted the original by +1 and compared the hydrophobicity profiles of the two.  Amazingly there was a correlation of .7 between the two, despite sequence identity of 7%.  Similarly frameshifting didn’t disturb the chance of intrinsic disorder.  So frameshifting is embedded in the structure of the universal genetic code, and may have actually contributed to its shaping.  Frameshifting could be an evolutionary mechanism of generating proteins with similar attributes (hydrophobicity, intrinsic order vs. disorder, etc.) but with vastly different sequences.  The evolution, aka natural selection aka deus ex machine aka God could muck about the ready made protein and find something new for it to do.   A remarkable concept.

The gag-pol precursor p180 of the AIDS virus is derived from the gag-pol mRNA by translation involving ribosomal frameshifting within the gag-pol overlap region.  The overlap is 241 nucleotides with pol in the -1 phase with respect to gag (that’s an amazing 80 amino acids).  I was amazed at the efficiency of coding of two different proteins (one and enzyme and one structural), but perhaps they aren’t that different in terms of hydrophobicity (or something else).

I’d love to see the hydropathy profile of the overlap of the two proteins, but I don’t know how to get it.