hed oga tet hec atw hoa tet her atw hob ith erp aw

Say what?  It’s a simple sentence made of 3 letter words frameshifted by one

he dog ate the cat who ate the rat who bit her paw

Codons are read as groups of three nucleotides, and frameshifting has always been thought to totally destroy the meaning of a protein, as an entirely different protein is made.

Not so says PNAS vol. 117 pp. 5907 – 5912 ’20. Normally a frameshifted protein has only 7% sequence identity with the original.  This is about what one would expect given that there are 20 amino acids, and chance coincidence would argue for 5%.  But there are more ways for proteins to be similar rather than identical.  One can classify our amino acids in several ways, charged vs. uncharged, aromatic vs. nonaromatic, hydrophilic vs. hydrophobic etc. etc.

The authors looked at 2,900 human proteins, then they frameshifted the original by +1 and compared the hydrophobicity profiles of the two.  Amazingly there was a correlation of .7 between the two, despite sequence identity of 7%.  Similarly frameshifting didn’t disturb the chance of intrinsic disorder.  So frameshifting is embedded in the structure of the universal genetic code, and may have actually contributed to its shaping.  Frameshifting could be an evolutionary mechanism of generating proteins with similar attributes (hydrophobicity, intrinsic order vs. disorder, etc.) but with vastly different sequences.  The evolution, aka natural selection aka deus ex machine aka God could muck about the ready made protein and find something new for it to do.   A remarkable concept.

The gag-pol precursor p180 of the AIDS virus is derived from the gag-pol mRNA by translation involving ribosomal frameshifting within the gag-pol overlap region.  The overlap is 241 nucleotides with pol in the -1 phase with respect to gag (that’s an amazing 80 amino acids).  I was amazed at the efficiency of coding of two different proteins (one and enzyme and one structural), but perhaps they aren’t that different in terms of hydrophobicity (or something else).

I’d love to see the hydropathy profile of the overlap of the two proteins, but I don’t know how to get it.

