Tag Archives: ACE2

The pandemic virus as evolution professor

Like it or not, the pandemic virus (SARS-CoV-2) is giving us all lessons in evolution and natural selection. The latest is one of the clearest examples of natural selection you are likely to see.  It is very clear cut, but to leave almost no one behind, I’m going to put in a lot of background material which will bore the cognoscenti — they can skip all this and go to the meat of the issue after the ****

The genetic code is read in groups of 3.  Imagine a language in which all words must be 3 letters long. 

The dog ate the fat cat who bit the toe off one mad rat.   Call this the reading frame, in which the words all make sense to you

Any combination of 3 letters means something to the machinery inside the cell responsible for reading the code, so deleting the f in fat 

gives us 

The dog ate the atc atw hob itt het oeo ffo nem adr at.   So this is a shift of 1 from the reading frame.  While it may not make sense to you, it makes sense to the cellular machinery. 

Now let’s delete 2 letters (in a row)

The dog ate the fat cat who bit the tof fon ema dra t.  

Not much sense after the deletion is there?  Or at least a completely different message.  This is a shift of 2 from the reading frame.

Now 3 letters (in a row)

The dog ate the fat cat who bit the toe off one mad rat.  

This gives 

The dog ate the fat cat who bit the tff one mad rat.  

Which has a funny looking word (tff), but leaves the rest of the 3 letter words intact (one mad rat).  This is called an in frame deletion. It basically lops out a single 3 letter word.  

Lopping out 4, 5, 6, .. letters will just give you one of the 3 patterns (frame shift of 1, frame shift of 2 or no frameshift at all) shown above (but nothing new)


Now the business end of the pandemic virus is the spike protein, and these are where the mutations everyone is worried about occur.  The spike protein binds to another protein (ACE2) on the surface of human cells and then the virus enters causing havoc.  All the vaccines we have are against the spike protein. 

The spike protein is big (1,273 different 3 letter words).  

Mutations occur randomly.  We now have something called GISAID (Global Initiative on Sharing All Influenza Data) which has well over 100,000 genome sequences of the virus.  

Other things being equal we should see as many 1,  4 (3+1), 7 (2*[3] + 1), 10 letter deletions as 2, 5 (3 + 2), 8 ( 2*[3] + 2) , as 3, 6, 9, 12, letter   deletions.

The set  1, 4, 7, 10, . . represents a shift of 1 from the original reading frame, the set 2, 5, 8, 11 … represents a frame shift of two and 3, 6, 9 .. represents a set of deletions producing no frameshift at all.

Since thousands on thousands of experiments show that mutations occur randomly, 1/3 of all deletion mutations should show a frameshift of 1, 1/3 of all deletion mutations should have a frame shift of 2, and 1/3 of all deletion mutations should produce no frameshift at all. 

Well the authors of Science vol. 371 pp. 1139 – 1142 ’21  looked at 146,795 viral sequences and found 1,108 deletions in the gene for the spike protein.

They did not find each of the 3 types of deletions occuring to the same extent (1/3 of the time).  Among all deletions, 93% were in frame.  

Why? Because out of frame deletions change everything that comes after them. 


The dog ate the atc atw hob itt het oeo ffo nem adr at.  

This means that a functional spike protein won’t be formed, and the virus won’t infect our  cells, and it certainly won’t be found in GISAID.  

Ladies and Gentlemen you have just witnessed natural selection in action. 

Actually it’s even more complicated and even more impressive than that.  The in frame deletions occurred in one of four areas, which happen to be where antibodies to the spike protein bind.  So the out of frame deletions were selected against, and the in frame deletions were selected for. 

The blind watchmaker in action.

Another way to see how improbable it is that random choice should choose one of 3 equally probable possibilities 97% of the time, imagine that you are throwing dice.  You throw a single dye 100 times, and 97 times you get either of two numbers (say 3 and 6) .  You know the dye is loaded.  The load being natural selection in the case of genome deletions. 






The uses and abuses of molarity — II

Just as the last post showed why a 1 Molar solution of a protein makes no sense at all, it is reasonable to ask what the highest concentration of a single protein in the cellular environment could be. Strangely, it was very hard for me to find an estimate of the percentage of protein mass inside a eukaryotic cell. There is one for the red blood cell, which is essentially a bag of hemoglobin. The amount is 33 grams/deciliter or 330 grams/liter. Hemoglobin (which is a tetramer) has a molecular mass of 64,000 Daltons.  So that’s 330/64000 = .5 x 10^-3 Molar.   So all proteins in our cells have a maximum concentration at most in the milliMolar range.

Before moving on, how do you think the red blood cell gets its energy?  Amazingly it is by anaerobic glycolysis, not using the oxygen carried by hemoglobin at all.  Why? If it used oxidative phosphorylation which runs on oxygen, it would burn up.  That’s why red cells do not contain mitochondria. 

On to Kd the dissociation constant.  At least 475 FDA approved drugs target G Protein Coupled Receptors (GPCRs), and our genome codes for some 826 of them.  Almost 500 of them code for smell receptors, and of the 300 or so not involved with smell 1/3 are orphans (as of 2019) with no known ligand.  There are GPCRs for all neurotransmitters which is why neurologists and psychiatrists are very interested in them. 

The Kd is defined as [ free ligand ][ free receptor ]/ [ ligand bound to receptor]  where all the  [  ]’s are concentrations in Moles/liter (e.g. Molar concentrations). 

There’s the rub.  Kd makes sense when ligand and receptor are swimming around in solution, but GPCRs never do this.  The working GPCR is embedded in our cell membrane which topologists tell us are 2 dimensional manifolds embedded in 3 dimensional space.  What does concentration mean in a situation like this?  Think of the entropy involved in getting all the GPCRs to lie in a single plane.  Obviously not so simple.  

People get around this by using radioactive ligands, and embedding GPCRs in membranes and measuring the time for ligands to bind and unbind (e.g. kinetics), but this is miles away from the physiologic situations — for details please see

2019 Apr 5; 485: 9–19.
The same is true for other proteins of interest — ion channels for the neurologist, hormone receptors for the endocrinologist, angiotension converting enzyme 2 (ACE2) for the pandemic virus.  
I think that all Kd’s of membrane embedded receptors do is give you an ordinal ordering (e.g. receptor A binds ligand B tighter than ligand C ) but not a quantitative one.
Next up, how a Nobel prizewinner totally misunderstood the nature and applicability of molarity and studies on a two dimensional gas (complete with Pressure * Area = n * Gas Constant * Temperature).