Tag Archives: AIDS virus

The chemical ingenuity of the AIDs virus

Pop quiz:  You are a virus with under 10,000 nucleotides in your genome.  To make the capsid enclosing your genome, you need to make 250 hexamers of a particular protein.  How do you do it?


Give up?


You grab a cellular metabolite with a mass under 1,000 Daltons to bind the 6 monomers together.  The metabolite occurs at fairly substantial concentrations (for a metabolite) of 10 – 40 microMolar.

What is the metabolite?

Give up?


It has nearly perfect 6 fold symmetry.


Still give up?

[ Nature vol. 560 pp. 509 – 512 ’18 ]  https://www.nature.com/articles/s41586-018-0396-4 says that it’s inositol hexakisphosphate (IP6)  — nomenclature explained at the end. http://www.refinebiochem.com/pages/InositolHexaphosphate.html

Although IP6 looks like a sugar (with 6 CHOH groups forming a 6 membered ring), it is not a typical one because it is not an acetal (no oxygen in the ring).  All 6 hydroxyls of IP6 are phosphorylated.  They bind to two lysines on a short (21 amino acids) alpha helix found in the protein (Gag which has 500 amino acids).  That’s how IP6 binds the 6 Gag proteins together. The paper has great pictures.

It is likely that IP6 is use by other cellular proteins to form hexamers (but the paper doesn’t discuss this).

IP6 is quite symmetric, and 5 of the 6 phosphorylated hydroxyls can be equatorial, so this is likely the energetically favored conformation, given the bulk (and mass) of the phosphate group.

I think that the AIDS virus definitely has more chemical smarts than we do.  Humility is definitely in order.

Nomenclature note:  We’re all used to ATP (Adenosine TriPhosphate) and ADP (Adenosine DiPhosphate) — here all 3 or 2 phosphates form a chain.  Each of the 6 hydroxyls of inositol can be singly phosphorylated, leading to inositol bis, tris, tetrakis, pentakis, hexakis phosphates.  Phosphate chains can form on them as well, so IP7 and IP8 are known (heptakis?, Octakis??)


Scary stuff

While you were in your mother’s womb, endogenous viruses were moving around the genome in your developing developing brain according to [ Neuron vol. 85 pp. 49 – 59 ’15 ].

The evidence is pretty good. For a while half our genome was called ‘junk’ by those who thought they had molecular biology pretty well figured out. For instance 17% of our 3.2 gigaBase DNA genome is made of LINE1 elements. These are ‘up to’ 6 kiloBases long. Most are defective in the sense that they stay where they are in the genome. However some are able to be transcribed into RNA, the RNA translated into proteins, among which is a reverse transcriptase (just like the AIDS virus) and an integrase. The reverse transcriptase makes a DNA copy of the RNA, and the integrates puts it back into the genome in a different place.

Most LINE1 DNA transcribed into RNA has a ‘tail’ of polyAdenine (polyA) tacked onto the 3′ end. The numbers of A’s tacked on isn’t coded in the genome, so it’s variable. This allows the active LINE1’s (under 1/1,000 of the total) to be recognized when they move to a new place in the genome.

It’s unbelievable how far we’ve come since the Human Genome Project which took over a decade and over a billion dollars to sequence a single human genome (still being completed by the way filling in gaps etc. etc [ Nature vol. 517 pp. 608 – 611 ’15 ] using a haploid human tumor called a hydatidiform mole ). The Neuron paper sequenced the DNA of 16 single neurons. They found LINE1 movement in 4

Once a LINE1 element has moved (something very improbable) it stays put, but all cells derived from it have the LINE1 element in the new position.

They found multiple lineages and sublineages of cells marked by different LINE1 retrotransposition events and subsequent mutation of polyA microsatellites within L1. One clone contained thousands of cells limited to the left middle frontal gyrus, while a second clone contained millions of cells distributed over the whole left hemisphere (did they do whole genome on millions of cells).

There is one fly in the ointment. All 16 neurons were from the same ‘neurologically normal’ individual.

Mosaicism is a term used to mean that different cells in a given individual have different genomes. This is certainly true in everyone’s immune system, but we’re talking brain here.

Is there other evidence for mosaicism in the brain? Yes. Here it is

[ Science vol. 345 pp. 1438 – 1439 ’14 ] 8/158 kids with brain malformations with no genetic cause (as found by previous techniques) had disease causing mutations in only a fraction of their cells (hopefully not brain cells produced by biopsy). Some mosaicism is obvious — the cafe au lait spots of McCune Albright syndrome for example. DNA sequencing takes the average of multiple reads (of the DNA from multiple cells?). Mutations foudn in only a few reads are interpreted as part of the machine’s inherent error rate. The trick was to use sequencing of candidate gene regions to a depth of 300 (rather than the usual 50 – 60).

It is possible that some genetically ‘normal’ parents who have abnormal kids are mosaics for the genetic abnormality.

[ Science vol. 342 pp. 564 – 565, 632 -637 ’13 ] Our genomes aren’t perfect. Each human genome contains 120 protein gene inactivating variants, with 20/120 being inactivated in both copies.

The blood of ‘many’ individuals becomes increasingly clonal with age, and the expanded clones often contain large deletions and duplications, a risk factor for cancer.

Some cases of hemimegalencephaly are due to somatic mutations in AKT3.

30% of skin fibroblasts ‘may’ have somatic copy number variations in their genomes.

The genomes of 110 individual neurons from the frontal cortex of 3 people were sequenced. 45/110 of the neurons had copy number variations (CNVs) — ranging in size from 3 megaBases to a whole chromosome. 15% of the neurons accounted for 73% of of the CNVs. However, 59% of neurons showed no CNVs, while 25% showed only 1 or 2.