Tag Archives: HIV1

The chemical ingenuity of the AIDs virus

Pop quiz:  You are a virus with under 10,000 nucleotides in your genome.  To make the capsid enclosing your genome, you need to make 250 hexamers of a particular protein.  How do you do it?


Give up?


You grab a cellular metabolite with a mass under 1,000 Daltons to bind the 6 monomers together.  The metabolite occurs at fairly substantial concentrations (for a metabolite) of 10 – 40 microMolar.

What is the metabolite?

Give up?


It has nearly perfect 6 fold symmetry.


Still give up?

[ Nature vol. 560 pp. 509 – 512 ’18 ]  https://www.nature.com/articles/s41586-018-0396-4 says that it’s inositol hexakisphosphate (IP6)  — nomenclature explained at the end. http://www.refinebiochem.com/pages/InositolHexaphosphate.html

Although IP6 looks like a sugar (with 6 CHOH groups forming a 6 membered ring), it is not a typical one because it is not an acetal (no oxygen in the ring).  All 6 hydroxyls of IP6 are phosphorylated.  They bind to two lysines on a short (21 amino acids) alpha helix found in the protein (Gag which has 500 amino acids).  That’s how IP6 binds the 6 Gag proteins together. The paper has great pictures.

It is likely that IP6 is use by other cellular proteins to form hexamers (but the paper doesn’t discuss this).

IP6 is quite symmetric, and 5 of the 6 phosphorylated hydroxyls can be equatorial, so this is likely the energetically favored conformation, given the bulk (and mass) of the phosphate group.

I think that the AIDS virus definitely has more chemical smarts than we do.  Humility is definitely in order.

Nomenclature note:  We’re all used to ATP (Adenosine TriPhosphate) and ADP (Adenosine DiPhosphate) — here all 3 or 2 phosphates form a chain.  Each of the 6 hydroxyls of inositol can be singly phosphorylated, leading to inositol bis, tris, tetrakis, pentakis, hexakis phosphates.  Phosphate chains can form on them as well, so IP7 and IP8 are known (heptakis?, Octakis??)


Bad news on the AIDs front

Bad news for those hoping for an AIDs cure. As you know, the active virus (HIV1) has a genome made of RNA. However, thanks to an enzyme it possesses called reverse transcriptase (which has led to Nobel prizes), it copies itself into DNA and integrates into the genome of lymphocytes. There it sits presumably doing nothing, but it’s always capable of activating and producing more infectious virus.

We seem to have fought the virus to a draw, using a cocktail of drugs which attack different aspects — HAART (Highly Active Antiretroviral Therapy). Success is usually considered being unable to detect viral RNA in the blood (see later). However blood cells are short-lived. What about the longer living lymphocytes found in the lymph nodes and spleen.

That’s what was studied in a current paper [ Nature vol. 530 pp. 5` – 45 ’16 ] but in only 3 people. All had no detectable virus in the blood (under 48 copies/milliLiter — an incredibly tiny amount — see later). What they did was to biopsy lymph nodes in the groin on study entry and at 3 and 6 months.

Then they sequenced the genomes of the lymphocytes from the nodes, to study the HIV1 DNA integrated into the genome. They found that the genome changed with time. This is very bad. Why?

Because it implies that, even though you the virus in the blood, the virus was not staying latent in the lymph nodes, but coming out of the lymphocytes and forming infectious virus which then mutated. Subsequently the mutated virus integrated into the genome of another lymphocyte. So even with what we consider excellent control, the virus is not purely latent. Drug resistance could arise from mutations (although they didn’t see it in this study).

Clearly, more people need to be studied this way (but serial biopsies? It will probably be done in prisoners, if such things are still done).

It’s worthwhile thinking about how incredibly selective and accurate our methods of analysis are. 48 copies of the viral RNA per milliLiter of blood is the lower limit of detection. Remember that water has a molecular weight of 18, so a liter of distilled water is 1000 grams / 18 grams = 55.5 Molar. A mole has 6 x 10^23 molecules. A milliLiter is 10^-3 liters. So 1 milliLiter of distilled water has 55 * 6 * 10^23 * 10^-3 == 3 * 10^22 molecules of water in it so the assay is finding 48 or more molecules of HIV1 RNA in the water haystack. Even figuring that the concentration of water in blood is 1/10 that of distilled water, this is still impressive.

Why Drug Discovery Is So Hard – Reason #22b — Drugs aren’t always doing the things we think they are

One of the things the AIDS virus does to make ‘curing’ AIDS so difficult is hiding. It integrates a DNA copy of its RNA genome into the genome of immune cells (and God knows what else) where it just sits quietly. Activation of the immune cell to fight infection often leads to emergence and production of more virus. One promising mode of therapy is preventing the DNA copy from entering our genome in the first place. The AIDS virus (aka HIV1) produces a protein called Integrase which does that. This has led to the development of integrase inhibitors.

[ Proc. Natl. Acad. Sci. vol. 110 pp. 8327 – 8328, 8690 – 8695 ’13 ] THe HIV1 integrase is targeted to sites in chromatin by the host protein LEDGF (Lens Epithelium Derived Growth Factor, aka p75). This work shows that the integrase inhibitors blocking the interaction of LEDGF/p75 (a translational coactivator) with the integrase cause something else — they cause AIDS viruses under construction within the cell. to assemble into a noninfectious structure. This happens long after integration and expression of viral RNA and protein. It is they thought that the integrase inhibitors inappropriately stabilize integrase dimers in the viral assembly process.

Who knew? They weren’t designed to do that.

For two more examples along these lines please see



We wouldn’t exist if retroviruses weren’t moving around in our genome.

Time for some of the excellent molecular biology I’ve put off writing about while I plow through the new Clayden.  I reached the halfway point today (p. 590) Exactly 2 months and 2 weeks after it arrived.  The chemist might need  some brushing up on DNA and messenger RNA before pushing on.  Pretty much all the background needed is found in https://luysii.wordpress.com/2010/07/07/molecular-biology-survival-guide-for-chemists-i-dna-and-protein-coding-gene-structure/ an d https://luysii.wordpress.com/2010/07/11/molecular-biology-survival-guide-for-chemists-ii-what-dna-is-transcribed-into/.

Everyone has heard of the AIDs virus.  It has so far been impossible to cure because it hides in our DNA doing next to nothing.  Tickle it in a variety of unknown ways, and it’s DNA is transcribed into messenger RNA (mRNA), the virus is assembled and goes on to wreak havoc with our immune system.  How does the AIDs virus get into our DNA in the first place?  Its genome is made of RNA, not DNA.  It has an enzyme (reverse transcriptase) which transcribes its RNA into DNA, and another enzyme (the integrate, which is actually a complex of proteins) which patches the DNA copy (called cDNA) into our genome.  That’s why we can’t get rid of it.  That’s also why it’s called a retrovirus — because of retrograde transcription of its RNA into cDNA).

Well, sorry to say, but at least 10% of our DNA is made of retrovirus remnants.  The vast majority of them have been crippled by mutation so their reverse transcriptases  don’t work any more, or there is something wrong with their integrase, etc. etc.  Some of them do make RNA copies of themselves however, but the copies are mutated enough that infectious virus doesn’t form.  But the RNA copies can be reverse transcribed  into cDNA and reinserted back into our DNA, and in a new site to boot.  This is why they are called retrotransposons.

The whole bunch of retroviruses, retrotransposons, and other repetitive elements of DNA have been called ‘junk’ by eminent authority.  Another epithet for them is the selfish gene — which exists only to reproduce itself.  Humans are said to be machines for reproducing human DNA.

Enter  [ Cell vol. 150 pp. 7 – 9, 29 – 38 ’12 ].  Now it’s time for some very human biology The fetus represents an immunologically different graft to the mother.  Half its antigens are tolerated because they are maternal, the paternal half are not likely to be.  Allogeneic means a transplant from a different member of the same species, so the fetus is regarded as semiallogeneic. 

So why doesn’t our immune system attack the placenta surrounding the fetus, which expresses the paternal proteins?  There’s probably a lot more to it but a class of immune cell called a regulatory T cell (Treg) shuts down the immune response wherever they are found, and the placenta has lots of them.

Different cells express different proteins, and Tregs are no exception. A transcription factor is something that binds to the DNA in front of a gene, turning on transcription of the gene,  ultimately increasing production of the protein the gene codes for. Specificity is obtained by the transcription factor binding to particular sequences of DNA, which are found in only in front of a subset of  genes

The transcription factor which turns on genes necessary to turn an immune cell into a Treg is called Foxp3.  Foxp3 is a protein and to have lots of it around the gene for it must be turned on so its mRNA can be made.  Guess what?  This means that other transcription factors must bind in front the Foxp3 gene.
Here’s Jonathan Swift on the subject
So nat’ralists observe, a flea
Hath smaller fleas that on him prey,
And these have smaller fleas that bite ’em,
And so proceed ad infinitum.”

An important protein like Foxp3 is highly controlled.  There are 3 distinct regions in front of the gene were other transcription factors and repressors of transcription bind.  They are called conserved nonCoding sequences (CNSs), an oxymoron, because they are clearly coding for something quite important. The 3 sequences are called CNS1, CNS2 and CNS3.    Technology has progressed to the point where we can remove just about any DNA sequence from the mouse genome we wish (the resultant mice are called knockout mice).  

Anyway if you knockout CNS1 the mice resorb semiallogenic fetuses (where the father and the mother aren’t genetically related), but not allogenic fetuses (where the genomes of the father and the mother are pretty much the same due to inbreeding).  It’s possible to trace Foxp3 far back in evolution.  Only animals with placentas (eutherians) have CNS1 in addition to CNS2 and CNS3. Marsupials, which don’t have placentas, just have CNS2 and CNS3. 

So where do retrotransposons come in?  The structure of CNS1 shows that it is a retrotransposon which moved in front of the Foxp3 gene.  It mutated enough for a new and different set of transcription factors to bind to it and turn on Foxp3 expression in the placenta allowing survival of the fetus.  Some Junk DNA indeed !