Category Archives: Chemistry (relatively pure)

The science behind Cassava Sciences (SAVA)

I certainly hope Cassava Sciences new drug Sumifilam for Alzheimer’s disease works for several reasons

l. It represents a new approach to Alzheimer’s not involving getting rid of the plaque which has failed miserably

2. The disease is terrible and I’ve watched it destroy patients, family members and friends

3. I’ve known one of the principals (Lindsay Burns) of Cassava since she was a teenager and success couldn’t happen to a nicer person. For details please see https://luysii.wordpress.com/2021/02/02/montana-girl-does-good-real-good/.

Unfortunately even if Sumifilam works I doubt that it will be widely used because of the side effects (unknown at present) it is very likely to cause.  I certainly hope I’m wrong.

Here is the science behind the drug.  We’ll start with the protein the drug is supposed to affect — filamin A, a very large protein (2,603 amino acids to be exact).  I’ve known about it for years because it crosslinks actin in muscle, and I read everything I could about it, starting back in the day when I ran a muscular dystrophy clinic in Montana.  

Filamin binds actin by its amino terminal domain.  It forms a dimerization domain at its carboxy terminal end.  In between are 23 repeats of 96 amino acids which resemble immunoglobulin — forming a rod 800 Angstroms long.  The dimer forms a V with the actin binding domain at the two tips of the V, making it clear how it could link actin filaments together. 

Immunoglobulins are good at binding things and Lindsay knows of 90 different proteins filamin A binds to.  This is an enormous potential source of trouble.  

As one might imagine, filamin A could have a lot of conformations in addition to the V, and the pictures shown in https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2099194/.

One such altered (from the V) conformation binds to the alpha7 nicotinic cholinergic receptor on the surface of neurons and Toll-Like Receptor 4 (TLR4) inside the cell.

Abeta42, the toxic peptide, has been known for years to bind tightly to the alpha7 nicotinic receptor — they say in the femtoMolar (10^-15 Molar) range, although I have my doubts as to whether such tiny concentration values are meaningful.  Let’s just say the binding is tight. 

The altered conformation of filamin A makes the binding of Abeta to alpha7even tighter. 

In some way, the tight binding causes signaling inside the cell (mechanism unspecified) to hyperphosphorylate the tau protein, which is more directly correlated with dementia in Alzheimer’s disease than the number of senile plaques. 

So what does Sumifilam actually do — it changes the ‘altered’ conformation of filamin A back to normal, decreasing Abeta signaling inside the cell.  

How do they know the conformation of filamin A has changed?  They haven’t done cryoEM or Xray crystallography on the protein.  The only evidence for a change in conformation, is a change in the electrophoretic mobility (which is pretty good evidence, but I’d like to know what conformation is changed to what).

Notice just how radical this proposed mechanism of action actually is.  The nicotinic cholinergic receptor is an ion channel, yet somehow the effect of Sumifilam is on how the channel binds to another protein, rather than how it conducts ions. 

However they have obtained some decent results with the drug in a very carefully done (though small — 13 patients) study in J. Prev Alz. Dis. 2020 (http://dx.doi.org/10.14283/ipad2020.6) and the FDA this year has given the company the go ahead for a larger phase III trial.

Addendum 26 March: The above link didn’t work.  This one should — it’s from Lindsay herself

https://link.springer.com/article/10.14283/jpad.2020.6

Why, despite rooting for the company and Lindsay am I doubtful that the drug will find wide use.  We are altering the conformation of a protein which interacts with at least 90 other proteins (Lindsay Burns, Personal Communication).  It seems inconceivable that there won’t be other effects in the neuron (or elsewhere in the body) due to changes in the interaction with the other 89 proteins filaminA interacts with.  Some of them are likely to be toxic. 

To understand anything in the cell you need to understand nearly everything in the cell

Understanding how variants in one protein can either increase or decrease the risk of Parkinson’s disease requires understanding of the following: the lysosome, TMEM175, Protein kinase B, protein moonlighting, ion channel lysoK_GF, dopamine neurons among other things. So get ready for a deep dive into molecular and cellular biology.

It is now 50 years and 6 months since L-DOPA was released in the USA for Parkinson’s disease, and I was tasked as a resident by the chief with running the first L-DOPA clinic at the University of Colorado.  We are still learning about the disease as the following paper Nature vol. 591 pp. 431 – 437 ’21 will show. 

The paper describes an potassium conducting ion channel in the lysosomal membrane called LysoK_GF.  The channel is made from two proteins TMEM175 and protein kinase B (also known as AKT).

TMEM175 is an ion channel conducting potassium.  It is unlike any of the 80 or so known potassium channels.  It  contains two repeats of 6 transmembrane helices (rather than 4) and no pore loop containing the GYG potassium channel signature sequence. Lysosomes lacking it aren’t as acidic as they should be (enzymes inside the lysosome work best at acid pH).  Why loss of a potassium channel show affect lysosomal pH is a mystery (to me at least).

Genome Wide Association Studies (GWAS) have pointed to the genomic region containing TMEM175 as having risk factors for Parkinsonism.  Some variants in TMEM175 are associated with increased risk of the disease and others are associated with decreased risk — something fascinating as knowledge here should certainly tell us something about Parkinsonism.  

The other protein making up LysoK_GF is protein kinase B (also known as AKT). It is found inside the cell, sometimes associated with membranes, sometimes free in the cytoplasm. It is big containing 481 amino acids. Control of its activity is important, and Cell vol. 169 pp. 381 – 405 ’17 lists 21 separate amino acids which can be modified by such things as acetylation, phosphorylation, sumoylation, Nacetyl glucosamine, proline hydroxylation.  Well 2^21 is 2,097,152, so this should keep cell biologists busy for some time. Not only that some 100 different proteins AKT phosphorylates were known as 2017.  

TMEM175 is opened by conformational changes in AKT.  Normally the enzyme is inactive because the pleckstrin homology domain binds to the catalytic domain inhibiting enzyme activity as the substrate can’t get in.

Remarkably you can make a catalytically dead AKT, and it still works as a controller of TMEM175 activity — this is an example of a moonlighting molecule — for more please see — https://luysii.wordpress.com/2021/01/11/moonlighting-molecules/.

Normally the activity and conformation of AKT is controlled by the metabolic state of the cell (with 21 different molecular knob sites on the protein this shouldn’t be hard).  So the fact that AKT conformation controls TMEM175 conductivity which controls lysosome activity gives the metabolic state of the cell a way to control lysosomal function.  

Notice how to understand anything in the cell you must ask ‘what’s it for’, thinking that is inherently teleological. 

Now on to the two risk factors for Parkinsonism in TMEM175.  The methionine –> threonine mutation at amino acid #393 reduces the lysoK_GF current and is associated with an increased risk of parkinsonism, while the glutamine –> proline mutation at amino acid position #65 gives a channel which remains functional under conditions of nutrient starvation. 

The authors cultured dopamine neurons and found out that the full blooded channel LysoK_GF (TMEM175 + AKT) protected neurons against a variety of insults (MPTP — a known dopamine neuron toxin, hydrogen peroxide, nutrient starvation). 

TMEM175 knockout neurons accumulate more alpha-synuclein — the main constituent of the Lewy body of Parkinsonism.

So it’s all one glorious tangle, but it isn’t just molecular biological navel gazing, because it is getting close to one cause (and hopefully a treatment) of Parkinson’s disease.  

Answer to Friday’s homework problem

2 days ago you were tasked with the following homework problem: Design a protein to capture cholesterol and triglycerides and insert them between the two leaflets of the standard biological membrane similar but not identical to the plasma membrane.

Why not just tell you Nature/God/Evolution’s solution to the problem?  Because unless you’ve thought about how you’d do it, you won’t appreciate the elegance (and beauty to a chemist) of the solution. 

Lipid droplets are how your cells store cholesterol and triglycerides (neutral fats).  Cholesterol and most fats are made in the lumen of the endoplasmic reticulum.  Then they move through the homework protein and accumulate between the two leaflets of the endoplasmic reticulum membrane, growing into lens-like structures with diameters of 400 to 600 Angstroms before they leave to enter the cytoplasm.  

Well clearly to get them between the sheets so to speak a hole must be formed in the membrane leaflet closest to the lumen, and the hole must have open sides so the cholesterol and triglyceride can escape.  

The protein must also catch the lipids in the lumen.  This is accomplished by an 8 stranded beta sandwich.  The protein must also cross the endoplasmic reticulum membrane so the lipids its caught can escape the sides.  

Like a lot of pores in the membrane (such as ion channels), several copies of the protein must come together to form the hole.  In this case the protein contains two transmembrane alpha helices.  Its hard to count just how many monomers make up the power, but my guess is 11 or so. 

Here’s a picture

 

The transmembrane (TM)alpha helices are in purple, the beta sandwiches are in blue-greem.

8 nm is 8 nanoMeters or 800 angstroms.  The hole looks to be around 30 Angstroms across — plenty of room to allow cholesterol and triglycerides to enter.  When you look at the top view you see that there is plenty of room between the alpha helices within the membrane for the lipids to escape out the side.  

Here’s the reference https://www.pnas.org/content/pnas/118/10/e2017205118.full.pdf

and the citation Proc. Natl. Acad. Sci. vol. 118 pp. e2017205118 ’21.  It’s a beautiful paper

The protein itself is called seipin, and mutations cause a variety of lipodystrophies, some of which have mental retardation.  The paper has some nice molecular dynamics simulations of seipin in action (if you believe that sort of thing). 

Were you smart enough to figure all this out on your own.  Nature/God/Evolution was.  I wasn’t.

Homework assignment

Design a protein to capture cholesterol and triglycerides and insert them between the two leaflets of the standard biological membrane similar but not identical to the plasma membrane. Answer Sunday night 14 March ’21 

I don’t think we fully grasp the chemical ingenuity of Nature when we discover one of its solutions.   Thinking on your homework assignment will give you a chance to appreciate  just how  chemically clever Nature/Evolution/God actually is. 

TDP43 and the anisosome

Neurologists have been interested in TDP43 (Tar Dna binding Protein of 43 kiloDaltons) for a long time. Mutants cause some cases of ALS (Amyotrophic Lateral Sclerosis — Lou Gehrig disease) and FTD (FrontoTemporal Dementia).  Some 50 different mutations in the protein have been found in cases of these two diseases.  Intracellular inclusions containing TDP are found in > 90% of sporadic ALS (no mutations) and 45% of FTD.

TDP43 contains 414 amino acids (as you might expect for a protein with a 43 kiloDalton mass).  There is an amino terminal ubiquitinlike fold, two RNA Recognition Motifs (RRMs) followed by a glycine rich low complexity sequence prion-like domain at the other (carboxy) end.  The disease causing mutations are found in the low complexity sequence. 

A  phase separated structure (the anisosome) never seen before involves  mutant TDP43 [ Science vol. 371 pp. 585, abb4309 pp. 1 –> 15 ’21 ].  It is a phase separated mass with liquid spherical shells and liquid cores.  The shells showed birefringence — evidence of a liquid crystal.  The cores show the HSP70 chaperone bound to TDP43 (which wasn’t binding RNA).

ATP is required to maintain the chaperone activity of HSP70. When ATP levels are reduced, the anisosome is converted into the protein aggregates seen in ALS and FTD.  So the anisosome is a protective mechanism. 

Biology is clearly leading chemistry around by the nose.  No chemist would ever have predicted something like this, or received a grant to mix all this stuff in a test tube not even thinking about stoichiometry and see what happened.  For more details on phase separation please see an old post — https://luysii.wordpress.com/2020/12/20/neuroscience-can-no-longer-ignore-phase-separation/

Here’s some stuff from that post to whet your appetite

Advances in cellular biology have largely come from chemistry.  Think DNA and protein structure, enzyme analysis.  However, cell biology is now beginning to return the favor and instruct chemistry by giving it new objects to study. Think phase transitions in the cell, liquid liquid phase separation, liquid droplets, and many other names (the field is in flux) as chemists begin to explore them.  Unlike most chemical objects, they are big, or they wouldn’t have been visible microscopically, so they contain many, many more molecules than chemists are used to dealing with.

These objects do not have any sort of definite stiochiometry and are made of RNA and the proteins which bind them (and sometimes DNA).  They go by any number of names (processing bodies, stress granules, nuclear speckles, Cajal bodies, Promyelocytic leukemia bodies, germline P granules.  Recent work has shown that DNA may be compacted similarly using the linker histone [ PNAS vol.  115 pp.11964 – 11969 ’18 ]

The objects are defined essentially by looking at them.  By golly they look like liquid drops, and they fuse and separate just like drops of water.  Once this is done they are analyzed chemically to see what’s in them.  I don’t think theory can predict them now, and they were never predicted a priori as far as I know.

No chemist in their right mind would have made them to study.  For one thing they contain tens to hundreds of different molecules.  Imagine trying to get a grant to see what would happen if you threw that many different RNAs and proteins together in varying concentrations.  Physicists have worked for years on phase transitions (but usually with a single molecule — think water).  So have chemists — think crystallization.

Proteins move in and out of these bodies in seconds.  Proteins found in them do have low complexity of amino acids (mostly made of only a few of the 20), and unlike enzymes, their sequences are intrinsically disordered, so forget the key and lock and induced fit concepts for enzymes.

Are they a new form of matter?  Is there any limit to how big they can be?  Are the pathologic precipitates of neurologic disease (neurofibrillary tangles, senile plaques, Lewy bodies) similar.  There certainly are plenty of distinct proteins in the senile plaque, but they don’t look like liquid droplets.

It’s a fascinating field to study.  Although made of organic molecules, there seems to be little for the organic chemist to say, since the interactions aren’t covalent.  Time for physical chemists and polymer chemists to step up to the plate.

 

Proteins (and amyloids) still have some tricks up their sleeves

We all know that amyloids are made of beta sheets stacked on top of each other. Not all of them, says Staph Aureus according to PNAS e2014442118 ’21. In fact one protein they produce (Phenol Soluble Modulin alpha 3 (PSMα3)– PSMalpha3 ) which is toxic to human immune cells forms amyloid made of alpha helices.  PSMalpha3 forms cross-α amyloid fibrils that are composed entirely of amphipathic α-helices. The helices stack perpendicular to the fibril axis into mated “sheets”

However other members of the family namely PSMα1 and PSMα4 adopt the classic amyloid ultrastable cross-β architecture and are likely to serve as a scaffold rendering the biofilm a more resistant barrier.

It gets worse.

Consider an antimicrobial peptide (AMP) called uperin 3.5, secreted on the skin of a frog which also forms amyloid fibrils made of alpha helices.  The amyloid is  essential for uperin 3.5’s  toxic activity against the Gram-positive bacterium Micrococcus luteus.

It gets even worse.  

When secreted onto the frog skin uperin 3.5. has a disordered structure. Uperin 3.5 requires bacterial membranes to form the toxic amyloid made of alpha helices.   When no membranes are around, uperin 3.5. still forms amyloid, but this time the amyloid is of the classic beta sheet.  So one protein can form two types of amyloid.  Go figure

Uperin 3.5 is a classic example of a chameleon protein. 

The uses and abuses of molarity — III

2 dimensional gases were made years and years ago but no one talks about them any more. I found them fascinating as a neophyte chemistry student. Just take a typical fatty acid with a long hydrophobic tail (say stearic acid with 18 carbons) and place a small amount on water. The COOH groups hydrogen bond with the water, while the hydrocarbon tails lie on the surface of the water. Confine them to a small area, and the hydrophobic tails stick straight up away from the air water interface. Now constrict the area they are found in. The force on the wall forming the constriction is proportional to the number of molecules in the area and the area and the temperature — e.g. PA = nRT — the ideal gas law in two dimensions. So confined stearic acid on the surface of water is a two dimensional gas.

It would be nice if we could get a similar 2 dimensional arrangement of G Protein Coupled Receptors (GPCRs) — see the previous post — but we can’t (so far). 

Of course there is a darker side. The films are known as Langmuir Blodgett films.

Irving Langmuir won the Nobel prize in Chemistry for this (and other work). Blodgett who was instrumental in figuring out how to make the films got nothing. 

Why?  Probably because she was a woman — https://en.wikipedia.org/wiki/Katharine_Burr_Blodgett..  She was a Bryn Mawr graduate and the first woman to receive a PhD in physics from Cambridge. 

Moving along to another Bryn Mawr graduate; Candace Pert really discovered the opiate receptor at Johns Hopkins. She was screwed out of proper recognition by her PhD advisor, Solomon Snyder.  While he now has a department named after him at  Hopkins,  he will never receive the Nobel prize. 

The story of Rosalind Franklin and DNA is too well known to repeat.  So I’ll close with Lise Meitner who discovered nuclear fission and got nothing except a book from an old girl friend — https://www.amazon.com/Lise-Meitner-Ruth-Lewin-Sime/dp/0520208609.  The authoress notes in the preference that she was the female chemist that the department didn’t want.  Definitely a woman with an edge, which is why I was attracted to her. 

Now, as promised, here is the Nobelist who clearly doesn’t understand Molarity.

The chemist can be excused for not knowing what a nanodomain is. They are beloved by neuroscientists, and defined as the part of the neuron directly under an ion channel in the neuronal membrane. Ion flows in and out of ion channels are crucial to the workings of the nervous system. Tetrodotoxin, which blocks one of them, is 100 times more poisonous than cyanide. 25 milliGrams (roughly 1/3 of a baby aspirin) will kill you.

Nanodomains are quite small, and Proc. Natl. Acad. Sci. vol. 110 pp. 15794 – 15799 ’13 defines them as hemispheres having a radius of 10 nanoMeters from channel (a nanoMeter is 10^-9 meter — I want to get everyone on board for what follows, I’m not trying to insult your intelligence). The paper talks about measuring concentrations of calcium ions in such a nanodomain. Previous work by a Nobelist (Neher) came up with 100 microMolar elevations of calcium in nanodomains when one of the channels was opened. Yes, evolution has produced ion channels permeable to calcium and not much else, sodium and not much else, potassium and not much else. For details read the papers of Roderick MacKinnon (another Nobelist). The mechanisms behind this selectivity are incredibly elegant — and I can tell you that no one figured out just what they were until we had the actual structures of channels in hand. As chemists you’re sure to get a kick out of them.

The neuroscientist (including Neher the Nobelist) cannot be excused for not understanding the concept of concentration and its limits.

How many ions are in a cc. of a 1 molar solution of calcium — 6 * 10^20 (Avogadro’s #/1000).A cc. (cubic centimeter) is 1/1000th of a liter) How many ions  in a 10^-4 molar solution (100 microMolar) — 6 * 10^16. How many calcium ions in a nanoDomain at this concentration? Just (6 * 10^16)/(5 * 10^17) e.g. just over .1 ion/nanodomain. As Bishop Berkeley would say this is the ghost a departed ion.

Does any chemist out there think that speaking of a 100 microMolar concentration in a volume this small is meaningful? I’d love to be shown how my calculation is wrong, if anyone would care to post a comment.

 

The uses and abuses of molarity — II

Just as the last post showed why a 1 Molar solution of a protein makes no sense at all, it is reasonable to ask what the highest concentration of a single protein in the cellular environment could be. Strangely, it was very hard for me to find an estimate of the percentage of protein mass inside a eukaryotic cell. There is one for the red blood cell, which is essentially a bag of hemoglobin. The amount is 33 grams/deciliter or 330 grams/liter. Hemoglobin (which is a tetramer) has a molecular mass of 64,000 Daltons.  So that’s 330/64000 = .5 x 10^-3 Molar.   So all proteins in our cells have a maximum concentration at most in the milliMolar range.

Before moving on, how do you think the red blood cell gets its energy?  Amazingly it is by anaerobic glycolysis, not using the oxygen carried by hemoglobin at all.  Why? If it used oxidative phosphorylation which runs on oxygen, it would burn up.  That’s why red cells do not contain mitochondria. 

On to Kd the dissociation constant.  At least 475 FDA approved drugs target G Protein Coupled Receptors (GPCRs), and our genome codes for some 826 of them.  Almost 500 of them code for smell receptors, and of the 300 or so not involved with smell 1/3 are orphans (as of 2019) with no known ligand.  There are GPCRs for all neurotransmitters which is why neurologists and psychiatrists are very interested in them. 

The Kd is defined as [ free ligand ][ free receptor ]/ [ ligand bound to receptor]  where all the  [  ]’s are concentrations in Moles/liter (e.g. Molar concentrations). 

There’s the rub.  Kd makes sense when ligand and receptor are swimming around in solution, but GPCRs never do this.  The working GPCR is embedded in our cell membrane which topologists tell us are 2 dimensional manifolds embedded in 3 dimensional space.  What does concentration mean in a situation like this?  Think of the entropy involved in getting all the GPCRs to lie in a single plane.  Obviously not so simple.  

People get around this by using radioactive ligands, and embedding GPCRs in membranes and measuring the time for ligands to bind and unbind (e.g. kinetics), but this is miles away from the physiologic situations — for details please see

2019 Apr 5; 485: 9–19.
 
The same is true for other proteins of interest — ion channels for the neurologist, hormone receptors for the endocrinologist, angiotension converting enzyme 2 (ACE2) for the pandemic virus.  
 
I think that all Kd’s of membrane embedded receptors do is give you an ordinal ordering (e.g. receptor A binds ligand B tighter than ligand C ) but not a quantitative one.
 
Next up, how a Nobel prizewinner totally misunderstood the nature and applicability of molarity and studies on a two dimensional gas (complete with Pressure * Area = n * Gas Constant * Temperature).
 
 

 

The uses and abuses of Molarity

Quick what does a one Molar solution of a protein look like?

Answer: It doesn’t. The average protein mass is 100 kiloDaltons — http://book.bionumbers.org/how-big-is-the-average-protein/. That’s 100,000 grams per mole (100 kilograms).

A mole of any chemical is Avogadro’s number of it — or 6.02 x 10^23.  The molar mass counts 1 gram for each hydrogen it contains, 12 for each carbon etc. etc. 

A 1 molar concentration of any chemical is its molecular mass dissolved in 1 liter of water, which is 1,000 cubic centimeters (cc.).  The density of water is pretty much the same between 32 and 212 Fahrenheit (or 1 – 100 centigrade).  

What is the molar concentration of water, e.g. how many moles of water are in a liter of water.  The molecular mass of water is 18 so there are 1000/18 = 55.6 moles of water per liter of water.  

Well you can’t get 220 pounds of our 100 kiloDalton protein into 2.2 pounds (1 kiloGram) of water.  You could decorate each of the 6.02 x 10^23 protein molecules with 55 waters. 

Why belabor the obvious?  Because numbers are infinitely divisible and it is possible to talk about concentrations given in moles which make no chemical sense. Why?  Because matter is not infinitely divisible.  Divisibility for chemists stops at the atom level. 

Now let’s do some biology.  Cell size is measured in microns or 10^-6 meters.   A liter is a cube 10 centimeters on a side, so it is 10^-3 cubic meters.  A cubic micron is 10^-18 cubic meters, so there are 10^15 cubic microns in a liter. 

Now lets put 1 molecule in our cubic micron and each and every cubic micron in a liter of water.  What is its concentration in moles?  Our liter contains 10^15 molecules of our chemical, so its Molar concentration is 10^15/6.02 *10^23or .16 x 10^-8  or 1.6 x 10^-9 or 1.6 nanoMolar.    So 1 cubic micron is the volume  at which concentration less than 1.6 nanoMolar make no sense. 

It should be noted that 1 cubic micron contains plenty of water molecules to dissolve our molecule.  The actual number:

55 x 6.02 x 10^23/10^15 = 331 x 10^8  = 3 x 10^10 of them.

Notice that the mass of the molecule makes no difference.  Molar means moles/liter and liter is just a volume.  The number of molecules is what is crucial. 

As the volume goes up 1 molecule/volume makes sense at lower and lower concentrations. 

At this point the physicist says ‘consider a spherical cow’.  The biologist doesn’t have to.  We have lymphocytes which are nearly spherical with diameters ranging from 6 to 14 microns. 

Call it 10 microns.  Then the volume of our lymphocyte is  4/3 * pi * 5^3 = 524 cubic microns (call it 1,000 cubic microns to make things easier).  Recall that a liter contains 10^15 cubic microns.  So a liter can contain at most 10^12 lymphocytes, or 10^12 of our molecules so their concentration is 10^12/6.02 * 10^23 or 1.6 x 10^-12 molar. or 1.6 picoMolar.   Molar concentrations lower than 1.6 picoMolar make no chemical or biological sense in volumes of 1000 cubic microns. 

Are there chemicals in the lymphocyte with concentrations that low?  Sure there are.  Each chromosome is a molecule, so in male lymphocytes there is exactly one X chromosome and one Y. 

Next up.  Is a dissociation constant (Kd) in the femtoMolar (10^-15 Molar) range biologically meaningful?   I’m not sure and am still thinking about it, but the answer has some relevance to Alzheimer’s disease. 

Hydrogen bonding — again, again

I’ve been thinking about hydrogen bonding ever since my senior thesis in 1959. Although its’ role in the protein alpha helix had been known since ’51 and in the DNA double helix since ’53, little did we realize at the time just how important it would be for the workings of the cell. So I was lucky Dr. Schleyer put me at an IR spectrometer and had me make a bunch of compounds, to look for hydrogen bonding of OH, NH and SH to the pi electrons of the benzene ring. I had to make a few of them, which involved getting a (CH2)n chain between the benzene ring and the hydrogen donor. Just imagine the benzene as the body of a scorpion and the (CH2) groups as the length of the tail.  The SH compounds were particularly nasty, and people would look at their shoes when I’d walk into the eating club. Naturally the college yearbook screwed things up and titled my thesis “Studies in Hydrogen Bombing”, to which my parents’ friends would say — he looks like such a nice young man, why was he doing that?

At any rate I’m going to talk about a recent paper [ Science vol. 371 pp. 160 – 164 ’21 ] on the nature of the bond in the F H F – anion.  It’s going to be pretty hard core stuff with relatively little explanatory material. You’ve either been previously exposed to this stuff or you haven’t.  So this post is for the cognoscenti.  Hold on, it’s going to be wild ride.

In conventional hydrogen bonds, the donor (D) atom is separated from the Acceptor atom (A) by 2.7 Angstroms or more, and the hydrogen nucleus is found closer to A where the potential energy minimum is found.

So it looks like this D – H . .. A

The D-H bond isn’t normal, but is stretched  and weakened.  This means that it takes less energy to stretch it meaning that it absorbs infrared radiation at a lower frequency (higher wavelength) — red shift if you will. 

Such is what we were looking for and we found it comparing 

Benzene (CH2)n OH vibrations to butanol, pentanol, hexanol, etc etc. cyclohexane (CH2)n OH.

As the D – A distance shrinks there is ultimately a flat bottomed single well potential, where H becomes a confined particle (but still delocalized) betwen D and A.

The vibrations of protons in hydrogen bonds deviate markedly from the classic quantum harmonic oscillator beloved by physicists.  Here the energy levels on solving the classic H psi = E psi equation of quantum mechanics are evenly spaced (see Lancaster & Blundell “Quantum Field Theory” p. 20.)

However in real molecules, as you ascend the vibrational ladder, conventional hydrogen bonds show a decrease in the difference between energy levels (positive anharmonicity).  By contrast, when proton confinement dictates the potential shape in short hydrogen bonds (when D and A are close together, mimicking the particle in a box model in quantum mechanics) the spacing between states increases (negative anharmonicity).

The present work shows that in FHF- the proton motion is superharmonic — https://en.wikipedia.org/wiki/Subharmonic_function — which they don’t describe very well. 

When the F F distance gets below 2.4 Angstroms, covalent bonding starts to become a notable contributor to the short hydrogen bond, and the authors actually have evidence that there is overlap in FHF- between the 3s orbital of H and the 2 Pz orbitals of the donor and the acceptor atoms, yielding a stabilization of the resulting molecular orbital. 

Is that cool or what.  The bond sits right on the borderland between a covalent bond and a hydrogen bond, taking on aspects of both.