Category Archives: Chemistry (relatively pure)

4 diseases explained at one blow said the protein chemist — part 2 — TDP43

A brilliant paper [ Science vol. 377 eabn5582 pp. 1 –> 20 ’22 ] explains how changing a single amino acid (proline) to another  can cause 4 different diseases, depending on the particular protein it is found in (and which proline of many is changed).

There is so much in this paper that it will take several posts to go over it all.  The chemistry in the paper is particularly fine.  So it’s back to Biochemistry 101 and the alpha helix and the beta sheet.

A lot of the paper concerns TDP43, a protein familiar to neurologists because it is involved in FTD-ALS (FrontoTemporal Dementia — Amyotrophic Lateral Sclerosis) and ALS itself.

I actually saw a case early in training.  I had been taught that ALS patients remained cognitively intact until the end (certainly true in my experience — think of Stephen Hawking), so here was this ALS case who was mildly demented.  My education, deficient at that time, so I’d never heard of FTD-ALS, had me writing in the chart “we’re missing something here”.  These were calmer times in the medical malpractice world.

TDP43 is a protein with a lot of different parts in its 414 amino acids.  There are two regions which bind to RNA (Rna Recognition Motifs { RRMs } ), and a glycine rich low complexity domain at the carboxy terminal end.

TDP43 proteins are found in the neuronal inclusions of ALS (interestingly, these weren’t recognized when I was in training).  The low complexity domain of TDP43 aggregates and form fibers.  Some 50 different mutations have been found here in patients.

Just this year the cryoEM structure of TDP43 aggregates from two patients with FTD-ALS were described [ Nature vol. 601 pp. 29 – 30, 139 – 143 ’22 ].  It appears to be a typical amyloid structure with all 79 amino acids (from # 282 Glycine to #360  Glutamine) in a single plane.  Here’s a link to the actual paper —  It is likely behind a paywall, but if you can get it, look at figure 2 p. 140, which has the structure.  Who would have ever thought that a protein could flatten out this much.

Both structures were from TDP-43 with none of the 24 mutations known to cause FTD-ALS.

But that’s far from the end of the story.  The same area of TDP43 can also form liquid droplets (perhaps the precursor of the fibers).  But that’s where the brilliant chemistry of [ Science vol. 377 eabn5582 pp. 1 –> 20 ’22 ] comes in.

That’s for next time.  After that, I should be finished with Needham and will have time to write about 6 or so of the interesting papers I’ve run across in the past 6 months.

We interrupt this program . . .

I’ll interrupt the series of posts on the brilliant article [ Science vol. 377 eabn5582 pp. 1 –> 20 ’22 ] to talk about working with the very frightening diazo methane 61 years ago.

I was able to convince Woodward to let me work on an idea of mine to show that carbenes were generated by photolysis of a diazo compound (this was suspected but not known at the time).

Here’s the idea

l. Condense acrylic acid with cyclopentadiene by a Diels Alder reaction.  Because of steric effects the acid points below the ring

2. Form the acyl chloride

3. React with diazoMethane to form the diazocarbonyl (no change in the orientation of the carbonyl relative to the ring.

4. Photolyze — if  a carbene is formed, it’s in perfect position to form a cyclopropane on the other side of the ring which if formed would pretty much prove the point.

Diazomethane was known to be quite explosive, and I spent a lot of time tiptoing around the lab when working with it.  Combine this with the worst lab technique in the world and I couldn’t get things to work. Subsequently the idea was shown to be correct, and an enormous amount of work has been done on carbenes.

So why interrupt the flow of posts about the brilliant  [ Science vol. 377 eabn5582 pp. 1 –> 20 ’22 ] ?

Because Science vol. 377 pp. 649 – 654 ’22 reports a simple (and nonexplosive) way to form carbenes from aldehydes.  Here’s what they say

“Common aldehydes are readily converted (via stable a-acyloxy halide intermediates) to electronically diverse (donor or neutral) carbenes to facilitate >10 reaction classes. This strategy enables safe reactivity of nonstabilized carbenes from alkyl, aryl, and formyl aldehydes via zinc carbenoids. Earth-abundant metal salts [iron(II) chloride (FeCl2), cobalt(II) chloride (CoCl2), copper(I) chloride (CuCl)] are effective catalysts for these chemoselective carbene additions to s and p bonds.”

How I wished I had this back then.

4 diseases explained at one blow said the protein chemist — part 1

A brilliant paper [ Science vol. 377 eabn5582 pp. 1 –> 20 ’22 ] explains how changing a single amino acid (proline) to another  can cause 4 different diseases, depending on the particular protein it is found in (and which proline of many is changed).

There is so much in this paper that it will take several posts to go over it all.  The chemistry in the paper is particularly fine.  So it’s back to Biochemistry 101 and the alpha helix and the beta sheet.

Have a look at this

If you can tell me how to get a picture like this into a WordPress post please make a comment.

The important point is that hydrogen bonds between the amide hydrogen of one amino acid and the carbonyl group of another hold the alpha helix and the beta pleated sheet together.

Enter proline : p//  Proline when not embedded in a protein has a hydrogen on the nitrogen atom in the ring.  When proline is joined to another amino acid by a peptide bond in a protein, the hydrogen on the nitrogen is no longer present.  So the hydrogen bond helping to hold the two structures (alpha helix and beta sheet) is no longer present at proline, and alpha helices and beta sheets containing proline are not has stable.  Prolines after the fourth amino acid of the alpha helix (e. g. after the first turn of the helix) produce a kink.  The proline can’t adopt the alpha helical configuration of the backbone and it can’t hydrogen bond.

But it’s even worse than that (and this observation may even be original).  Instead of a hydrogen bonding to the free electrons of the oxygen in the carbonyl group you have the two electrons on the nitrogen jammed up against them.  This costs energy and further destabilizes both structures.

Being a 5 membered ring which contains the alpha carbon of the amino acid, proline in proteins isn’t as flexible as other amino acids.

This is why proline is considered to be a helix breaker, and is used all the time in alpha helices spanning cellular membranes to cause kinks, giving them more flexibility.

There is much more to come — liquid liquid phase separation, prion like domains, low complexity sequences, frontotemporal dementia with ALS, TDP43, amyloid, Charcot Marie Tooth disease and Alzheimer’s disease.

So, for the present stare at the link to the diagram above.

Apologies for another posting delay

Hopefully the post on the paper I’m so impressed with will be out in the next few days.  I’ve been clearing away the underbrush in Needham’s Visual Differential Geometry and Forms before the final push on the Einstein field equation and Riemannian geometry.

Apologies for the delay

Here’s a clue for you all to think about — what effects does proline have on (1) the alpha helix (2) the beta pleated sheet?

Amyloid Structure at Last ! 3 The Alzheimer mutations

I am republishing this post from last October, because the excellent paper I’m going to write about has similar thinking.

Although the chemistry explaining why these mutations are associated with Alzheimer’s disease is exquisite and why they point to ‘the’ cause of Alzheimer’s disease — the amyloid fibril, billions have been spent in attempts to remove amyloid fibrils with no useful therapeutic result (and some harm)

Here’s the old post

The structure of the amyloid fibril formed by the aBeta42 peptide exactly shows why certain mutations are associated with hereditary Alzheimer’s disease.   Here is a picture

Scroll down to the picture above “Bonds that Tie”

If you need some refreshing on the general structure of amyloid, have a look at the first post in the series —

Recall that in amyloid fibrils the peptide backbone is flat as a flounder (well in a box 4.8 Angstroms high) with the amino acid side chains confined to this plane.  The backbone winds around in this plane like a snake.  The area in the leftmost loop is particularly crowded with bulky side chains of glutamic acid (single letter E) at position 22 and aspartic acid (single letter D) at position 23 crowding each other.  If that wasn’t enough, at the physiologic pH of 7 both acids are ionized, hence negatively charged.  Putting two negative charges next to each other costs energy and makes the sheet making up the fibril less stable.

The marvelous paper (the source for much of this) Cell vol. 184 pp. 4857 – 4873 ’21 notes that there are 3 types of amyloid — pathological, artificial, and functional, and that the pathological amyloids are the most stable. The most stable amyloids are the pathological ones.  Why this should be so will be the subject of a future post, but accept it as fact for now

In 2007 there were 7 mutations associated with familial Alzheimer’s disease (10 years later there were 11). Here are 5 of them.

Glutamic Acid at 22 to Glycine (Arctic)

Glutamic Acid at 22 to Glutamine (Dutch)

Glutamic Acid at 22 to Lysine (Italian)

Aspartic Acid at 23 to Asparagine (Iowa)

Alanine at 21 to Glycine (Flemish)

All of them lower the energy of the amyloid fiber.

Here’s why

Glutamic Acid at 22 to Glycine (Arctic) — glycine is the smallest amino acid (side chain hydrogen) so this relieves crowding.  It also removes a negatively charged amino acid next to the aspartic acid.  Both lower the energy

Glutamic Acid at 22 to Glutamine (Dutch) — really no change in crowding, but it removes a negative charge next to the negatively charged Aspartic acid

Glutamic Acid at 22 to Lysine (Italian)– no change in crowding, but the lysine is positively charged at physiologic pH, so we have a positive charge next to the negatively charged Aspartic acid, lowering the energy

Aspartic Acid at 23 to Asparagine (Iowa) –really no change in crowding, but it removes a negative charge next to the negatively charged Glutamic acid next door

Alanine at 21 to Glycine (Flemish) — no change in charge, but a reduction in crowding as alanine has a methyl group and glycine a hydrogen.

As a chemist, I find this immensely satisfying.  The structure explains why the mutations in the 42 amino acid aBeta peptide are where they are, and the chemistry explains why the mutations are what they are.

, , , , , , , . No

Bye bye stoichiometry

I’m republishing this old post from 2018, to refresh my memory (and yours) about liquid liquid phase separation before writing a new post on one of the most interesting papers I’ve read in recent years.  The field has exploded since this was written.

Until recently, developments in physics basically followed earlier work by mathematicians Think relativity following Riemannian geometry by 40 years.  However in the past few decades, physicists have developed mathematical concepts before the mathematicians — think mirror symmetry which came out of string theory — You may skip the following paragraph, but here is what it meant to mathematics — from a description of a 400+ page book by Amherst College’s own David A. Cox

Mirror symmetry began when theoretical physicists made some astonishing predictions about rational curves on quintic hypersurfaces in four-dimensional projective space. Understanding the mathematics behind these predictions has been a substantial challenge. This book is the first completely comprehensive monograph on mirror symmetry, covering the original observations by the physicists through the most recent progress made to date. Subjects discussed include toric varieties, Hodge theory, Kahler geometry, moduli of stable maps, Calabi-Yau manifolds, quantum cohomology, Gromov-Witten invariants, and the mirror theorem. This title features: numerous examples worked out in detail; an appendix on mathematical physics; an exposition of the algebraic theory of Gromov-Witten invariants and quantum cohomology; and, a proof of the mirror theorem for the quintic threefold.

Similarly, advances in cellular biology have come from chemistry.  Think DNA and protein structure, enzyme analysis.  However, cell biology is now beginning to return the favor and instruct chemistry by giving it new objects to study. Think phase transitions in the cell, liquid liquid phase separation, liquid droplets, and many other names (the field is in flux) as chemists begin to explore them.  Unlike most chemical objects, they are big, or they wouldn’t have been visible microscopically, so they contain many, many more molecules than chemists are used to dealing with.

These objects do not have any sort of definite stiochiometry and are made of RNA and the proteins which bind them (and sometimes DNA).  They go by any number of names (processing bodies, stress granules, nuclear speckles, Cajal bodies, Promyelocytic leukemia bodies, germline P granules.  Recent work has shown that DNA may be compacted similarly using the linker histone [ PNAS vol.  115 pp.11964 – 11969 ’18 ]

The objects are defined essentially by looking at them.  By golly they look like liquid drops, and they fuse and separate just like drops of water.  Once this is done they are analyzed chemically to see what’s in them.  I don’t think theory can predict them now, and they were never predicted a priori as far as I know.

No chemist in their right mind would have made them to study.  For one thing they contain tens to hundreds of different molecules.  Imagine trying to get a grant to see what would happen if you threw that many different RNAs and proteins together in varying concentrations.  Physicists have worked for years on phase transitions (but usually with a single molecule — think water).  So have chemists — think crystallization.

Proteins move in and out of these bodies in seconds.  Proteins found in them do have low complexity of amino acids (mostly made of only a few of the 20), and unlike enzymes, their sequences are intrinsically disordered, so forget the key and lock and induced fit concepts for enzymes.

Are they a new form of matter?  Is there any limit to how big they can be?  Are the pathologic precipitates of neurologic disease (neurofibrillary tangles, senile plaques, Lewy bodies) similar.  There certainly are plenty of distinct proteins in the senile plaque, but they don’t look like liquid droplets.

It’s a fascinating field to study.  Although made of organic molecules, there seems to be little for the organic chemist to say, since the interactions aren’t covalent.  Time for physical chemists and polymer chemists to step up to the plate.

Posting delay

Sorry for the delay.  An 8 year old grandson consumes a lot of time and energy.  However, one of the best papers I’ve read in years will be the subject of the next post, which gave a plausible mechanism for how a type of protein mutation in 4 different proteins causes 4 different neurologic diseases.  In the meantime, think about proline and what it does to protein structure. Sorry

The silence is deafening

3 weeks ago I published a post about a paper that I thought would be a real bombshell, in effect contradicting a paper in a prestigious journal, and strongly arguing from real data that the pandemic virus could have been made in a lab, quite possibly Wuhan.  .

Absolutely nothing has happened. No letters to PNAS (the source of the article) to Cell (the source of the criticized study).  With a question of this magnitude and importance  you’d think Nature or Science would weigh in about it.  The origin of the pandemic virus is certainly they’ve covered extensively.

So I’m going to send this to all concerned and see if I get any feedback.

Here is the original post.

Evidence that the pandemic virus was made in a lab


Everyone knows that the Chinese have been less than forthcoming about the origin of the pandemic virus (SARS-CoV-2).  An article in the current Proceedings of the National Academy of Sciences — arguesthat US data, which hasn’t been released, and some 290 pages of which has been redacted could shed a good deal of light on the subject (without any help from China).  One of the authors is an economist, but the other has serious biochemical chops —

Basically a variety of US institutions (see the paper — it’s freely available) have been working with the lab at Wuhan for years modifying the virus, long before the pandemic.  The paper names the names etc. etc. and is quite detailed, but I want to explain the evidence that the virus could have been produced (by human modification) at the Wuhan lab.  It has to do with a site in a viral protein which says ‘cut here’.

Here is more background than many readers will need, but the virus has affected us all and I want to make it accessible to as many as possible.

Proteins are linear strings of amino acids, just as this post is a linear sequence of letters, spaces and punctuation.

We have fewer amino acids (20 to be exact) than letters  and to save space each one has a one letter abbreviation (A for alanine V for valine, etc. etc.).  The spike protein (the SARS-CoV-2 protein binding to the receptor  for it on our cells) is quite long (1,273 amino acids all in a row).

Our genome codes for 588  proteins (called proteases) whose job it is to cut up other proteins. Obviously, it would be a disaster if they worked indiscriminately.  So each cuts at a particular sequence of amino acids. Think of the protease as a key and the sequence as a lock.  One protease called furin cuts in the middle of an 8 amino acid sequence RRAR’SVAS (R stands for aRginine and S for Serine).  This is called the furin cleavage site (FCS)

A paper (The origins of SARS-CoV-2: A critical review. Cell 184, 4848–4856 (2021) argued that the amino acid sequence of the FCS in SARS-CoV-2 is an unusual, nonstandard sequence for an FCS and that nobody in a laboratory would design such a novel FCS.  So, like many, I skimmed the paper and accepted its conclusions, as Cell is one of the premier molecular biology journals.

One final quote “The NIH has resisted the release of important evidence, such as the grant proposals and project reports of EHA, and has continued to redact materials released under FOIA, including a remarkable 290-page redaction in a recent FOIA release.”

Sounds like Watergate doesn’t it?


Watch this space

Brilliant structural work on the Arp2/3 complex with actin filaments and why it makes me depressed

The Arp2/3 complex of 5 proteins forms side branches on existing actin filaments.  The following paper shows its beautiful structure along with movies.  Have a look — it’s open access.

Why should it make me depressed? Because I could spend the next week studying all the ins and outs of the structure and how it works without looking at anything else.  Similar cryoEM studies of other multiprotein machines are coming out which will take similar amounts of time.  Understanding how single enzymes work is much simpler, although similarly elegant — see Cozzarelli’s early work on topoisomerase.

So I’m depressed because I’ll never understand them to the depth I understand enzymes, DNA, RNA etc. etc.

Also the complexity and elegance of these machines brings back my old worries about how they could possibly have arisen simply by chance with selection acting on them.  So I plan to republish a series of old posts about the improbability of our existence, and the possibility of a creator, which was enough to me get thrown off Nature Chemistry as a blogger.

Enough whining.

Here is why the Arp2/3 complex is interesting.  Actin filaments are long (1,000 – 20,000 Angstroms and thin (70 Angstroms).  It you want to move a cell forward by having them grow toward its leading edge, growing actin filaments would puncture the membrane like a bunch of needles, hence the need for side branches, making actin filaments a brush-like mesh which could push the membrane forward as it grows.

The Arp2/3 complex has a molecular mass of 225 kiloDaltons, or probably 2,250 amino acids or 16 thousand atoms.

Arp2 stands for actin related protein 2, something quite similar to the normal actin monomer so it can sneak into the filament. So can Arp3.  The other 5 proteins grab actin monomers and start them polymerizing as a branch.

But even this isn’t enough, as Arp2/3 is intrinsically inactive and multiple classes of nucleation promoting factors (NPFs) are needed to stimulate it.  One such NPF family is the WASP proteins (for Wiskott Aldrich Syndrome Protein) mutations of which cause the syndrome characterized by hereditary thrombocytopenia, eczema and frequent infections.

The paper’s pictures do not include WASP, just the 7 proteins of the complex snuggling up to an actin filament.

In the complex the Arps are in a twisted conformation, in which they resemble actin monomers rather than filamentous actin subunits which have a flattened conformation.  After activation arp2 and arp3 mimic the arrangement of two consecutive subunits along the short pitch helical axis of an actin filament and each arp transitions from a twisted (monomerLike) to a flattened (filamentLike) conformation.

So look at the pictures and the movies and enjoy the elegance of the work of the Blind Watchmaker (if such a thing exists).

If the right hand don’t get you, the left hand will

Do you know the source of the title?  I found it surprising.  Answer at the end.

Some cancer cells have elevated levels of an enzyme called PHosphoGlyceride DeHydrogenase (PHGDH, others have decreased levels.  Many cancers contain both types of cells.  Neither is good news.

Those cancers  with low levels of PHGDH  have slower growth.  That’s good news isn’t it?  No.  Such cells are more likely to metastasize.

Those with high levels of PHGDH are less likely to metastasize.  That’s good news isn’t it?  No. such cells grow faster.

So cancers with both types of cells are more aggressive.

Here’s how it works [ Nature vol. 605 pp. 617 – 617, 747 – 753 ’22 ].

PHGDH is on the pathway for synthesis of serine, an amino acid required for protein synthesis (like all of them).  So low levels of the enzyme result in less protein synthesis and less tumor growth.

So how is this bad?  PHGDH binds to another enzyme PFK (PhosphoFructoKinase) stabilizing it.  When PHGDH is low PFK enzyme levels are low, so the subsrate of PFK (fructose 6 phosphate) is diverted to making sialic acid, which modifies cell surface proteins making them more likely to migrate.

So blocking sialic acid synthesis reverses the effects of low PHGDH on cancer migration and metastasis — but it does potentiate cell proliferation.

You just can’t win

Things like this may explain other paradoxic and unexpected effects of enzyme blockade.

16 Tons by Tennessee Ernie Ford