Category Archives: Chemistry (relatively pure)

Little Bo Peep meets cellular biology and biochemistry.

Flippase. Eat me signals. Dragging their tails behind them. Have cellular biologists and structural biochemists gone over to the dark side? It’s all quite innocuous as the old nursery rhyme will show

Little Bo Peep has lost her sheep
and doesn’t know where to find them
Leave them alone, and they’ll come home
wagging their tails behind them.

First, some cellular biochemistry. The lipid bilayer encasing all our cells is made of two leaflets, inner and outer. The composition of the two is different (unlike the soap bubble). On the inside we find phosphatidylethanolamine (PE), phosphatidylserine (PS). The outer leaflet contains phosphatidylcholine (PC) and sphingomyelin (SM) and almost no PE or PS. This is clearly a low entropy situation compared to having all 4 randomly dispersed between the 2 leaflets.

What is the possible use of this (notice how teleology invariably creeps into cellular biology)? Chemistry is powerless to explain such things. Much as I love chemistry, such truths must be faced.

It takes energy to maintain this peculiar distribution. The enzyme moving PE and PS back inside the cell is the flippase. It requires energy in the form of ATP to operate. When a cell is dying ATP drops, and entropy takes its course moving PE and PS to the cell surface. Specialized cells (macrophages) exist to scoop up the dying or dead cells, without causing inflammation. They recognize PE and PS by a variety of receptors and munch up cells exposing them on the surface. So PE and PS are eat me signals which appear when there isn’t enough ATP around for flippase to use to haul PE and PS back inside. Clever no?

No for some juicy chemistry (assuming that you consider transport of a molecule across a lipid bilayer actual chemistry — no covalent bonds to the transferred molecule are formed or removed, although they are to the transporter). Well it certainly is physical chemistry isn’t it?

Here are the structures of PE, PS, PC, SM

There are a few things to notice. Like just about every lipid found in our membranes, they are amphipathic — they have a very lipid soluble part (look at the long hydrocarbon changes hanging below them) and a very water soluble part — the head groups containing the phosphate.

This brings us to [ Proc. Natl. Acad. Sci. vol. 111 pp. E1334 - E1343 '14 ] Which describes ATP8A2 (aka the flippase). Interestingly, the protein, with at least 10 alpha helices spanning the membrane, and 3 cytoplasmic domains closely resembles the classic sodium pump beloved of neurophysioloogists everywhere, which pumps sodium ions out of neurons and pumps potassium ions inside, producing the equally beloved membrane potential of neurons.

Look at those structures again. While there are charges on PE, PS (on the phosphate group), these molecules are far larger than the sodium or the potassium ion (easily by a factor of 10). This has long been recognized and is called the ‘giant substrate problem’.

The paper solved the structure of ATP8A2 and used molecular dynamics stimulations to try to understand how it works. What they found is that transmembrane alpha helices 1, 2, 4 and 6 (out of 10) form a water filled cavity, which dissolves the negatively charged phosphate of the head group. What happens to those long hydrocarbon tails? The are left outside the helices in the lipid core of the membrane. It is the charged head groups that are dragged through by the flippase, with the tails wagging along behind them, just like little Bo Peep.

There’s a lot more great chemistry in the paper, particularly how Isoleucine #364 directs the sequential formation and annihilation of the water filled cavities between alpha helices 1, 2, 4 and 6, and how a particular aspartic acid is phosphorylated (by ATP, explaining why the enzyme no longer works in energetically dying cells) changing conformation of all 10 transmembrane helices, so that only one half of the channel is open at a time (either to the inside or the outside).

Go read and enjoy. It’s sad that people who don’t know organic chemistry are cut off from appreciating such elegance. There is more to esthetics than esthetics.

The prions within us

Head for the hills. All of us have prions within us sayeth [ Cell vol. 156 pp. 1127 - 1129, 1193 - 1206, 1206 - 1222 '14 ]. They are part of the innate immune system and help us fight infection. But aren’t all sorts of horrible disease (Bovine Spongiform Encephalopathy aka BSE, Jakob Creutzfeldt disease aka JC disease, Familial Fatal Insomnia etc. etc.) due to prions? Yes they are.

If you’re a bit shaky on just what a prion is see the previous post which should get you up to speed —

Initially there was an enormous amount of contention when Stanley Prusiner proposed that Jakob Creutzfeldt disease was due to a protein forming an unusual conformation, which made other copies of the same protein adopt it. It was heredity without DNA or RNA (although this was hotly contended at the time), but the evidence accumulating over the years has convinced pretty much everyone except Laura Manuelidis (about whom more later). It convinced the Nobel Prize committee at any rate.

JC disease is a rapidly progressive dementia which kills people within a year. Fortunately rare (attack rate 1 per million per year) it is due to misfolded protein called PrP (unfortunately initially called ‘the’ prion protein although we now know of many more). Trust me, the few cases I saw over the years were horrible. Despite decades of study, we have no idea what PrP does, and mice totally lacking a functional Prp gene are normal. It is found on the surface of neurons. Bovine Spongiform Encephalopathy was a real scare for a time, because it was feared that you could get it from eating meat from a cow which had it. Fortunately there have been under 200 cases, and none recently.

If you cut your teeth on the immune system being made of antibodies and white cells and little else, you’re seriously out of date. The innate immune system is really the front line against infection by viruses and bacteria, long before antibodies against them can be made. There are all sorts of receptors inside and outside the cell for chemicals found in bacteria and viruses but not in us. Once the receptors have found something suspicious inside the cell, a large protein aggregate forms which activates an enzyme called caspase1 which cleaves the precursor of a protein called interleukin 1Beta, which is then released from some immune cells (no one ever thought the immune system would be simple given all that it has to do). Interleukin1beta acts on all sorts of cells to cause inflammation.

There are different types of inflammasomes and the nomenclature of their components is maddening. Two of the sensors for bacterial products (AIM, NLRP2) induce a polymerization of an inflammasome adaptor protein called ASC producing a platform for the rest of the inflammasome, which contains other proteins bound to it, along with caspase1 whose binding to the other proteins activates it. (Terrible sentence, but things really are that complicated).

ASC, like most platform proteins (scaffold proteins), is made of many different modules. One module in particular is called pyrin (because one of the cardinal signs of inflammation is fever). Here’s where it gets really interesting — the human pyrin domain in ASC can replace the prion domain of the first yeast prion to be discovered (Sup35 aka [ PSI+ ] — see the above link if you don’t know what these are) and still have it function as a prion in yeast. Even more amazing, is the fact that the yeast prion domain can functionally replace ASC modules in our inflammasomes and have them work (read the references above if you don’t believe this — I agree that it’s paradigm destroying). Evidence for human prions just doesn’t get any better than this. Fortunately, our inflammasome prions are totally unrelated to PrP which can cause such havoc with the nervous system.

Historical note: Stanley Prusiner was a year behind me at Penn Med graduating in ’67. Even worse, he was a member of my med school fraternity (which was more a place to get a decent meal than a social organization). Although I doubtless ate lunch and dinner with him before marrying in my Junior year, I have absolutely no recollection of him. I do remember our class’s medical Nobel — Mike Brown. Had I gone to Yale med instead of Penn, Laura Manuelidis would have been my classmate. Small world

A primer on prions

Actually Kurt Vonnegut came up with the basic idea behind prions in his 1963 Novel “Cat’s Cradle”. Instead of proteins, it involved a form of water (Ice-9) which had never been seen before, but one which was solid at room temperature. Unfortunately, it also solidified all liquid water it came in contact with effectively ending life on earth.

Now for some history.

The first Xray crystallographic structures of proteins were incredibly seductive intellectually, much as false color functional magnetic resonance (fMRI) images are today. It was hard not to think of them as the structure of the protein.

Nowaday we know that lots of proteins have at least one intrinsically disordered (trans. unstructured) segment of 30 amino acids ore more. [ Nature vol. 411 pp. 151 - 153 '11 ] says 40%, and also that 25% of all human proteins are likely to be disordered (translation; unstructured) from end to end — basic on a bioinformatics program.

I’ve always been amazed that any protein has only a few shapes, purely on the basis of the chemistry — read this if you have the time — Clearly the proteins making us up do have a relatively limited number of shapes (or we’d all be dead).

The possible universe of proteins from which our proteins are selected is enormously large. In fact the whole earth doesn’t have enough mass (even if it were made entirely of hydrogen, carbon, nitrogen, oxygen and sulfur) to make just one copy of the 20^100 possible proteins of length 100. For the calculation please see — — if you have the time.

So, even though it is meaningful question philosophically, just how common proteins with a few shapes are in this universe, we’ll never be able to carry out the experiment. Popper would say it’s a scientifically meaningless question, because it can’t be experimentally decided. Bertrand Russell would not.

Again, if you have time, take a look at

Which, at long last, brings us to prions.

They were first discovered in yeast, and were extremely hard to figure out as they represented something in the cytoplasm which contained no DNA and yet which was heritable. The first prion was discovered nearly 50 years ago. It was called [PSI+] and it produced a lot of new proteins in yeast containing it (which is how its effects were measured) Mating [ PSI+ ] with [ psi-] (e.g. yeast cells without [ PSI+ ] converted the [ psi-] to [ PSI+ ]. It couldn’t be mapped to any known genetic element. Also [ PSI+ ] was lost at a higher rate than would be expected for a DNA mutation. The first clue that [ PSI+ ] was a protein was that it was lost faster when yeast were grown in the presence of protein denaturants (such as guanidine).

It turned out that [ PSI + ] was an aggregated form of the Sup35 protein, which basically functioned to suppress the ribosome from reading through the stop codon. If you need background on what was just said please see — and the subsequent 4 posts. This is why [ PSI+ ] yeast produced longer proteins.Things began to get exciting when Sup35 was dissected so domains could be found which induced [ PSI+ ] formation. Amazingly these domains spontaneously formed visible fibers in vitro resembling amyloid in some respects (binding the dye Congo Red for one). Then they found that preformed fibers, greatly accelerated fiber formation by unpolymerized Sup35 — beginning to sound a bit lice Ice 9 doesn’t it. Yeasts have many other prions, but the best studied and most informative is the one formed from Sup35.

So that’s how prions were found (in yeast) and what they are — an aggregated form of a given protein in a slightly different shape, which can cause another molecule of the same protein to adopt the prion proteins new shape. Amazingly, we have prions within us. But that’s the subject of the next post.

Was I the last to find out?

Quick ! Can you form a hydrogen bond from a carbon hybridized sp3 to an oxygen atom?

I didn’t think so, but you can. This, in spite of reading about proteins for over half a century. [ Proc. Natl. Acad. Sci. vol. 111 pp. E888 - E895 '14 ] describes this (along with lots of references backing up the statements which follow) to such bonds forming between the transmembrane segments of membrane proteins (estimated to be 30% of all our proteins).

Whether or not they contribute to membrane stability isn’t known. Consider the alpha carbon of an amino acid. It is adjacent to a carbonyl group of an amide (electron hungry, but less so than a pure carbonyl because of resonance) and the nitrogen atom of an amide (slightly more electronegative than carbon, and probably more electron hungry because it loses part of its lone pair to resonance).

They are usually found from the alpha carbon of glycine on one helix to the carbonyl of an adjacent transmembrane helix. Glycine zippers (e.g. the G X X X G motif) have long been known in transmembrane helices. Since glycine is the smallest amino acid, having them on the same side of the helix was thought to be a way to pack adjacent helices together.

What would you consider good evidence for such a bond? Spectroscopy of model compounds with deuterium for the alpha hydrogen would be one way (it’s been done). The best evidence would be a shortened distance between the hydrogen and the carbonyl and this has been found as well.

Humbling ! !

Short and Sweet

Yamanaka strikes again. Citrulline is deiminated arginine, replacing a C=N-H (the imine) by a carbonyl C=O. An enzyme called PAD4 does the job. Why is it important? Because one of its targets is the H1 histone which links nucleosomes together. Recall that the total length of DNA in each and every one of our cells is 3 METERS. By wrapping the double helix around nucleosomes, the DNA is shortened by one order of magnitude.

So what? Well, at physiologic pH the imine probably binds another proton making it positively charged, making it bind to the negatively charged DNA phosphate backbone. Removing the imine makes this less likely to happen, so the linker doesn’t bind the double helix as tightly.

Duck soup for the chemist, but apparently no one had thought to look at this before.

This opens up the DNA (aka chromatin decondensation) for protein transcription. Why is Yamanaka involved? Because PAD4 is induced during cellular reprogramming to induced pluripotent stem cells (iPSCs), activating the expression of key stem cell genes. Inhibition of PAD4 lowers the percentage of pluripotent stem cells, reducing reprogramming efficiency. The paper is Nature vol. 507 pp. 104 – 108 ’14.

Will this may be nice for forming iPSCs, it should be noted that PAD4 is unregulated in a variety of tumors.

Curioser and curioser

Curious Wavefunction alluded to the first example of a protein which stands everything we thought we knew about them on its head. At the end of this post you’ll find another equally counterintuitive example.

We all know that proteins fold into a relatively dry core where hydrocarbon side chains and other hydrophobic elements hide out. This was one of Walter Kauzmann’s many contributions to chemistry and biology. He also wrote one of the first books on quantum chemistry, as did his PhD advisor Henry Eyring at Princeton (I was lucky enough to take PChem from him). The driving force for the formation of globular proteins according to him, was pretty much entropic, with hydrocarbon side chains solvating each other so water wouldn’t have to form an elaborate (hence structured) cage to do so.

Which brings us to the wonderfully named fish Pseudopleuronectes Americanus which lives in frigid polar waters. To keep ice crystals from forming in their cells, arctic fish have evolved proteins to prevent it. It is a fascinating example of evolution solving a problem different ways, because by 1996 at least 4 different types of antifreeze proteins were known [ PNAS vol. 93 pp. 6835 - 6840 '96 ].

The new protein is a 3 kiloDalton alanine rich helix bundle 145 Angstroms long.
Amazingly the helices surround a core of 400 water molecules (surround as in the water is on the inside of the protein, not the outside). The water molecules inside the protein are arranged as pentagons (not hexagons as they would be in ice) — so they form a clathrate. The pentagonal arrangement of water was predicted on theoretical grounds 50 years ago by Scheraga ( J. Biol. Chem. vol. ?? pp. 2506 – 2508 1962 ).

The protein has an amino acid periodicity of 11 amino acids, which nicely comes out to 3 turns of the alpha helix. There is a threonine at position i, alanine at position i + 4 and alanine a position i + 8. All of these bind water — not surprising for threonine, but alanine is a hydrocarbon. The evolving fish clearly didn’t listen to protein chemists. However, most of carbonyl groups of the protein backbone are involved in hydrogen bonding to water.

Not to be outdone, a freeze tolerant beetle (Upis cermaboides — don’t you love these names) has an antifreeze molecule made mostly of sugar and lipid.

Well even if we don’t know what we thought we knew about proteins, at least we understand biologic membranes and the proteins that go through them. Don’t we?

Apparently not. [ Proc. Natl. Acad. Sci. vol. 111 pp. 2425 - 2430 '14 ] studied the alpha-hemolysin of staphylococci. We know that the membrane of our cells is made of a double layer of molecules which a charged head which binds water and a long (16 + carbons) hydrocarbon tail. So the hydrocarbon core is 30 Angstroms across, and the lipid head groups are about 40 Angstroms away from each other on either side of the membrane.

We also know how proteins fit into the membrane — one model is the G Protein Coupled Receptor (GPCR) for which we have at least 800 human genes, and which is the target for 30% of all drugs approved by the FDA [ Science vol. 335 pp. 1106 - 1110 '12 ]. These all have 7 alpha helices arranged like a stack of logs extending across the membrane. The amino acids here are usually hydrophobic. Another model is the beta barrel — used mostly by bacteria — these have beta strands arranged across the membrane (like the staves of a barrel — get it). I’m not sure what the record is for the number of strands, but one from the gonococcus has 16 of them. They surround a large pore.

Back to the alpha hemolysin of staphylococci It’s designed to kill its target by forming a hole in the membrane. And so 7 of them get together to do so. However, instead of the running back and forth across the 30 Angstroms of the anhydrous part of the membrane, the heptamers put their heads together forming the hole (like skydivers holding hands), with their hydrocarbon like parts sticking out into the membrane and the water filled hole in the center. How do they know? They studied truncated mutants of the hemolysin, which weren’t long enough to span the 30 Angstroms across the membrane, and they still formed holes. An entirely new (to me) protein arrangement.

Everything not expressly forbidden biochemically is happening somewhere

A fairly oblique introduction (from an earlier post)

Sherlock Holmes and the Green Fluorescent Protein

Gregory (Scotland Yard): “Is there any other point to which you would wish to draw my attention?”
Holmes: “To the curious incident of the dog in the night-time.”
Gregory: “The dog did nothing in the night-time.”
Holmes: “That was the curious incident.”

The chromophore of green fluorescent protein (GFP) is para-hydroxybenzylidene imidazolinone. It is formed by cyclization of a serine (#65) tyrosine (#66) glycine (#67) sequential tripeptide. It is found in the center of a beta barrel formed by the 238 amino acids of GFP.

What is so curious about this?

Simply put, why don’t things like this happen all the time? Perhaps nothing quite this fancy, but on a more plebeian level consider this: of the twenty amino acids, 2 are carboxylic acids, 2 are amides, 1 is an amine, 3 are alcohols and one is a thiol. One might expect esters, amides, thioesters and sulfides to be formed deep inside proteins. Why deep inside? On the surface of the protein, there is water at 55 molar around to hydrolyze them purely by the law of mass action (releasing about 10 kJ/Avogadro’s number per bond in the process). Some water is present in the X-ray crystallographic structure of proteins, but nothing this concentrated.

The presence of 55 M water bathing the protein surface leads to an even more curious incident, namely why proteins exist at all given that amide hydrolysis is exothermic (as well as entropically favorable). Perhaps this is why proteins contain so many alpha helices and beta sheets — as well as functioning as structural elements they may also serve to hide the amides from water by hydrogen bonding them to each other. Along this line, could this be why the hydrophilic side chains of proteins (arginine, lysine, the acids and the amides) are rather bulky? Perhaps they also function to sterically shield the adjacent amides. After all, why should lysine have 4 CH2 groups to separate the primary amino from the alpha carbon? Ditto for the 3 CH2 groups separating the guanidine group, and the 2 CH2 for glutamic acid.

We now have an example before us of an ester between threonine and glutamic acid within the same protein. For details see Proc. Natl. Acad. Sci. vol. 111 pp. 1229 – 1230, 1367 – 1372 ’14. It is put to use to stabilize long thin proteins subject to mechanical stress. All sorts have bacteria have little hairs (pili) allowing them to attach to our cells. The first example were found in some nasty characters (Streptococcus progenies, Clostridium perfringens), possibly because they’re under intense study because the infections they cause are even nastier. Interestingly, the ester is buried deep in the protein where water can’t get at it so easily. This type of link on external proteins turns out to be fairly common in Gram positive organisms.

So everything not biochemically forbidden is probably happening somewhere.

Beating the Born Oppenheimer approximation with lasers

Organic chemists love to push electrons to describe reaction mechanisms. Chemists love potential energy surfaces — even protein chemists love them, although they can’t really calculate them. Both depend on the reality of the Born Oppenheimer approximation which says that electrons move first and nuclei follow much more slowly — which makes sense as even in hydrogen they are almost 2000 times as heavy.

A recent paper [ Proc. Natl. Acad. Sci. vol. 111 pp. 912 - 917 '14 ] was able to use an extremely short laser burst (in 10^-18 seconds– an attoSecond) to move nuclei around in the D2 molecule — the energy had to be in the ultraviolet range, unlike vibratory motion which is in the infraRed range.

Interferences between electronic wave packets (evolving on attosecond timescales) controlled the population of different electronic states of the excited neutral molecule, which can be switched on attosecond timescales. They could control whether D2 ionized to D2+ or flew apart by the type of pulse.

They conclude with the following: “State-of-the-art quantum calculations, which have only recently become feasible, allowed us to interpret this very rich set of quantum dynamics, including both the nuclear motion and the coherently excited electronic state interferences. Thus, we succeed in both observing and rigorously modeling multiscale coherent quantum control in the time domain. The observed richness and complexity of the dynamics, even in this very simplest of molecules, is both remarkable and daunting.”

For more about how complicated even the simplest chemical reaction is please see —

We’ll never be able to really understand chemistry

We’ll never be able to really understand chemistry. That’s my take on the first Chemistry article of 2014 in PNAS (vol. 111 pp. 15 – 20). The article is actually titled “Is the simplest chemical reaction really so simple?” The first heading in the article reads “H + D2 Collision Looks Like Playing Molecular Pool on a Warped Table”

Oy Vey.

Chemistry doesn’t get simpler than that. We’re talking about a reaction with 3 protons, 2 neutrons and 3 electrons. After a bit of back-patting about how much computational chemistry studying this (and similar) reactions has taught us, they throw up figure #2 — a potential energy surface for the reaction, which is identical to what I studied in grad school 50+ years ago.

Theory works well for collisions of H with D2 when all 3 nuclei are linear (180 degrees). Experiments were carried out in the gas phase. No solvent. The D2 was in the lowest vibrational state, and one of the first few of the rotational levels (j = 0, 1, 2). They used D2 to tell products from reactants. Theory and experiment agree and the results look our mental picture of the reaction e.g. it looks like what happens with billiard balls on a good table.

So far so good. Then “We did one experiment too many”.

A rather amazing statement.

What they did was to examine the differential cross-sections of the HD products that were produced with more vibrational excitation (ν′ = 4). They find that the potential energy surface isn’t purely repulsive at higher vibrational levels (which occur in every reaction occurring in living tissue).

As they put it “Warped Billiard Table Changes as the Pool Balls Move and Is Weirder than First Imagined” I’ll let them continue — they’re funny and write well. “So far, we have been fortunate to consider only one potential energy surface (PES) as governing the motion of the nuclei. However, the H3 system is a Jahn–Teller system having a second PES connected to the first at a conical intersection seam whose minimum is located about 2.7 eV (64 kcal/mol) above the asymptotes of the ground state. Once this is energetically accessed, we can expect electronic nonadiabatic transitions to occur in which we must consider motion of the nuclei on both PESs at the same time. This hurts the classical mind (my highlighting). We will need to consider playing molecular pool on interconnected warped billiard tables whose shapes change with the motion of the pool balls. Well, no one ever promised us that chemical reactions would be simple, even for what is called the simplest chemical reaction of them all.”

No solvent, very little internal energy of the D2 and all this is going on. This will give drug chemists trying to dock ligands into a protein cavity nightmares.

Happy New Year, I guess.

The mating dance of a promiscuous protein

Calmodulin is a 147 amino acid protein that changes its shape when calcium ions are bound. It’s quite important, quite ancient and evolutionarily stable (human calmodulin differs at only 3/147 amino acids from the fruitfly’s).

When calcium is bound, the shape change puts hydrophobic (water hating) amino acid side chains its surface. Normally such side chains are hidden in the oily interior of most proteins. The hydrophobic surface patch allows Calmodulin to bind to over 300 cellular proteins. So calmodulin functions as a calcium dependent switch which changes the function of the proteins it binds to.

Naturally the 300 proteins have amino acid sequences (motifs) whose 3 dimensional shape in the native protein is capable of recognizing (e.g. binding to) calmodulin when it contains calcium.

What is quite surprising, is how different from each other in amino acid sequence these calmodulin recognition motifs actually are. Also these sequences are ‘often’ partially or largely disordered in the absence of calmodulin.

The structure of calcium bound calmodulin is also quite dynamic (translation: it flits about between a bunch of different three dimensional structures).

The authors of a recent study [ Proc. Natl. Acad. Sci. vol. 110 pp. 20545 - 20550 '13 ] claim that the binding of calmodulin to a target requires both to undergo mutually induced changes in shape. They call this induced fit. But it’s really a mating dance. Happy New Year and Good Luck to all.


Get every new post delivered to your Inbox.

Join 57 other followers