Man’s best friend

I usually pay little attention to animal models of neurologic disease. After all, our brain is what separates us from animals (recent human behavior excepted). Neuromuscular disease is different because our peripheral nerves and muscles work the same way as animals. An astounding paper from Harvard and Brazil, gives us an entirely new angle to treat muscular dystrophy, particularly the Duchenne form. I ran a muscular dystrophy clinic for 15 years in the 70s and 80s and haplessly watched young boys deteriorate and die from Duchenne. The major therapeutic advance during that time was — hold your breath — lighter weight braces, allowing the boys to stay out of wheelchairs a bit longer.

Some background for those who don’t know, the molecular defect in Duchenne was found in ’87. Interestingly Kunkel, one of the authors on the original paper [ Cell vol. 51 pp.; 919 – 928 ’87 ] is an author on the present one [ Cell vol. 163 pp. 1204 – 1213 ’15 ]. Duchenne dystrophy affects only males, as the gene for the protein (dystrophin) is found on the X chromosome, so women with a normal X and a mutant X escape. To show how pathetic things were back then, we tried to find out if a sister of a patient was a carrier. How did we do it. By measuring an enzyme released by damaged muscle (CPK) on several occasion. Carriers often showed an elevation.

The mutated protein is called dystrophin. It hooks the contractile apparatus of a muscle cell to the membrane. Failure of this makes muscle cells more fragile when they contract resulting in eventual loss. From a molecular biological point of view the protein is fascinating. The gene is one of largest known, stretching over 2,220,233 positions (nucleotides) on the X chromosome and containing 79 exons. Figuring a transcription rate of 100 nucleotides a second, it takes 6 hours to make the messenger RNA (mRNA) for it. The protein has 3,685 amino acids and figuring a translation rate of 3 – 6 amino acids/second it takes 10 minutes for the ribosome to make it. Given that it takes only 3 nucleotides to code for an amino acid, the protein coding part of the gene takes up only .5% of the gene. Correctly splicing out the introns is a huge task, which we all perform well. This size and complexity of the gene explains why mutations are so common, making it the most common form of hereditary muscular dystrophy (most are).

There are currently all sorts of efforts underway to correct the mutation, particularly in a milder form called Becker dystrophy. Derek has covered them and they constitute a logical direct attack on the pathology.

What is so remarkable about the current Cell paper is that it gives us an entirely new and different way to attack Duchenne (and possible all forms of muscular dystrophy). It involves a colony of dogs in Brazil. They have GRMD (Golden Retriever Muscular Dystrophy) with a mutation in one of the many splice sites in dystrophin (it has 79 exons in man) leading to a premature stop codon and no functional dystrophin in the dogs’ muscles. The animals weaken and become non ambulatory with a shortened lifespan. However, a few of the dogs in the colony seemed pretty normal. So they went to work. The obvious reason was that gene was in some way repaired so the animals had normal amounts of dystrophin. Not so, even though ambulatory, the animals’ muscles had no dystrophin. So the whole genome was sequenced. What they found was that a mutation at an upstream site of a protein called Jagged1 lead to increased transcription of the gene and increased levels of the protein.

Jagged1 is a protein ligand for the Notch system of receptors. The Notch system is important in muscle regeneration. The myoblasts of the animals had more proliferative capacity. The Notch system is far too complicated to go into here —, but expect to see a lot more research money pumped into it.

What I find so fabulous about this paper, is that it gives us an entirely new way of thinking about Duchenne, totally unrelated to the genetic defect, which had been our focus up to now. It also rubs our noses in how little we understand about our molecular biology and cell physiology. If we really understood things, we’d have been focused on Notch years ago. Yet another reason drug discovery is so hard. We are trying to alter a system we only dimly understand.

From Banned in Boston to Banned in Berkeley in 55 years

When I arrived in Cambridge for grad school 55 years ago, there were a lot of sore shoulders in people who’d been patting themselves on the back for the blows struck for freedom of expression. Boston was still banning books, and the year before Grove Press had won a suit permitting them to publish Lady Chatterley’s Lover. Hilariously, the same self congratulatory and self-righteous lot is now banning speech in a campus near you. The impetus is always the same, thought control by someone more moral (and now smarter) than you, and always for the noblest and purest of reasons. What happened to irony? Where is George Orwell when you really need him? Well he’s right here

“If liberty means anything at all, it means the right to tell people what they do not want to hear.”

Which brings me to some recent campus disturbances.

Smith, where only members of the media agreeing with the demonstrators were allowed in —

U. Mass — protesting for free tuition —

Princeton — wanting Woodrow Wilson out because of his opinions —

Which brings me to ‘charm school”.

After serving as an army doc for two years in ’68 – ’70, a time when we had 500,000 men in Vietnam, I left with little respect for its leadership. I was stateside at one of the Army’s premier hospitals, which was a plum assignment (because the army was very short of neurologists). This meant that 2 year docs who’d served their first year in Vietnam got their choice of assignment when returning stateside. So I saw plenty of them. NOT ONE thought we were winning over there, despite what the top brass said to the press and the president.

So who would have thought that 25 years later I’d be friendly with a retired Major General, George Baker. Never say never. He was a very intelligent man, an orthopedic surgeon, who’d been chief at Walter Reed and found retirement boring, so he practiced at my hospital. He told me about something he called charm school. It was where newly promoted Generals were sent for training. They were told to toe the straight and narrow sexually and in other matters, and that if a planeload of them went down, the army would have no trouble at all filling their shoes.

I’ve done some alumni interviews for some excellent candidates for Princeton, none of whom were accepted. It would be no problem at all to expel the protestors if physically disruptive or destructive, and replace them. They certainly should NOT be expelled for what they say or think, just how they act. The Princeton acceptance rate is under 10%.

Now that everyone is neatly characterized by racial status, it would be interesting to see the breakdown by race of the occupants of Nassau Hall, also their majors. I seriously doubt that the group most discriminated against in admissions (the Asians) took much part. I doubt that many science majors were involved.

Les fleurs du PTEN

Les fleurs du Mal is a volume of poetry by Baudelaire about the beauty of evil and depravity. I have the same esthetic appreciation for the horrible things a mutant of PTEN does. It’s awful, but incredibly elegant chemically.

Back in the day med students used to be told ‘know syphylis and you’ll know medicine’ because of its varied clinical manifestations. PTEN is like that for cellular and molecular biology.

PTEN (Phosphatase and TENsin homolog) is a gene mutated in many forms of cancer. So it was regarded as a tumor suppressor, keeping our cells on the straight and narrow. Naturally cancer cells ‘try’ (note the anthropomorphism) to neutralize it. PI3K is a universal tumor driver, integrating growth factor signaling with downstream circuitries of cell proliferation, metabolism and survival.

Inositol is a 6 membered ring (all carbons) with one OH group attached to each carbon, which are numbered 1 through 6. PI3K puts phosphate on the 3 position, PTEN takes it off. Since this is how PI3K signaling begins, cells lacking PTEN grow faster and migrate aberrantly (e.g. spread).

Enter Proc. Natl. Acad. Sci. vol. 112 pp. 13976 – 13981 ’15 which carefully studied a PTEN mutant found in an unfortunate man with aggressive prostate cancer. It just changed one of the 403 amino acids (#126) from alanine to glycine. Not a big deal you say,it’s just a change of CH3 (alanine) to H (glycine). #126 is near the active site of the enzyme. One might expect that the mutation inhibits PTEN’s phosphatase activity (e.g. its enzymatic activity). Not so — the mutations shifts the activity so the enzyme. Instead of removing phosphate from the 3 position of inositol, the phosphate at the 5 position is removed (leaving the 3 position alone). This shifts inositol phosphate levels in the cell with hyperactivation of PI3K signaling (which requires inositol phospholipids containing phosphate at the 3 position).

What happens is that inositol phosphates fit into the mutant active site with the 5 position near the catalytic amino acid (cysteine). Essentially the 6 membered ring rotates the 3 position away from cysteine and puts the 5 position there instead. This changes PTEN from a tumor suppressor (anti-oncogene) to an oncogene.

To a chemist this is elegant and beautiful (apologies Baudelaire).

PTEN has taught us a huge amount about the control of protein levels, pseudogenes, competitive endogenous RNA (ceRNA). You can read all about this in

That’s fairly grim, so here’s a link to one of the great comedians of years past — Jonathan Winters

It’s politically incorrect and sure to offend the humorless pompous prigs. Enjoy ! ! !

Are you sure you know everything your protein is up to?

Just because you know one function of a protein doesn’t mean you know them all. A recent excellent review of the (drumroll) executioner caspases [ Neuron vol. 88 pp. 461 – 474 ’15 ] brings this to mind. Caspases control a form of cell death called apoptosis, in which a cell goes gently into the good night without causing a fuss (particularly inflammation and alerting the immune system that something bad killed it). They are enzymes which chop up other proteins and cause the activation of other proteins which chop up DNA. They cause the inner leaflet of the plasma membrane to expose itself (particularly phosphatidyl serine which tells nearby scavenger cells to ‘eat me’).

The answer to the mathematical puzzle in the previous post will be found at the end of this one.

In addition to containing an excellent review of the various steps turning caspases on and off, the review talks about all the things activated caspases do in the nervous system without killing the neuron containing them. Among them are neurite outgrowth and regeneration of peripheral nerve axons after transection. Well that’s pathology, but one executioner caspase (caspase3) is involved in the millisecond to millisecond functioning of the nervous system — e.g. long term depression of neurons (LTD), something quite important to learning.

Of course, such potentially lethal activity must be under tight control, and there are 8 inhibitors of apoptosis (IAPs) of which 3 bind the executioners. We also have inhibitors of IAPs (SMAC, HTRA2) — wheels within wheels.

Are there any other examples where a protein discovered by one of its functions turns out to have others. Absolutely. One example is cytochrome c, which was found as it shuttles electrons to complex IVin the electron transport chain of mitochondria.Certainly a crucial function. However, when the mitochondria stops functioning either because it is told to or something bad happens, cytochrome c is released from mitochondria into the cytoplasm where it then activates caspase3, one of the executioner caspases.

Here’s another. Enzymes which hook amino acids onto tRNA are called tRNA synthases (aaRs for some reason). However one of the (called EPRS) when phosphorylated due to interferon gamma activity, became part of a complex of proteins which silences specific genes (translation — stops the gene from being transcribed) involved in the inflammatory response.

Yet another tRNA synthase, when released from the cell triggers an inflammatory response.

Naturally molecular biologists have invented a fancy word for the process of evolving a completely different function for a molecule — exaptation (to contrast it with adaptation).

Note the word molecule — exaptation isn’t confined to proteins. [ Cell vol. 160 pp. 554 – 566 ’15 ] Discusses exaptation as something which happens to promoters and enhancers. This work looked at the promoters and enhancers active in the liver in 20 mammalian species — all the enhancers were rapidly evolving.


Answer to the mathematical puzzle of the previous post. R is the set of 4 straight lines bounding a square centered at (0,0)

Here’s why proving it has an inside and an outside isn’t enough to prove the Jordan Curve Theorem

No. The argument for R uses its geometry (the boundary is made of straight
line segments). The problem is that an embedding f: S^1 -> R^2 may be
convoluted, say something of the the Hilbert curve sort.

An incorrect proof of the Jordan Curve Theorem – can you find what’s wrong with it?

Every closed curve in an infinite flat plane divides it into a bounded part and an unbounded part (inside and and outside if you’re not particular). This is so screamingly obvious, that for a long time no one thought it needed proof. Bolzano changed all that about 200 years ago, but a proof was not forthcoming until Jordan gave a proof (thought by most to be defective) in 1887.

The proof is long and subtle. The one I’ve read uses the Brouwer fixed point theorem, which itself uses the fact that fundamental group of a circle is infinite cyclic (and that’s just for openers). You begin to get the idea.

Imagine the 4 points (1,1),(1,-1),(-1,1) and (-1,1) the vertices of a square centered at ( 0, 0 ). Now connect the vertices by straight lines (no diagonals) and you have the border of the square (call it R).

We’re already several pages into the proof, when the author makes the statement that R “splits R^2 (the plane) into two components.”

It seemed to me that this is exactly what the Jordan Curve theorem is trying to prove. I wrote the author saying ‘why not claim victory and go home?.

I got the following back

“It is obvious that the ‘interior’ of a rectangle R is path connected. It is
only a bit less obvious – but still very easy – to show that the ‘exterior’
of R is also connected. The rest of the claim is to show that every path
alpha from a point alpha(O)=P inside the rectangle R to a point alpha(1)=Q
out of it must cross the boundary of R. The set of numbers S={alpha(i) :
alpha(k) is in interior(R) for every k≤i} is not empty (0 is there), and it
is bounded from above by 1. So j=supS exists. Then, since the exterior and
the interior of R are open, j must be on the boundary of R. So, the interior
and the exterior are separate components of R^2 \ R. So, there are two of

Well the rectangle is topologically equivalent (homeomorphic) to a circle.

So why isn’t this enough?  It isn’t ! !

Answer to follow in the next post. Here’s the link — go to the end of the post —

Facilitated communication

Amidst the ads in the Sunday Magazine largely targeted to the 1% that the New York Times claims to hate is an article on facilitated communication. I had a clinical experience with it 30 years ago that you might be interested in.

As a neurologist I was asked to do an Independent Medical Evaluation (IME) on an unfortunate man who was electrocuted at work (he worked on high voltage transmission lines). He went into cardiac arrest and sustained severe brain damage. The issue was not fault, which the power company readily admitted, but whether in what appeared to be a vegetative state, with no visible response to verbal commands, he was in fact conscious but unable to respond. In the latter case the reward to the family would have been substantially larger (for pain and suffering in addition to loss of consortium, etc. etc.). It was claimed that facilitated communication showed that he was able to write the answer to simple calculations given verbally, not visually.

Reviewing the chart before seeing the man, showed that he and his wife were admirable individuals, adopting children that no one else wanted and raising them despite limited income. He was seen at the rehab facility, with attorneys for the insurer for the power compony and his family present. It was apparent that the people caring for him were quite devoted, both to him and his wife and were very sincere, especially one of his young therapists.

The neurologic exam showed that although he did react to deep pain (sternal compression), he did not follow simple commands (e.g. blink). He appeared to be in a coma. Following the neurologic examination the young therapist then demonstrated how when he held the man’s hand to which a pencil was attached, the man could actually perform calculations — add 2 and 2 produce a 4, etc. etc. Several such calculations were produced all with correct answer.

What do you think I did next?

No peeking. Think about it.

I took the first sheet of paper away, placed a clean sheet under the man’s hand and asked for a repeat (this time with the therapist’s eyes closed).

This produced a bunch of random lines, nothing more.  When the therapist opened his eyes and saw the results, he was visibly shaken and close to tears.

Was he faking the whole time? At any time? I seriously doubt it. A faker could have produced a reasonable number with his eyes shut. Try it. He didn’t.

“You can’t con an honest man” —

True, but you certainly can con yourself.

For another example, this time perpetrated by nurses, see how an 11 year old girl (Emily Rosa) put a definitive end to “Therapeutic Touch” and became the youngest co-author ever of an article in the Journal of the American Medical Association —

A new kid on the Alzheimer’s block

There’s a new kid on the Alzheimer’s block, and it may explain why the huge sums thrown at beta-secretase inhibitors by big pharma has been such an abject failure. First, a lot of technical background.

The APP (for amyloid precursor protein) contains anywhere from 563 to 770 amino acids in 5 distinct transcripts made by alternate splicing of the single gene. The 3 main forms contain 695, 751 and 770 amino acids. The 695 amino acid form is found only in brain and peripheral nerve where it predominates, while the transcripts containing 751 and 770 amino acids are found everywhere but predominate in other tissues. The A4 peptides (Abeta peptides) which are the major components of the Alzheimer senile plaque are derived from from the carboxy terminal end of APP (beginning at amino acid #597 ) and contain only 39 – 43 amino acids. About 1/3 of the 39 – 43 amino acid amyloid beta peptide (A beta peptide) is found within the transmembrane segment of APP the other two thirds being found just outside the membrane.  So to get A beta peptides the APP must be cut (more than once) at its carboy terminal end.

For Abetaxx (xx between 39 and 43) to be formed, cleavage must occur outside the membrane in which APP is embedded by beta secretase. This produces a soluble extracellular fragment, with the rest embedded in the membrane (this is called C99). Then gamma secretase (another enzyme) cleaves C99 within the membrane forming the Abeta peptides, which constitute much of the senile plaque of Alzheimer’s disease.

Alpha secretase (yet another enzyme) also cleaves the APP in its carboxy terminal extramembranous part, but does so closer to the membrane, so that part of the protein which would form the aBeta peptide is removed.

R. Scheckman personal communication (2012) — The Abeta peptide is appears to be cleaved by gamma secretase from the fragment generated by beta secretase. However, this happens well inside the cell in the last station of the Golgi apparatus. Then Abeta is swept out of the cell by the secretory pathway. So all this happens INSIDE the cell, rather than at the neuron’s extracellular membrane (which is what I thought).

Remarkably it is very difficult (for me at least) to find out just at what amino acids of the amyloid precursor protein(s) the 3 secretases (alpha, beta, gamma) cleave.

[ Nature vol. 526 pp. 443 – 447 ’15 ] describes a totally new kid on the block, which (if replicated) should make us rethink everything we thought we knew about the amyloid precursor protein and the Abeta peptide. Another set of carboxy terminal fragments (CTFs) called CTFneta is formed from the amyloid precurosr protein (APP). Formation is mediated (in part) by MT5-MMP, a matrix metalloprotease. (In grad school neta is how we pronounced the Greek letter eta, which looks like a script N). The authors call the enzymatic activity forming them neta-secretase (clearly not all the enzymes which do this have been identified at this point). At least the authors tell you where the neta secretases cleave APP695 (between amino acids #504 – #505) . This is amino terminal to the beta and alpha sites (which are at higher amino acid numbers and the gamma site which is at a higher number still).  Alpha and beta secretase then work on CTFneta to produce shorter peptides, called Aneta-alpha, and Aneta-beta.

This isn’t idle chatter as Aneta-alpha, and Aneta-beta are found in the dystrophic neurites in an Alzheimer mouse model (human work is sure to follow). Inhibition of beta secretase activity results in accumulation of CTFneta and Aneta-alpha.

Aneta-alpha itself lowers long term potentiation (LTP) in hippocampal slices (LTP is considered by most to be the best molecular and physiological model we have of learning). As judged by intracellular calcium levels, hippocampal neuronal activity is also inhibited by Aneta-alpha.

What’s fascinating about all this, is that the work possibly explains why the huge amount of money big pharma has spend on beta secretase inhibitors has been such a failure.

Maybe chemistry just isn’t that important in wiring the brain

Even the strongest chemical ego may not survive a current paper which states that the details of ligand receptor binding just aren’t that important in wiring the fetal brain.

The paper starts noting that there isn’t enough information in our 3.2 gigaBase genome to specify each and every synapse. Each cubic milliMeter of cerebral cortex is stated to contain a billion of them [ Cell vol. 163 pp. 277 – 280 ’15 ].

If you have enough receptors and ligands and use them combinatorially, you actually can specify quite a few synapses. We have 70 different protocadherin gene products found on the neuronal surface. They can bind to each other and themselves. The fruitfly has the dscam genes which guide axons to their proper position. Because of alternative splicing some 38,016 dscam isoforms are possible.

It’s not too hard to think of these different proteins on the neuronal surface as barcodes, specifying which neuron will bind to which.

Not so, says [ Cell vol. 163 pp. 285 – 291 ’15 ]. What is important is that there are lot of them, and that a neuron expressing one of them is unlikely to bump into another neuron carrying the same one. Neurons ‘like’ to form synapses, and will even form synapses with themselves (one process synapsing on another) if nothing else is around. These self synapses are called autapses. How likely is this? Well under each square millimeter of cortex in man there are some 100,000 neurons, and each neuron has multiple dendrites and axons. Self synapse formation is a real problem.

The paper says that the structure of all these protocadherins, dscams and similar surface molecules is irrelevant to what program they are carrying out — not synapsing on yourself. If a process bumps into another in the packed cortex with the same surface molecule, the ‘homophilic’ binding prevents self-synapse formation. So the chemical diversity is just the instantiation of the ‘don’t synapse with yourself’ rule — what’s important is that there is a lot of diversity. Just what this diversity is chemically is less important than there is a lot of it.

This is another example of “It’s not the bricks, it’s the plan” in another guise —

The next big drug target – II

In a post a week ago I argued that the next big drug target was the protein protein interface. The PNAS of 6 Oct had a paper indirectly confirming just that [ Proc. Natl. Acad. Sci. vol. 112 pp. E5486 – E5495 ’15 ] What they did was fairly simple (intellectually) but a lot of work. They just analyzed the PanCancer compendium of somatic mutations from 4,742 tumors relative to all known 3 dimensional structures of human proteins in the Protein Data Bank. They looked for clustering of the mutations — on the protein surface (or interior). As you all know, although proteins are a linear string of amino acids, they fold up like a hair ball, so widely separated amino acids in the sequence may be right next to each other in the 3 dimensional structure of the protein.

What’s so confirmatory of the previous post was that they found enrichment of mutations in the interfaces between a variety of oncoprotein and other proteins (including tumor suppressors). Most of the significant interfaces carried mutations in both interaction partners. Overall,they found 50 different proteins with clustering of mutations and/or enrichment of mutations at interaction interfaces. Here are the names of a few of the culprits for the cognoscenti — FBXW7-CCNE1, HRAS-RASA1, CUL4B-CAND1, OGT-HCFC1, PPP2R1A-PPP2R5C/PPP2R2A, DICER1-Mg2+, MAX-DNA, SRSF2-RNA, and others. The paper contains much more detail than this and discusses the significance of the protein pairs shown above. One example should suffice

FBXW7-CCNE1. Cyclin E1 (CCNE1) is a critical cell cycle protein, which at abnormally high levels promotes premature cell division, genomic instability, and tumorigenesis. FBXW7 (F-box/WD repeat-containing protein 7) is a substrate recog- nition component of an E3 ubiquitin-protein ligase complex, mediating the ubiquitination and subsequent proteasomal degradation of CCNE1 and other cancer proteins like MYC and JUN. We found that all six recurrently mutated residues (found in at least three samples from our mutation dataset) of FBXW7 clustered together at the WD40 propeller domain of the protein product. Four of them, R465, R479, R505, and R689, interacted directly with the substrate CCNE1 through hydrogen bonds (Fig. 5A). Changes in these residues could perturb the interaction, causing insufficient ubiquitination/ degradation of CCNE1 in tumor samples (as has been pre- viously shown in model systems).

Here’s the post of a week ago

The next big drug target

So many of the molecular machines used in the cell are composed of many different proteins held together by nonCovalent interactions. The Mediator complex contains 25 – 30 proteins with a mass of 1.6 megaDaltons, RNA polymerase contains 12 subunits, the general transcription factors contain 25 proteins, our ribosome with a mass of 4.3 megaDaltons contains 47 in the large subunit and 33 in the small. The list goes on and on — proteasome,nucleosome, post-synaptic density.

The typical protein/protein interface has an area of 1,000 – 2000 square Angstroms — or circles of diameter between 34 and 50 Angstroms. [ Proc. Natl. Acad. Sci. vol. 101 pp. 16437 – 16441 ’04 ]. Think of the largest classical organic molecule you’ve ever made (not any polymer like a protein, polynucleotide, or polysaccharide). It isn’t anywhere close to this.

Yet I’m convinced that drugs targeting these complexes, will be useful. Classical organic chemistry will be useless in designing them. We’ll have to forget our beloved SN1, SN2, nonclassical carbonium ions etc. etc. We need some new sort of physical organic chemistry, one not concerned with reaction mechanism, but with van der Waals interactions, electrostatic interactions. At least stereochemistry will still be important.

The problem is much harder than designing enzyme inhibitors, or their allosteric modifiers, because the target is so large.

What follows are some notes on the protein protein interface I’ve taken over the years to get you started thinking. Good luck. Don’t expect any neat answers. There is a lot of contention concerning the nature of the binding occurring at the interface.

Many of the references aren’t particularly new. In my reading, I don’t try for the latest reference, but the newest idea that I’m unfamiliar with. I think they pretty much cover the territory as it stands now.

[ Proc. Natl. Acad. Sci. vol. 108 pp. 603 – 608 ’11 ] A very interesting article argues that worms and humans have about the same number of proteins (20,000) because if they had more, nonspecific protein protein interactions would cause disease. The achievable energy gap favoring specific over nonspecific binding decreases with protein number in a power law fashion (in their model). The optimization of binding interfaces favors networks in which a few proteins have many partners and most proteins have just a few — this is consistent with a scale free network topology.

[ Proc. Natl. Acad. Sci. vol. 101 pp. 16437 – 16441 ’04 ] The hot spot theory of protein protein interactions says that the binding energy between two proteins is governed in large part by just a few critical residues at the binding interface. In a typical interface of 1000 – 2000 square Angstroms, only 5% of the residues from each protein contribute more than 2 kiloCalories/mole to the binding interaction. (This is controversial — see later)

[ Proc. Natl. Acad. Sci. vol. 99 pp. 14116 – 14121 ’02 ] Specific replacement of amino acids in the interface by alanine (alanine scanning or alanine mutagenesis) and measuring the effect on the interaction has led to the idea that only a small set of ‘hot spot’ residues at the inferface contribute to the binding free energy. A hot spot has been defined as a residue that when mutated to alanine leads to a significant drop in the binding constant (typically 10 fold or higher — should know how many kiloCalories this is — I think 2 or 3 ). This was well worked out for human growth hormone (HGH) and its receptor. Subsequently ‘many’ other studies have suggested that the presence of a few hot spots may be a general characteristic of most protein/protein interfaces.

However there is extreme variation in the size, shape, amino acid character and solvent content of the protein/protein interface. It is not obvious from looking at structural contacts which residues are important for binding. Usually they are found at the center of the interface but sometimes the key residues can lie on the periphery. Peripheral residues serve as an O-ring to exclude solvent from the center. A lowered effective dielectric constant in a ‘dryer’ environment strengthens electrostatic and hydrogen bonding interactions. An interaction deleted by alanine mutagenesis in the periphery can be replaced by a water molecule in the periphery and hence cause less loss in stability (this calls the whole concept of alanine scanning into question).

Interestingly, there is no general correlation between ‘surface accessibility’ and the contribution of a residue to the binding energy.

Polar residues (Arg, Gln, His, Asp, and Asn) are conserved in interfaces. This implies that they are hot spots — implies ? don’t they know? haven’t they tested? However, many interaction hot spots involve hydrophobic or large aromatic residues (also hydrophobic). It is unclear whether buried polar interactions are energetically net stabilizing or merely facilitating specificity (how would you tell?).

Some residues without significant contacts in the interface apparently contribute substantially to the free energy of binding when assayed by alanine scanning mutagenesis, because of destabilization of the unbound protein.

This a report of a free energy function (using packing interactions, hydrogen bonds and an implicit solvation model) which predicts 79% of all interface hot spots. They think that a description of polar interactions with Coulomb electrostatics with a linear distance dependent dielectric constant. ??? The latter ignores the orientation dependence of the hydrogen bond. Also the assumption that acidic or basic residues largely buried in the interface are charged may be wrong. The enthalpic gains of ionization are offset by the cost of desolvating polar groups, and the loss in side chain conformational entropy.

[ Proc. Natl. Acad. Sci. vol. 101 pp. 16437 – 16441 ’04 ] It is of interest to find out if hot spot theory applies to transient protein protein interactions (such as those involved in enzyme catalysis). This work looked for them in the process of protein substrate recognition for the Cdc25 phosphatase (which dephosphorylates the cyclin dependent kinases). Crystal structures of the catalytic domains of Cdc25A and Cdc25B have shown a shallow active site with no obvious features for mediating substrate recognition. This suggests a broad protein interface rather than lock and key interaction. This is confirmed by the activity of the Cdc25 phosphatases toward Cdk/cyclin protein substrates which is 6 orders of magnitude greater than that of peptidic substrates containing the same primary sequence — this suggests a broad protein interface rather than a lock and key interaction. The shallow active sites also correlates with the lack of potent speicific inhibitors of the Cdc25 phosphatases, despite extensive search. This work finds hot spot residues in the catalytic domain (not the catalytic site) of Cdc25B located 20 – 30 Angstroms away from the active site. They are involved in recognition of substrate. The residues are conserved across eukaryotes.

[ Proc. Natl. Acad. Sci. vol. 101 pp. 11287 – 11292 ’04 ] One can study the effects of mutating a single amino acid on two separate rates (the on rate and the off rate) the ratio of which is the equilibrium constant. Mutations changing the on rate, concern the specificity of protein protein interaction. Mutations only changing the off rate do not affect the transition state of protein binding (don’t see why not). Mutations in bovine pancreatic trypsin inhibitor (BPTI) have been found at positions #15 and #17 which differentially affect on and off rates. K15A decreases by 200 fold in the on rate and by a 1000 fold increase in the off rate. But R17A doesn’t change the on rate but also increases the off rate by 1000 fold.

The concept of anchor residue arose in the study of peptide binding to class I MHC molecules (Major HistoCompatibility complex) In this system the carboxy terminal side chain of the peptide gets buried in pocket F of the MHC binding groove. Sometimes, one also finds a second anchor residue and even a third one buried at other positions.

The authors attempt to apply the anchor residue concept to protein protein interactions. They studied 39 different protein/protein complexes. They found them, and in some way conclude that these anchor residues are already in the ‘bound’ conformation in the free partner. The anchors interact with structurally constrained pockets matching the anchor residues. The presence of nativelike anchor side chains provides a readily attainable geometrical fit that jams the two interacting surfaces, allowing for the recognition and stabilization of a near-native intermediate. Subsequently an induced fit process occurs on the periphery of the binding pocket.

The analysis of ANY (really?) protein/protein complex at the atomic length scale shows that the interface, rather than being smooth and flat, includes side chains deeply protruding into well defined cavities on the other protein. In all complexes studied, the anchor is the side chain whose burial after complex formation yields the largest possible decrease in solvent accessible surface area (SASA). If SASA is over 100 square Angstroms, than only one anchoring interaction is present. For lesser SASA amino acids one anchor isn’t enough.

In all cases tested (39) latch side chains are found in conformations conducive to a relatively straightforward clamping of the anchored intermediate into a high affinity complex.

[ Proc. Natl. Acad. Sci. vol. 102 pp. 57 – 62 ’05 ] An analysis of the protein interface between a beta-lactamase and its inhibitor, shows that the interface can be divided into clusters (by means of cluster anlaysis) using multiple mutant analysis and xray crystallography. Replacing an entire module of 5 interface residues with alanine (in one cluster) created a large cavity in the interface with no effect on the detailed structure of the remaining interface. They obtained similar results when they did this with another of the 5 clusters.

Mutating a single amino acid at a time has been done in the past, but the results of single mutations aren’t additive (e.g. they aren’t linear — no surprise). The sum of the loss in free energy of all of the single mutations within a cluster exceeds by 4 fold the loss in free energy generated when all of the residues of the cluster are mutated simultaneously. The energetic effect of many single mutations is larger than their net contribution due to a penalty paid by leaving the rest of the cluster behind.

“Binding seems to be a result of higher organization of the binding sites, and not just of surface complementrity.”

[ Proc. Natl. Acad. Sci. vol. 103 pp. 311 – 316 ’06 ] Two different ‘interactomes’ both show the same power law distribution of node sizes. However, when the two major S. cerevisiae protein/protein interactions are experiments are compared with each other, only 150 of the THOUSANDS of interactions of each experiments are the same. A similar lack of agreement has been found for independent Y2H experiments in Drosophila.

This work says that desolvation of the interface is a major physical factor in protein/protein interactions. This model reproduces the scale free nature of the topology. The number of interactions made by a protein is correlated with the fraction of hydrophobic residues on its surface.

[ Proc. Natl. Acad. Sci. vol. 108 pp. 13528 – 13533 ’11 ] The drugs they are looking for disrupt specific protein protein interactions (PPIs). Tey use computational solvent mapping, which explores the protein surface using a variety of small probe molecules, along with a conformer generator to account to side chain flexibility. They studied unliganded proteins known to participate PPI. The surface cavities available at protein protein interfaces which can bind a smal molecule inhibitor are rather different than those seen in traditional drug targets. The traditional targets have one or two disproportionately large pockets with an average volume of 260 cubic Angstroms — these account for the binding site for the endogenous ligand in over 90% of proteins. The average volume of pockets at protein protein interfaces is only 54 cubic Angstroms, the same as for all protein surface pockets. The interface ontains 6 such small pockets (on average).
The binding sites of proteins generall include smaller regions called hotspots which are major contributors to the binding free energy. The results of experimental fragment screens confirm that the hot spotes of proteins are characterized by their ability to bind a variety of small molecules and that the number of different probe molecules observed to bind to a particular site predicts the importance of the site and predicts overall druggability.
This work shows that the druggable sites in PPIs have concave topology and both hydrophobic and polar functionality. So the hotspots bind organic molecules having some polar groups decorating largely hydropobic scaffolds. Sos druggable sites have a ‘general tendency’ to bind organic compounds with a variety of structures. Conformational flexibility at the binding site (by side chains?) allow the hotspots to expand to accomodate a ligand of druglike dimensions. This involves low energy side chain motions within 6 Angstroms of a hot spot.
So druggable sites at a PPI aren’t just sites complementary to particular organic functionality, but they have a general tendency to bind a variety of different organic structures.
The most important binding is that the druggable sites are detectable from the structure of the unliganded protein, even when substantial conformational adaptation is needed for optimal ligand binding.
[ Science vol. 347 pp. 673 – 677 ’15 ] Mapping the sequence space of 4 key amino acids in the E. Coli protein kinase PhoQ which drives the recognition of its substrate (PhoP). For histidine kinases mutating just 3 or 4 interfacial amino acids to match those in another kinase is enought to reprogram them. The key variants are Ala284, Val 285, Ser288, Thr289.

All 20^4 = 160,000 variants of PhoQ at these positions were made, of which 1,659 were functional (implying singificant degeneracy of the interface). There were 16 single mutants, 100 double, 544 triple and 998 quadruple mutants which were functional. There was an enrichment of hydrophobic and small polar residues at each position. Most bulky and charged residues appeared at low frequencies. Some substitutions were permissible individually, but not in combination. The combinations, ACLV, TISV, SILS, each involving aresidues found individually in functional mutants at high frequency, were quite impaired in competition against wildtype PhoQ — so the effects of individual substitutions are context dependent (epistatic). Of the 100 functional double mutants, only 23 represent cases where both single mutants are functional. THere are double mutants where neither single mutant is functional. 79/1,658 functional variants can’t be reached from the wild-type combination AVST) without passing through a nonfunctional intermediated. They talk about the Hamming distance between mutants.

Finally some blue sky stuff — implying that (as usual) Nature got there first

[ Science vol. 341 pp. 1116 – 1120 ’13 ] Small Open Reading Frames (smORFs) code for peptides of under 100 amino acids. This work has shown that peptides as short as 11 amino acids are translated and provide essential functions during insect development. This work shows two peptides of 28 and 29 amino acids regulating calcium transport in the Drosophila heart. The peptides are found in man.
They don’t think that smORFs can’t be dismissed as irrelevant, and function should be looked for.
[ Science vol. 1356 – 1358 ’15 ] The Drosophila polished-rice (Pri) sORF peptides (11 – 32 amino acids)trigger proteasome mediated processing converting the Shavenbaby transcription repressor into a shorter activator.
They think that oORF/smORFs mimic protein binding interfaces and control protein interactions that way.

Just before the battle, mother

This is not a scientific post. Tomorrow I’m going to play piano at a memorial service for the husband of a coworker of my cellist. Wish me luck. I hate playing in public although I love playing chamber music with friends. The fact that one of the pieces I’m going to play is something I wrote is no consolation, since last week my sister in law, an accomplished composer who’s had some of her stuff performed in Carnegie hall, told me that playing something you wrote is no guarantee that you won’t screw up. Thanks a lot.

The distaste for performing in public goes back to grade school when I first started taking lessons. We had to play from memory when the teacher would parade all his pupils at a concert. I never actually screwed up, but was always afraid that I’d get in an endless loop while playing and be unable to get out. I’ve seen this happen once, and even watching it was excruciatingly painful.   In fact that’s how I found that my mother was not omniscient.  Every winter I was told “Close your galoshes and button up or you’ll get sick”. So two weeks before each recital I’d run around with open galoshes and a wide open jacket in the hopes that I’d get sick and miss the recital but I never did.

So wish me luck.  The following fits my current mood

Just before the battle, mother,
I am thinking most of you,
While upon the field we’re watching
With the enemy in view.
Comrades brave are ’round me lying,
Filled with thoughts of home and God
For well they know that on the morrow,
Some will sleep beneath the sod.

Popular song during the American Civil War


Get every new post delivered to your Inbox.

Join 82 other followers