Tag Archives: protein folding

The humble snow flea teaches us some protein chemistry

Who would have thought that the humble snow flea (that we used to cross country ski over in Montana) would teach us a great deal about protein chemistry turning over some beloved shibboleths in the process.

The flea contains an antifreeze protein, which stops ice crystals from forming inside the cells of the flea in the cold environment in which it lives. The protein contains 81 amino acids, is 45% glycine and contains six  type II polyProline helices each 8 amino acids long (https://en.wikipedia.org/wiki/Polyproline_helix). None of the 6 polyProline helices contain proline despite the name, but all contain from 2 to 6 glycines. Also to be noted is (1) the absence of a hydrophobic core (2) the absence of alpha helices (3) the absence of beta turns (4) the protein has low sequence complexity.

Nonethless it quickly folds into a stable structure — meaning that (1), (2), and (3) are not necessary for a stable protein structure. (4) means that low sequence complexity in a protein sequence does not invariably imply an intrinsically disordered protein.

You can read all about it in Proc. Natl. Acad. Sci. vol. 114 pp. 2241 – 2446 ’17.

Time for some humility in what we thought we knew about proteins, protein folding, protein structural stability.

Advertisements

Aromatic rings are planar in proteins aren’t they? More trouble for computational chemistry

Every organic chemistry book worth its salt has a diagram of the heats of hydrogenation of Benzene (the new Clayden has it on p. 158).  Adding H2 across a double bond releases energy, because saturated hydrocarbons have a lower energy than alkenes.  However the heat of hydrogenation of benzene is some 208 kiloJoules/mole, which is considerably less than 3 times the heat of hydrogenation of cyclohexene (3 * 120 kiloJoules/mole).  Then we’re off for a romp through the planarity of benzene allowing the p orbitals to overlap, the Huckel rules etc. etc.  It’s why benzene (and the aromatic nucleotides making up DNA and RNA) are flat — move one of the atoms out of the plane, and you decrease overlap, raise energy, etc. etc.

Except that there are 19 proteins where this isn’t the case for the 6 membered rings of phenylalanine or tyrosine.  A truly fascinating paper [ Proc. Natl. Acad. Sci. vol. 109 pp. 9414 – 9419  ’12 ] describes alpha-lytic protease (alphaLP from here on) in which phenylalanine #228 has a bent benzene ring.  Even more interestingly, this raises the energy of the protein and appears to be an integral part of the protein’s biological functioning.   It’s not an accident.

When you do Xray crystallography, 2 Angstrom resolution is usually enough to show you what’s going on (the C – C bond is 1.54 Angstroms).  However, ultrahigh resolution structures (resolutions under 1 Angstrom) have become available for some 100 proteins, allowing you to see if aromatic rings are truly flat.

Phenylalanine #228 of alphaLP is not flat at all, deviating by 6 degrees from planarity.  How much is this?  Well the benzene carbon carbon bond length is 1.4 Angstroms, so it’s 1.4 + 2 * 1.4  * sine (30) = 1.4 + 2 * 1.4  * 1/2 = 2.8 Angstroms from carbon 1 to carbon 4.  How far does 6 degrees takes carbon 4 out of the plane of carbons 1, 2 and 6? It’s 2.8 * sine 6 degrees.  Since sine 6 degrees is .10, this means that carbon 6 is only .28 Angstroms out of the plane — high resolution indeed.

Now it gets interesting, from both a chemical and biological point of view.  It turns out that, purely on an energetic basis, the unfolded form of alphaLP is 4 kiloCalories/mole lower in energy than the folded (native) form, so the native form is metastable.  However, it is kinetically stable, with a half-life for unfolding of 1.2 years, a classic example of a kinetically stable, thermodynamically unstable chemical entity.

It gets more interesting (and confusing to me) because the folding barrier is said to have a half-life of 1,800 years (4 kiloCalories/mole shouldn’t make that much difference should it?).  Does anyone out there know why the folding and unfolding barriers should be so different.  So how does the protein get into the native configuration?  By a covalently attached folding catalyst (called the pro region), which is removed when the native state is reached.  Kinetic stability seems to exact a toll on the difficulty of folding, one which selection is willing to pay.

Now it’s time to look at the environment of phenylalanine #228.  The ring is being pushed out of shape by threonine #181 below and tryptophan #199 above.  So the authors did the obvious, replacing Thr181 by glycine and then alanine and watching what happened.  The mutants unfolded faster — so the distortion in some way is raising the energy of the transition state, and thus is functionally important in the kinetic stability of the protein.   The authors are silent as to the actual structure of the transition state for unfolding, but rates are rates and their conclusions seem sound.  As they say in computer land, that’s not a bug, that’s a feature.  Why would you wan’t alphaLP to be kinetically rather than thermodynamically stable?  The authors think that kinetic stability makes alphaLP  better able to survive in harsh environments.  Perhaps

Well, how common is this?  There are some 100 protein structures now available at ultrahigh resolution.  19 of them have nonplanar aromatic side chains by 6 degrees or more (see figure 5 p. 9418).  Who’d a thunk it.  One wonders how many structures were thrown out because everyone knew that aromatic rings are planar.

What does this mean for the computational chemist?  The low energy form may not actually be the important one.  What we’ve assumed about side chains may not be true.  It makes the protein folding problem even more complicated.

They don’t discuss tryptophan planarity.  Clearly more ultrahigh resolution studies of proteins are needed.  Think of the decades spent studying proteins, and here’s something brand new.  Reading the scientific literature is like reading a Russian novel with thousands of new characters popping up and doing  the unexpected.

Where has all the chemistry gone?

Devoted readers of this blog (assuming there are any) must be wondering where all the chemistry has gone.  Willock’s book convinced me of the importance of group theory in understand what solutions we have of the Schrodinger equation.  Fortunately (or unfortunately) I have the mathematical background to understand group characters and group representations, but I found Willock’s  presentation of just the mathematical  results unsatisfying.

So I’m delving into a few math books on the subject. One is  “Representations and Characters of Groups” by James and Liebeck (which provides an application to molecular vibration in the last chapter starting on p. 367).  It’s clear, and for the person studying this on their own, does have solutions to all the problems. Another is “Elements of Molecular Symmetry” by Ohrn, which I liked quite a bit.  But unfortunately I got stymied by the notation M(g)alpha(g) on p. 28. In particular, it’s not clear to me if the A in equation (4.12) and (4.13) are the same thing.

I’m also concurrently reading two books on Computational Chemistry, but the stuff in there is pretty cut and dried and I doubt that anyone would be interested in comments as I read them.  One is “Essential Computational Chemistry” by Cramer (2nd edition).  The other is “Computational Organic Chemistry” by Bachrach.  The subject is a festival of acronyms (and I thought the army was bad) and Cramer has a list of a mere 284 of them starting on p. 549. On p. 24 of Bachrach there appears the following “It is at this point that the form of the functionals  begins to cause the eyes to glaze over and the acronyms appear to be random samplings from an alphabet soup.”  I was pleased to see that Cramer still thinks 40 pages or so of Tom Lowry and Cathy Richardson’s book is still worth reading on molecular orbital theory, even though it was 24 years old at the time Cramer referred to it.  They’re old friends from grad school.   I’m also pleased to see that Bachrach’s book contains interviews with Paul Schleyer (my undergraduate mentor).  He wasn’t doing anything remotely approaching computational chemistry in the late 50s (who could?).  Also there’s an interview with Ken Houk, who was already impressive as an undergraduate in the early 60s.

Maybe no one knows how all of the above applies to transition metal organic chemistry, which has clearly revolutionized synthetic organic chemistry since the 60’s, but it’s time to know one way or the other before tackling books like Hartwig.

Another (personal) reason for studying computational chemistry, is so I can understand if the protein folding people are blowing smoke or not.  Also it appears to be important in drug discovery, or at least is supporting Ashutosh in his path through life.  I hope to be able to talk intelligently to him about the programs he’s using.

So stay tuned.