There’s an old joke about a drunk who lost his keys coming home from the bar one night. Daybreak found him still crawling around searching under a lamppost. A passerby asked him why he didn’t look elsewhere. Drunk’s answer: “That’s where the light is”
Could biochemists be doing something similar? Have a look at [ Science vol. 325 pp. 1230 – 1234 ’09 (4 September 2009) ] where a new type of bond between amino acids was found — a sulfilimine bond ( – S = N – ) which crosslinks hydroxy lysine #211 and methionine #93 of adjoining protomers of type IV collagen. By way of background, collagens account for 25% by mass of all mammalian proteins (Molecular Biology of the Cell 4th Ed. p. 1096). Humans have 25 genes for collagen. Since each collagen contains 3 copies of the protein, we could have 15,625 different collagens — but so far only 20 have been found. Type IV collagen is particularly important medically since it is the main type of collagen found in the basement membranes which surround capillaries and muscle cells. Basement membrane thickening occurs early in diabetes (both human and experimental) and is thought to be one source of the vascular problems that diabetics get. Whether the thickening causes glucose intolerance or is caused by it isn’t known.
The sulfilimine bond had never previously been found in biomolecules. However the crosslink between two protein chains in type IV collagen had been known to exist for 20 years, but no one could figure out what it was. The authors used Fourier transform ion cyclotron resonance (FTICR) which can achieve a mass accurary of 2 parts per million (impressive). All sorts of spectroscopy were used as well with NMR playing a big role. A real tour de force.
One chemical question for the cognoscenti. The structure given in figure 2 p. 1232 shows sulfur attached by a double bond to a the nitrogen of hydroxy lysine, but it is also attached by single bonds to a methyl group and to the rest of the methionine amino acid. No charge is shown on the sulfur atom (or anywhere for that matter). Structures for S-adenosyl methionine (where sulfur is bound by 3 covalent bonds to carbon) always puts a positive charge on the sulfur. Shouldn’t sulfilimine show a charge of +2 on the sulfur atom?
How many other wierdnesses are out there? In this case biochemists were forced to look for the bond because they knew it was there. Are protein biochemists basically looking under the lamppost?
Another example of this might be found in a post of mine on the Skeptical Chymist. Stuart Cantrill tells me that since I wrote it, it’s mine. So here it is (slightly edited). Recall that a Nobel was awarded for making the Green Fluorescent protein available to cellular biology.
Chemiotics: Sherlock Holmes and the Green Fluorescent Protein
Gregory (Scotland Yard): “Is there any other point to which you would wish to draw my attention?”
Holmes: “To the curious incident of the dog in the night-time.”
Gregory: “The dog did nothing in the night-time.”
Holmes: “That was the curious incident.”
The chromophore of green fluorescent protein (GFP) is para-hydroxybenzylidene imidazolinone. It is formed by cyclization of a serine (#65) tyrosine (#66) glycine (#67) sequential tripeptide. It is found in the center of a beta barrel formed by the 238 amino acids of GFP.
What is so curious about this?
Simply put, why don’t things like this happen all the time? Perhaps nothing quite this fancy, but on a more plebeian level consider this: of the twenty amino acids, 2 are carboxylic acids, 2 are amides, 1 is an amine, 3 are alcohols and one is a thiol. One might expect esters, amides, thioesters and sulfides to be formed deep inside proteins. Why deep inside? On the surface of the protein, there is water at 55 molar around to hydrolyze them purely by the law of mass action (releasing about 10 kJ/Avogadro’s number per bond in the process). Some water is present in the X-ray crystallographic structure of proteins, but nothing this concentrated.
The presence of 55 M water bathing the protein surface leads to an even more curious incident, namely why proteins exist at all given that amide hydrolysis is exothermic (as well as entropically favorable). Perhaps this is why proteins contain so many alpha helices and beta sheets — as well as functioning as structural elements they may also serve to hide the amides from water by hydrogen bonding them to each other. Along this line, could this be why the hydrophilic side chains of proteins (arginine, lysine, the acids and the amides) are rather bulky? Perhaps they also function to sterically shield the adjacent amides. After all, why should lysine have 4 methylene groups rather than just one or two?
Now the serine-tyrosine-glycine tripeptide should occur by chance once in every 8000 tripeptides. The SwissProt database of proteins contains 144,041,553 amino acids in 399,749 proteins as of 14 October 2008. Does this tripeptide occur 18,805 times in the database as it should? If it doesn’t, is negative selection preventing it? If the tripeptide does occur this often, have we missed other chromophores? Are there other tripeptides missing from SwissProt? If there are, does this tell us how to build other chromophores? Or does it tell us something important about protein structure?
I don’t have the skills to properly interrogate SwissProt or the Protein Data Bank, but I imagine that some of the readership does. Go to it. These are curious incidents indeed.
Comments
My face is red. Spoke to the QM prof this PM about something else. He saw no problem with uncharged S in the collagen IV link. It’s just like dimethyl sulfoxide (which had a huge play in the 80s as a cure/treatment for all sorts of diseases). The difference between the -S = N – bond and the 3 bonds in S-adenosyl methionine is that nitrogen can donate electrons to sulfur, but saturated carbon cannot. Ouch !
S: neutral with valence 2, 4 and 6 🙂 Think of SF4 and SF6 and of course, DMSO.
As for your very interesting question about the absence of certain kinds of linkages in proteins, evolution can be a tricky beast. I don’t doubt that those kinds of linkages were probably formed in early proto-biomolecules. But evolution seeks a balance between two much stability and too little. Esters would not work since they would be too unstable.
Amides probably strike the golden mean; kinetically at neutral pH they are not so unstable so as to cause serious structural instability but also unstable enough to be formed and hydrolyzed easily in the normal life cycle of proteins. What is the rate of hydrolysis of a simple unactivated peptide in neutral water? The half-life of the tetrapeptide Phe-Phe-Phe-Gly was found to be 7 years, which sounds stable enough for most purposes (JACS, Vol. 110, 1988, pg. 7529-7534)
Click to access JACS%201988%20110%207529-7534.pdf
It’s like the classic question of why nature chose phosphates and not sulfates or arsenates. This was answered in a fascinating and classic Science paper by the late Frank Westheimer. If you haven’t seen this before, drop everything and read it now! It’s one of my favorite chemistry papers.
Click to access westheimer.pdf
Other examples of side chains interacting with each other are the homemade cofactors. At least 5 examples are known, all involving tyrosine. In one case the SH of a cysteine elsewhere in the protein forms a thioether with the ortho carbon to the OH of tyrosine (in galactose oxidase). Copper is chelated, and alcohols are oxidized to aldehydes by this. Radicals are involved. Other examples are given in PNAS 98 12863 – 12865 ’01