Category Archives: Chemistry (relatively pure)

Did these guys just repeal the second law of thermodynamics and solve the global warming problem?

Did these guys just repeal the second law of thermodynamics and solve the global warming problem to boot? [ Science vol. 355 pp. 1023 – 1024, 1062 -1066 ’17 ] Heady stuff. But they can put a sheet of metamaterial over water during the day in Arizona and cool it by 8 degrees Centigrade in two hours!

How did they do it? Time for a little atmospheric physics. There is nothing in the Earth’s atmosphere which absorbs light of wavelength between 8 and 13 microns (this is called the atmospheric window). So anything radiating energy in this range sends it out into space. This is called radiative cooling. It doesn’t work during the day because most materials absorb sunlight in the visible and near infrared range (.7 -2.5 microns) heating them up. Solar power density overwhelms the room temperature radiation spectrum shorter than 4 microns. So for daytime cooling you need a material reflecting all the light shorter than 4 microns, while being fully emissive for longer wavelengths.

This work describes a metamaterial– https://en.wikipedia.org/wiki/Metamaterial — in which small (average diameter 4 microns) spheres ofSiO2 (glass) are randomly dispersed in a polymer matrix transparent to visible and infrared light. The matrix is 50 microns thick. The whole shebang is backed by a very thin (.2 micron) silver mirror. So light easily passes through the film and is then bounced back by the mirror without being absorbed.

Chemists have already studied the Carnot cycle, which gives the maximum efficiency of a heat engine. This is always proportional to the temperature difference between phases of the cycle. That’s why the biggest thing about a nuclear power plant is the cooling tower (and almost as important). Well few things are colder than the cosmic microwave background (2.7 degrees Centigrade above absolute zero).

So while the entropy of the universe increases as the heat goes somewhere, locally it looks like the second law of thermodynamics is being violated. No work is done (as far as i can tell) yet the objects spontaneously cool.

Perhaps the physics mavens out there can help. I seem to remember Feynman and Wheeler once saying something to the effect that radiation is impossible without something around to absorb it. If I haven’t totally garbled the physics, it almost sounds like emitter and absorber are entangled.

Anyway beaming heat out into space through the atmospheric window sounds like a good way to combat global warming.

No wonder DARPA supported this research.

The humble snow flea teaches us some protein chemistry

Who would have thought that the humble snow flea (that we used to cross country ski over in Montana) would teach us a great deal about protein chemistry turning over some beloved shibboleths in the process.

The flea contains an antifreeze protein, which stops ice crystals from forming inside the cells of the flea in the cold environment in which it lives. The protein contains 81 amino acids, is 45% glycine and contains six  type II polyProline helices each 8 amino acids long (https://en.wikipedia.org/wiki/Polyproline_helix). None of the 6 polyProline helices contain proline despite the name, but all contain from 2 to 6 glycines. Also to be noted is (1) the absence of a hydrophobic core (2) the absence of alpha helices (3) the absence of beta turns (4) the protein has low sequence complexity.

Nonethless it quickly folds into a stable structure — meaning that (1), (2), and (3) are not necessary for a stable protein structure. (4) means that low sequence complexity in a protein sequence does not invariably imply an intrinsically disordered protein.

You can read all about it in Proc. Natl. Acad. Sci. vol. 114 pp. 2241 – 2446 ’17.

Time for some humility in what we thought we knew about proteins, protein folding, protein structural stability.

Ring currents ride again

One of the most impressive pieces of evidence (to me at least) that we really understand what electrons are doing in organic molecules are the ring currents. Recall that the pi electrons in benzene are delocalized above and below the planar ring determined by the 6 carbon atoms.

How do we know this? When a magnetic field is applied the electrons in the ring cloud circulate to oppose the field. So what? Well if you can place a C – H bond above the ring, the induced current will shield it. Such molecules are known, and the new edition of Clayden (p. 278) shows the NMR spectra showing [ 7 ] paracyclophane which is benzene with 7 CH2’s linked to the 1 and 4 positions of benzene, so that the hydrogens of the 4th CH2 is directly over the ring (7 CH2’s aren’t long enough for it to be anywhere else). Similarly, [ 18 ] Annulene has 6 hydrogens inside the armoatic ring — and these hydrogens are even more deshielded. Interestingly building larger and larger annulenes, as shown that aromaticity decreases with increasing size, vanishing for systems with more than 30 pi electrons (diameter 13 Angstroms), probably because planarity of the carbons becomes less and less possible, breaking up the cloud.

This brings us to Nature vol. 541 pp. 200 – 203 ’17 which describes a remarkable molecule with 6 porphyins in a ring hooked together by diyne linkers. The diameter of the circle is 24 Angstroms. Benzene and [ 18 ] Annulene have all the carbons in a plane, but the picture of the molecule given in the paper does not. Each of the porphyrins is planar of course, but each plane is tangent to the circle of porphyrins.

Also discussed is the fact that ‘anti-aromatic’ ring currents exist, in which they circulate to enhance rather than diminish the imposed magnetic field. The molecule can be switched between the aromatic and anti-aromatic states by its oxidation level. When it has 78 electrons ( 18 * 4 ) + 2 in the ring (with a charge of + 6) it is aromatic. When it has 80 elections with a + 4 charge it is anti-aromatic — further confirmation of the Huckel rule (as if it was needed).

On a historical note reference #27 is to a paper of Marty Gouterman in 1961, who was teaching grad students in chemistry in the spring of 1961. He was an excellent teacher. Here he is at the University of Washington — http://faculty.washington.edu/goutermn/

Memories are made of this ?

Back in the day when information was fed into computers on punch cards, the data was the holes in the paper not the paper itself. A far out (but similar) theory of how memories are stored in the brain just got a lot more support [ Neuron vol. 93 pp. 6 -8, 132 – 146 ’17 ].

The theory says that memories are stored in the proteins and sugar polymers surrounding neurons rather than the neurons themselves. These go by the name of extracellular matrix, and memories are the holes drilled in it which allow synapses to form.

Here’s some stuff I wrote about the idea when I first ran across it two years ago.

——

An article in Science (vol. 343 pp. 670 – 675 ’14) on some fairly obscure neurophysiology at the end throws out (almost as an afterthought) an interesting idea of just how chemically and where memories are stored in the brain. I find the idea plausible and extremely surprising.

You won’t find the background material to understand everything that follows in this blog. Hopefully you already know some of it. The subject is simply too vast, but plug away. Here a few, seriously flawed in my opinion, theories of how and where memory is stored in the brain of the past half century.

#1 Reverberating circuits. The early computers had memories made of something called delay lines (http://en.wikipedia.org/wiki/Delay_line_memory) where the same impulse would constantly ricochet around a circuit. The idea was used to explain memory as neuron #1 exciting neuron #2 which excited neuron . … which excited neuron #n which excited #1 again. Plausible in that the nerve impulse is basically electrical. Very implausible, because you can practically shut the whole brain down using general anesthesia without erasing memory. However, RAM memory in the computers of the 70s used the localized buildup of charge to store bits and bytes. Since charge would leak away from where it was stored, it had to be refreshed constantly –e.g. at least 12 times a second, or it would be lost. Yet another reason data should always be frequently backed up.

#2 CaMKII — more plausible. There’s lots of it in brain (2% of all proteins in an area of the brain called the hippocampus — an area known to be important in memory). It’s an enzyme which can add phosphate groups to other proteins. To first start doing so calcium levels inside the neuron must rise. The enzyme is complicated, being comprised of 12 identical subunits. Interestingly, CaMKII can add phosphates to itself (phosphorylate itself) — 2 or 3 for each of the 12 subunits. Once a few phosphates have been added, the enzyme no longer needs calcium to phosphorylate itself, so it becomes essentially a molecular switch existing in two states. One problem is that there are other enzymes which remove the phosphate, and reset the switch (actually there must be). Also proteins are inevitably broken down and new ones made, so it’s hard to see the switch persisting for a lifetime (or even a day).

#3 Synaptic membrane proteins. This is where electrical nerve impulses begin. Synapses contain lots of different proteins in their membranes. They can be chemically modified to make the neuron more or less likely to fire to a given stimulus. Recent work has shown that their number and composition can be changed by experience. The problem is that after a while the synaptic membrane has begun to resemble Grand Central Station — lots of proteins coming and going, but always a number present. It’s hard (for me) to see how memory can be maintained for long periods with such flux continually occurring.

This brings us to the Science paper. We know that about 80% of the neurons in the brain are excitatory — in that when excitatory neuron #1 talks to neuron #2, neuron #2 is more likely to fire an impulse. 20% of the rest are inhibitory. Obviously both are important. While there are lots of other neurotransmitters and neuromodulators in the brains (with probably even more we don’t know about — who would have put carbon monoxide on the list 20 years ago), the major inhibitory neurotransmitter of our brains is something called GABA. At least in adult brains this is true, but in the developing brain it’s excitatory.

So the authors of the paper worked on why this should be. GABA opens channels in the brain to the chloride ion. When it flows into a neuron, the neuron is less likely to fire (in the adult). This work shows that this effect depends on the negative ions (proteins mostly) inside the cell and outside the cell (the extracellular matrix). It’s the balance of the two sets of ions on either side of the largely impermeable neuronal membrane that determines whether GABA is excitatory or inhibitory (chloride flows in either event), and just how excitatory or inhibitory it is. The response is graded.

For the chemists: the negative ions outside the neurons are sulfated proteoglycans. These are much more stable than the proteins inside the neuron or on its membranes. Even better, it has been shown that the concentration of chloride varies locally throughout the neuron. The big negative ions (e.g. proteins) inside the neuron move about but slowly, and their concentration varies from point to point.

Here’s what the authors say (in passing) “the variance in extracellular sulfated proteoglycans composes a potential locus of analog information storage” — translation — that’s where memories might be hiding. Fascinating stuff. A lot of work needs to be done on how fast the extracellular matrix in the brain turns over, and what are the local variations in the concentration of its components, and whether sulfate is added or removed from them and if so by what and how quickly.

—-

So how does the new work support this idea? It involves a structure that I’ve never talked about — the lysosome (for more info see https://en.wikipedia.org/wiki/Lysosome). It’s basically a bag of at least 40 digestive and synthetic enzymes inside the cell, which chops anything brought to it (e.g. bacteria). Mutations in the enzymes cause all sorts of (fortunately rare) neurologic diseases — mucopolysaccharidoses, lipid storage diseases (Gaucher’s, Farber’s) the list goes on and on.

So I’ve always thought of the structure as a Pandora’s box best kept closed. I always thought of them as confined to the cell body, but they’re also found in dendrites according to this paper. Even more interesting, a rather unphysiologic treatment of neurons in culture (depolarization by high potassium) causes the lysosomes to migrate to the neuronal membrane and release its contents outside. One enzyme released is cathepsin B, a proteolytic enzyme which chops up the TIMP1 outside the cell. So what. TIMP1 is an endogenous inhibitor of Matrix MetalloProteinases (MMPs) which break down the extracellular matrix. So what?

Are neurons ever depolarized by natural events? Just by synaptic transmission, action potentials and spontaneously. So here we have a way that neuronal activity can cause holes in the extracellular matrix,the holes in the punch cards if you will.

Speculation? Of course. But that’s the fun of reading this stuff. As Mark Twain said ” There is something fascinating about science. One gets such wholesale returns of conjecture out of such a trifling investment of fact.”

Tidings of great joy

One of the hardest things I had to do as a doc was watch an infant girl waste away and die of infantile spinal muscular atrophy (Werdnig Hoffmann disease) over the course of a year. Something I never thought would happen (a useful treatment) may be at hand. The actual papers are not available yet, but two placebo controlled trials with a significant number of patients (84, 121) in each were stopped early because trial monitors (not in any way involved with the patients) found the treated group was doing much, much better than the placebo. A news report of the trials is available [ Science vol. 354 pp. 1359 – 1360 ’16 (16 December) ].

The drug, a modified RNA molecule, (details not given) binds to another RNA which codes for the missing protein. In what follows a heavy dose of molecular biology will be administered to the reader. Hang in there, this is incredibly rational therapy based on serious molecular biological knowledge. Although daunting, other therapies of this sort for other neurologic diseases (Huntington’s Chorea, FrontoTemporal Dementia) are currently under study.

If you want to start at ground zero, I’ve written a series https://luysii.wordpress.com/category/molecular-biology-survival-guide/ which should tell you enough to get started. Start here — https://luysii.wordpress.com/2010/07/07/molecular-biology-survival-guide-for-chemists-i-dna-and-protein-coding-gene-structure/
and follow the links to the next two.

Here we go if you don’t want to plow through all three

Our genes occur in pieces. Dystrophin is the protein mutated in the commonest form of muscular dystrophy. The gene for it is 2,220,233 nucleotides long but the dystrophin contains ‘only’ 3685 amino acids, not the 770,000+ amino acids the gene could specify. What happens? The whole gene is transcribed into an RNA of this enormous length, then 78 distinct segments of RNA (called introns) are removed by a gigantic multimegadalton machine called the spliceosome, and the 79 segments actually coding for amino acids (these are the exons) are linked together and the RNA sent on its way.

All this was unknown in the 70s and early 80s when I was running a muscular dystrophy clininc and taking care of these kids. Looking back, it’s miraculous that more of us don’t have muscular dystrophy; there is so much that can go wrong with a gene this size, let along transcribing and correctly splicing it to produce a functional protein.

One final complication — alternate splicing. The spliceosome removes introns and splices the exons together. But sometimes exons are skipped or one of several exons is used at a particular point in a protein. So one gene can make more than one protein. The record holder is something called the Dscam gene in the fruitfly which can make over 38,000 different proteins by alternate splicing.

There is nothing worse than watching an infant waste away and die. That’s what Werdnig Hoffmann disease is like, and I saw one or two cases during my years at the clinic. It is also called infantile spinal muscular atrophy. We all have two genes for the same crucial protein (called unimaginatively SMN). Kids who have the disease have mutations in one of the two genes (called SMN1) Why isn’t the other gene protective? It codes for the same sequence of amino acids (but using different synonymous codons). What goes wrong?

[ Proc. Natl. Acad. Sci. vol. 97 pp. 9618 – 9623 ’00 ] Why is SMN2 (the centromeric copy (e.g. the copy closest to the middle of the chromosome) which is normal in most patients) not protective? It has a single translationally silent nucleotide difference from SMN1 in exon 7 (e.g. the difference doesn’t change amino acid coded for). This disrupts an exonic splicing enhancer and causes exon 7 skipping leading to abundant production of a shorter isoform (SMN2delta7). Thus even though both genes code for the same protein, only SMN1 actually makes the full protein.

More background. The molecular machine which removes the introns is called the spliceosome. It’s huge, containing 5 RNAs (called small nuclear RNAs, aka snRNAs), along with 50 or so proteins with a total molecular mass again of around 2,500,000 kiloDaltons. Think about it chemists. Design 50 proteins and 5 RNAs with probably 200,000+ atoms so they all come together forming a machine to operate on other monster molecules — such as the mRNA for Dystrophin alluded to earlier. Hard for me to believe this arose by chance, but current opinion has it that way.

Splicing out introns is a tricky process which is still being worked on. Mistakes are easy to make, and different tissues will splice the same pre-mRNA in different ways. All this happens in the nucleus before the mRNA is shipped outside where the ribosome can get at it.

The papers [ Science vol. 345 pp. 624 – 625, 688 – 693 ’14 ].describe a small molecule which acts on the spliceosome to increase the inclusion of SMN2 exon 7. It does appear to work in patient cells and mouse models of the disease, even reversing weakness.

I was extremely skeptical when I read the papers two years ago. Why? Because just about every protein we make is spliced (except histones), and any molecule altering the splicing machinery seems almost certain to produce effects on many genes, not just SMN2. If it really works, these guys should get a Nobel.

Well, I shouldn’t have been so skeptical. I can’t say much more about the chemistry of the drug (nusinersen) until the papers come out.

Fortunately, the couple (a cop and a nurse) took the 25% risk of another child with the same thing and produced a healthy infant a few years later.

A new way to study protein dynamics

“Fields of 1,000,000 Volts/centiMeter are dangerously large from a laboratory point of view” — true enough, but that’s merely one TENTH of the potential difference/distance ratio found across the plasma membrane of all our cells. Here’s why after a bit of background

We wouldn’t exist without the membranes enclosing our cells which are largely hydrocarbon. Chemists know that fatty acids have one end (the carboxyl group) which dissolves in water while the rest is pure hydrocarbon. The classic is stearic acid — 18 carbons in a straight chain with a carboxyl group at one end. 3 molecules of stearic acid are esterified to glycerol in beef tallow (forming a triglyceride). The pioneers hydrolyzed it to make soap. Saturated fatty acids of 18 carbons or more are solid at body temperature (soap certainly is), but cellular membranes are fairly fluid, and proteins embedded in them move around pretty quickly. Why? Because most fatty acids found in biologic membranes over 16 carbons have double bonds in them. Guess whether they are cis or trans. Hint: the isomer used packs less well into crystals — you’ve got it, all the double bonds found in oleic (18 carbons 1 double bond), arachidonic (20 carbons, 4 double bonds) are cis this keeps membranes fluids as well. The cis double bond essentially puts a 60 degree kink in the hydrocarbon chain, making it much more difficult to pack in a liquid crystal type structure with all the hydrocarbon chains stretched out. Then there’s cholesterol which makes up 1/5 or so of membranes by weight — it also breaks up the tendency of fatty acid hydrocarbon chains to align with each other because it doesn’t pack with them very well. So cholesterol is another fluidizer of membranes.

How thick is the cellular membrane? If you figure the hydrocarbon chains of a saturated fatty acid stretched out as far as they can go, you get 1.54 Angstroms * cosine (30 degrees) = 1.33 Angstroms/carbon — times 16 = 21 Angstroms. Now double that because cellular membranes are lipid bilayers meaning that they are made of two layers of hydrocarbons facing each other, with the hydrophilic ends (carboxyls, phosphate groups) pointing outward. So we’re up to 42 Angstroms of thickness for the hydrocarbon part of the membrane. Add another 10 Angstroms or so for the hydrophilic ends (which include things like serine, choline etc. etc.) and you’re up to about 60 Angstroms thickness for the membrane (which is usually cited as 70 Angstroms — I don’t know why).

Because the electric field across our membranes is huge. The potential difference across our cell membranes is 70 milliVolts — 70 x 10^-3 volts. 70 Angstroms is 7 nanoMeters (7 x 10^-9) meters. Divide 70 x 10^-3 volts by 7 x 10^-9 and you get a field of 10,000,000 Volts/centiMeter.

So our membrane proteins live and function quite nicely in this intense electric field. Which brings us to [ Nature vol. 540 pp. 400 – 405 ’16 ] which zaps protein crystals with electric fields of this intensity, and then does Xray crystallography at various intervals to watch how the protein backbone and side chains move. The technique is called Electric Field stimulated Xray crystallography (EF-X). Unlike solution where proteins are all in slightly different conformations, the starting line is the same as is the finish line.

The electric pulse durations range from 50 – 500 nanoSeconds (50 – 500 * 10^-9 seconds). The xray pulse for doing Xray crystallography lasts all of 100 picoSeconds (100 * 10^’12). By timing the delay between the electric pulse and the Xray pulse you watch the protein move in time in response to the electric pulse. Hardly physiologic, but it seems likely that protein motions will follow the path of least resistance, which should tell us which conformations are closest in energy to the energy minimum found in proteins. The pulses are collected 50, 100, 200 nanoSeconds after pulse onset. The crystals tolerated ‘huncreds’ of 100 – 500 nanoSecond megaVolt electric field pulses. But even 50 nanoSeconds is pretty long when protein dynamics is concerned, as bond vibrations are as fast as a few femtoSeconds (10^-15 seconds). An electric field of this strength exerts a force of 10^

The technology enabling this is fantastic, but it is quite similar in concept what the late Nobelist Ahmed Zewail was doing. Of course his work was even faster looking at chemical reactions at the femtoSecond level of time (10^-15 seconds). So as the year draws to a close, it’s nice to see his ideas live on, even if he didn’t.

Scramblase

Pickings have been slim lately, but here’s a great paper and a puzzle for you chemists out there. Most chemists (and biologists) know what a lipid bilayer is. It’s basically a soap bubble, with water loving (hydrophilic) groups on the outside of both sides of the bilayer, and hydrocarbon chains within. If the hydrocarbon chains are all stretched out the distance between carbons 1 and 3 is 2.66 Angstroms, and you have an 18 carbon fatty acid (stearic acid) it should be 8 * 2.66 + 1.33 Angstroms long (22.6 Angstroms). Double this for the bilayer and you have a thickness of 45 Angstroms. It’s probably less because carbon chains aren’t extended, partially because of entropy and largely because of cholesterol which breaks up any chance of such order (which maybe an important function for it). Sitting on either side of the lipid bilayer are phosphates esterified to one of the 3 hydroxyls of glycerol, with fatty acids of at least 16 – 18 carbons esterified to the other two. Hanging off the phosphates are a variety of things, but mostly serine and choline, forming phosphatidyl serine (PS) and phosphatidyl choline (PC). Here’s a picture — https://en.wikipedia.org/wiki/Lipid_bilayer.

Scramblases are enzymes which move phospholipids from one side of the lipid bilayer essentially randomizing their composition. They undo the action of other enzymes (called flippases believe it or not) which make the lipid composition of the two leaflets of the lipid bilayer rather different. This isn’t trivial, and is behind an elegant mechanism to show scavenger cells that a cell is dead. FLippases work to put phosphatidyl serine (PS) on the side of the lipid bilayer (the leaflet) facing the cytoplasm. This, of course takes energy, and when a cell lacks energy, entropy takes its course and PS appears on the outer leaflet, telling scavenger cells (phagocytes) to eat (phagocytose) the cell.

So how does an enzyme drag phosphatidyl choline (PC) https://en.wikipedia.org/wiki/Phosphatidylcholine or phosphatidyl serine (PC) across the lipid bilayer — scrambling the compositional asymmetry. Can you figure out a mechanism for a membrane protein to do this without looking at Proc. Natl. Acad. Sci. vol. 113 pp. 140149 – 14054 ’16? Chemists think they’re smart, and if you can design a protein to do this you’re smarter than I am because I’ve always wondered (ineffectually) how this was done for a long time.

The authors describe the structure of a fungal scramblase. It functions as a dimer with each subunit containing a hydrophilic groove containing polar and charged amino acid side chains facing the dimer interface. The protein itself does something unusual — it twists the sheet of the membrane, and decreases the thickness of the membrane from 29 to 18 Angstroms (remember the maximum possible thickness of the lipid bilayer was 45 Angstroms, but isn’t that thick for the reasons given above).

Phosphatidyl choline is a zwitterion (e.g. it contains both negative and positive charges although overall electrically neutral). The charges are separated in space forming a dipole. On the cytoplasmic side of the bilayer the scramblase has some amino acid side chains also forming a dipole, and right near the channel formed by the two hydrophilic grooves of the dimer. So it attracts the head group of PC (phosphate plus choline) as one dipole does to another which is then further attracted to the hydrophilic groove entering it — its hydrocarbon tail remains in the lipid part of the membrane. Then another PC joins the fun, pushing PC #1 farther into the groove, so that a chain of PCs fills the groove, wagging their lipid tails behind them (a la Little Bo Peep).

Clever no?

All is not perfect as the model doesn’t explain how phosphatidyl serine (which isn’t a zwitterion) moves across, but it’s an incredible start.

A scary paper: Cancer by proxy

Can a good kid growing up in a bad neighborhood turn bad? Most think so. What about a genetically normal cell growing up in a bad neighborhood? Can it turn cancerous if its neighbors have a mutation ? A recent paper [ Nature vol. 539 pp.304 – 308 ’16b] demonstrates how this can happen.

A gene called PTPN11 is mutated in myelomonocytic leukemia (MML)in humans and mice. Expressing the mutant in blood cells causes leukemia in mice (nothing spectacular there).

However, expressing the mutant in marrow supporting cells, not blood cells or blood stem cells for long enough gives MML in mice which can be transplanted into normal mice producing MML there.

Note that the blood stem cells don’t contain the mutant gene. One theory has it that mutant PTPN11 recruits monocytes, which then produce other stuff (CCL3 also known as MIP1alpha and interleukin1Beta), which then turns on blood stem cells to proliferate madly causing leukemia. Giving a CCL3 receptor antagonist reverses the myeloproliferation (but it isn’t clear to me if it reverses the leukemia once established)

As far as we know the cells developing into MML don’t contain mutant PTPN11. So it’s cancer by proxy. Obviously some changes (mutations, epigenetic changes) have have occurred in the leukemic cells, but at this point we don’t know what they are.

What is ICP27 trying to tell us? One of you could get a PhD if you figure it out !

It wouldn’t be the first time a viral protein led us to an important cellular mechanism. Consider what the polio virus taught us about the translation of mRNA into protein. It cleaves two components of eIF-4F (eukaryotic Initiation (of ribosome translation of mRNA into protein) Factor 4F totally shutting down synthesis of mRNAs with a cap on their 5′ end (which is most of them). Poliovirus proteins don’t have these caps so their proteins continue to be made.

Well this brings us to ICP27 (Infected Cell Protein 27) a product of the Herpes Simplex virus. You can read all about it in [ Proc. Natl. Acad. Sci. vol. 113 pp. 12256 – 12261 ’16 ]. ICP27 is essential for herpes virus infection. This work shows that it inhibits intron splicing (but in under 1% of cellular genes) and also promotes the use of alternative 5′ splice sites.

It also induces the expression of pre-mRNAS prematurely cleaved and polyAdenylated from cryptic polyAdenylation signals located in intron 1 or intron 2 of an amazing 1% of all cellular genes. These prematurely cleaved and polyAdenylated mRNA sometimes contain novel open reading frames (ORFs). They are typically intronless (they should be) and under 2 kiloBases long. They are expressed early during viral infection and efficiently exported to cytoplasm. The ICP27 targeted genes are GC rich (as are all Herpes simplex genes), contain cytosine rich sequences near the 5′ splice site.

The paper also showed that optimization of splice site sequences, or mutation of nearby cytosines eliminated ICP27 mediated splicing inhibition. Introduction of cytosine rich sequences to an ICP27 INsensitive splicing reporter conferred susceptibility to ICP27.

How is this going to help you get a PhD? Ask yourself. What are cryptic polyAdenylation signals doing in the first two introns in so many genes? It seems obvious (to me) that as well as the virus the cell is using them for some purpose. It isn’t hard to mutate something to the signal for polyadenylation AAUAAA. Interestingly cleavage doesn’t occur here, but 30 nucleotides or so downstream. The sequence occurs every 4^6 == 4096 nucleotides (if they’re random). I’m not sure what the total length of introns #1 and #2 are of our 20,000 or so protein coding genes, but someone should be able to find out and see if 200 occurrences of this sequence is more than would be expected by chance.

The plot thickens when the paper notes that “Over 200 genes are affected by ICP27. Over 30 (including PML, STING, TRAF6, PPP6C, MAP3K7, FBXw11, IFNAR2, NKFB1, RELA and CREBP are related to the immune pathway). Do you think the cell doesn’t use this pathway as well?

What about the existence of other viral (and cellular) proteins doing the same sort of thing (but on different introns perhaps). What are those novel open reading frames in the alternatively spliced mRNAs doing?

Fascinating stuff. Time to get busy if you’re an enterprising grad student, or young faculty member.

The proteasome branches out

The surface of a protein is not at all like a ball of yarn, even though they are both one long string. This has profound implications for the immune system. Look at any solved protein structure. The backbone bobs and weaves taking water hating (hydrophobic) amino acids into the center of the protein, and putting water loving (hydrophilic) amino acids on the surface. So even though the peptide backbone is continuous, only discontinuous patches of it are displayed on the protein surface.

Which is a big problem for the immune system which wants to recognize the surface of the protein (which is all it first gets to see with an invading bug). Now we know that foreign proteins are ingested by the cell, chopped up by the proteasome, and fragments loaded on to immune molecules (class I Major Histocompatibility Complex antigens) and displayed on the cell surface so the immune system can learn what it looks like and react to it. The peptides aren’t very long — under 11 or so amino acids, but they are continuous.

What if the really distinct part of the protein surface (e.g. the immunogen)  is made of two distinct patches from the backbone? A fascinating paper shows how the immune system might still recognize it. Chop the protein up into fragments by the proteasome, and then have the fragments from adjacent patches put back together. You know that any enzyme can be run in reverse, so if the proteasome can split peptide bonds apart it can also join them together.

This is exactly what was found in a recent paper — Science vol. 354 pp. 354 – 358 ’16. The small peptides (containing at most 11 amino acids) finding their way to the cell surface were analyzed in a technical tour de force. In aggregate they go by the fancy name of immunopeptidome. They found that the proteasome IS actually splicing peptide fragments together. This is called Proteasome Catalyzed Peptide Splicing (PCPS). The present work shows that it accounts for 1/3 of the class I immunopeptidome in terms of diversity and 1/4 in terms of abundance. One-third of self antigens are represented on the cell surface of the immune cell line they studied (GR-LCL the GR-lymphoblastoid cell line) ONLY by spliced peptides. The ordering of the spliced peptide was the same as the parent protein in only half. There was no preference for the length of the protein skipped by the splice.

The work has huge implications for immunology, not least autoimmune disease.

So today I wrote the author the following

Dr. Mishto

Terrific paper ! Do you have any evidence for the spliced peptides being spatially contiguous on the surface of the parent protein. Have you looked?

This makes a lot of sense, because the immune system should ‘want’ to recognize protein conformations as they exist in the living cell, rather than stretches of amino acid sequence in the parent protein. Also, with few exceptions the surface of a given protein in vivo is a collection of discontinuous peptide sequences of the parent protein. I’ve always wondered how the immune system did this, and perhaps your paper explains things.

Luysii

and got this back almost immediately

Dear Luysii

Interesting idea. We shall have a look for few examples where the crystallography structure or the parental protein is disclosed already.

regards

Michele

It doesn’t get any better than this. Tomorrow I will be exactly 78 years and 6 months old. It shows I can still think (on occasion).

Addendum 17 Nov ’16;  It looks as though proteins are fed into the central cavity of the proteasome as a completely denatured single strand.  See figure 5 of PNAS 113 pp 12991 -m12996 ’16.  The channel to get in appears quite narrow.