One reason our brain is 3 times that of a chimpanzee

Just based on the capacity of the skull, our brain is 3 – 4 times larger than that of our closest primate relative, the chimp. Most of the increase in size occurs in the cerebral cortex (the gray matter) just under the skull. Our cortex is thrown into folds because there is so much of it. Compare the picture of the mouse brain (smooth) and ours, wrinkled like a walnut

We now may have part of the explanation. A fascinating paper studied genetic differences between the progenitor cells from which the cortex arises (radial glia) in man and mouse. They found 56 protein coding genes expressed in our radial glia not present in the mouse (out of 20,000 or so).

One in particular called by the awful name ARHGAP11B is particularly fascinating. Why? Because it’s the product of a gene duplication of ARHGAP11A. When did this happen — after the human line split off from the chimp 6 million years ago. Chimps have no such duplication, just the original

Put ARHGAP11B into a developing mouse and its cortex expands so much it forms folds.

There has been all sorts of work on the genetic difference between man and chimp. There almost too many — [ Nature vol. 486 pp. 481 – 482 ’12 ] — some 20,000,000. Finding the relevant ones is the problem. ARHGAP11A is by far the best we’ve found to date.

Another fascinating story is the ‘language gene’ discovered in a family suffering from a speech and language disorder. It’s called FOXP2. Since the last common ancestor of humans and mice (70 megaYears ago) there have been only 3 changes in the 715 amino acids comprising the protein. 2 of them have occurred in the human lineage since it split with the chips 6 megaYears ago. So far no one has put the human FOXP2 gene into a chimp and got it to talk. For more details see

There is all sorts of fascinating molecular biology about what these two genes actually do in the cell, but that would make this post too long,. This is, in part, a chemistry blog and just what FOXP2 and ARHGAP11A actually do involves some beautiful and elegant chemistry — look up RhoGAP and Winged Helix transcription factors. Ferrari’s are beautiful cars, and become even more beautiful when you understand what’s going on under the hood. Chemistry gives you that for molecular, cellular and organismal biology.

Of what use is an inactive enzyme?

Why should a cell take the trouble make an enzyme protein with no enzymatic activity? It takes metabolic energy to store the information for a protein in DNA, transcribe the DNA into RNA and then translate the RNA into protein. Is this junk protein a la junk DNA? Not at all — and therein lies a tale.

All sorts of nasty bugs inveigle their way into cells, among them viruses (such as influenza) whose genome is made of RNA, rather than DNA. Not only that, but in many virus their genome is not single stranded (like mRNA) but double stranded with two RNA strands base paired to each other (just like DNA, except for an extra oxygen on the ribose sugars in the backbone).

Nucleated cells don’t contain much double stranded RNA (dsRNA) outside the nucleus, so it almost always means trouble. An extremely elegant mechanism exists to find and respond to such RNA. Recall that double helix molecules can reach enormous lengths.The 3.2 billion base pairs of our genome, if stretched out, would be more than a yard.

Well we have at least 4 genes which bind dsRNA and then signal trouble. They all make a molecule called 2′ – 5′ oligoadenyic acid (2-5A) from ATP, so they are called OligoAdenylate Syntheses (OASs). The 2-5A, once made wanders about the cell until it finds another enzyme called RNAase L. 2-5A binds to RNAase L causing it to dimerize and become active. RNAase L then destroys all the RNA in the cell, killing it along with the invading virus. Pretty harsh, but it’s one way to stop the virus from spreading and killing more cells.

A recent paper concerns OAS3, which has 3 catalytic modules rather than just one like most enzymes. Even worse, 2 of the 3 catalytic modules can’t make 2-5A (but they still can bind dsRNA). OAS3 is a large protein (over 1,000 amino acids), so it has some length to it. The 3 catalytic modules are spread out along OAS3 with the active catalytic module at one end and one of the inactive modules at the other.

The modules at both ends bind dsRNA, but only the active module makes 2-5A when it does. Interestingly, the inactive module binds dsRNA much more strongly than the active one.

OK, you’ve got the picture — what possible use is this rather Byzantine set up?

See if you can figure it out.

It’s incredibly clever and elegant, and shows the danger to regarding anything within the cell as functionless (or junk). Teleology rides supreme in molecular and cellular biology.

Give up?

OAS3 essentially acts as a molecular ruler making 2-5A only when long dsRNA (e.g. over 50 nucleotides long) binds to it. The inactive module gloms onto longish dsRNA, holding it tightly until till Brownian motion brings it to the other end of OAS3 activating the catalytic module to make 2-5A. This is good as the cell normally contains all sorts of shorter RNA duplexes (the binding of microRNAs to the 3′ end of mRNAs come to mind — but they are much shorter (22 nucleotides at most).

No wonder we get sick

“It is estimated that a human cell repairs 10,000 – 20,000 DNA lesions per day” This is the opening sentence of Proc. Natl. Acad. Sci. vol. 112 pp. 3997 – 4002 ’15, but no source for this estimate is given. The lesions range from single and double strand breaks in the sugar phosphate backbone of the DNA helix, to hydrolytic losses of a DNA base from the backbone, to chemical modification of the DNA bases themselves — oxidation etc. etc.

What needs explaining then, is why we stay as well as we do.

Why we imperfectly understand randomness the way we do.

The cognoscenti think the average individual is pretty dumb when it comes to probability and randomness. Not so, says a fascinating recent paper [ Proc. Natl. Acad. Sci. vol. 112 pp. 3788 – 3792 ’15 ] The average joe (this may mean you) when asked to draw a random series of fifty or so heads and tails never puts in enough runs of heads or runs of tails. This leads to the gambler’s fallacy, that if an honest coin gives a run of say 5 heads, the next result is more likely to be tails.

There is a surprising amount of structure lurking within purely random sequences such as the toss of a fair coin where the probability of heads is exactly 50%. Even with a series with 50% heads, the waiting time for two heads (HH) or two tails (TT) to appear is significantly longer than for an alternation (HT or TH). On average 6 tosses will be required for HH or TT to appear while only an average of 4 are needed for HT or TH.

This is why Joe SixPack never puts in enough runs of Hs or Ts.

Why should the wait be longer for HH or TT even when 50% of the time you get a H or T. The mean time for HH and TT is the same as for HT and TH. The variance is different because the occurrences of HH and TT are bunched in time, while the HT and TH are spread evenly.

It gets worse for longer repetitions — they can build on each other. HHH contains two instances of HH, while alterations do not. Repetitions bunch together as noted earlier. We are very good at perceiving waiting times, and this is probably why we think repetitions are less likely and soon to break up.

The paper goes a lot farther constructing a neural model, based on the way our brains integrate information over time when processing sequences of events. It takes into consideration our perceptions of mean time AND waiting times. We average the two. This produces the best fitting bias gain parameter for an existing Bayesian model of randomness.

See, you’re not as dumb as they thought you were.

Another reason for our behavior comes from neuropsychology and physiological psychology. We have ways to watch the electrical activity of your brain and find out when you perceive something as different. It’s called mismatch negativity (see for more detail). It a brain potential (called P300) peaking .1 -.25 seconds after a deviant tone or syllable.

Play 5 middle c’s in a row followed by a d than c’s again. The potential doesn’t occur after any of the c’s just after the d. This has been applied to the study of infant perception long before they can speak.

It has shown us that asian and western newborn infants both hear ‘r’ and ‘l’ quite well (showing mismatch negativity to a sudden ‘r’ or ‘l’ in a sequence of other sounds). If the asian infant never hears people speaking words with r and l in them for 6 months, it loses mismatch negativity to them (and clinical perception of them). So our brains are literally ‘tuned’ to understand the language we hear.

So we are more likely to notice the T after a run of H’s, or an H after a run of T’s. We are also likely to notice just how long it has been since it last occurred.

This is part of a more general phenomenon — the ability of our brains to pick up and focus on changes in stimuli. Exactly the same phenomenon explains why we see edges of objects so well — at least here we have a solid physiologic explanation — surround inhibition (for details see — It happens in the complicated circuitry of the retina, before the brain is involved.

Philosophers should note that this destroys the concept of the pure (e.g. uninterpreted) sensory percept — information is being processed within our eyes before it ever gets to the brain.

Update 31 Mar — I wrote the following to the lead author

” Dr. Sun:

Fascinating paper. I greatly enjoyed it.

You might be interested in a post from my blog (particularly the last few paragraphs). I didn’t read your paper carefully enough to see if you mention mismatch negativity, P300 and surround inhibition. if not, you should find this quite interesting.


And received the following back in an hour or two

“Hi, Luysii- Thanks for your interest in our paper. I read your post, and find it very interesting, and your interpretation of our findings is very accurate. I completely agree with you making connections to the phenomenon of change detection and surround inhibition. We did not spell it out in the paper, but in the supplementary material, you may find some relevant references. For example, the inhibitory competition between HH and HT detectors is a key factor for the unsupervised pattern association we found in the neural model.


Nice ! ! !

Should pregnant women smoke pot?

Well, maybe this is why college board scores have declined so much in recent decades that they’ve been normed upwards. Given sequential MRI studies on brain changes throughout adolescence (with more to come), we know that it is a time of synapse elimination. (this will be the subject of another post). We also know that endocannabinoids, the stuff in the brain that marihuana is mimicking, are retrograde messengers there, setting synaptic tone for information transmission between neurons.

But there’s something far scarier in a paper that just came out [ Proc. Natl. Acad. Sci. vol. 112 pp. 3415 – 3420 ’15 ]. Hedgehog is a protein so named because its absence in fruitflies (Drosophila) causes excessive bristles to form, making them look like hedgehogs. This gives you a clue that Hedgehog signaling is crucial in embryonic development. A huge amount is known about it with more being discovered all the time — for far more details than I can provide see

Unsurprisingly, embryonic development of the brain involves hedgehog, e,g, [ Neuron vol. 39 pp. 937 – 950 ’03 ] Hedgehog (Shh) signaling is essential for the establishment of the ventral pattern along the whole neuraxis (including the telencephalon). It plays a mitogenic role in the expansion of granule cell precursors during CNS development. This work shows that absence of Shh decreases the number of neural progenitors in the postnatal subventricular zone and hippocampus. Similarly conditional inactivation of smoothened results in the formation of fewer neurospheres from progenitors in the subventricular zone. Stimulation of the hedgehog pathway in the mature brain results in elevated proliferation in telencephalic progenitors. It’s a lot of unfamiliar jargon, but you get the idea.

Of interest is the fact that the protein is extensively covalently modified by lipids (cholesterol at the carboxy terminal end and palmitic acid at the amino terminal end. These allow hedgehog to bind to its receptor (smoothened). It stands to reason that other lipids might block this interaction. The PNAS work shows this is exactly the case (in Drosophila at least). One or more lipids present in Drosophila lipoprotein particles are needed in vivo to keep Hedgehog signaling turned off in wing discs (when hedgehog ligand isn’t around). The lipids destabilize Smoothtened. This work identifies endocannabinoids as the inhibitory lipids from extracts of human very low density lipoprotein (VLDL).

It certainly is a valid reason for women not to smoke pot while pregnant. The other problem with the endocannabinoids and exocannabinoids (e.g. delta 9 tetrahydrocannabinol), is that they are so lipid soluble they stick around for a long time — see

It is amusing to see regulatory agencies wrestling with ‘medical marihuana’ when it never would have gotten through the FDA given the few solid studies we have in man.

A post which may actually be of some use to Safari users

This post may actually be of some use (to those of you using Safari on a Mac anyway). Yesterday, I had the awful experience of a pop-up that I couldn’t get rid of. It said that I had to call a number right away to protect my identity etc. etc. I’d heard about malware that got on your computer encrypting everything so you couldn’t use it, except to pay them a ransom.

So I tried quitting Safari and restarting. No luck. There it was along with sites I always go to on Safari (PNAS, Nature, Science, Cell and Neuron).

So I tried to shut down (which wasn’t possible because I got a note that Safari was busy).

Then I used Force Quit to shut down Safari and was then able to shut down.

Rebooting was of no help whatsoever, as the pop-up appeared along with all 5 sites I usually have open whenever I opened Safari. This happened several times, yours truly being bull headed enough to try it again and again against all hope.

Time to call Applecare — they fixed it immediately. Apparently Safari has a some sore of cache which reopens everything you’ve opened on your last visit. This is what brought up my favored sites and the annoying popup.

The trick is to Open Safari from the Dock (and you must do it this way, not from recently used items) with the shift key held down — this flushed the cache (and the pop-up along with it).

Applecare said this pop-up wasn’t malware, just a scam which charged money to get rid of it (which you can now do free of charge).

Why drug discovery is so hard: Reason #26 — We’re discovering new players all the time

Drug discovery is so very hard because we don’t understand the way cells and organisms work very well. We know some of the actors — DNA, proteins, lipids, enzymes but new ones are being discovered all the time (even among categories known for decades such as microRNAs).

Briefly microRNAs bind to messenger RNAs usually decreasing their stability so less protein is made from them (translated) by the ribosome. It’s more complicated than that (see later), but that’s not bad for a first pass.

Presently some 2,800 human microRNAs have been annotated. Many of them are promiscuous binding more than one type of mRNA. However the following paper more than doubled their number, finding some 3,707 new ones [ Proc. Natl. Acad. Sci. vol. 112 pp. E1106 – E1115 ’15 ]. How did they do it?

Simplicity itself. They just looked at samples of ‘short’ RNA sequences from 13 different tissue types. MicroRNAs are all under 30 nucleotides long (although their precursors are not). The reason that so few microRNAs have been found in the past 20 years is that cross-species conservation has been used as a criterion to discover them. The authors abandoned the criterion. How did they know that this stuff just wasn’t transcriptional chaff? Two enzymes (DROSHA, DICER) are involved in microRNA formation from larger precursors, and inhibiting them decreased the abundance of the ‘new’ RNAs, implying that they’d been processed by the enzymes rather than just being runoff from the transcriptional machinery. Further evidence is that of half were found associated with a protein called Argonaute which applies the microRNA to the mRBNA. 92% of the microRNAs were found in 10 or more samples. An incredible 23 billion sequenced reads were performed to find them.

If that isn’t complex enough for you, consider that we now know that microRNAs bind mRNAs everywhere, not just in the 3′ untranslated region (3′ UTR) — introns, exons. MicroRNAs also bind pseudogenes, SINEes, circular RNAs, nonCoding RNAs. So it’s a giant salad bowl of various RNAs binding each other affecting their stability and other functions. This may be echoes of prehistoric life before DNA arrived on the scene.

It’s early times, and the authors estimate that we have some 25,000 microRNAs in our genome — more than the number of protein genes.

As always, the Category “Molecular Biology Survival Guide” found on the left should fill in any gaps you may have.

One rather frightening thought; If, as Dawkins said, we are just large organisms designed to allow DNA to reproduce itself, is all our DNA, proteins, lipids etc, just a large chemical apparatus to allow our RNA to reproduce itself? Perhaps the primitive RNA world from which we are all supposed to have arisen, never left.

The dietary guidelines have been changed — what are the faithful to believe now ?

While we were in China dietary guidelines shifted. Cholesterol is no longer bad. Shades of Woody Allen and “Sleeper”. It’s life imitating art.

Sleeper is one of the great Woody Allen movies from the 70s. Woody plays Miles Monroe, the owner of (what else?) a health food store who through some medical mishap is frozen in nitrogen and is awakened 200 years later. He finds that scientific research has shown that cigarettes and fats are good for you. A McDonald’s restaurant is shown with a sign “Over 795 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 Served”

Seriously then, should you believe any dietary guidelines? In my opinion you shouldn’t. In particular I’d forget the guidelines for salt intake (unless you actually have high blood pressure in which case you should definitely limit your salt). People have been fighting over salt guidelines for decades, studies have been done and the results have been claimed to support both sides.

So what’s a body to do? Well here are 4 things which are pretty solid (which few docs would disagree with, myself included)

l. Don’t smoke
2. Don’t drink too much (over 2 drinks a day), or too little (no drinks). Study after study has shown that mortality is lowest with 1 – 2 drinks/day
3. Don’t get fat — by this I mean fat (Body Mass Index over 30) not overweight (Body Mass Index over 25). The mortality curve for BMI in this range is pretty flat. So eat whatever you want, it’s the quantities you must control.
4. Get some exercise — walking a few miles a week is incredibly much better than no exercise at all — it’s probably half as good as intense workouts — compared to doing nothing.

Not very sexy, but you’re very unlikely to find anyone telling you the opposite 50years from now.

It’s off topic, but I’d use the same degree of skepticism about the dire predictions of the Global Warming (AKA Climate change) people, particularly since there has been no change in global mean temperature this century.

When the active form of a protein is intrinsically disordered

Back in the day, biochemists talked about the shape of a protein, influenced by the spectacular pictures produced by Xray crystallography. Now, of course, we know that a protein has multiple conformations in the cell. I still find it miraculous that the proteins making us up have only relatively few. For details see —

Presently, we also know that many proteins contain segments which are intrinsically disordered (e.g. no single shape).The pendulum has swung the other way — “estimations that contiguous regions longer than 50 amino acids ‘may be present” in ‘up to’ 50% of proteins coded in eukaryotic genomes [ Proc. Natl. Acad. Sci. vol. 102 pp. 17002 – 17007 ’05 ]

[ Science vol. 325 pp. 1635 – 1636 ’09 ] Compared to ordered regions, disordered regions of proteins have evolved rapidly, contain many short linear motifs that mediate protein/protein interactions, and have numerous phosphorylation sites compared to ordered regions. Disordered regions are enriched in serine and threonine residues, while ordered sequences are enriched in tyrosines — this highlights functional differences in the types of phosphorylation. Interestingly tyrosines have been lost during evolution.

What are unstructured protein segments good for? One theory is that the disordered segment can adopt different conformations to bind to different partners — this is the moonlighting effect. Then there is the fly casting mechanism — by being disordered (hence extended rather than compact) such proteins can flail about and find partners more easily.

Given what we know about enzyme function (and by inference protein function), it is logical to assume that the structured form of a protein which can be unstructured is the functional form.

Not so according to this recent example [ Nature vol. 519 pp. 106 – 109 ’15 ]. 4EBP2 is a protein involved in the control of protein synthesis. It binds to another protein also involved in synthesis (eIF4E) to suppress a form of translation of mRNA into protein (cap dependent translation if you must know). 4EBP2 is intrinsically disordered. When it binds to its target it undergoes a disorder to ordered transition. However eIF4E binding only occurs from the intrinsically disordered form.

Control of 4EBP2 activity is due, in part, to phosphorylation on multiple sites. This induces folding of amino acids #18 – #62 into a 4 stranded beta domain which sequesters the canonical YXXXLphi motif with which 4EBP2 binds eIF4E (Y stands for tyrosine, X for any amino acid, L for leucine and phi for any bulky hydrophobic amino acid). So here we have an inactive (e.g. nonbonding) form of a protein being the structured rather than the unstructured form. The unstructured form of 4EBP2 is therefore the physiologically active form of the protein.

Off to China

No posts until March. Off to meet our new Granddaughter. Will be Email and Internet free until then.

To fill up the empty hours until I’m back, drug chemists should study the physical chemistry of protein/protein interaction, since that’s where most cellular work is done (and where new drugs should be useful). The interctions are multiple, transient and nonequivalent (the WordPress processor substituted this for nonCovalent).

An interesting paper made all 160,000 possible variants of 4 amino acids at the interface between two bacterial proteins [ Science vol. 347 pp. 673 – 677 ’15 ]. For bacterial histidine kinases mutating just 3 or 4 interfacial amino acids to match those in another kinase is enough to reprogram their specificity. The key amino acids are Ala284, Val285, Ser288, Thr289. The results were rather surprising.



Get every new post delivered to your Inbox.

Join 75 other followers