Why we imperfectly understand randomness the way we do.

The cognoscenti think the average individual is pretty dumb when it comes to probability and randomness. Not so, says a fascinating recent paper [ Proc. Natl. Acad. Sci. vol. 112 pp. 3788 – 3792 ’15 ]. The average joe (this may mean you) when asked to draw a random series of fifty or so heads and tails never puts in enough runs of heads or runs of tails. This leads to the gambler’s fallacy, that if an honest coin gives a run of say 5 heads, the next result is more likely to be tails.

There is a surprising amount of structure lurking within purely random sequences such as the toss of a fair coin where the probability of heads is exactly 50%. Even with a series with 50% heads, the waiting time for two heads (HH) or two tails (TT) to appear is significantly longer than for an alternation (HT or TH). On average 6 tosses will be required for HH or TT to appear while only an average of 4 are needed for HT or TH.

This is why Joe SixPack never puts in enough runs of Hs or Ts.

Why should the wait be longer for HH or TT even when 50% of the time you get a H or T. The mean time for HH and TT is the same as for HT and TH. The variance is different because the occurrences of HH and TT are bunched in time, while the HT and TH are spread evenly.

It gets worse for longer repetitions — they can build on each other. HHH contains two instances of HH, while alterations do not. Repetitions bunch together as noted earlier. We are very good at perceiving waiting times, and this is probably why we think repetitions are less likely and soon to break up.

The paper goes a lot farther constructing a neural model, based on the way our brains integrate information over time when processing sequences of events. It takes into consideration our perceptions of mean time AND waiting times. We average the two. This produces the best fitting bias gain parameter for an existing Bayesian model of randomness.

See, you’re not as dumb as they thought you were.

Another reason for our behavior comes from neuropsychology and physiological psychology. We have ways to watch the electrical activity of your brain and find out when you perceive something as different. It’s called mismatch negativity (see http://en.wikipedia.org/wiki/Mismatch_negativity for more detail). It a brain potential (called P300) peaking .1 -.25 seconds after a deviant tone or syllable.

Play 5 middle c’s in a row followed by a d than c’s again. The potential doesn’t occur after any of the c’s just after the d. This has been applied to the study of infant perception long before they can speak.

It has shown us that asian and western newborn infants both hear ‘r’ and ‘l’ quite well (showing mismatch negativity to a sudden ‘r’ or ‘l’ in a sequence of other sounds). If the asian infant never hears people speaking words with r and l in them for 6 months, it loses mismatch negativity to them (and clinical perception of them). So our brains are literally ‘tuned’ to understand the language we hear.

So we are more likely to notice the T after a run of H’s, or an H after a run of T’s. We are also likely to notice just how long it has been since it last occurred.

This is part of a more general phenomenon — the ability of our brains to pick up and focus on changes in stimuli. Exactly the same phenomenon explains why we see edges of objects so well — at least here we have a solid physiologic explanation — surround inhibition (for details see — http://en.wikipedia.org/wiki/Lateral_inhibition). It happens in the complicated circuitry of the retina, before the brain is involved.

Philosophers should note that this destroys the concept of the pure (e.g. uninterpreted) sensory percept — information is being processed within our eyes before it ever gets to the brain.

Should pregnant women smoke pot?

Well, maybe this is why college board scores have declined so much in recent decades that they’ve been normed upwards. Given sequential MRI studies on brain changes throughout adolescence (with more to come), we know that it is a time of synapse elimination. (this will be the subject of another post). We also know that endocannabinoids, the stuff in the brain that marihuana is mimicking, are retrograde messengers there, setting synaptic tone for information transmission between neurons.

But there’s something far scarier in a paper that just came out [ Proc. Natl. Acad. Sci. vol. 112 pp. 3415 – 3420 ’15 ]. Hedgehog is a protein so named because its absence in fruitflies (Drosophila) causes excessive bristles to form, making them look like hedgehogs. This gives you a clue that Hedgehog signaling is crucial in embryonic development. A huge amount is known about it with more being discovered all the time — for far more details than I can provide see http://en.wikipedia.org/wiki/Hedgehog_signaling_pathway.

Unsurprisingly, embryonic development of the brain involves hedgehog, e,g, [ Neuron vol. 39 pp. 937 – 950 ’03 ] Hedgehog (Shh) signaling is essential for the establishment of the ventral pattern along the whole neuraxis (including the telencephalon). It plays a mitogenic role in the expansion of granule cell precursors during CNS development. This work shows that absence of Shh decreases the number of neural progenitors in the postnatal subventricular zone and hippocampus. Similarly conditional inactivation of smoothened results in the formation of fewer neurospheres from progenitors in the subventricular zone. Stimulation of the hedgehog pathway in the mature brain results in elevated proliferation in telencephalic progenitors. It’s a lot of unfamiliar jargon, but you get the idea.

Of interest is the fact that the protein is extensively covalently modified by lipids (cholesterol at the carboxy terminal end and palmitic acid at the amino terminal end. These allow hedgehog to bind to its receptor (smoothened). It stands to reason that other lipids might block this interaction. The PNAS work shows this is exactly the case (in Drosophila at least). One or more lipids present in Drosophila lipoprotein particles are needed in vivo to keep Hedgehog signaling turned off in wing discs (when hedgehog ligand isn’t around). The lipids destabilize Smoothtened. This work identifies endocannabinoids as the inhibitory lipids from extracts of human very low density lipoprotein (VLDL).

It certainly is a valid reason for women not to smoke pot while pregnant. The other problem with the endocannabinoids and exocannabinoids (e.g. delta 9 tetrahydrocannabinol), is that they are so lipid soluble they stick around for a long time — see https://luysii.wordpress.com/2014/05/13/why-marihuana-scares-me/

It is amusing to see regulatory agencies wrestling with ‘medical marihuana’ when it never would have gotten through the FDA given the few solid studies we have in man.

A post which may actually be of some use to Safari users

This post may actually be of some use (to those of you using Safari on a Mac anyway). Yesterday, I had the awful experience of a pop-up that I couldn’t get rid of. It said that I had to call a number right away to protect my identity etc. etc. I’d heard about malware that got on your computer encrypting everything so you couldn’t use it, except to pay them a ransom.

So I tried quitting Safari and restarting. No luck. There it was along with sites I always go to on Safari (PNAS, Nature, Science, Cell and Neuron).

So I tried to shut down (which wasn’t possible because I got a note that Safari was busy).

Then I used Force Quit to shut down Safari and was then able to shut down.

Rebooting was of no help whatsoever, as the pop-up appeared along with all 5 sites I usually have open whenever I opened Safari. This happened several times, yours truly being bull headed enough to try it again and again against all hope.

Time to call Applecare — they fixed it immediately. Apparently Safari has a some sore of cache which reopens everything you’ve opened on your last visit. This is what brought up my favored sites and the annoying popup.

The trick is to Open Safari from the Dock (and you must do it this way, not from recently used items) with the shift key held down — this flushed the cache (and the pop-up along with it).

Applecare said this pop-up wasn’t malware, just a scam which charged money to get rid of it (which you can now do free of charge).

Why drug discovery is so hard: Reason #26 — We’re discovering new players all the time

Drug discovery is so very hard because we don’t understand the way cells and organisms work very well. We know some of the actors — DNA, proteins, lipids, enzymes but new ones are being discovered all the time (even among categories known for decades such as microRNAs).

Briefly microRNAs bind to messenger RNAs usually decreasing their stability so less protein is made from them (translated) by the ribosome. It’s more complicated than that (see later), but that’s not bad for a first pass.

Presently some 2,800 human microRNAs have been annotated. Many of them are promiscuous binding more than one type of mRNA. However the following paper more than doubled their number, finding some 3,707 new ones [ Proc. Natl. Acad. Sci. vol. 112 pp. E1106 – E1115 ’15 ]. How did they do it?

Simplicity itself. They just looked at samples of ‘short’ RNA sequences from 13 different tissue types. MicroRNAs are all under 30 nucleotides long (although their precursors are not). The reason that so few microRNAs have been found in the past 20 years is that cross-species conservation has been used as a criterion to discover them. The authors abandoned the criterion. How did they know that this stuff just wasn’t transcriptional chaff? Two enzymes (DROSHA, DICER) are involved in microRNA formation from larger precursors, and inhibiting them decreased the abundance of the ‘new’ RNAs, implying that they’d been processed by the enzymes rather than just being runoff from the transcriptional machinery. Further evidence is that of half were found associated with a protein called Argonaute which applies the microRNA to the mRBNA. 92% of the microRNAs were found in 10 or more samples. An incredible 23 billion sequenced reads were performed to find them.

If that isn’t complex enough for you, consider that we now know that microRNAs bind mRNAs everywhere, not just in the 3′ untranslated region (3′ UTR) — introns, exons. MicroRNAs also bind pseudogenes, SINEes, circular RNAs, nonCoding RNAs. So it’s a giant salad bowl of various RNAs binding each other affecting their stability and other functions. This may be echoes of prehistoric life before DNA arrived on the scene.

It’s early times, and the authors estimate that we have some 25,000 microRNAs in our genome — more than the number of protein genes.

As always, the Category “Molecular Biology Survival Guide” found on the left should fill in any gaps you may have.

One rather frightening thought; If, as Dawkins said, we are just large organisms designed to allow DNA to reproduce itself, is all our DNA, proteins, lipids etc, just a large chemical apparatus to allow our RNA to reproduce itself? Perhaps the primitive RNA world from which we are all supposed to have arisen, never left.

The dietary guidelines have been changed — what are the faithful to believe now ?

While we were in China dietary guidelines shifted. Cholesterol is no longer bad. Shades of Woody Allen and “Sleeper”. It’s life imitating art.

Sleeper is one of the great Woody Allen movies from the 70s. Woody plays Miles Monroe, the owner of (what else?) a health food store who through some medical mishap is frozen in nitrogen and is awakened 200 years later. He finds that scientific research has shown that cigarettes and fats are good for you. A McDonald’s restaurant is shown with a sign “Over 795 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 Served”

Seriously then, should you believe any dietary guidelines? In my opinion you shouldn’t. In particular I’d forget the guidelines for salt intake (unless you actually have high blood pressure in which case you should definitely limit your salt). People have been fighting over salt guidelines for decades, studies have been done and the results have been claimed to support both sides.

So what’s a body to do? Well here are 4 things which are pretty solid (which few docs would disagree with, myself included)

l. Don’t smoke
2. Don’t drink too much (over 2 drinks a day), or too little (no drinks). Study after study has shown that mortality is lowest with 1 – 2 drinks/day
3. Don’t get fat — by this I mean fat (Body Mass Index over 30) not overweight (Body Mass Index over 25). The mortality curve for BMI in this range is pretty flat. So eat whatever you want, it’s the quantities you must control.
4. Get some exercise — walking a few miles a week is incredibly much better than no exercise at all — it’s probably half as good as intense workouts — compared to doing nothing.

Not very sexy, but you’re very unlikely to find anyone telling you the opposite 50years from now.

It’s off topic, but I’d use the same degree of skepticism about the dire predictions of the Global Warming (AKA Climate change) people, particularly since there has been no change in global mean temperature this century.

When the active form of a protein is intrinsically disordered

Back in the day, biochemists talked about the shape of a protein, influenced by the spectacular pictures produced by Xray crystallography. Now, of course, we know that a protein has multiple conformations in the cell. I still find it miraculous that the proteins making us up have only relatively few. For details see — https://luysii.wordpress.com/2010/08/04/why-should-a-protein-have-just-one-shape-or-any-shape-for-that-matter/.

Presently, we also know that many proteins contain segments which are intrinsically disordered (e.g. no single shape).The pendulum has swung the other way — “estimations that contiguous regions longer than 50 amino acids ‘may be present” in ‘up to’ 50% of proteins coded in eukaryotic genomes [ Proc. Natl. Acad. Sci. vol. 102 pp. 17002 – 17007 ’05 ]

[ Science vol. 325 pp. 1635 – 1636 ’09 ] Compared to ordered regions, disordered regions of proteins have evolved rapidly, contain many short linear motifs that mediate protein/protein interactions, and have numerous phosphorylation sites compared to ordered regions. Disordered regions are enriched in serine and threonine residues, while ordered sequences are enriched in tyrosines — this highlights functional differences in the types of phosphorylation. Interestingly tyrosines have been lost during evolution.

What are unstructured protein segments good for? One theory is that the disordered segment can adopt different conformations to bind to different partners — this is the moonlighting effect. Then there is the fly casting mechanism — by being disordered (hence extended rather than compact) such proteins can flail about and find partners more easily.

Given what we know about enzyme function (and by inference protein function), it is logical to assume that the structured form of a protein which can be unstructured is the functional form.

Not so according to this recent example [ Nature vol. 519 pp. 106 – 109 ’15 ]. 4EBP2 is a protein involved in the control of protein synthesis. It binds to another protein also involved in synthesis (eIF4E) to suppress a form of translation of mRNA into protein (cap dependent translation if you must know). 4EBP2 is intrinsically disordered. When it binds to its target it undergoes a disorder to ordered transition. However eIF4E binding only occurs from the intrinsically disordered form.

Control of 4EBP2 activity is due, in part, to phosphorylation on multiple sites. This induces folding of amino acids #18 – #62 into a 4 stranded beta domain which sequesters the canonical YXXXLphi motif with which 4EBP2 binds eIF4E (Y stands for tyrosine, X for any amino acid, L for leucine and phi for any bulky hydrophobic amino acid). So here we have an inactive (e.g. nonbonding) form of a protein being the structured rather than the unstructured form. The unstructured form of 4EBP2 is therefore the physiologically active form of the protein.

Off to China

No posts until March. Off to meet our new Granddaughter. Will be Email and Internet free until then.

To fill up the empty hours until I’m back, drug chemists should study the physical chemistry of protein/protein interaction, since that’s where most cellular work is done (and where new drugs should be useful). The interctions are multiple, transient and nonequivalent (the WordPress processor substituted this for nonCovalent).

An interesting paper made all 160,000 possible variants of 4 amino acids at the interface between two bacterial proteins [ Science vol. 347 pp. 673 – 677 ’15 ]. For bacterial histidine kinases mutating just 3 or 4 interfacial amino acids to match those in another kinase is enough to reprogram their specificity. The key amino acids are Ala284, Val285, Ser288, Thr289. The results were rather surprising.


Scary stuff

While you were in your mother’s womb, endogenous viruses were moving around the genome in your developing developing brain according to [ Neuron vol. 85 pp. 49 – 59 ’15 ].

The evidence is pretty good. For a while half our genome was called ‘junk’ by those who thought they had molecular biology pretty well figured out. For instance 17% of our 3.2 gigaBase DNA genome is made of LINE1 elements. These are ‘up to’ 6 kiloBases long. Most are defective in the sense that they stay where they are in the genome. However some are able to be transcribed into RNA, the RNA translated into proteins, among which is a reverse transcriptase (just like the AIDS virus) and an integrase. The reverse transcriptase makes a DNA copy of the RNA, and the integrates puts it back into the genome in a different place.

Most LINE1 DNA transcribed into RNA has a ‘tail’ of polyAdenine (polyA) tacked onto the 3′ end. The numbers of A’s tacked on isn’t coded in the genome, so it’s variable. This allows the active LINE1’s (under 1/1,000 of the total) to be recognized when they move to a new place in the genome.

It’s unbelievable how far we’ve come since the Human Genome Project which took over a decade and over a billion dollars to sequence a single human genome (still being completed by the way filling in gaps etc. etc [ Nature vol. 517 pp. 608 – 611 ’15 ] using a haploid human tumor called a hydatidiform mole ). The Neuron paper sequenced the DNA of 16 single neurons. They found LINE1 movement in 4

Once a LINE1 element has moved (something very improbable) it stays put, but all cells derived from it have the LINE1 element in the new position.

They found multiple lineages and sublineages of cells marked by different LINE1 retrotransposition events and subsequent mutation of polyA microsatellites within L1. One clone contained thousands of cells limited to the left middle frontal gyrus, while a second clone contained millions of cells distributed over the whole left hemisphere (did they do whole genome on millions of cells).

There is one fly in the ointment. All 16 neurons were from the same ‘neurologically normal’ individual.

Mosaicism is a term used to mean that different cells in a given individual have different genomes. This is certainly true in everyone’s immune system, but we’re talking brain here.

Is there other evidence for mosaicism in the brain? Yes. Here it is

[ Science vol. 345 pp. 1438 – 1439 ’14 ] 8/158 kids with brain malformations with no genetic cause (as found by previous techniques) had disease causing mutations in only a fraction of their cells (hopefully not brain cells produced by biopsy). Some mosaicism is obvious — the cafe au lait spots of McCune Albright syndrome for example. DNA sequencing takes the average of multiple reads (of the DNA from multiple cells?). Mutations foudn in only a few reads are interpreted as part of the machine’s inherent error rate. The trick was to use sequencing of candidate gene regions to a depth of 300 (rather than the usual 50 – 60).

It is possible that some genetically ‘normal’ parents who have abnormal kids are mosaics for the genetic abnormality.

[ Science vol. 342 pp. 564 – 565, 632 -637 ’13 ] Our genomes aren’t perfect. Each human genome contains 120 protein gene inactivating variants, with 20/120 being inactivated in both copies.

The blood of ‘many’ individuals becomes increasingly clonal with age, and the expanded clones often contain large deletions and duplications, a risk factor for cancer.

Some cases of hemimegalencephaly are due to somatic mutations in AKT3.

30% of skin fibroblasts ‘may’ have somatic copy number variations in their genomes.

The genomes of 110 individual neurons from the frontal cortex of 3 people were sequenced. 45/110 of the neurons had copy number variations (CNVs) — ranging in size from 3 megaBases to a whole chromosome. 15% of the neurons accounted for 73% of of the CNVs. However, 59% of neurons showed no CNVs, while 25% showed only 1 or 2.

The chemical ingenuity of the lake Ontario midge

Well we’re freezing our butts off here in sunny New England, so it’s time to discourse upon the chemical ingenuity of antifreeze proteins. They’ve long been known, with most found in fish living in arctic waters. A very unusual structure is found in a 79 amino acid protein from an insect living near Lake Ontario. It contains 79 amino acids with a set of 10 amino acid tandem repeats making up most of the protein. Here is the the repeat.

X X Cys X Gly X Tyr Cys X Gly ; X = any amino acid.

Can you as a computational chemistry expert figure out what it forms?

The 10 amino acids form a complete circle with the peptide backbone looking nothing like an alpha helix, a beta sheet or anything else I’ve seen. It just sort of wanders around for 360 degrees. In cross section the ‘circle’ resembles the Greek letter theta with a disulfide bond between the two cysteines forming a crossbar inside the circle. This puts all 7 tyrosines from the 7 repeats in a row on one side of the circle, where they form the presumed ice binding site. The solenoid is reinforced by intrachain hydrogen bonds, and side chain salt bridges. You can read about it and see some pictures in Proc. Natl. Acad. Sci. vol. 112 pp. 737 – 742 ’15 ].

The chemical ingenuity of some of these proteins is remarkable. None of them (except one) appear to have been figured out before their structures were determined.

[ Proc. Natl. Acad. Sci. vol. 108 pp. 7281 – 7282 ’11 ] Even now, the structural differences between the surface of ice nuclei and liquid water are poorly characterized (we don’t even know how many hydrogen bonds are involved), yet antifreeze proteins somehow recognize it. Some 12 different structural motifs have been found in antifreeze proteins. 3 are given — one is a small globular protein (sea pout) another is an alpha helix (winter flounder), and the third is a stack of left handed PolyProtein-II helices (snow flea). The present work gives a fourth example — a right handed parallel beta helix from (Marinomonas primoyensis). It is a 34 kiloDalton domain — it is a calcium bound parallel beta helix, with an extensive array of icelike surface waters that are anchored via hydrogen bonds directly to the protein backbone and adjacent side chains. The bound waters make an excellent 3 dimensional match to the primary prism and basal planes of ice.

Probably the most counterintuitive antifreeze protein is the following. It stands a lot of what we thought we knew about protein structure on its head.

[ Science vol. 343 pp. 743 – 744, 795 – 798 ’14 ] Almost all globular proteins reported to date have a dry protein core (e.g. water free). An antifreeze protein called Maxi from the winter flounder (Pseudopleuronectes americanus) has been found with a water filled core. It is a 3 kiloDalton alanine rich 4 helix bundle 145 Angstroms long. The periodicity of the alpha helices is 11 amino acids. A single turn of an alpha helix is 5.4 Angstroms high and 11 Angstroms wide. So 11 amino acids fairly neatly comes out to 16 Angstroms in length (because each helical turn is 3.7 residues (vs. the normal 3.6 in the classic alpha helix). The ice binding residues are Threonine at position i, Alanine at position i+4 and Alanine at position i + 8 (putting them along one face of the helix). The protein is a dimer of monomers each containing two helices. The core is comprised of 400 (yes 400 !) highly organized water molecules. The water is interleaved as a roughly two molecule thick layer between both intra and intermonomer helix interfaces, extending to the ice binding surfaces. Maxi must bind ice nuclei and inhibit their growth. The water molecules inside the bundle form pentagons ! ! ! Amazingly, this was predicted 50 years ago by Scheraga . The 5 membered water rings form cages around individual amino acid side chains, illustrating their semi-clathrate structure — rather than ice. Most of the carbonyls are involved in hydrogen bonding interactions with water — helping to keep the protein soluble. The protein denatures at low temperatures (16 C)

Ordered water can be found in most high resolution Xray crystallograpy protein structures, but they are usually between the proteins. Maxi retains the very structure of water.

Removal of water has been proposed as a potential rate limiting step in protein folding. Maxi folds to the point where water not in direct contact with the protein chain is removed from its core. It then arrests further folding to retain a beautifully ordered core of water interleaved between the protein helices.

Amazing! No one would ever have predicted something like Maxi (except Sheraga).

The butterfly effect in cancer

Fans of Chaos know all about the butterfly, where a tiny change in air current produced by a butterfly’s wings in Brazil leads to a typhoon in Java. Could such a thing happen in cell biology? [ Proc. Natl. Acad. Sci. vol. 112 pp. 1131 – 1136 ’15 ] comes close.

The Cancer Genome Project has spent a ton of money looking at all the mutations of all our protein coding genes which occur in various types of cancers. It was criticized as we already knew that cancer is effectively a hypermutable state, and that it would just prove the obvious. Well it did, but it also showed us just what a formidable problem cancer actually is.

For instance [ Nature vol. 489 pp. 519 – 525 ’12 ] is report from the Cancer Genome Atlas of 178 cases of squamous cell cancer of the lung. There are a mean of 360 exonic mutations, 165 genomic rearrangements, and 323 copy number alterations per tumor. The technical details in the rest of the paragraph can be safely ignored but the point is that there no consistent pattern of mutation was found (except for p53 which is mutated in over 50% of all types of cancer, which we knew long before the Cancer Genome Atlas). Recurrent mutations were found in 11 genes. p53 was mutated in nearly all. Previously unreported loss of function mutations were seen in the class I major histocompatibility (HLA-A). Several pathways were altered relatively consistently (NFE2L2, KEAP1 in 34%, squamous differentiation genes in 44%, PI3K genes in 47% and CDKN2A and RB1 in 72%). EGFR and kRAS mutations are rare in squamous cell cancer of the lung (but quite common in adenocarcinoma). Alterations in FGFR are quite common in squamous cell carcinomas.

This sort of thing (which has been found in all the many types of tumors studied by the Cancer Genome Atlas) lead to a degree of hopelessness in looking for the holy grail of a single ‘driver mutation’ which leads to cancer with its attendant genomic instability.

All is not lost however.

MCF-10A is an immortalized epithelial cell line derived from human breast tissue. It is capable of continuous growth, but is far from normal: (1) an abnormal complement of chromosomes ) (2) threefold amplification of the MYC oncogene, and (3) deletion of a known tumor suppressor . It does lack some mutations found in breast cancer. For instance, the Epidermal Growth Factor Receptor 2 (ERRBB2) is not amplified. The cell line doesen’t express the estrogen and progesterone receptors — making it similar to triple negative breast cancer.

A single amino acid mutation (Arginine for Histidine at amino acid #1047 ) in the catalytic subunit of a very important protein kinase (p110alpha of the PIK3CA gene) was put into the MCF-10A cell line (which they call MCF-1A-H1047R). The mutation was chosen because it is one of the most frequently encountered cancer specific mutations known. Exome sequencing of the entire genome showed that this was the only change — but the control sequences outside the exons weren’t studied, a classic case of the protein centric style of molecular biology.

In the (admittedly not completely normal) cell line, the mutation produced a cellular reorganization that far exceeds the known signaling activities of PI3K. The proetins expressed were stimilar to the protein and RNA signatures of basal breast cancer. The changes far exceeded the known effects of PIK3CA signaling. The phosphoproteins of MCF-1A-H1047R are extremely different. Inhibitors of the kinase induce only a partial reversion to the normal phenotype.

They plan to study the epigenome. This is signifcant as breast cancers are said in the paper to have tons of mutations changing amino acids in proteins (4,000 per tumor). In my opinion they should do whole genome sequencing of MCF-A1-H1047R as well.

The mutant becomes fully transformed whan a second mutation (of KRAS, an oncogene) is put in. This allows them to form tumors in nude mice. Recall that nude mice (another rodent beloved of experimental biologists — see the previous post on the Naked Mole Rat) has a very limited immune system, allowing grafts of human cells to take root and proliferate.

How close the initial cell line is to normal is another matter. Work on a similar cell line the (3T3 fibroblast) has been criticized because that cell is so close to neoplastic. At least the mutant MCF-1A-H1047R cells aren’t truly neoplastic as they won’t produce tumors in nude mice. However, mutating just one more gene (KRAS) turns MCF-1A-H1047R malignant when transplanted.

The paper is also useful for showing how little we really understand about cause and effect in the cell. PI3K has been intensively studied for years because it is one of the major players telling cells to grow in size rather than divide. And yet “the mutation produced a cellular reorganization that far exceeds the known signaling activities of PI3K”


Get every new post delivered to your Inbox.

Join 75 other followers