Category Archives: Molecular Biology

Why drug discovery is so hard: Reason #26 — We’re discovering new players all the time

Drug discovery is so very hard because we don’t understand the way cells and organisms work very well. We know some of the actors — DNA, proteins, lipids, enzymes but new ones are being discovered all the time (even among categories known for decades such as microRNAs).

Briefly microRNAs bind to messenger RNAs usually decreasing their stability so less protein is made from them (translated) by the ribosome. It’s more complicated than that (see later), but that’s not bad for a first pass.

Presently some 2,800 human microRNAs have been annotated. Many of them are promiscuous binding more than one type of mRNA. However the following paper more than doubled their number, finding some 3,707 new ones [ Proc. Natl. Acad. Sci. vol. 112 pp. E1106 – E1115 ’15 ]. How did they do it?

Simplicity itself. They just looked at samples of ‘short’ RNA sequences from 13 different tissue types. MicroRNAs are all under 30 nucleotides long (although their precursors are not). The reason that so few microRNAs have been found in the past 20 years is that cross-species conservation has been used as a criterion to discover them. The authors abandoned the criterion. How did they know that this stuff just wasn’t transcriptional chaff? Two enzymes (DROSHA, DICER) are involved in microRNA formation from larger precursors, and inhibiting them decreased the abundance of the ‘new’ RNAs, implying that they’d been processed by the enzymes rather than just being runoff from the transcriptional machinery. Further evidence is that of half were found associated with a protein called Argonaute which applies the microRNA to the mRBNA. 92% of the microRNAs were found in 10 or more samples. An incredible 23 billion sequenced reads were performed to find them.

If that isn’t complex enough for you, consider that we now know that microRNAs bind mRNAs everywhere, not just in the 3′ untranslated region (3′ UTR) — introns, exons. MicroRNAs also bind pseudogenes, SINEes, circular RNAs, nonCoding RNAs. So it’s a giant salad bowl of various RNAs binding each other affecting their stability and other functions. This may be echoes of prehistoric life before DNA arrived on the scene.

It’s early times, and the authors estimate that we have some 25,000 microRNAs in our genome — more than the number of protein genes.

As always, the Category “Molecular Biology Survival Guide” found on the left should fill in any gaps you may have.

One rather frightening thought; If, as Dawkins said, we are just large organisms designed to allow DNA to reproduce itself, is all our DNA, proteins, lipids etc, just a large chemical apparatus to allow our RNA to reproduce itself? Perhaps the primitive RNA world from which we are all supposed to have arisen, never left.

When the active form of a protein is intrinsically disordered

Back in the day, biochemists talked about the shape of a protein, influenced by the spectacular pictures produced by Xray crystallography. Now, of course, we know that a protein has multiple conformations in the cell. I still find it miraculous that the proteins making us up have only relatively few. For details see — https://luysii.wordpress.com/2010/08/04/why-should-a-protein-have-just-one-shape-or-any-shape-for-that-matter/.

Presently, we also know that many proteins contain segments which are intrinsically disordered (e.g. no single shape).The pendulum has swung the other way — “estimations that contiguous regions longer than 50 amino acids ‘may be present” in ‘up to’ 50% of proteins coded in eukaryotic genomes [ Proc. Natl. Acad. Sci. vol. 102 pp. 17002 – 17007 ’05 ]

[ Science vol. 325 pp. 1635 – 1636 ’09 ] Compared to ordered regions, disordered regions of proteins have evolved rapidly, contain many short linear motifs that mediate protein/protein interactions, and have numerous phosphorylation sites compared to ordered regions. Disordered regions are enriched in serine and threonine residues, while ordered sequences are enriched in tyrosines — this highlights functional differences in the types of phosphorylation. Interestingly tyrosines have been lost during evolution.

What are unstructured protein segments good for? One theory is that the disordered segment can adopt different conformations to bind to different partners — this is the moonlighting effect. Then there is the fly casting mechanism — by being disordered (hence extended rather than compact) such proteins can flail about and find partners more easily.

Given what we know about enzyme function (and by inference protein function), it is logical to assume that the structured form of a protein which can be unstructured is the functional form.

Not so according to this recent example [ Nature vol. 519 pp. 106 – 109 ’15 ]. 4EBP2 is a protein involved in the control of protein synthesis. It binds to another protein also involved in synthesis (eIF4E) to suppress a form of translation of mRNA into protein (cap dependent translation if you must know). 4EBP2 is intrinsically disordered. When it binds to its target it undergoes a disorder to ordered transition. However eIF4E binding only occurs from the intrinsically disordered form.

Control of 4EBP2 activity is due, in part, to phosphorylation on multiple sites. This induces folding of amino acids #18 – #62 into a 4 stranded beta domain which sequesters the canonical YXXXLphi motif with which 4EBP2 binds eIF4E (Y stands for tyrosine, X for any amino acid, L for leucine and phi for any bulky hydrophobic amino acid). So here we have an inactive (e.g. nonbonding) form of a protein being the structured rather than the unstructured form. The unstructured form of 4EBP2 is therefore the physiologically active form of the protein.

Scary stuff

While you were in your mother’s womb, endogenous viruses were moving around the genome in your developing developing brain according to [ Neuron vol. 85 pp. 49 – 59 ’15 ].

The evidence is pretty good. For a while half our genome was called ‘junk’ by those who thought they had molecular biology pretty well figured out. For instance 17% of our 3.2 gigaBase DNA genome is made of LINE1 elements. These are ‘up to’ 6 kiloBases long. Most are defective in the sense that they stay where they are in the genome. However some are able to be transcribed into RNA, the RNA translated into proteins, among which is a reverse transcriptase (just like the AIDS virus) and an integrase. The reverse transcriptase makes a DNA copy of the RNA, and the integrates puts it back into the genome in a different place.

Most LINE1 DNA transcribed into RNA has a ‘tail’ of polyAdenine (polyA) tacked onto the 3′ end. The numbers of A’s tacked on isn’t coded in the genome, so it’s variable. This allows the active LINE1’s (under 1/1,000 of the total) to be recognized when they move to a new place in the genome.

It’s unbelievable how far we’ve come since the Human Genome Project which took over a decade and over a billion dollars to sequence a single human genome (still being completed by the way filling in gaps etc. etc [ Nature vol. 517 pp. 608 – 611 ’15 ] using a haploid human tumor called a hydatidiform mole ). The Neuron paper sequenced the DNA of 16 single neurons. They found LINE1 movement in 4

Once a LINE1 element has moved (something very improbable) it stays put, but all cells derived from it have the LINE1 element in the new position.

They found multiple lineages and sublineages of cells marked by different LINE1 retrotransposition events and subsequent mutation of polyA microsatellites within L1. One clone contained thousands of cells limited to the left middle frontal gyrus, while a second clone contained millions of cells distributed over the whole left hemisphere (did they do whole genome on millions of cells).

There is one fly in the ointment. All 16 neurons were from the same ‘neurologically normal’ individual.

Mosaicism is a term used to mean that different cells in a given individual have different genomes. This is certainly true in everyone’s immune system, but we’re talking brain here.

Is there other evidence for mosaicism in the brain? Yes. Here it is

[ Science vol. 345 pp. 1438 – 1439 ’14 ] 8/158 kids with brain malformations with no genetic cause (as found by previous techniques) had disease causing mutations in only a fraction of their cells (hopefully not brain cells produced by biopsy). Some mosaicism is obvious — the cafe au lait spots of McCune Albright syndrome for example. DNA sequencing takes the average of multiple reads (of the DNA from multiple cells?). Mutations foudn in only a few reads are interpreted as part of the machine’s inherent error rate. The trick was to use sequencing of candidate gene regions to a depth of 300 (rather than the usual 50 – 60).

It is possible that some genetically ‘normal’ parents who have abnormal kids are mosaics for the genetic abnormality.

[ Science vol. 342 pp. 564 – 565, 632 -637 ’13 ] Our genomes aren’t perfect. Each human genome contains 120 protein gene inactivating variants, with 20/120 being inactivated in both copies.

The blood of ‘many’ individuals becomes increasingly clonal with age, and the expanded clones often contain large deletions and duplications, a risk factor for cancer.

Some cases of hemimegalencephaly are due to somatic mutations in AKT3.

30% of skin fibroblasts ‘may’ have somatic copy number variations in their genomes.

The genomes of 110 individual neurons from the frontal cortex of 3 people were sequenced. 45/110 of the neurons had copy number variations (CNVs) — ranging in size from 3 megaBases to a whole chromosome. 15% of the neurons accounted for 73% of of the CNVs. However, 59% of neurons showed no CNVs, while 25% showed only 1 or 2.

The butterfly effect in cancer

Fans of Chaos know all about the butterfly, where a tiny change in air current produced by a butterfly’s wings in Brazil leads to a typhoon in Java. Could such a thing happen in cell biology? [ Proc. Natl. Acad. Sci. vol. 112 pp. 1131 – 1136 ’15 ] comes close.

The Cancer Genome Project has spent a ton of money looking at all the mutations of all our protein coding genes which occur in various types of cancers. It was criticized as we already knew that cancer is effectively a hypermutable state, and that it would just prove the obvious. Well it did, but it also showed us just what a formidable problem cancer actually is.

For instance [ Nature vol. 489 pp. 519 – 525 ’12 ] is report from the Cancer Genome Atlas of 178 cases of squamous cell cancer of the lung. There are a mean of 360 exonic mutations, 165 genomic rearrangements, and 323 copy number alterations per tumor. The technical details in the rest of the paragraph can be safely ignored but the point is that there no consistent pattern of mutation was found (except for p53 which is mutated in over 50% of all types of cancer, which we knew long before the Cancer Genome Atlas). Recurrent mutations were found in 11 genes. p53 was mutated in nearly all. Previously unreported loss of function mutations were seen in the class I major histocompatibility (HLA-A). Several pathways were altered relatively consistently (NFE2L2, KEAP1 in 34%, squamous differentiation genes in 44%, PI3K genes in 47% and CDKN2A and RB1 in 72%). EGFR and kRAS mutations are rare in squamous cell cancer of the lung (but quite common in adenocarcinoma). Alterations in FGFR are quite common in squamous cell carcinomas.

This sort of thing (which has been found in all the many types of tumors studied by the Cancer Genome Atlas) lead to a degree of hopelessness in looking for the holy grail of a single ‘driver mutation’ which leads to cancer with its attendant genomic instability.

All is not lost however.

MCF-10A is an immortalized epithelial cell line derived from human breast tissue. It is capable of continuous growth, but is far from normal: (1) an abnormal complement of chromosomes ) (2) threefold amplification of the MYC oncogene, and (3) deletion of a known tumor suppressor . It does lack some mutations found in breast cancer. For instance, the Epidermal Growth Factor Receptor 2 (ERRBB2) is not amplified. The cell line doesen’t express the estrogen and progesterone receptors — making it similar to triple negative breast cancer.

A single amino acid mutation (Arginine for Histidine at amino acid #1047 ) in the catalytic subunit of a very important protein kinase (p110alpha of the PIK3CA gene) was put into the MCF-10A cell line (which they call MCF-1A-H1047R). The mutation was chosen because it is one of the most frequently encountered cancer specific mutations known. Exome sequencing of the entire genome showed that this was the only change — but the control sequences outside the exons weren’t studied, a classic case of the protein centric style of molecular biology.

In the (admittedly not completely normal) cell line, the mutation produced a cellular reorganization that far exceeds the known signaling activities of PI3K. The proetins expressed were stimilar to the protein and RNA signatures of basal breast cancer. The changes far exceeded the known effects of PIK3CA signaling. The phosphoproteins of MCF-1A-H1047R are extremely different. Inhibitors of the kinase induce only a partial reversion to the normal phenotype.

They plan to study the epigenome. This is signifcant as breast cancers are said in the paper to have tons of mutations changing amino acids in proteins (4,000 per tumor). In my opinion they should do whole genome sequencing of MCF-A1-H1047R as well.

The mutant becomes fully transformed whan a second mutation (of KRAS, an oncogene) is put in. This allows them to form tumors in nude mice. Recall that nude mice (another rodent beloved of experimental biologists — see the previous post on the Naked Mole Rat) has a very limited immune system, allowing grafts of human cells to take root and proliferate.

How close the initial cell line is to normal is another matter. Work on a similar cell line the (3T3 fibroblast) has been criticized because that cell is so close to neoplastic. At least the mutant MCF-1A-H1047R cells aren’t truly neoplastic as they won’t produce tumors in nude mice. However, mutating just one more gene (KRAS) turns MCF-1A-H1047R malignant when transplanted.

The paper is also useful for showing how little we really understand about cause and effect in the cell. PI3K has been intensively studied for years because it is one of the major players telling cells to grow in size rather than divide. And yet “the mutation produced a cellular reorganization that far exceeds the known signaling activities of PI3K”

Time for drug chemists to hit the cell biology books

The (undeservedly) obscure Naked Mole Rat should be of interest to drug chemists for two reasons (1) it lives 8 times as long its fellow rodent the lab mouse (2) it never gets cancer (despite being under observation for the past 40 years). So untangling the mechanisms behind this should tell us about aging and cancer, particularly since cancer accounts for over 50% of the mortality in lab rodents. They age healthy. Until the last few years of their long lives, they show minimal morphological and physical changes of aging.

This post will concern a possible way Naked Mole Rats escape cancer. I’ve attempted to provide a molecular biologiocal background for chemists about DNA, RNA, gene transcription etc. See https://luysii.wordpress.com/2010/07/07/molecular-biology-survival-guide-for-chemists-i-dna-and-protein-coding-gene-structure/ and follow the links. There is very little in these posts about cell physiology and biology. I suggest having a look at “Molecular Biology of The Cell” and “Cancer” by Robert Weinberg. Get the latest editions, as things are moving rapidly.

The following paper tried to find out why Naked Mole Rats don’t get cancer [ Proc. Natl. Acad. Sci. vol. 112 pp. 1053 – 1058 ’15 ]. In tissue culture, naked mole rat fibroblasts show hypersensitivity to contact inhibition (aka early contact inhibition aka ECI). E.g. they stop dividing or die when they get too close to each other. The signal triggering ECI comes from hyaluronan (which has a very high molecular weight) outside the cell. Removal of high MW hyaluronan abrogates ECI and makes naked mole rat cells susceptible to malignant transformation.

ECI is associated with an increase in expression of p16^INK4a, a tumor suppressor (here is where the cell biology comes in). Cells losing expression no longer show ECI. Deletion and/or silencing of INK4a/b is found in human cancers as well. The genomic locus containing p16^INK4a is small (under 50 kiloBases), but it it codes for 3 different tumor suppressors (p16^INK4a, p15^INK4b and p14^ARF). The 3 proteins coordinate a signaling network depending on the activities of the retinoblastoma protein (RB) and p53 (more cell molecular biology).

In the naked mole rat, the INK4a/b locus codes for an additional product which consists of p15^INK4b exon #1 joined to p16^INK4a exons #2 and #3, due to alternative splicing. They call this pALT^INK4a/b. It is present in cultured cells from naked mole rat tissues, but is absent in human and mouse cells. pALT^INK4a/b expression is induced during early contact inhibition and by a variety of stresses such as ultraviolet light, gamma radiation, loss of substrate attachment and expression of oncogenes. When over expressed in human cells, pALT^INK4a/b has more ability to induce cell cycle arrest than either p16^INK4a or p15^INK4b. So pALT^INK4a/b might explain the increased resistance to tumors.

There’s also a lot of work concerning why they live so long, but that’s for another post.

An interesting way to study the hydrophobic effect between protein surfaces

Protein interaction domains haven’t been studied to nearly the extent they need to be, and we know far less about them than we should. All the large molecular machines of the cell (ribosome, mediator, spliceosome, mitochondrial respiratory chain) involve large numbers of proteins interacting with each other not by the covalent bonds beloved by organic chemists, but by much weaker forces (van der Waals,charge attraction, hydrophobic entropic forces etc. etc.).

Designing drugs to interfere (or promote) such interactions will be tricky, yet they should have profound effects on cellular and organismal physiology. Off target effects are almost certain to occur (particularly since we know so little about the partners of a given motif). Showing how potentially useful such a drug can be, a small molecule inhibitor of the interaction of the AIDs virus capsid protein with two cellular proteins (CPSF6, TNPO3) the capsid protein must interact with to get into the nucleus has been developed. (Unfortunately I’ve lost the reference). For more about the host of new protein interaction domains (and potential durable targets) just discovered please see https://luysii.wordpress.com/2015/01/04/microexons-great-new-drugable-targets/

Hydrophobic ‘forces’ are certain to be important in protein protein interactions. A very interesting paper figured out a way to measure them using atomic force microscopy (AFM). [ Nature vol. 517 pp. 277 – 279, 347 – 350 ’15 ]. This is particularly interesting to me because entropy has nothing to do with the force as measured. I’ve always assumed that the the hydrophobic force was entropic, similar to the force exerted by rubber when you stretch it. It’s what pushes hydrophobic side chains into the interior of proteins (e.g water doesn’t have to decrease its entropy by organizing itself to solvate hydrophobic side chains). Not so in this case.

The authors prepared self-assembled monolayers using dodecyl thiol (CH3 (CH2) 10 CH2 SH) bound to gold. Every now and then an amino group or a guanido group was placed at the other end of the thiol. This allowed them to produce a mixture of hydrophobic groups (60%) and ionic species (NH4+ or guanidinium ions) within nanoMeters of the hydrophobic regions. The amine and the guanidino groups were the same distance as the hydrocarbon ends from the gold surface. A gold atomic force microscope (AFM) with a hydrophobic tip (the same C(12) moiety), was then used to measure the adhesive force between the tip and the surface in aqueous solution.

This is important because it is a measurement not a theoretical calculation (apologies Ashutosh). This is particularly useful since water is so complex that we don’t have a good understanding (potential function) for it.

Methanol was added (which eliminated most of the hydrophobic interactions). Sensitivity to methanol was taken as a signature of the hydrophobic component of the force. The pH could be manipulated, so the R – NH2 could be charged to R -NH3+, ditto for guanidinium to the uncharged species.

So guess what the effect of amino and guanidine groups were on the hydrophobic interaction. I was rather surprised.

The strength of hydrophobic interactions between the mixed monolayers and the tip doubled when neutral amino groups found within nanoMeters of hydrophobic regions are charged to form R -NH3+ ions by lowering the pH. A similarly placed guanidinium ion eliminates the hydrophobic interactions at all pHs. So the effect of the two side chains (NH2 for lysine, guanidinium for arginine) is opposite.

They note that the ammonium ion is well hydrated, but guanidinium is hydrated only at the edges of the plane (where the electrons are) but not above it. This allows guanidinium an amphipathic behavior, which is why it can be a denaturant (did you know this? I didn’t).

I’m sure that the effect of negative ions (e.g. carboxyl groups) and every other conceivable side chain will be studied in the future.

Thus hydrophobicity is not an intrinsic property of any given nonPolar domain. It can be changed by functional groups within 10 Angstroms.. So placing a charged group near a hydrophobic domain, should allow tuning of the hydrophobic driving force. I’d be amazed if this isn’t found to be the case evolutionarily.

They also studied some wierd looking stuff resembling proteins (beta peptides { e.g. the amino and carboxyl groups on adjacent carbons rather than the same one as with alpha amino acids) with weird side chains which are known to adopt an amphipathic helical conformation. THe nonpolar side chains were trans 2 aminocyclohexanecarboxylic acid (ACHC), and the cationic side chains were beta3 homolysine. Why didn’t they use something more natural. The peptide forms an ACHC rich nonPolar square domain 10 Angstroms on a side with a polar patch on the other side of the helix.

So it’s a fascinating piece of work with large implications for the design of drugs attacking protein protein interfaces.

How little we know

Who would have thought that a random mutagenesis experiment throwing Ethyl Nitroso Urea (ENU) at unsuspecting mice looking for genes using a mutagenesis strategy to identify novel immune regulatory genes would point to a possible treatment for muscular dystrophy? When the experimenters looked at the mutated offspring, they found that the muscles appeared unusually red.

What happened?

You need to know a bit more about muscles. On a very simplistic level there are only two types of muscle fibers, red and white. Carnivores eating chicken know about dark meat and white meat. The dark meat is composed of red fibers, which have that appearance because of large numbers of mitochondria (which are full of iron) giving them the same red appearance as blood (which is also full of iron). In both cases the iron is bound by porphyrin rings. As one might expect, these muscles consume a lot of energy, being postural for the most part. The white meat made of white fibers has muscle which can contract very quickly and strongly, for flight and fight. They don’t have nearly the endurance of red muscle, because they can’t produce energy for the long term.

Humans have the two types of muscle fibers mixed up in each of our muscles.

The ENU had produced a mutation in something called fnip1 (Folliculin INteracting Protein 1). What’s folliculin? It prevents a gene transcription factor (TFE3) from getting into the nucleus. Folliculin prevents an embryonic stem cell from differentiating. It is mutated in the Birt Hogg Dube syndrome which is characterized by many benign hair follicle tumors. What in the world does this have to do with muscular dystrophy? It’s not something someone would start investigating looking for a cure is it? Knock out both copies of folliculin and the embryo dies in utero.

It gets deeper.

What does Fnip1 do to folliculin? It, and its cousin fnip2 form complexes with folliculin. The complex binds an enzyme called AMPK (which is turned on by energy depletion in the cell. AMPK phosphorylates both fnip1 and folliculin. Folliculin binds and inhibits AMPK.

So animals lacking fnip1 have a more activated AMPK. So what? Well AMPK activates a transcriptional coactivator called PGC1alpha (you don’t want to know what the acronym stands for). This ultimately results in production of more mitochondria (recall that AMPK is an energy sensor, and one of the main functions of mitochondria is to produce energy, lots of it).

This ultimately means more red muscle fibers. There is a mouse model of Duchenne dystrophy called the mdx mouse (which has a premature termination codon in the dystrophin protein, resulting in a protein only 27% as long as it should be. That still leaves a lot, as normal dystrophin contains 3,685 amino acids. Knocking out fnip1 in the mdx mice improved muscle function. Impressive !!

I’m quite interested in this sort of work, as I ran a muscular dystrophy clinic from ’72 to ’87 and watched a lot of kids die. The major advance during that time wasn’t anything medical. It came from engineering — lighter braces using newer materials allowed the kids to stay out of wheelchairs longer.

You can read all about it in Proc. Natl. Acad. Sci. vol. 112 pp. 424 – 429 ’15 ] Clearly we know a lot (AMPK, dystrophin, PGC1alpha, fnip1, fnip2, folliculin, TFE3), but what we didn’t know was how in the world they function together in the cell. We’re sure to learn a lot more, but this whole affair was uncovered when looking for something else (immune regulators) using the bluntest instrument possible (throw a mutagen at an animal and see what happens). No one applying for a muscular dystrophy grant would dare to offer the original work as a rationale, yet here we are.

So directed research isn’t always the way to go. Although we know a lot, we still know very little.

Framingham shows us just how there is more to biology than genetics

If you have two copies of a particular variant (rs993609) of the FTO gene (FaT mass and Obesity associated gene) you are likely to weigh 7 pounds more then if you have neither. Pretty exciting stuff for the basic scientist, given the problems obesity causes (or at least is associated with). The study involved 39,000 people [ Science vol. 316 pp. 889 – 894 ’07 ]. At the end of the post, I’ll have a lot of technical stuff about just what FTO is thought to do inside the cell, but that’s not why I’m posting this.

Framingham Massachusetts is a town about 30 miles west of Boston. Thanks to the cooperation of its citizenry, it has taught us huge swaths of human biology since it began nearly 70 years ago. Briefly, The Framingham Health Study (FHS) was initiated in 1948 when 5,209 people were enrolled in the original cohort; since then, the study has come to be composed of four separate but related populations. The Framingham Offspring Study began in 1971, consisting of 5,124 individuals who represented the children of the original cohort population and their spouses. Participants in the offspring study were given physical examinations and detailed questionnaires at regular intervals starting in 1972, with a total of eight waves completed through 2008. The Body Mass Index (BMI) was calculated from measured height and weight. The offspring cohort was born over a 40-y period, with participants ranging in age from their teens to their late 50s at the time of study onset in 1971. In addition to providing survey and examination data, a large fraction of participants (73.0%, 3,742 individuals) had their DNA genotyped using the 100KAffymetrix array (43). Genotypes at the rs9939609 allele of FTO were extracted using PLINK (44) from data contained in the Framingham SHARe database.

Given the same gene, its effects should be constant through time, other things being equal. The following work [ Proc. Natl. Acad. Sci. vol. 112 pp. 354 – 359 ’15 ] mined the Framingham study to see if when you were born mattered to how fat you became if you carried the fat variant. There were 8 waves of data collection data from ’71 to ’08. Those born before ’42 showed less penetrance of the FTO gene.

Figure 1 p.356 is particularly impressive. Everyone became heavier as they got older. This is because height declines with age raising BMI even in the presence of constant weight. As far as I know, the following explanation from another post ( https://luysii.wordpress.com/2013/05/30/something-is-wrong-with-the-model-take-2/is original — “People lose height as they age, yet the BMI is quite sensitive to it (remember the denominator has height squared). The great thing about BMI is that it’s easily measured, and doesn’t rely on what people remember about their weight or their height. Well as a high school basketball player my height was 6′ 1”+, now (at age 75) its 6’0″. So even with constant weight my BMI goes up.

Well it’s time to do the calculation to see what a fairly common shrinkage from 73.5 inches to 72 would to to the BMI (at a constant weight). Surprisingly it is not trivial — (72/73.5) * (72/73.5) = .9596. So the divisor is 4% less meaning the BMI is 4% more, which is almost exactly what the low point on the curve does with each passing decade after 50 ! ! ! This might even be an original observation, and it would explain a lot.”

What is impressive about figure 1, is that those born before 1942 with two copies of the risk allele weren’t much heavier than those with one or no copies of the risk allele. This was true at all ages measured (remember these people were sequentially followed). Those born after 1942 carrying two copies of the high risk allele were 2 – 4 pounds heavier (again measured at all ages).

This is as good proof as one could hope for that environment affects gene expression, something we all assumed instinctively. There is no way one could repeat the experiment, except to start a new one in the future, which, as this shows, will occur in a different environment, which should make a difference. MDs gradually woke up to the fallacy of using historical rather than concurrent controls particularly in studies of therapies to prevent heart attack and stroke, as the rates of both dropped significantly in the past 50 years, and survival from individual heart attacks and strokes also improved.

So what does FTO actually do? Naturally anyone dealing with strokes wants to know as much as possible about one of the largest risk factors — obesity. What follows is a fairly undigested copy of my notes over the years on papers concerning FTO. I make no attempt to provide the relevant background, although most readers will have some. It’s interesting to see how our knowledge about FTO has grown over the years. Enjoy ! !

*****
[ Science vol. 316 p. 185, 889 – 894 ’07 ] FTO was first found in type II diabetics by looking for single nucleotide polymorphisms distinguishing 1924 UK type II diabetics from 2938 UK controls (were southeast Asians included?). Subsequently, larger populations (3757 type IIs and 5346 controls) were independently studied and the findings replicated. [ Cell vol. 134 p. 714 ’08 ] — The association hasn’t held up in the Han Chinese.

The FTO gene is found on chromosome #16. 16% of white adults have two copies of the variant (46% have one copy). They are 1.67 times more likely to be obese. At this point (13 Apr ’07) no one knows what the gene does.

FTO is a gene of unknown function in an unknown pathway that was originally cloned as a result of a fused-toe mutant mouse, that results from a 1.6 megaBase deletion of mouse chromosome #8. The deletion removes some 6 genes.

[ Cell vol. 131 p. 827 ’07 ] A blurb about something to be published in Science. This work shows that FTO codes for a nucleic acid demethylase. It has the enzymatic activity of a 2 oxo-glutaric acid oxygenase. The enzyme removes methyl groups from 3 methyl thymine (in DNA) 3 methyl uracil (in RNA). The SNPs linking FTO to obesity are in introns in the gene. In mice, the mRNA for FTO is highly enriched in the hypothalamus. Levels of FTO mRNA drop by 60% in fasting mice.

[ Science vol. 318 pp. 1469 – 1472 ’07 ] The Science paper at last. The gene produce catalyzes the Fe++ and 2-oxoglutaric acid dependent demethylation of 3 methyl thymine (which may not be the relevant substrate) in single stranded DNA with production of succinic acid, formaldehyde, and CO2. FTO is found in the nucleus in transfected cells. The mRNA for FTO is most abundant in the brain particularly in hypothalamic nuclei governing energy balance. FTO is inhibited by Krebs cycle intermediates (isn’t 2 oxoglutarate a Krebs cycle intermediate? ) particularly fumaric acid.

[ Science vol. 334 pp. 569 – 571 ’11 ] FTO removes methyl groups from 3 Methylthymine, and 3 methylUridine in single stranded DNA and RNA (ssDNA, ssRNA). The present work shows FTO converts 6 methylamino Adenine to adenine in RNA. FTO associates with speckles containing RNA splicing factors and RNA polymerase II

[ Nature vol. 457 p. 1095 ’09 ] Mice lacking FTO were normal at birth, but at 6 weeks weighed 30 – 40% less than normal mice (or haploinsufficients). This was due to loss of white fat — which was nearly completely absent at 15 months. The mutants ate more (in proportion to their body weight) than normal. On a high fat diet, both groups gained less weight than normals. Mice lacking FTO use more energy while not moving much.

[ Nature vol. 458 pp. 894 – 898 ’09 ] Loss of FTO in mice leads to postnatal growth retardation and a significant reduction both in fat and in lean body mass. The leanness is due to increased energy expenditure and sympathetic cativation, despite decreased sspontaneous motor activity and relative hyperphagia.

[ Proc. Natl. Acad. Sci. vol. 107 pp. 8404 – 8409 ’10 ] Carriers of the fat allele of FTO have smaller brains (8% smaller in the frontal lobes, 12% smaller in the occipital lobes). The brain differences weren’t due to differences in cholesterol, hypertension or white matter hyperintensities. So FTO risk isn’t a surrogate for the metabolic changes of obesity. The study was done in 206 cognitively normal adults (average age 76). Every 1 unit increase in BMI was assocaited with 1 – 1.5% reduction in brain volume in a variety of brain regions.

The highest expression of FTO is in the cerebral cortex. Whether expression in the hypothalamus changes after food deprivation is controversial.

It is known that obesity (BMI > 30) is associated with smaller brains. In this group temporal lobe atrophy was found in people with higher BMI but not in people with risk allele of FTO.

There was no effect of BMI on brain size in noncarriers of the FTO allele. So FTO status may influence the effect of BMI on the brain.

[ Cell vol. 149 pp. 1635 – 1646 ’12 ] A study of just what 6methylamino adenine (m6A) is doing and where in the genome it is doing it. m6A is the physiologically relevant target of FTO. It is found in tRNA, rRNA and mRNA. It fact m6A is found in 7,676 different mRNAs. The modification is markedly increased throughout brain development. m6A sites are enriched near stop codons and in 3′ untranslated regions (3′ UTRs). Even more interestingly, there is an association between m6A and microRNA binding sites in the 3′ UTRs ! ! ! m6A is not enriched at splice junctions. 30% of genes are said to have microRNA binding sites, but 67% of the 3′ UTRs containing m6A have microRNA binding sites. However, the two can’t overlap in the 3′ UTR. Many features of m6A localization are the same in man and mouse.

[ Nature vol. 490 pp. 267 – 272 ’12 ] In some way the SNP rs7202116 in FTO is associated with phenotypic variability per se. No other locus causes BMI variability this way.

[ Proc. Natl. Acad. Sci. vol. 110 pp. 2557 – 2562 ’13 ] FTO is widely expressed, with highest levels in brain, particularly the hypothalamus. FTO expression in the hypothalamus is decreased after a 48 hour fast, and incraeasing after a 10 week exposure to a high fat diet.

Carriers of the obesity promoting allele are hyperphagic and show altered (how?) macronutrient preference. This work shows that cells lacking FTO show decreased activation of the mTORC1 pathway, decreased rates of mRNA translation, and increased autophagy — all of which helps explain the stunted growth seen in man homozygous for FTO mutations.

FTO is rapidly degraded when cells are deprived of amino acids (this decreases TORC1 activity, making it a part of the physiological response to starvation). How this reoates to the demethylase activity of FTO isn’t known (yet). The methylase action is crucial for its ability to sustain mTORC1 activity in the face of amino acid deprivation.

[ Nature vol. 507 pp. 309 – 310, 371 – 375 ’14 ] Amazingly, the association between obesity and FTO involves another gene (IRX3) which is 500 kiloBases away. This was determined by chromosome conformation capture (CCC). The promoter of IRX3 interacts physically interacts with the first intron of FTO — this was found human cell lines, and other organisms. Obesity li9nked SNPs are associated with IRX3 expression in these samples, but not with expression of FTO. Mice lacking a functional copy of IRX3 have 25 – 30% lower body weight than controls (primarily due to loss of fat mass and increase in BMR with browning of white fat.

There is another case — an enhancer in an intron of LMBR1 reglates the developmental gene SHH found over a megaBase away. Mutations in the enhancer can cause limb malformations due to altered SHH expression.

Do enzymes chase their prey?

Do enzymes chase their prey? At first thought, this seems ridiculous. However people have been measuring diffusion of substances in water for over a century. Even Einstein worked on it (his paper on Brownian motion). So it’s fairly easy to measure the diffusion of an enzyme in water. Several enzymes (catalase — one of the most efficient enzymes known, and urease) diffuse faster when their substrate is present. [ Nature vol. 517 pp. 149 – 150, 227 – 230 ’15 ] The hydrolysis of urea by urease and the conversion of H2O2 to O2 and water by catalase enhances the molecular diffusion of the enzymes (this is called anomlous diffusion).If you inhibit catalase enzymatic activity using azide the anomalous diffusion disappears (even though there’s still plenty of H2O2 around). This work also showed that the rate of diffusion of catalase, urease and 2 more ezymes correlates with the heat produced by the reaction catalyzed.

Heating the catalytic center of catalase (using a short laser pulse) produces the same anomalous diffusion. Proteins exist in a world in which Brownian motion is governed by viscous forces rather than by inertia, so coasting (a la Galileo and Newton’s law of inertia) isn’t an option — continuous force generation is required.

Heat generated from each catalytic cycle could be transmitted through the enzyme as a pressure wave. For this to happen the catalytic center must be NOT at the center of mass of the enzyme, so the pressure wave will create differential stress at the enzyme solvent interface (which should propel the enzyme). They call this the chemoacoustic effect.

Molecular dynamics simulations suggest that the transmission of energy through a protein can be quite fast (5 Angstroms/picoSecond) and nonuniformly distributed.

Some enzymes have a near perfect catalytic efficiency. Every time a substrate hits them, the substrate is converted to product. Examples include catalase, acetyl cholinesterase, fumarase, and carbonic anhydrase. There are 100 million to a billion collisions per mole per second in solution.

Could this be a product of evolution (to make enzymes actively search out substrates?). Note, this won’t work if the catalytic center of the enzyme is in the center of mass.

I doubt that much catalytic efficiency is gained by having a huge protein molecule sluggishly move through the cytoplasm. Why? The molecular mass of H2O2 is 19 Daltons (vs. 18 for water), so it moves slightly more slowly but water moves at 20C in water at 590 meters/second. Of course it doesn’t get very far before it bumps into another water molecule and gets deflected.

Is there an ace physical chemist out there who can put numbers on this. I couldn’t believe that I couldn’t find a simple expression for the relation between the diffusion coefficient and the mass of the diffuser, ditto for the atomic volume of a water molecule, although I’m guessing that it’s pretty close to the length of the H – O bond (.95 Angstroms) giving a mass of 3.6 cubic Angstroms. I wanted this so I could see how much room to roam a water molecule has.

Cancer as the telephone game

An interesting paper just out [ Science vol. 347 pp. 78 – 81 ’14 ] basically says that cancer is just bad luck due to copying errors of the 3.2 megaBase genome when cells divide. It’s a version of the telephone game in which a message is passed around a circle of people getting progressively garbled each time.

The evidence in support of the assertion is that the variation in cancer rates between tissues is strongly related to the number of divisions of the stem cells required to maintain that tissue. For instance the lifetime risk of being diagnosed with cancer is 7% for lung but .6% for brain (about this more later). Risk in the GI tract varies by a factor of 24 (.5% for the esophagus 4.8% for the colon) which is proportional to the number of stem cell divisions undergone during lifetime.

They estimate that at most 1/3 of the variation in risk among tissues is due to environmental factors or inherited predisposition. That’s certainly not to say that you should go ahead and smoke.

The idea makes a lot of sense. Even though the error rate in copying the parental genome to a child is an amazingly low 1/100,000,000 that still is 32 mutations per generation (more from the father than the mother and more from him the older he is, not so for the mother)– for details please see https://luysii.wordpress.com/2012/08/30/how-fast-is-your-biological-clock-ticking-ii-latest-results/.

There is even better evidence for this based on my clinical experience in neurology for 35+ years. The lifetime chance of a brain tumor is stated to be .6%. However in all these years I never saw a brain tumor made of neurons. They were all derived from glia (astrocytoma, glioblastoma) or the coverings of the brain (meningiomas). Why? Essentially neurons in the cerebral cortex (not the deeper parts of the brain) don’t divide. [ Cell vol. 153 pp. 1183 – 1185, 1219 – 1227 ’13, Science vol. 340 pp. 1180 – 1181 ’13 (Editorial) ] Even the parts that do divide add a trivial amount of neurons to the brain (700 neurons a day). Even if you live 100 years — that’s only 100 * 365 * 700 == 26 million neurons, a trivial amount compared to the 100 billion neurons you are estimated to have (this number grows each time I read about it).

You might be interested in how we can make statements like this about new neuron formation in the brain. It’s very clever — Carbon-14 accumulated in the atmosphere between the mid 50s and early 60s as a byproduct of above ground testing of nuclear weapons. Such testing was banned by treaty in 1963 and carbon-14 levels in the atmosphere declined in the following decades to previous low background levels. Carbon-14 is used in archeologic dating because its halflife is 5730 years.

Using postmortem tissue samples of individuals born before and after the nuclear bomb tests, the integration of carbon-14 into genomic DNA was measured. This would have occurred during the cell’s last division cycle. One can calculate the birth dates of different cell types collected from various tissues including brain. The approach is accurate to within a few years. The 5730 year half life of 14-C means that whatever is in human DNA hasn’t had a chance to decay (by much) in 50 years. The amount of carbon-14 in cellular DNA therefore reflects the amount of carbon-14 in the atmosphere when the cells underwent their last division. The amount of carbon-14 in the atmosphere was determined by measuring it in the annual growth rings of pine trees in Sweden — a surrogate for atmospheric carbon-14 levels in the past 60 years. The birthdate of cells is determined as the year the C-14 in them matches those of the pine trees.

Follow

Get every new post delivered to your Inbox.

Join 75 other followers