Category Archives: Molecular Biology

Here’s a drug target for schizophrenia and other psychiatric diseases

All agree that any drug getting schizophrenics back to normal would be a blockbuster. The more we study its genetics and biochemistry the harder the task becomes. Here’s one target — neuregulin1, one variant of which is strongly associated with schizophrenia (in Iceland).

Now that we know that neuregulin1 is a potential target, why should discovering a drug to treat schizophrenia be so hard? The gene stretches over 1.2 megaBases and the protein contains some 640 amino acids. Cells make some 30 different isoforms by alternative splicing of the gene. Since the gene is so large one would expect to find a lot of single nucleotide polymorphisms (SNPs) in the gene. Here’s some SNP background.

Our genome has 3.2 gigaBases of DNA. With sequencing being what it is, each position has a standard nucleotide at each position (one of A, T, G, or C). If 5% of the population have any one of the other 3 at this position you have a SNP. By 2004 some 7 MILLION SNPs had been found and mapped to the human genome.

Well it’s 10 years later, and a mere 23,094 SNPs have been found in the neuregulin gene, of which 40 have been associated with schizophrenia. Unfortunately most of them aren’t in regions of the gene which code for amino acids (which is to be expected as 640 * 3 = 1920 nucleotides are all you need for coding out of the 1,200,000 nucleotides making up the gene). These SNPs probably alter the amount of the protein expressed but as of now very little is known (even whether they increase or decrease neuregulin1 protein levels).

An excellent review of Neuregulin1 and schizophrenia is available [ Neuron vol. 83 pp. 27 - 49 '14 ] You’ll need a fairly substantial background in neuroanatomy, neuroembryology, molecular biology, neurophysiology to understand all of it. Included are some fascinating (but probably incomprehensible to the medicinal chemist) material on the different neurophyiologic abnormalities associated with different SNPs in the gene.

Here are a few of the high points (or depressing points for drug discovery) of the review. Neuregulin1 is a member of a 6 gene family, all fairly similar and most expressed in the brain. All of them have multiple splicing isoforms, so drug selectivity between them will be tricky. Also SNPs associated with increased risk of schizophrenia have been found in family members numbers 2, 3 and 6 as well, so neuregulin1 not be the actual target you want to hit.

It gets worse. The neuregulins bind to a family of receptors (the ERBBs) having 4 members. Tending to confirm the utility of the neuregulins as a drug target is the fact that SNPs in the ERBBs are also associated with schizophrenia. So which isoform of which neuregulin binding to which iso form of which ERBB is the real target? Knowledge isn’t always power.

A large part of the paper is concerned with the function of the neuregulins in embryonic development of the brain, leading the the rather depressing thought that the schizophrenic never had a change, having an abnormal brain to begin with. A drug to reverse such problems seems only a hope.

The neuregulin/EBBB system is only one of many genes which have been linked to schizophrenia. So it looks like a post of a 4 years ago on Schizophrenia is largely correct — http://luysii.wordpress.com/2010/04/25/tolstoy-was-right-about-hereditary-diseases-imagine-that/

Happy hunting. It’s a horrible disease and well worth the effort. We’re just beginning to find out how complex it really is. Hopefully we’ll luck out, as we did with the phenothiazines, the first useful antipsychotics.

“A Troublesome Inheritance” – I

One of the joys of a deep understanding of chemistry, is the appreciation of the ways in which life is constructed from the most transient of materials. Presumably the characteristics of living things that we can see (the phenotype) will someday be traceable back to the proteins, nucleic acids,and small metabolites (lipids, sugars, etc..) making us up.

For the time being we must content ourselves with understanding the code (our genes) and how it instructs the development of a trillion celled organism from a fertilized egg. This brings us to Wade’s book, which has been attacked as racist, by anthropologists, sociologists and other lower forms of animal life.

Their position is that races are a social, not a biological construct and that differences between societies are due to the way they are structured, not by differences in the relative frequency of the gene variants (alleles) in the populations making them up. Essentially they are saying that evolution and its mechanism descent with modification under natural selection, does not apply to humanity in the last 50,000 years when the first modern humans left Africa.

Wade disagrees. His book is very rich in biologic detail and one post about it discussing it all would try anyone’s attention span. So I’m going to go through it, page by page, commenting on the material within (the way I’ve done for some chemistry textbooks), breaking it up in digestible chunks.

As might be expected, there will be a lot of molecular biology involved. For some background see the posts in https://luysii.wordpress.com/category/molecular-biology-survival-guide/. Start with http://luysii.wordpress.com/2010/07/07/molecular-biology-survival-guide-for-chemists-i-dna-and-protein-coding-gene-structure/ and follow the links forward.

Wade won me over very quickly (on page 3), by his accurate and current citations to the current literature. He talks about how selection on a mitochondrial protein helped Tibetans to live at high altitude (while the same mutation those living at low altitudes leads to blindness). Some 25% Tibetans have the mutation while it is rare among those living at low altitudes.
Here’s my post of 10 June 2012 ago on the matter. That’s all for now

Have Tibetans illuminated a path to the dark matter (of the genome)?

I speak not of the Dalai Lama’s path to enlightenment (despite the title). Tall people tend to have tall kids. Eye color and hair color is also hereditary to some extent. Pitched battles have been fought over just how much of intelligence (assuming one can measure it) is heritable. Now that genome sequencing is approaching a price of $1,000/genome, people have started to look at variants in the genome to help them find the genetic contribution to various diseases, in the hopes of understanding andtreating them better.

Frankly, it’s been pretty much of a bust. Height is something which is 80% heritable, yet the 20 leading candidate variants picked up by genome wide association studies (GWAS) account for 3% of the variance [ Nature vol. 461 pp. 458 - 459 '09 ]. This has happened again and again particularly with diseases. A candidate gene (or region of the genome), say for schizophrenia, or autism, is described in one study, only to be shot down by the next. This is likely due to the fact that many different genetic defects can be associated with schizophrenia — there are a lot of ways the brain cannot work well. For details — see http://luysii.wordpress.com/2010/04/25/tolstoy-was-right-about-hereditary-diseases-imagine-that/. or see http://luysii.wordpress.com/2010/07/29/tolstoy-rides-again-autism-spectrum-disorder/.

Typically, even when an association of a disease with a genetic variant is found, the variant only increases the risk of the disorder by 2% or less. The bad thing is that when you lump them all of the variants you’ve discovered together (for something like height) and add up the risk, you never account for over 50% of the heredity. It isn’t for want of looking as by 2010 some 600 human GWAS studies had been published [ Neuron vol. 68 p. 182 '10 ]. Yet lots of the studies have shown various disease to have a degree of heritability (particularly schizophrenia). The fact that we’ve been unable to find the DNA variants causing the heritability was totally unexpected. Like the dark matter in galaxies, which we know is there by the way the stars spin around the galactic center, this missing heritability has been called the dark matter of the genome.

Which brings us to Proc. Natl. Acad. Sci. vol. 109 pp. 7391 – 7396 ’12. It concerns an awful disease causing blindness in kids called Leber’s hereditary optic neuropathy. The ’cause’ has been found. It is a change of 1 base from thymine to cytosine in the gene for a protein (NADH dehydrogenase subunit 1) causing a change at amino acid #30 from tyrosine to histidine. The mutation is found in mitochondrial DNA not nuclear DNA, making it easier to find (it occurs at position 3394 of the 16,569 nucleotide mitochondrial DNA).

Mitochondria in animal cells, and chloroplasts in plant cells, are remnants of bacteria which moved inside cells as we know them today (rest in peace Lynn Margulis).

Some 25% of Tibetans have the 3394 T–>C mutations, but they see just fine. It appears to be an adaptation to altitude, because the same mutation is found in nonTibetans on the Indian subcontinent living about 1500 meters (about as high as Denver). However, if you have the same genetic change living below this altitude you get Lebers.

This is a spectacular demonstration of the influence of environment on heredity. Granted that the altitude you live at is a fairly impressive environmental change, but it’s at least possible that more subtle changes (temperature, humidity, air conditions etc. etc.) might also influence disease susceptibility to the same genetic variant. This certainly is one possible explanation for the failure of GWAS to turn up much. The authors make no mention of this in their paper, so these ideas may actually be (drumroll please) original.

If such environmental influences on the phenotypic expression of genetic changes are common, it might be yet another explanation for why drug discovery is so hard. Consider CETP (Cholesterol Ester Transfer Protein) and the very expensive failure of drugs inhibiting it. Torcetrapib was associated with increased deaths in a trial of 15,000 people for 18 – 20 months. Perhaps those dying somehow lived in a different environment. Perhaps others were actually helped by the drug

Why marihuana scares me

There’s an editorial in the current Science concerning how very little we know about the effects of marihuana on the developing adolescent brain [ Science vol. 344 p. 557 '14 ]. We know all sorts of wonderful neuropharmacology and neurophysiology about delta-9 tetrahydrocannabinol (d9-THC) — http://en.wikipedia.org/wiki/Tetrahydrocannabinol The point of the authors (the current head of the Amnerican Psychiatric Association, and the first director of the National (US) Institute of Drug Abuse), is that there are no significant studies of what happens to adolescent humans (as opposed to rodents) taking the stuff.

Marihuana would the first mind-alteraing substance NOT to have serious side effects in a subpopulation of people using the drug — or just about any drug in medical use for that matter.

Any organic chemist looking at the structure of d9-THC (see the link) has to be impressed with what a lipid it is — 21 carbons, only 1 hydroxyl group, and an ether moiety. Everything else is hydrogen. Like most neuroactive drugs produced by plants, it is quite potent. A joint has only 9 milliGrams, and smoking undoubtedly destroys some of it. Consider alcohol, another lipid soluble drug. A 12 ounce beer with 3.2% alcohol content has 12 * 28.3 *.032 10.8 grams of alcohol — molecular mass 62 grams — so the dose is 11/62 moles. To get drunk you need more than one beer. Compare that to a dose of .009/300 moles of d9-THC.

As we’ve found out — d9-THC is so potent because it binds to receptors for it. Unlike ethanol which can be a product of intermediary metabolism, there aren’t enzymes specifically devoted to breaking down d9-THC. In contrast, fatty acid amide hydrolase (FAAH) is devoted to breaking down anandamide, one of the endogenous compounds d9-THC is mimicking.

What really concerns me about this class of drugs, is how long they must hang around. Teaching neuropharmacology in the 70s and 80s was great fun. Every year a new receptor for neurotransmitters seemed to be found. In some cases mind benders bound to them (e.g. LSD and a serotonin receptor). In other cases the endogenous transmitters being mimicked by a plant substance were found (the endogenous opiates and their receptors). Years passed, but the receptor for d9-thc wasn’t found. The reason it wasn’t is exactly why I’m scared of the drug.

How were the various receptors for mind benders found? You throw a radioactively labelled drug (say morphine) at a brain homogenate, and purify what it is binding to. That’s how the opiate receptors etc. etc. were found. Why did it take so long to find the cannabinoid receptors? Because they bind strongly to all the fats in the brain being so incredibly lipid soluble. So the vast majority of stuff bound wasn’t protein at all, but fat. The brain has the highest percentage of fat of any organ in the body — 60%, unless you considered dispersed fatty tissue an organ (which it actually is from an endocrine point of view).

This has to mean that the stuff hangs around for a long time, without any specific enzymes to clear it.

It’s obvious to all that cognitive capacity changes from childhood to adult life. All sorts of studies with large numbers of people have done serial MRIs children and adolescents as the develop and age. Here are a few references to get you started [ Neuron vol. 72 pp. 873 - 884, 11, Proc. Natl. Acad. Sci. vol. 107 pp. 16988 - 16993 '10, vol. 111 pp. 6774 -= 6779 '14 ]. If you don’t know the answer, think about the change thickness of the cerebral cortex from age 9 to 20. Surprisingly, it get thinner, not thicker. The effect happens later in the association areas thought to be important in higher cognitive function, than the primary motor or sensory areas. Paradoxical isn’t it? Based on animal work this is thought to be due pruning of synapses.

So throw a long-lasting retrograde neurotransmitter mimic like d9-THC at the dynamically changing adolescent brain and hope for the best. That’s what the cited editorialists are concerned about. We simply don’t know and we should.

Having been in Cambridge when Leary was just getting started in the early 60′s, I must say that the idea of tune in turn on and drop out never appealed to me. Most of the heavy marihuana users I’ve known (and treated for other things) were happy, but rather vague and frankly rather dull.

Unfortunately as a neurologist, I had to evaluate physician colleagues who got in trouble with drugs (mostly with alcohol). One very intelligent polydrug user MD, put it to me this way — “The problem is that you like reality, and I don’t”.

Further (physical) chemical elegance

If the chemical name phosphatidyl serine (PS) draws a blank, read the verbatim copy of a previous post under the *** to find out why it is so important to our existence. It is an ‘eat me’ signal when there is lots of it around, telling professional scavenger cells to engulf the cell showing lots of PS on its surface.

Life, as usual, is more complicated. There are a variety of proteins exposed on cell surfaces which bind to phosphoserine. Not only that, but exposing just a little PS on the surface of a cell can trigger a protective immune response. Immune cells binding to just a little PS on the surface of another cell proliferate rather than eat the cell expressing the PS. This brings us to Proc. Natl. Acad. Sci. vol. 111 pp 5526 – 5531 ’14 that explains how a given PS receptor (called TIM4) acts differently depending how much PS is present.

Some PS receptors such as Annexin V have essentially an all or none response to PS, if they bind at all, they trigger a response in the cell carrying them. Not so for TIM4 which only reacts if there is a lot of PS around, leaving cells which express less PS alone. This allows these cells to function in the protective immune response.

So how does TIM4 do this? See if you can think of a mechanism before reading the rest.

In addition to the PS binding pocket TIM4 has 4 peripheral basic residues in separate places. The basic residues are positively charged at physiologic pH and bind to the negatively charged phosphate group of phosphatidyl serene or to the carboxylate anion of phosphatidyl serine. The paper doesn’t explain how these basic residues don’t bind to the other phospholipids of the cell surface (such as phosphatidyl choline or sphingomyelin). It is conceivable that the basic side chains (arginine, lysine etc.) are so set up that they only bind to carboxylate anions and not phosphate anions (but this is a stretch). That would at least give them specificity for phosphatidyl serene as opposed the other phospholipids present in both leaflets of the cell membrane. In any even TIM4 will be triggered only if these groups also bind PS, leaving cells which show relatively little PS alone. Clever no?

For the cognoscenti, the Hill coefficient of TIM4 is 2 while that of Annexin V is 8 (describing more than explaining the all or none character of Annexin V binding).

****
Flippase. Eat me signals. Dragging their tails behind them. Have cellular biologists and structural biochemists gone over to the dark side? It’s all quite innocuous as the old nursery rhyme will show

Little Bo Peep has lost her sheep
and doesn’t know where to find them
Leave them alone, and they’ll come home
wagging their tails behind them.

First, some cellular biochemistry. The lipid bilayer encasing all our cells is made of two leaflets, inner and outer. The composition of the two is different (unlike the soap bubble). On the inside we find phosphatidylethanolamine (PE), phosphatidylserine (PS). The outer leaflet contains phosphatidylcholine (PC) and sphingomyelin (SM) and almost no PE or PS. This is clearly a low entropy situation compared to having all 4 randomly dispersed between the 2 leaflets.

What is the possible use of this (notice how teleology invariably creeps into cellular biology)? Chemistry is powerless to explain such things. Much as I love chemistry, such truths must be faced.

It takes energy to maintain this peculiar distribution. The enzyme moving PE and PS back inside the cell is the flippase. It requires energy in the form of ATP to operate. When a cell is dying ATP drops, and entropy takes its course moving PE and PS to the cell surface. Specialized cells (macrophages) exist to scoop up the dying or dead cells, without causing inflammation. They recognize PE and PS by a variety of receptors and munch up cells exposing them on the surface. So PE and PS are eat me signals which appear when there isn’t enough ATP around for flippase to use to haul PE and PS back inside. Clever no?

No for some juicy chemistry (assuming that you consider transport of a molecule across a lipid bilayer actual chemistry — no covalent bonds to the transferred molecule are formed or removed, although they are to the transporter). Well it certainly is physical chemistry isn’t it?

Here are the structures of PE, PS, PC, SM http://www.google.com/search?q=phosphatidylserine&client=safari&rls=en&tbm=isch&tbo=u&source=univ&sa=X&ei=bDRLU5yfHOPLsQSOnoG4BA&ved=0CPABEIke&biw=1540&bih=887#facrc=_&imgdii=_&imgrc=qrLByG2vmhWdwM%253A%3BwAtgsTPwCxeZXM%3Bhttp%253A%252F%252Fscience.csumb.edu%252F~hkibak%252F241_web%252Fimg%252Fpng%252FCommon_Phospholipids.png%3Bhttp%253A%252F%252Fscience.csumb.edu%252F~hkibak%252F241_web%252Fcoursework_pages%252F2012_02_2.html%3B1297%3B934.

There are a few things to notice. Like just about every lipid found in our membranes, they are amphipathic — they have a very lipid soluble part (look at the long hydrocarbon changes hanging below them) and a very water soluble part — the head groups containing the phosphate.

This brings us to [ Proc. Natl. Acad. Sci. vol. 111 pp. E1334 - E1343 '14 ] Which describes ATP8A2 (aka the flippase). Interestingly, the protein, with at least 10 alpha helices spanning the membrane, and 3 cytoplasmic domains closely resembles the classic sodium pump beloved of neurophysioloogists everywhere, which pumps sodium ions out of neurons and pumps potassium ions inside, producing the equally beloved membrane potential of neurons.

Look at those structures again. While there are charges on PE, PS (on the phosphate group), these molecules are far larger than the sodium or the potassium ion (easily by a factor of 10). This has long been recognized and is called the ‘giant substrate problem’.

The paper solved the structure of ATP8A2 and used molecular dynamics stimulations to try to understand how it works. What they found is that transmembrane alpha helices 1, 2, 4 and 6 (out of 10) form a water filled cavity, which dissolves the negatively charged phosphate of the head group. What happens to those long hydrocarbon tails? The are left outside the helices in the lipid core of the membrane. It is the charged head groups that are dragged through by the flippase, with the tails wagging along behind them, just like little Bo Peep.

There’s a lot more great chemistry in the paper, particularly how Isoleucine #364 directs the sequential formation and annihilation of the water filled cavities between alpha helices 1, 2, 4 and 6, and how a particular aspartic acid is phosphorylated (by ATP, explaining why the enzyme no longer works in energetically dying cells) changing conformation of all 10 transmembrane helices, so that only one half of the channel is open at a time (either to the inside or the outside).

Go read and enjoy. It’s sad that people who don’t know organic chemistry are cut off from appreciating such elegance. There is more to esthetics than esthetics.

Little Bo Peep meets cellular biology and biochemistry.

Flippase. Eat me signals. Dragging their tails behind them. Have cellular biologists and structural biochemists gone over to the dark side? It’s all quite innocuous as the old nursery rhyme will show

Little Bo Peep has lost her sheep
and doesn’t know where to find them
Leave them alone, and they’ll come home
wagging their tails behind them.

First, some cellular biochemistry. The lipid bilayer encasing all our cells is made of two leaflets, inner and outer. The composition of the two is different (unlike the soap bubble). On the inside we find phosphatidylethanolamine (PE), phosphatidylserine (PS). The outer leaflet contains phosphatidylcholine (PC) and sphingomyelin (SM) and almost no PE or PS. This is clearly a low entropy situation compared to having all 4 randomly dispersed between the 2 leaflets.

What is the possible use of this (notice how teleology invariably creeps into cellular biology)? Chemistry is powerless to explain such things. Much as I love chemistry, such truths must be faced.

It takes energy to maintain this peculiar distribution. The enzyme moving PE and PS back inside the cell is the flippase. It requires energy in the form of ATP to operate. When a cell is dying ATP drops, and entropy takes its course moving PE and PS to the cell surface. Specialized cells (macrophages) exist to scoop up the dying or dead cells, without causing inflammation. They recognize PE and PS by a variety of receptors and munch up cells exposing them on the surface. So PE and PS are eat me signals which appear when there isn’t enough ATP around for flippase to use to haul PE and PS back inside. Clever no?

No for some juicy chemistry (assuming that you consider transport of a molecule across a lipid bilayer actual chemistry — no covalent bonds to the transferred molecule are formed or removed, although they are to the transporter). Well it certainly is physical chemistry isn’t it?

Here are the structures of PE, PS, PC, SM http://www.google.com/search?q=phosphatidylserine&client=safari&rls=en&tbm=isch&tbo=u&source=univ&sa=X&ei=bDRLU5yfHOPLsQSOnoG4BA&ved=0CPABEIke&biw=1540&bih=887#facrc=_&imgdii=_&imgrc=qrLByG2vmhWdwM%253A%3BwAtgsTPwCxeZXM%3Bhttp%253A%252F%252Fscience.csumb.edu%252F~hkibak%252F241_web%252Fimg%252Fpng%252FCommon_Phospholipids.png%3Bhttp%253A%252F%252Fscience.csumb.edu%252F~hkibak%252F241_web%252Fcoursework_pages%252F2012_02_2.html%3B1297%3B934.

There are a few things to notice. Like just about every lipid found in our membranes, they are amphipathic — they have a very lipid soluble part (look at the long hydrocarbon changes hanging below them) and a very water soluble part — the head groups containing the phosphate.

This brings us to [ Proc. Natl. Acad. Sci. vol. 111 pp. E1334 - E1343 '14 ] Which describes ATP8A2 (aka the flippase). Interestingly, the protein, with at least 10 alpha helices spanning the membrane, and 3 cytoplasmic domains closely resembles the classic sodium pump beloved of neurophysioloogists everywhere, which pumps sodium ions out of neurons and pumps potassium ions inside, producing the equally beloved membrane potential of neurons.

Look at those structures again. While there are charges on PE, PS (on the phosphate group), these molecules are far larger than the sodium or the potassium ion (easily by a factor of 10). This has long been recognized and is called the ‘giant substrate problem’.

The paper solved the structure of ATP8A2 and used molecular dynamics stimulations to try to understand how it works. What they found is that transmembrane alpha helices 1, 2, 4 and 6 (out of 10) form a water filled cavity, which dissolves the negatively charged phosphate of the head group. What happens to those long hydrocarbon tails? The are left outside the helices in the lipid core of the membrane. It is the charged head groups that are dragged through by the flippase, with the tails wagging along behind them, just like little Bo Peep.

There’s a lot more great chemistry in the paper, particularly how Isoleucine #364 directs the sequential formation and annihilation of the water filled cavities between alpha helices 1, 2, 4 and 6, and how a particular aspartic acid is phosphorylated (by ATP, explaining why the enzyme no longer works in energetically dying cells) changing conformation of all 10 transmembrane helices, so that only one half of the channel is open at a time (either to the inside or the outside).

Go read and enjoy. It’s sad that people who don’t know organic chemistry are cut off from appreciating such elegance. There is more to esthetics than esthetics.

The death of the synonymous codon – IV

The coding capacity of our genome continues to amaze. The redundancy of the genetic code has been put to yet another use. Depending on how much you know, skip the following three links and read on. Otherwise all the background to understand the following is in them.

http://luysii.wordpress.com/2011/05/03/the-death-of-the-synonymous-codon/

http://luysii.wordpress.com/2011/05/09/the-death-of-the-synonymous-codon-ii/

http://luysii.wordpress.com/2014/01/05/the-death-of-the-synonymous-codon-iii/

There really was no way around it. If you want to code for 20 different amino acids with only four choices at each position, two positions (4^2) won’t do. You need three positions, which gives you 64 possibilities (61 after the three stop codons are taken into account) and the redundancy that comes with it. The previous links show how the redundant codons for some amino acids aren’t redundant at all but used to code for the speed of translation, or for exonic splicing enhancers and inhibitors. Different codons for the same amino acid can produce wildly different effects leaving the amino acid sequence of a given protein alone.

If anything will figure out a way to use synonymous codons for its own ends, it’s cancer. [ Cell vol. 156 pp. 1129 - 1131, 1324 - 1335 '14 ] analyzed protein coding genes in cancer. Not just a few cases, but the parts of the genome coding for the exons of a mere 3,851 cases of cancer. In addition they did whole genome sequencing in 400 cases of 19 different tumor types.

There are genes which suppress cancer (which cancer often knocks out — such as the retinoblastoma or the ubiquitous p53), and genes which when mutated promote it (oncogenes like ras). They found a 1.3 fold enrichment of synonymous mutations in oncogenes (which would tend to activate them) than in the tumor suppressors. The synonymous mutations accounted for 20 – 40 % of somatic mutations found in cancer exomes.

Unfortunately, synonymous mutations have been used to estimate the background mutation frequency for evolutionary analysis, on the theory that they are neutral (e.g. because they don’t change protein structure, they are assumed not to change how the gene for the protein functions). Wrong. Wrong. They can change how much, or where, or what exons of a protein are included in the final product.

Why drug discovery is so hard: Reason #25 — What if your drug target is really a pointer to the real target?

Any drug safely producing weight loss would be a big (or small) pharma blockbuster. Those finding it should get on the boat to Sweden. Finding a target to attack is the problem. Here’s one way to look. Take lots of fat people, lots of thin people and see what in their genomes differentiates them (assuming anything does). Actually what was done was to look at type II diabetics (non-insulin dependent) the vast majority overweight and controls. The first study involved the genomes of nearly 5,000 diabetics and controls. How did they interrogate the genomes? At the time of the work it was impossible to completely sequence this many genomes.

It’s time to speak of SNPs (single nucleotide polymorphisms). Our genome has 3.2 gigaBases of DNA. With sequencing being what it is, each position has a standard nucleotide at each position (one of A, T, G, or C). If 5% of the population have one of the other 3 at this position you have a SNP. Already 10 years ago, some 7 MILLION SNPs had been found and mapped to the human genome.

The first study found some SNPs associated with obesity in the diabetics. This tells where to look for the gene. A second study with nearly 9,000 diabetics and controls, replicated the first.

Then the monster study, with 39,000 people [ Science vol. 316 pp. 889 - 894 '07 ] found FTO (FaT mass and Obesity associated gene) on chromosome #16. The 16% of Caucasian adults with two copies of the variant SNP in FTO were 1.67 times more likely to be obese. An intense flurry of work showed that the gene coded for an oxidase, using iron and 2 oxo-glutaric acid (alphaKG for you old timers). The enzyme removes methyl groups from the amino group at position #6 of adenine and the 3 position of thymine. Before this time, no one really paid much attention to them. Subsequently we’ve found 6 methyl adenine in a mere 7,676 mRNAs. Just what it does when it’s there, and why the cell wants to remove it is currently being worked out.

Clearly FTO is a great target for an obesity drug. Of course they knocked the gene out in the mouse. The animals were normal at birth, but at 6 weeks weighed 30 – 40% less than normal mice. FTO as a drug target looked even better after this.

It was somewhat surprising that the SNP was in an intron in the gene. This meant that even in the obese the protein product of the FTO gene was the same as in the skinny. Presumably this could mean more FTO, less FTO or a different splice variant. If some of this molecular biology is above your pay grade, the background you need is in 5 posts starting with https://luysii.wordpress.com/2010/07/07/molecular-biology-survival-guide-for-chemists-i-dna-and-protein-coding-gene-structure/.

It was somewhat surprising that FTO levels were the same in people with and without the fat SNP. That left splice variants as a possibility.

The denouement came this week [ Nature vol. 507 pp. 309 - 310, 371 - 375 '14 ]. The intron containing the SNP in FTO produces obesity by controlling another gene called IRX3 which is a mere 500,000 nucleotides away. The intron of FTO binds to the promoter of IRX3 turning the gene on resulting in more IRX3. Mice lacking a functional copy of IRX3 have a 25 – 30% lower body mass. As any C programmer would say, FTO is the pointer not the data.

I don’t know if big or small pharma was at work finding inhibitors or enhancers of FTO function, but this paper should have brought them to a screeching halt. The FTO/IRX3 story just shows how many pitfalls there are to finding new drugs, and why the search has shown relatively little success recently. We are trying to alter the function of an incredibly complex system, whose workings we only dimly understand.

Was I the last to find out?

Quick ! Can you form a hydrogen bond from a carbon hybridized sp3 to an oxygen atom?

I didn’t think so, but you can. This, in spite of reading about proteins for over half a century. [ Proc. Natl. Acad. Sci. vol. 111 pp. E888 - E895 '14 ] describes this (along with lots of references backing up the statements which follow) to such bonds forming between the transmembrane segments of membrane proteins (estimated to be 30% of all our proteins).

Whether or not they contribute to membrane stability isn’t known. Consider the alpha carbon of an amino acid. It is adjacent to a carbonyl group of an amide (electron hungry, but less so than a pure carbonyl because of resonance) and the nitrogen atom of an amide (slightly more electronegative than carbon, and probably more electron hungry because it loses part of its lone pair to resonance).

They are usually found from the alpha carbon of glycine on one helix to the carbonyl of an adjacent transmembrane helix. Glycine zippers (e.g. the G X X X G motif) have long been known in transmembrane helices. Since glycine is the smallest amino acid, having them on the same side of the helix was thought to be a way to pack adjacent helices together.

What would you consider good evidence for such a bond? Spectroscopy of model compounds with deuterium for the alpha hydrogen would be one way (it’s been done). The best evidence would be a shortened distance between the hydrogen and the carbonyl and this has been found as well.

Humbling ! !

What junk DNA is doing

I’ve never bought the idea that the 98% of our 3.2 gigaBase genome not coding for protein is junk. Consider the humble leprosy organism.It’s a mycobacterium (like the organism causing TB), but because it essentially is confined to man, and lives inside humans for most of its existence, it has jettisoned large parts of its genome, first by throwing about 1/3 of it out (the genome is 1/3 smaller than TB from which it is thought to have diverged 66 million years ago), and second by mutation of many of its genes so protein can no longer be made from them. Why throw out all that DNA? The short answer is that it is metabolically expensive to produce and maintain DNA that you’re not using.

Which brings us to Cell vol. 156 pp. 907 – 919 ’14. At least half of our genome is made of repetitive elements. We have some 520,000 (imperfect) copies of LINE1 elements — each up to 6,000 nucleotides long. There are 1,400,000 (imperfect) copies of Alu each around 300 nucleotides long. This stuff has been called junk for decades. However it has become apparent that over 50% of our entire genome is transcribed into RNA. This is also expensive metabolically.

Addendum 17 Mar: Just the cost of making a single nucleotide from scratch to hook into mRNA is 50 ATP molecules (according to an estimate I read). It also takes energy for the polymerase to hook two nucleotides together — but I can’t find out what it is (anyone know?). It’s hard to avoid teleology when thinking about biology — but why should a cell expend all this metabolic energy to copy half or more of its genome into RNA, if it weren’t getting something useful back?

Why hasn’t evolution got rid of this stuff, like the leprosy organism? Probably because it’s doing several important things we don’t understand. Here’s one of them. The cell paper did something clever and obvious (now that someone else though of it). C0T-1 DNA is placental DNA predominantly 50 – 300 nucleotides in size, very enriched in repetitive DNA sequences. It is used to block nonspecific hybridization in microarray screening for mRNA coding for protein. The authors used C0T-1 DNA to look at whole cells to find RNA transcribed from these repetitive elements, and more importantly, to find where in the cell it was located.

Guess what they found? Repetitive DNA is associated big time with interphase (e.g. not undergoing mitosis) active chromatin (aka euchromatin). So RNA transcribed from Alu and LINE1 is a structural component of our chromosomes. Since the length of the 3.2 gigaBases of our genome, if stretched out, is 1 METER, a lot of our DNA occurs in very compact structures (heterochromatin) which is thought to be transcriptionally inactive. What happens when you use RNAase (an enzyme breaking down RNA) to remove it? The chromosomes condense to heterochromatin. So the junk may be keeping our chromosomes in an ‘open’ state, a fairly significant function.

This is the exact opposite of XIST, a 17,000 nucleotide RNA transcribed from the X chromosome, which keeps one of the two X’s each female possesses inactive by coating it like the ecRNAs

The authors conclude with “we are far from understanding genome expression and regulation.” Amen.

If some of this is a bit above your molecular biological pay grade — please see a series of articles “Molecular Biology Survival Guide for Chemists” — here’s a link to the first one — https://luysii.wordpress.com/2010/07/07/molecular-biology-survival-guide-for-chemists-i-dna-and-protein-coding-gene-structure/. There are 4 more.

Short and Sweet

Yamanaka strikes again. Citrulline is deiminated arginine, replacing a C=N-H (the imine) by a carbonyl C=O. An enzyme called PAD4 does the job. Why is it important? Because one of its targets is the H1 histone which links nucleosomes together. Recall that the total length of DNA in each and every one of our cells is 3 METERS. By wrapping the double helix around nucleosomes, the DNA is shortened by one order of magnitude.

So what? Well, at physiologic pH the imine probably binds another proton making it positively charged, making it bind to the negatively charged DNA phosphate backbone. Removing the imine makes this less likely to happen, so the linker doesn’t bind the double helix as tightly.

Duck soup for the chemist, but apparently no one had thought to look at this before.

This opens up the DNA (aka chromatin decondensation) for protein transcription. Why is Yamanaka involved? Because PAD4 is induced during cellular reprogramming to induced pluripotent stem cells (iPSCs), activating the expression of key stem cell genes. Inhibition of PAD4 lowers the percentage of pluripotent stem cells, reducing reprogramming efficiency. The paper is Nature vol. 507 pp. 104 – 108 ’14.

Will this may be nice for forming iPSCs, it should be noted that PAD4 is unregulated in a variety of tumors.

Follow

Get every new post delivered to your Inbox.

Join 61 other followers