Category Archives: Molecular Biology

The plural of anecdote IS data

Five years ago I wrote a post on the perils of implicating a gene as the cause of a disease because one or two people with the disease had a mutation there (see the bottom). That is now back in spades with a new report from the Exome Aggregation Consortium (ExAC) [ Nature vol. 536 pp. 249, 277 – 278, 285 – 291 ’16 ].

What they did was to aggregate sequence data from 60,704 people on the parts of their genomes coding for the amino acids making up proteins (the exome — https://en.wikipedia.org/wiki/Exome). The paper has 80+ authors. The data is publicly available and is planed to grow to 120,000 exomes and 20,000 whole genomes in the next year. Both are orders of magnitude larger than any individual exome study so far. So study enough anecdotes (small studies) and pretty soon you have real data

The articles state that over a million people have now had either their exomes or their whole genomes sequenced ! ! !

The amount of variation in the human genome is simply incredible. Some 7,404,909 variants in the exome were described, of which 54% had never been seen before. These account for 1/8 of all the sites in all our exomes, implying that the exome comprises 60 megaBases of the 3200 megaBase human genome (1.8%). Most of the variants were single amino acid changes due changes in a single nucleotide, but there were 317,381 insertions or deletions (95% shorter than 6 nucleotides).

99% of all variants had a frequency of under 1% (e.g. not found in in more than 607 people), with half being found only once in the 60,704. 8% of the sites with variation contain more than one (consistent with what you’d expect of a Poisson distribution).

What is so remarkable is that the average participant has 54 variants previously classified as responsible for a genetic disorder. Not only that 183/192 variants thought to cause a rare hereditary disease were found in many healthy people, implying that they were incidental findings (anecdotes) rather than causal. It shows you what happens when you have adequate data.

They are pretty sure that their work will stand, because the exomes were sequenced many times over (deeply sequenced in the lingo) more than 10x in over 80% of the cohort.

I’d also written earlier about how full of errors our genomes are — see https://luysii.wordpress.com/2012/07/31/how-badly-are-thy-genomes-oh-humanity/

A lot of the variants produced termination codons in the body of the exome, so a full-length protein couldn’t be produced from the gene (these are called truncation variants) — some 179,774 in the 7,404,909. Most occurred just once. Even so this means that most of the cohort had at least one or two. Even this rather negative knowledge was useful — since we have about 20,000 protein coding genes, they found 3,230 in which truncation variants NEVER occurred, implying that the protein is crucial to survival.

****

We’ve found the mutation causing your disease — not so fast, says this paper (posted 17 July 2011)

This post takes a while to get to the main points, but hang in there, the results are striking (and disturbing).

First: a bit of history. In the bad old days (any time over about 30 years ago) there was basically only one way to look for a disc in the spinal canal pressing on a nerve producing symptoms (usually pain, followed by numbness and weakness). It was the myelogram, where a spinal tap was done, an oily substance (containing iodine which Xrays don’t penetrate well) was injected into the spinal canal, and Xrays taken. The disc showed up as a defect in the column of dye (not really a dye as any chemist can see). This usually led to surgery if a disc was found, even if it was one or two spinal levels from where clinicians thought it should be based on their examination and other tests such as electromyography (EMG). This was usually put down to anatomic variability. Results were less than perfect.

Myelography was a rather stressful procedure, and I usually brought patients into the hospital the night before, got a cardiogram (to make sure their heart could take it, and that they hadn’t had a silent heart attack). Then the myelography itself, which wasn’t painful as the radiologist put the needle in under fluoroscopy so they could see exactly where to go. However many people got severe post-spinal headaches (invariably doctor’s wives), sometimes requiring a blood patch to plug the hole where the (large) needle used to inject the ‘dye’ went — it had to be large because the ‘dye’ was rather oily (viscous). The bottom line was that you didn’t subject a patient to a myelogram unless they were having a significant problem. Only very symptomatic people had the test, and usually when nonsurgical therapy had been tried and failed.

Fast forward to the MRI (Magnetic Resonance Imaging) era (nuclear magnetic resonance to the chemist, but radiologists were smart enough to get the word nuclear removed so patients would submit to the test). A painless technique, but stressful for some because of the close quarters in the MRI machine. You could look at the whole spinal canal, and see far more anatomic detail, because you actually see the disc (rather than its impression on a column of dye) and the surrounding bones, ligaments etc. etc.

What did we find? There were tons of people with discs where they shouldn’t be (e.g. herniated discs) who were having no problems at all. This led to a lot more careful assessment of patients, with far better correlation of anatomic defect and clinical symptoms.

What in the world does this all have to do with the genetics of disease? Patience; you’re about to find out.

There’s an interesting interview with Eric Lander (of Human Genome Project fame) in the current PNAS (p. 11319). He notes that in 1990 sequencing a single genome cost $3,000,000,000. He thinks that at some time in the next 5 years we’ll be able to do this for $1,000, a 3 million-fold improvement in cost. The genome has around 3,000,000,000 positions to sequence. As things stand now, it’s literally nothing to determine the sequence of a few million positions in DNA.

On to Cell vol. 145 pp. 1036 – 1048 ’11 which sequenced some 9,000,000 positions of DNA. This didn’t make a big splash (but its implications might). Just a single paper, buried in the middle of the 24 June ’11 Cell — it didn’t even rate an editorial. Now, as chemists, if you’re a bit shaky on what follows, all the background you need can be found in the series of articles found here –https://luysii.wordpress.com/category/molecular-biology-survival-guide/

As a neurologist, I treated a lot of patients with epilepsy (recurrent convulsions, recurrent seizures). 2% of children and 1% of adults have it (meaning that half of the kids with it will outgrow it, as did the wife of an old friend I saw this afternoon). Some forms of epilepsy run in families with strict inheritance (like sickle cell anemia or cystic fibrosis). 20 such forms have been tied down to single nucleotide polymorphisms (SNPs) in 20 different genes coding for protein (there are other kinds of genes) — all is explained in the background material above). 17/20 of these SNPs are in a type of protein known as an ion channel. These channels are present in all our cells, but in neurons they are responsible for the maintenance of a membrane potential across the membrane, which has the ability change abruptly causing an nerve cell to fire an impulse. In a very simplistic way, one can regard a convulsion (epileptic seizure) as nerve cells gone wild, firing impulses without cease, until the exhausted neurons shut down and the seizure ends.

However, the known strictly hereditary forms of epilepsy account for at most 1 – 2% of all people with epilepsy. The 9,000,000 determinations of DNA sequence were performed on 237 ion channel genes, but just those parts of the genes actually coding for amino acids (these are the exons). They studied 152 people with nonhereditary epilepsy (also known as idiopathic epilepsy) and, most importantly, they looked at the same channels in 139 healthy normal people with no epilepsy at all.

Looking at the 17/237 ion channels known to cause strictly hereditary epilepsy they found that 96% of cases of nonhereditary (idiopathic) epilepsy had one or more missense mutations (an amino acid at a given position different than the one that should be there). Amazingly, 70% of normal people also had missense mutations in the 17. Looking at the broader picture of all 237 channels, they found 300 different mutations in the 139 normals, of which 23 were in the 17. Overall they found 989 SNPs in all the channels in the whole group, of which 415 were nonsynonumous.

Well what about mutational load? Suppose you have more than one mutation in the 17 genes. 77% the cases with idiopathic epilepsy had 2 or more mutations in the 17, but so did 30% of the people without epilepsy at all.

The relation between myelography and early genetic work on disease should be clear. Back then, a lot was taken as abnormal as only the severely afflicted could be studied, due to time, money and technological constraints. As the authors note “causality cannot be assigned to any particular variant”. Many potentially pathogenic genetic variants in known dominant channel genes are present in normals.

What was not clear to me from reading the paper is whether any of the previously described mutations in the 17 are thought to be causative of strictly hereditary epilepsy were present in the 139 normals.

A very interesting point is how genetically diverse the human population actually is (and they only studied Caucasians and Hispanics — apparently no Blacks). No individual was free of SNPs. No two individuals (in the 139 + 152) had the same set of SNPs. Since they found 989 SNPs in the combined group, even in this small sample of proteins (17 of 20,000) this averages out to more than 3 per individual. Well, are there ‘good’ SNPs in the asymptomatic group, and ‘bad’ SNPs in the patients with idiopathic epilepsy? Not really, the majority of the SNPs were present in both groups.

I leave it to your imagination what this means for ‘personalized medicine’. We’re literally just beginning to find out what’s out there. This is the genetic analog of the asymptomatic disc. We may not know all we thought we knew about genetics and disease. Heisenberg must be smiling, wherever he is.

What reading the literature is like when things are barely understood

There is a very exciting paper to be described in a post to appear shortly. I ran a muscular dystrophy clinic for 15 years, and saw lots of Amyotrophic Lateral Sclerosis (ALS) — even though, strictly speaking it is not a muscular dystrophy. The muscular Dystrophy Association was founded by parents of weak children, before we could actually separate motor neuron disease from myopathy. In retirement, I’ve kept up an interest in ALS (particularly since all I could do for patients as a doc was — (drumroll) — basically nothing).

The fact that a fair amount of even sporadic ALS has a problem with a protein called C9ORF72 was particularly fascinating. All this came out less than five years ago (October 2011). Everything is far from clearcut even now.

That being the case, it might be of interest to look at the notes I accumulated as scientists began to explore what was wrong with C9ORF72, how the protein normally does whatever it does (we still don’t know really) and how the mutated product of the gene causes trouble (there are 3 main theories).

What you’ll see in what follows is the heat of scientific battle (warts and all), where things are far from clear. Enjoy. This is basically what used to be called a core-dump (back in the day when computer memory was made of metallic cores). Things are far from cut and dried even now so it might be of interest to see the many angles of attack on the problem, the confusion, the conflicting theories, as things became a bit more clear. It’s the scientific enterprise in action against a very horrible disease (trust me).

I’ll try and clear up the typos. I’ll also try to put the notes on the papers in semi-chronological order, but I make no guarantees. The notes may be incomprehensible, as they include only what I didn’t know rather than all the background needed to understand what’s in them .

First a bit of background — FTD stands for FrontoTemporal Dementia.

The #9p21 chromosomal region is another locus for ALS/FTD. It contains something called C9orf72, which contains a GGGGCC hexnucleotide repeat in the intron between noncoding exons 1a and 1b. Normal alleles contain less than 24 repeats (range 2 – 23). Those with ALS + FTD contain over 30 (actually they think the repeat length is much higher — 700 to 1,600 ! ! !). ORF probably stands for open reading frame.

The expansion is present in 12% of familial FTD and 22.5% of familial ALS — making it the most common genetic abnormality in both conditions. More importantly it is found in 21% of sporadic ALS and 29% of FTD in the Finnish population. Later they say it is the most common genetic cause of sporadic ALS (but only in 4%).

There are 3 possible mechanisms of toxicity
l. The RNA transcribed from the repeat acts as an RNA sponge, binding all sorts of RNAs it shouldn’t
2. Repeat Assoaicted Non-ATG translation (RAN translation) see later
3. Decreased expression of the mRNA for C9ORF72.

[ Science vol. 338 pp. 1282 – 1283 ’12 ] Now 40% of familial ALS, 21% of familial frontotemporal dementia, and 8% of sporadic ALS, 5% of sporadic frontotemporal dementia have expansions in C9orf72.

Not much is known about C9orf72 — it is conserved across species. It contains no previously known protein domains. The expansion leads to loss of one alternatively spliced C9ORF72 isoform (normally 3 isoforms are expressed), and to the formation of nuclear RNA foci (which appear to be composed mostly of the expansion). [ Neuron vol. 79 pp. 416 – 438 ’13 ] The function of C9ORF72 is unknown (8/13).

The current (12/12) thinking is that the repeats produce a glob of RNA which traps RNA binding proteins which have better things to do. The best analogy is myotonic dystrophy in which an expanded 3 nucleotide repeat sequesters muscleblind, an RNA binding protein involved in splicing.

The expansion is present in 46% of familial ALS in Finland and 21% of sporadic ALS there. But Finns are somewhat different genetically. The expansion is found in 1/3 of European ancestry familial ALS.

Interestingly some of the patients with FTD presented with nonfluent progressive aphasia.

[ Cell vol. 152 pp. 691 – 698 ’13, Neuron vol. 77 pp. 639 – 646 ’13 ] The protein aggregates of C9orf72 mutants contain TDP43 inclusions. But they also show additional p62 and ubiquilin positive pathology (with no TDP43 present). The abnormal proteins are due to translation of the expanded GGGGCC repeats (which should be nonCoding as they are in introns). This is an example of Repeat Associated Non-ATG translation (RAN). This was first shown for expanded CAG repeats, which can be translated in all 3 reading frames giving polyGlutamine, polyLysine and polySerine . A minimum of 58 CAG repeats was required for translation.

This work looked for translation of GGGGCC in all 3 reading frames (poly glycine-proline, poly glycine-alanine, polyglycine-arginine. They found that poly glycine-proline was found and in the protein inclusions which were p62 positive and TDP43 negative. Similar inclusions weren’t present in other neurodegenerative diseases, known to have nucleotide inclusions.

[ Proc. Natl. Acad. Sci. vol. 110 pp 7533 – 7534, 7778 – 7783 ’13 ] The expanded C9orf72 repeat is enough to cause neurodegeneration (mammalian neurons, and D. melanogaster). They placed either 3 or 30 copies of GGGCC into an epidermal growth factor vector between the start of transcription and the first ATG codon. The repeat can sequester the RNA binding protein Pur alpha (and other Pur family members). Interestingly, TDP43 didn’t bind to the repeat RNA, nor did hnRNP A2/B1 which binds to fragile X CGG repeat containing RNA. Overexpression of of Pur alpha is able to abort the neurogeneration in the mammalian neuonal cell line (Neuro-2a). So probably the excessive repeat number is acting as an RNA sponge.

Pur alpha is evolutionarily conserved. It controlls the cell cycle and differentiation. It is also a pomonent of the RNA transport granule. It interacts with Pur beta.

30 was as many repeats as they could manipulate experimentally — normals have 2 – 8 repeats, but patients with disease have from 100s to 1,000s of repeats, so the pathogenesis might be different.

[ Neuron vol. 80 pp. 257 – 258, 415 – 428 ’13 ] Expression of C9orf72’s mRNA in frontotemporal dementia/als (FTD/ALS) patients is reduced by 50%, and the expanded repeat and neighboring CgP islands are hypermethylated consistent with transcriptional silencing. Also the cytoplasmic aggregates staining positively for P62 appear to result from protein translation through the hexanucleotide repeat.

This work used induced pluripotent stem cells (iPSCs) derived from C9ALS/FTD patients. They show decreased C9orf72 mRNA, nuclear and cytoplasmic GGGGCC RNA foci, and expression of one RAN product (Gly Pro dipeptide). Neurons derived from the iPSCs also show enhanced sensitvity to glutamic acid excitotoxicity, and a transcriptional profile that ‘partially’ overlaps with transcriptional changes seen in iPSC neurons derived from mutant SOD1 ALS patients.

In addition, some 19 proteins were found which associate with the GGGGCC repeats in vitro. ADARB2 does this and participates in RNA editing.

ASOs (AntiSense OIigonucleotides ??) were used to suppress C9orf72 RNA expression. This led to reversal in many of the phenotypes of the iPSC neurons (suppression of glutamic acid toxicity, reduction in RNA foci formation). This implies that the GGGGCC repeats trigger toxicity through a gain of function mechanism. [ Proc. Natl. Acad. Sci. vol. 110 pp. E4530 – E4539 ’13 ] Nuclear RNA foci containing GGGCC in patient cells (wbc’s fibroblasts, glia, neurons) were ssen in patients with repeat expansion. The Foci weren’t present in sporadic ALS or ALS/FTD caused by other mutations (SOD1, TDP43, tau), Parkinsonism, or nonNeurological controls. Antisense oligonucleotides reduced the GGGGCC containing nuclear foci without alteraling overall C9orf72 RNA levels. SiNRAS didn’t work.

The Rx was applied to living mice and it was well tolerated.

[ Proc. Natl. Acad. Sci. vol. 110 pp E4968 – E4977 ’13 ] C9orf72 antisense transcripts are elevated in the brains of those with the expansion. Repeat expansion GGCCCC RNAs accumulate in nuclear foci in the brain. Sense and antisense foci accumulate in the blood and are potential biomarkers. RAN translation occurs in BOTH sense and antisense expansion transcripts — so all 6 proteins described above are made. The proteins accumulate in cytoplasmic aggregates in affected brain regions (e.g. frontal and motor cortex, spinal cord neurons).

[ Nature vol. 507 pp. 175 – 177, 195 – 200 ’14 ] C9orf72 has repeated hexanucleotide units (GGGGCC). Two or more G quartets stacked on top of one another form a G-quadruplex. In the expanded repeats of C9orf72 in ALS and frontotemporal dementia, stable quadruplexes form in DNA as well as the RNA transcribed from it.

Sequences which can form G-quadruplexes are conserved during evolution, so they presumably are doing something useful. They are found in transcriptional start sites. This work shows that G-quadruplex assembly in DNA increases transcriptional pauses in the expanded repeat (unsurprising). Also the G-quadruplexes in C9orf72 DNA promote the formation of stable R-loops — triple stranded structures that assemble when a newly form RNA transcript exiting RNA polymerase II invades the double helix and binds to one DNA strand, displacing the other. If the R-loops aren’t resolved, they can halt transcriptional elongation.

Not only that, but abortive GGGGCC containing RNAs accumulate in the spinal cord and motor cortex of patients with the expanded repeats. The RNAs are truncated in the GGGGCC region, and the amount is linearly proportional to the length of the hexanucleotide repeat. This explains how they could accumulate along with decreased level of full length C9orf72 mRNA (and presumably the protein made from it).

A ‘few dozen’ proteins binding the GGGGCC repeats have been found. One of them is nucleolin, involved in the formation of the ribosome within the nucleolus It is mislocalized to RNA foci in neurons of the motor cortex of patients with C9orf72 related disease. The lack of mature ribosomes results in the buildup of untranslated mRNA in the cytoplasm.

[ Science vol. 345 pp. 1118 – 1119, 1139 – 1145, 1192 – 1194 ’14 ] Normally the number of GGGGCC repeats in C9orf72 ranges from 2 to 23, with hundreds or even thousands of copies in the disease range. Possibilities
l. Interference with C9orf72 expression — e. g. loss of function
2. Sponging up RNA binding proteins by the transcript
3. Repeat associated non-ATG translation (RAN translation) in all reading frames (sense and antisense).

A series of stop codons in both the sense and antisense RNAs was engineered every 12 repeats, stopping formation of the dipeptide repeat proteins. The new RNAs still formed the G-quadruplexes, and both RNAs formed RNA foci when expressed in cultured neurons.

Putting them into Drosophila showed that the pure repeats able to form dipeptides causing degeneration in the fly eye, while the interrupted constructs (producing RNA only) did not. The same was true when expressed in the nervous systems of adult flies. Blocking translation of the RNA partially suppressed the phenotype.

There are 5 possible dipeptide products of RAN of GGGGCC (GA, GP, PA, GR, PR — G == Glycine, P == Proline, A == Alanine, R = Arginine). Then RNAs using alternate codons for the dipeptides were used (so GGGGCC wasn’t present). Expressing Glycine Arginine (GR) or Proline Arginine (PR) was toxic, Glycine Alanine showing ‘some’ toxicity later in life.

Some RNA binding proteins containing low complexity sequences (aka prion-like domains) — these are FUS, EWSR1, TAF14, hnRNPA2 — form polymeric assemblies, which incorporate into hydrogels in vitro. The assemblies are similar to RNA granules. Many of the RNA binding proteins associating with hydrogels hare serine arginine (SR) sequences. The SR domain proteins are regulated by phosphorylation on serine, also controlling the association with hydrogels. It is hypothesized that the GR and PR transcripts associate with hydrogels (or similar assemblies such as RNA granules), but are impervious to the regulatory action of the kinases (no serine to phosphorylate), so they might clog up the trafficking of SR domain containing RNA binding proteins moving in an out of the granules to transfer information throughout the cell.

[ Neuron vol 84 pp. 1213 – 1225 ’14 ] Proline Arginine dipeptides are neurotoxic. They form aggregates in nucleoli in experimental systems. Nuclear aggregates were also found in postmortem spinal cord from C9ORF72 ALS and ALS/FTD patients. Intronic GGGGCC transcripts are also toxic. Repeat associated non-ATG translation (RAN translation) is thought to depend on RNA hairpin structures using GC pairing.

[ Cell vol. 158 pp. 967 ’14 (abstract of something to appear in Science) ] Peptide translated from GGGGCC expansions containng arginines (Gly Arg and Pro Arg) are harmful — 3 other dipeptide repeats are harmless. The peptides bind to nucleoi and impede RNA biogenesis. Interestingly Ser-Arg repeats proteins (SR proteins) are important in RNA splicing. The GlyARG and PROARG repeat peptides alter splicing of the amino acid transporter EAAT2, similar to that seen in ALS. Interestingly, the peptides are readily taken up by cells in culture, translocating to the nucleus.

Also a small molecule has been developed which targets GGGGCC RNA expansions. It inhibits translation of the dipeptide repeat proteins from the expansions (see Science vol. 353 pp. 64 ****

GlyPro in CSF is a biomarker of ALS patients with the C9orf7s expansion.

The normal function of C9orf72 isn’t known. It is structurally related to DENN (Differentially Expressed in Normal and Neoplastic cells) proteins, which are GDP/GTP exchange factors for Rab GTPases.

At this point it isn’t known if the proteins generated by RAN are toxic. The protein inclusions are present in unaffected areas of the brain (lateral geniculate) as well as the vulnerable areas (cortex, hippocampus).

The initiation of RNA translation is thought to depend on RNA hairpin structures which use C:G complementary pairing. CAG (but not CAA) repeats undergo RAN translation. Protein aggregates occured only in brain intestes despite the fact that C9orf72 is expressed all over the body (but expression is highest in brain).

It is possible that antisense RNA could be formed from the opposite strand (e.g. CCCCGG) giving poly pro-ala, poly pro-gly and poly pro-arg.

[ Science vol. 1106 – 1112 ’15 ] Just expressing 66 GGGGCC repeats without an ATG start codon using an AdenoAssociated Virus (AAV) vector in mice was enough to produce neurodegeneration with RNA foci, inclusoins of poly QP, GA and GR and TDP43 pathology. There was cortical neuron and cerebellar Purkinje cell loss and gliosis.

[ Nature vol. 525 pp. 36 – 37, 56 – 61, 129 – 133 ’15 ] (GGGGCC)30 was expressed in the Drosophila eye. This leads to the rough eye trait and is easily scored, allowing you to look at the effect of other genes on it. Mutations activating RanGAP suppressed rough eyes. RanGAP binds to GGGGCC on the cytoplasmic face of the nuclear pore. Enhancing nuclear import or suppressing nuclear export of proteins also suppressed neurodegeneration. RanGAP physically interacts with the GGGGCC Hexanucleotide Repeat Expansion resulting in its mislocalization. The mislocalization is found in neurons derived from iPSCs from a patient with C9orf72 type ALS, and also in brain tissue from other patients with C9orf72 ALS.

Nuclear import is impaired due to HRE expression (fly and iPSC derived neurons). The defects can be ‘rescued’ by small molecules and antisense oligonucleotides targeting the HRE G-quadruplexes. This may actually be a way to Rx ALS ! ! ! !

Another paper crossed (GGGGCC)58 flies with missing chromosomal segments. They found a variety of nuclear import factors whose inactivation worsened rough eye.

Expression of constructs of in GGGGCC)8, 28 and 58 lacking an AUG start codon in Drosophila was done. The constructs could only produce Repeat Associated NonAUG translation products (e.g. dipeptides). The dipeptides disrupt nuclear import of fluorescent test substrates and of normal nuclear proteins (notably TDP43). In addition RNA export from the nucleus is also compromised. The deleterious effects could be modified by 18 genetic regions (found by large scale unbiased genetic screening). THey coded for components of the nuclear pore complex, nuclear RNA export machinery and nuclear import.

Dipeptides produced from GGGGCC and GGGGCCn’s disrupt the nucleolus, so this may be an additional cause of repeat toxicity.

[ Neuron vol. 88 pp. 892 – 901 ’15 ] A mouse model containng the full human C9orf72 repeat which was either normal (15 repeats) or expanded (100 – 1,000 repeats) — using bacterial artificial chromosomes (BACs) — thes mice are called C9-BACexpanded. They show widespread RNA foci and RAN translated dipeptides. Nucleolin distribution was altered. However the mice showed normal behavior and there was no neurodegenration. This is surprising.

[ Nature vol. 535 p. 327’16 (abstr. of Sci. Transl. Med ’16) ] Mice with mutations diminishing or eliminating the function of C9ORF72 (unknown as of 8/13) developed autoimmune disease.

[ Science vol. 351 pp. 1324 – 1329 ’16 ] Two independent mouse lines lacking the ortholog of C9orf72 (3110043021Rik) in all tissues developed normally and aged without any motor neuron disease. Instead they developed progressive splenomegaly and lymphadenopathy with accumulation of engorged macrophagelike cells. There was age related neuroInflammation similar to C9orf72 ALS but not sporadic ALS. There was no evidence of neurodegeneration however.

[ Neuron vol. 90 pp. 427 – 430, 531 -534, 535 – 550 ’16 ] BAC transgenic mice using patient derived gene constructs expressing (some of? all of?) C9ORF72 are reported.

A germline knockout develops blood abnormalities (splenomegaly, lymphadenopathy and premature death). The data conflict on which of the 5 products of RAN (Repeat Associated NonATG) translation are the most toxic (GP, GA, GR, PA, PA, PR).

In this study, mice with increased levels of repeats (up to 450) showed no evidence of motor neuron disease, and the brain was normal. They at least did have some trouble with cognition.

THe second study put in the full C9 gene with 5′ and 3′ flanking sequences. 4 lines of transgenics with repeats ranging from 37 to 500 were characterized. These mice did have peirpheral and central neurodegeneration, with motor deficits. There was a decrease in cortical neurons, Purkinje cells. This is the first time any transgenic has shown neurodegeneration. The deficits are reversible with antisense oligonucleotides. There was a disparity in disease expression between male and female mice.

RNA foci and DPR (DiPeptide Repeat) proteins don’t accumulate in the most affected brain regions.

[ Science vol. 353 pp. 647 – 648, 708 – 712 ’16 ] Spt4 is a highly conserved transcription elongation factor which regulates RNA polymerase II processivity (along with its binding partner Spt5). Spt4 is required to transcribe long trinucleotide repeats found in open reading frames, or in non protein coding regions of DNA templates (in S. cerevisiae). Mutations of Spt4 decrease synthesis of (and restored enzymatic activity to) expanded polyQ proteins (in yeast) without affecting genes lacking the excessive CAG repeats. It might also work in nonCAG repeats.

Targeting Spt4 (with antiSense oligonucleotides) reduces production of the C9orf72 expansion associated RNA and protein, and helps neurodegeneration in model systems. Repeat expansions are transcribed in both the sense and antisense directions. Yeast Spt4 (human homolog SUPT4H) is a small evolutionarily conserved zinc finger protein which forms a complex with Spt5, which then binds to RNA polymerase II regulating transcription elongation (pol II processivity).

DRB is a RNA polymerase II inhibitor. The complex of Spt4 and Spt5 homologs in man (SUPT4H, SUPT5H) is called DSIF (DRB Sensitivity Inducing Factor)

Depletion of Spt4 or its binding partner (Spt5 ) decreases the number of both sense and antisense repeat transcripts and RNA foci. One of the 6 RAN translation products (polyGlyPro) is substantially reduced by Spt4 depletion.

The study was in human c9ALS fibroblasts. However, side effects are certainly possible — in addition to decreasing the expression of C9ORF72, 95% depletion of SUPT4H1 altered (how?) the expression of another 300 genes. In mice deletion of both copies of SUPT4 is embryonic lethal, but deleting one produced no effects up to 18 months of age.

Time for drug chemists to go to the Multiplex

30 – 40% of all the drugs currently in clinical use are thought to target G Protein Coupled Receptors (GPCRs). Just how many GPCRs inhabit our genome isn’t clear. The latest estimate is 850 which is 4.2% of the 20,077 annotated protein genes we have. That being the case, it behooves drug chemists to know everything about them and how they work.

A recent paper [ Cell vol. 166 pp. 907 – 919 ’16 ] shows that a lot of the old thinking about GPCRs is wrong. Binding of a ligand to a GCPR results in a conformational change in its 7 transmembrane segments, so that the parts inside the cell bind to a heterotrimer of proteins which bind (and hydrolyze) GTP — this is the G protein. So far so good. The trimer splits up into its 3 constituents, unimaginatively called alpha, beta and gamma, each of which can act as a messenger that a ligand from outside the cell has landed on a GPCR, binding to other proteins causing all sorts of effects (e.g. can act as a second messenger)

All good things must end, and termination of GPCR signaling was thought to involve phosphorylation of the intracellular segment of the GPCR, binding of another protein (betaArrestin), removal from the cell membrane (so it can no longer bind its extracellular ligand) and then either destruction or recycling back to the cell membrane. So the old paradigm was betaArrestin binding equals the end of signaling.

It was thought that betaArrestin and the G protein competed for binding to the same intracellular amino acids of the GPCR. Not so says this paper. For some GPCRs both can bind, and signaling can continue, even though the complex of GPCR, G protein and betaArrestin is now inside the cell in an endosome. The complex is called the Multiplex. The examples given are GPCRs for parathyroid hormone (PTH) and Thyroid Stimulating Hormone (TSH). Blurry pictures are given of the complex. GPCRs have been divided into several classes and GPCRs for TSH and PTH are class B GPCRs — which contain a long phosphorylatable tail in the cytoplasm. The G protein binds to these GPCRs by its core region, while betaArrestin binds to the tail. Signaling continues apace.

You are alive because the lipid bilayer of your plasma membrane is asymmetric

You are an organism with trillions of cells. A mosquito bit you depositing millions of viruses in your tissues. The virus can reproduce only within one of your cells and it has exploited all sorts of protein protein chemistry to get in. Antibodies (if you are fortunate enough to have them) can get rid of the extracellular critters. However, 500,000 have made into the same number of your cells, and are merrily trying to reproduce.

How does the asymmetry of the lipid bilayer of your plasma membrane help you survive. If each virus infected cell killed itself before the virus reproduced, you’d survive. Although 500,000 is a large number is is less than 1 millionth of your cell total.

Well you do have intracellular defenses against viruses, called the innate immune system. One of them is a protein with the ugly name of gasdermin D. The activated innate immune system (in the form of inflammatory caspases) cleaves gasdermin. This breaks up the inhibition of the amino terminal part of gasdermin by the carboxy terminal part giving a fragment which binds to one particular membrane component (phosphatidyl serine) which makes up 20% of the inner leaflet of the cell membrane. It then forms a large diameter (to a cell 140 Angstroms is quite large) pore in the cell membrane. No cell can survive this, so it dies, releasing cellular contents (probably some viral components but not fully formed one). For details see [ Nature vol. 535 pp 111 – 116, 153 – 158 ’16 ]

Wait a minute. The toxic gasdermin fragment is also released. So how come it doesn’t kill everything in sight? Because our cellular membranes keep phosphatidyl serine confined to the inner membrane, normal cells don’t show it on their exterior, so they can be bathed in gasdermin with no ill effect. What is responsible for this asymmetry — believe it or not an ATP consuming enzyme called flippase (about this more later) which takes any phosphatidyl serine it finds on the outer leaflet and schleps it back inside the cell.

There is all sorts of elegant chemistry which explains just how gasdermin binds to phosphatidyl serine and none of the many other phospholipids found on the inner leaflet. There is more elegant chemistry explaining how flippase works (see later).

What chemistry cannot explain, is why organisms would ‘want’ an asymmetric membrane. As soon as you get into the function of a particular compound in an organism, chemistry is powerless to tell you why. Nothing else can explain how a given molecule does what it does on the molecular level but that is not enough for a satisfying explanation.

One further explanation before some hard core cellular biochemistry follows (after ***). Our cells are dying all the time. The lining of your gut is replaced every 5 days. Even the longest lasting element of your blood is gone after half a year, and most other elements are turned over at least once a month. When these cells die, they must be cleaned up, without undue fuss (such as inflammation). The cleaners are cells called macrophages. A dying cell releases chemical signals, actually called ‘eat me’, one of which is phosphatidyl serine found on the membrane fragments of a dead cell. The fact that flippases keep it on the inner leaflet means that macrophages won’t attack a normal cell.

Slick isn’t it?

***

Flippase is a MgATPdependent aminophospholipid translocase. It localizes phosphatidylserine and phosphatidylethanolamine to the inner membrane leaflet by rapidly translocating them from the outer to the inner leaflet against an electrochemical gradient. The stoichiometry between amino phospholipid translocation and ATP hydrolysis is close to one (how will the cell have enough ATP to do anything else?). The flippase is inhibited by high calcium, and by pseudosubstrates such as vanadate, acetylphosphate and para-nitrophenyl phosphate, and by SH reactive reagents such as N-ethylmaleimide and pyridyldithioethylamine (PDA) a specific inhibitor of phospholipid translocation

[ Proc. Natl. Acad. Sci. vol. 109 pp. 1449 – 1454 ’12 ] P4-ATPases are a subfamily of P-type ATPases. They transport aminophospholipids from the exoplasmic to the cytoplasmic leaflet (and are known as flippases). Man has 14 P4-ATPases, expressed in various cell types. They are thought to be similar to the catalytic subunits of the Ca++ ATPase, and the Na, K ATPase, consisting of cytoplasmic, N, P and A domains and a membrane domain made of 10 transmembrane helices (M1 – M10).

[ Proc. Natl. Acad. Sci. vol. 111 pp. E1334 – E1343 ’14 ] The P4-ATPases are thought to resemble the classic P-type ATPase cation pumps — a transmembrane domain of 10 helices and 3 cytoplasmic domains (P for phosphorylation, N for nucleotide binding and A for actuator). ATP8A2 forms an intermediate phosphorylated on aspartic acid (E2P)and undergoes a catalytic cycle similar to the sodium pump (Na+, K+ ATPase). Dephosphorylation of E2P is activated by the transported substrates phosphatidyl serine (PS) and phosphatidyl ethanolamine (PE), similar to the K+ activation of dephosphorylation in the sodium pump.

PE and PS are 10x as large as the cations transported by the sodium pump. This is known as the giant substrate problem. This work shows that isoleucine #364 (mutated in — patients with the ataxia, retardation and dysequilibrium syndrome Eur. J. Hum. Genet. vol. 21 pp. 281 – 285 ’13 aka CAMRQ syndrome ) forms a hydrophobic gate separating the entry and exit sites of PS. I364 likely directs the sequential formation and annihilation of water filled cavities (as shown by molecular dynamics simulations) allowing transport of the hydrophilic phospholipid head group, in a groove outlined by TMs 1, 2, 4 and 6, with the hydrocarbon chains following passively, still in the membrane lipid phase (and presumably outside the channel) — this must disrupt the hell out of the protein as it passes. They call this the credit card model — only the interaction with part of the molecule is important — just as the magnetic stripe is the only important thing about the credit card.

Another fail safe mechanism used by the cell — readthrough

Nothing is perfect in this world, not even the translation of mRNA into protein. The error rate is one amino acid misincorporated into a protein for every 10,000 or so done correctly — but these results are for one celled organisms (E. Coli, yeast). I can’t find a number for mammals, primates etc. etc.

This means that occasionally one of the 3 codons which tell the ribosome to quit (stop codons), will be misread as an amino acid. This is called readthrough, and means that the ribosome will merrily march on producing a much larger protein than coded for by the mRNA until one of two things happens. l. the ribosome reaches the end of the mRNA and stops. 2. the mRNA contains another stop codon (there are 3). The probability of this is 3/64 per codon. If stop codons are randomly distributed (which they are most certainly not in the protein coding segment of an mRNA) the chances of 100 codons in a row not containing a stop codon is under 1% (.822 % to be exact). So any protein containing more than 100 amino acids is a statistical freak in this sense. Since the 3′ untranslated region (3’UTR) of mRNA doesn’t code for protein, they should have stop codons randomly distributed (there being no selective pressure to keep them away).

Enter Nature vol. 534 pp. 719 – 723 ’16 — if you attach a 3′ UTR section of an mRNA to a normal protein sequence (mimicking readthrough) you get much less protein. The authors think the 3’UTRs code for peptide sequences destabilizing the attached protein. They don’t know what this might be, so it’s terra incognita for researchers, and a worthwhile PhD project to figure it out. Another example of ‘coding’ by a presumably nonCoding sequence in the genome. It may also tell us something about protein structure.

Why you do and don’t need chemistry to understand why we have big brains

You need some serious molecular biological chops to understand why primates such as ourselves have large brains. For this you need organic chemistry. Or do you? Yes and no. Yes to understand how the players are built and how they interact. No because it can be explained without any chemistry at all. In fact, the mechanism is even clearer that way.

It’s an exercise in pure logic. David Hilbert, one of the major mathematicians at the dawn of the 20th century famously said about geometry — “One must be able to say at all times–instead of points, straight lines, and planes–tables, chairs, and beer mugs”. The relationships between the objects of geometry were far more crucial to him than the objects themselves. We’ll take the same tack here.

So instead of the nucleotides Uridine (U), Adenine (A), Guanine (G), Cytosine (C), we’re going to talk about lock and key and hook and eye.

We’re going to talk about long chains of these four items. The order is crucial Two long chains of them can pair up only only if there are segments on each where the locks on one pair with the keys on the other and the hooks with the eyes. How many possible combinations of the four are there on a chain of 20 — just 4^20 or 2^40 = 1,099,511,621,776. So to get two randomly chosen chains to pair up exactly is pretty unlikely, unless in some way you or the blind Watchmaker chose them to do so.

Now you need a Turing machine to take a long string of these 4 items and turn it into a protein. In the case of the crucial Notch protein the string of locks, keys, hooks and eyes contains at least 5,000 of them, and their order is important, just as the order of letters in a word is crucial for its meaning (consider united and untied).

The cell has tons of such Turing machines (called ribosomes) and lots of copies of strings coding for Notch (called Notch mRNAs).

The more Notch protein around in the developing brain, the more the proliferating precursors to neurons proliferate before differentiating into neurons, resulting in a bigger brain.

The Notch string doesn’t all code for protein, at one end is a stretch of locks, keys, hooks and eyes which bind other strings, which when bound cause the Notch string to be degraded, mean less Notch and a smaller brain. The other strings are about 20 long and are called microRNAs.

So to get more Notch and a bigger brain, you need to decrease the number of microRNAs specifically binding to the Notch string. One particular microRNA (called miR-143-3p) has it in for the Notch string. So how did primates get rid of miR-143-3p they have an insert (unique to them) in another string which contains 16 binding sites for miR-143-3p. So this string called lincND essentially acts as a sponge for miR-143-3p meaning it can’t get to the Notch string, meaning that neuronal precursor cells proliferate more, and primate brains get bigger.

So can you forget organic chemistry if you want to understand why we have big brains? In the above sense you can. Your understanding won’t be particularly rich, but it will be at a level where chemical explanation is powerless.

No amount of understanding of polyribonucleotide double helices will tell you why a particular choice out of the 1,099,511,621,776 possible strings of 20 will be important. Literally we have moved from physicality to the realm of pure ideas, crossing the Cartesian dichotomy in the process.

Here’s a copy of the original post with lots of chemistry in it and all the references you need to get the molecular biological chops you’ll need.

Why our brains are large: the elegance of its molecular biology

Primates have much larger brains in proportion to their body size than other mammals. Here’s why. The mechanism is incredibly elegant. Unfortunately, you must put a sizable chunk of recent molecular biology under your belt before you can comprehend it. Anyone can listen to Mozart without knowing how to read or write music. Not so here.

I doubt that anyone can start from ground zero and climb all the way up, but here is all the background you need to comprehend what follows. Start here — https://luysii.wordpress.com/2010/07/07/molecular-biology-survival-guide-for-chemists-i-dna-and-protein-coding-gene-structure/
and follow the links (there are 5 more articles).

Also you should be conversant with competitive endogenous RNA (ceRNA) — here’s a link — https://luysii.wordpress.com/2014/01/20/why-drug-discovery-is-so-hard-reason-24-is-the-3-untranslated-region-of-every-protein-a-cerna/

Also you should understand what microRNAs are — we’re still discovering all the things they do — here’s the background you need — https://luysii.wordpress.com/2015/03/22/why-drug-discovery-is-so-hard-reason-26-were-discovering-new-players-all-the-time/weith.

Still game?

Now we must delve into the embryology of the brain, something few chemists or nonbiological type scientists have dealt with.

You’ve probably heard of the term ‘water on the brain’. This refers to enlargement of the ventricular system, a series of cavities in all our brains. In the fetus, all nearly all our neurons are formed from cells called neuronal precursor cells (NPCs) lining the fetal ventricle. Once formed they migrate to their final positions.

Each NPC has two choices — Choice #1 –divide into two NPCs, or Choice #2 — divide into an NPC and a daughter cell which will divide no further, but which will mature, migrate and become an adult neuron. So to get a big brain make NPCs adopt choice #1.

This is essentially a choice between proliferation and maturation. It doesn’t take many doublings of a NPC to eventually make a lot of neurons. Naturally cancer biologists are very interested in the mechanism of this choice.

Well to make a long story short, there is a protein called NOTCH — vitally important in embryology and in cancer biology which, when present, causes NPCs to make choice #1. So to make a big brain keep Notch around.

Well we know that some microRNAs bind to the mRNA for NOTCH which helps speed its degradation, meaning less NOTCH protein. One such microRNA is called miR-143-3p.

We also know that the brain contains a lncRNA called lncND (ND for Neural Development). The incredible elegance is that there is a primate specific insert in lncND which contains 16 (yes 16) binding sites for miR-143-3p. So lncND acts as a sponge for miR-143-3p meaning it can’t bind to the mRNA for NOTCH, meaning that there is more NOTCH around. Is this elegant or what. Let’s hear it for the Blind Watchmaker, assuming you have the faith to believe in such things.

Fortunately lncND is confined to the brain, otherwise we’d all be dead of cancer.

Should you want to read about this, here’s the reference [ Neuron vol. 90 pp. 1141 – 1143, 1255 – 1262 ’16 ] where there’s a lot more.

Historically, this was one of the criticisms of the Star Wars Missile Defense — the Russians wouldn’t send over a few missles, they’d send hundreds which would act as sponges to our defense. Whether or not attempting to put Star Wars in place led to Russia’s demise is debatable, but a society where it was a crime to own a copying machine, could never compete technically to produce such a thing.

Why our brains are large: the elegance of its molecular biology

Primates have much larger brains in proportion to their body size than other mammals. Here’s why. The mechanism is incredibly elegant. Unfortunately, you must put a sizable chunk of recent molecular biology under your belt before you can comprehend it. Anyone can listen to Mozart without knowing how to read or write music. Not so here.

I doubt that anyone can start from ground zero and climb all the way up, but here is all the background you need to comprehend what follows. Start here — https://luysii.wordpress.com/2010/07/07/molecular-biology-survival-guide-for-chemists-i-dna-and-protein-coding-gene-structure/
and follow the links (there are 5 more articles).

Also you should be conversant with competitive endogenous RNA (ceRNA) — here’s a link — https://luysii.wordpress.com/2014/01/20/why-drug-discovery-is-so-hard-reason-24-is-the-3-untranslated-region-of-every-protein-a-cerna/

Also you should understand what microRNAs are — we’re still discovering all the things they do — here’s the background you need — https://luysii.wordpress.com/2015/03/22/why-drug-discovery-is-so-hard-reason-26-were-discovering-new-players-all-the-time/weith.

Still game?

Now we must delve into the embryology of the brain, something few chemists or nonbiological type scientists have dealt with.

You’ve probably heard of the term ‘water on the brain’. This refers to enlargement of the ventricular system, a series of cavities in all our brains. In the fetus, all nearly all our neurons are formed from cells called neuronal precursor cells (NPCs) lining the fetal ventricle. Once formed they migrate to their final positions.

Each NPC has two choices — Choice #1 –divide into two NPCs, or Choice #2 — divide into an NPC and a daughter cell which will divide no further, but which will mature, migrate and become an adult neuron. So to get a big brain make NPCs adopt choice #1.

This is essentially a choice between proliferation and maturation. It doesn’t take many doublings of a NPC to eventually make a lot of neurons. Naturally cancer biologists are very interested in the mechanism of this choice.

Well to make a long story short, there is a protein called NOTCH — vitally important in embryology and in cancer biology which, when present, causes NPCs to make choice #1. So to make a big brain keep Notch around.

Well we know that some microRNAs bind to the mRNA for NOTCH which helps speed its degradation, meaning less NOTCH protein. One such microRNA is called miR-143-3p.

We also know that the brain contains a lncRNA called lncND (ND for Neural Development). The incredible elegance is that there is a primate specific insert in lncND which contains 16 (yes 16) binding sites for miR-143-3p. So lncND acts as a sponge for miR-143-3p meaning it can’t bind to the mRNA for NOTCH, meaning that there is more NOTCH around. Is this elegant or what. Let’s hear it for the Blind Watchmaker, assuming you have the faith to believe in such things.

Fortunately lncND is confined to the brain, otherwise we’d all be dead of cancer.

Should you want to read about this, here’s the reference [ Neuron vol. 90 pp. 1141 – 1143, 1255 – 1262 ’16 ] where there’s a lot more.

Historically, this was one of the criticisms of the Star Wars Missile Defense — the Russians wouldn’t send over a few missles, they’d send hundreds which would act as sponges to our defense. Whether or not attempting to put Star Wars in place led to Russia’s demise is debatable, but a society where it was a crime to own a copying machine, could never compete technically to produce such a thing.

ONTX Good news and semibad news.

The stock I recommended 1 June (ONTX) was up 11% today on a fourfold increase in volume. The rationale based on a Cell paper (vol. 165 pp. 643 – 655 ’16 ) will be found in a copy of the entire post below the ****

It is worth looking at the chart — https://finance.yahoo.com/echarts?s=ONTX+Interactive#{“range”:”1d”,”allowChartStacking”:true}

After a delay in opening, it exploded most of the way up on high volume (for it). Why? My guess is that people looked at the poster of the study in progress using their Ras blocking drug Rigosertib. Who looked? Why some of the 30,000 attendees at the 2016 American Society of Clinical Oncology Annual Meeting in Chicago, Illinois.

Why is this good (aside from the rise)? Assuming the people who bought ONTX were attendees at the convention, these are very informed buyers (e.g. professional oncologists) laying down their long green (e.g. very smart money).   In one of the many books I read about the Bernie Madoff Ponzi scheme, the people who invested with him were described as ‘dumb money’. They’d made their pile elsewhere and were babes in the woods when it came to investing.

Why is this also bad for what I predicted? Have a look at the abstract of one of the posters. Here’s a link to it —
http://meetinglibrary.asco.org/content/165681-176

The skinny is that the phase III study I was so excited about began only last December. It likely will be years before the results will be in. So goodbye 10x – 100x pop in the stock right away. Possibly big pharma will be impressed with their work and buy out the company which should also mean a significant gain.

Now 30,000 people can’t crowd around a single poster presentation. The stock is likely to continue moving up on volume this week as word spreads from the people who’ve already bought it and more people see the possibilities.

Here’s the post of 1 June — note that I didn’t own ONTX when I wrote the first post 3 May ’16, but did when I wrote the 1 June post.

*****

In a gambling mood?

If a pair of posters to be presented Monday 6 June at the 2016 American Society of Clinical Oncology Annual Meeting in Chicago, Illinois, contains the results of a phase III clinical trial of rigosertib, and if the results are as good as a paper discussed below the stock Onconova Therapeutics (ONTX) will jump by a factor of 10 to 100.

Full disclosure: I own some. The posters may just describe the clinical trial rather than report the results in which case all bets are off. In that case, I’ll just hold the stock until the results are in. This isn’t the ‘pump and dump’ beloved of boiler room operators everywhere. The rationale for the drug and my take on the original paper (3 May ’16) are reproduced below.

Has the great white whale of oncology finally been harpooned?

The ras oncogene is the great white whale of oncology. Mutations in 20 – 40% of cancer turn its activity on so that nothing can turn it off, resulting in cellular proliferation. People have been trying to turn mutated ras off for years with no success.

A current paper [ Cell vol. 165 pp. 643 – 655 ’16 ] describes a new and different way to attack it. Once ras is turned on (either naturally or by mutation) many other proteins must bind to it, to produce their effects — they are called RAS effectors, among which are the uneuphoniously named RAF, RalGDS and PI3K. They bind to activated ras by the cleverly named Ras Binding Domain (RBD) which has 78 amino acids.

The paper describes rigosertib, a not that complicated molecule to the chemist, which inhibits the binding (by resembling the site on ras that the RBD binds to). It is a styryl benzyl sulfone and you can see the structure here — https://en.wikipedia.org/wiki/Rigosertib.

What’s good about it? Well it is in phase III trials for a fairly uncommon form of cancer (myelodysplastic syndrome). That means it isn’t horribly toxic or it wouldn’t have made it out of phase I.

Given the mechanism described, it is possible that Rigosertib will be useful in 20 – 40% of all cancer. Can you say blockbuster drug?

Do you have a speculative bent? Buy the company testing the drug and owning the patent — Onconova Therapeutics. It’s quite cheap — trading at $.40 (yes 40 cents !). It once traded as high as $30.00 — symbol ONTX. I don’t own any (yet), but for the price of a movie with a beer and some wings afterwards you could be the proud owner of 100 shares. If Rigosertib works, the stock will certainly increase more than a hundredfold.

Enough kidding around. This is serious business. In what follows you will find some hardcore molecular biology and cellular physiology showing just what we’re up against. Some of the following is quite old, and probably out of date (like yours truly), but it does give you the broad outlines of what is involved.

The pathway from Ras to the nucleus

The components of the pathway had been found in isolation (primarily because mutations in them were associated with malignancy). Ras was discovered as an oncogene in various sarcoma viruses. Mutations in ras found in tumors left it in a ‘turned on’ state, but just how ras (and everything else) fit into the chain of binding of a growth factor (such as platelet derived growth factor, epidermal growth factor, insulin, etc. etc.) to its receptor on the cell surface to alterations in gene expression wasn’t clear. It is certain to become more complicated, because anything as important as cellular proliferation is very likely to have a wide variety of control mechanisms superimposed on it. Although all sorts of protein kinases are involved in the pathway it is important to remember that ras is NOT a protein kinase.

l. The first step is binding of a growth factor to its receptor on the cell surface. The receptor is usually a tyrosine kinase. Binding of the factor to the receptor causes ‘activation’ of the receptor. Activation usually means increasing the enzymatic activity of the receptor in the tyrosine kinase reaction (most growth factor receptors are tyrosine kinases). The increase in activity is usually brought about by dimerization of the receptor (so it phosphorylates itself on tyrosine).

2. Most activated growth factor receptors phosphorylate themselves (as well as other proteins) on tyrosine. A variety of other proteins have domains known as SH2 (for src homology 2) which bind to phosphorylated tyrosine.

3. A protein called grb2 binds via its SH2 domain to a phosphorylated tyrosine on the receptor. Grb2 binds to the polyproline domain of another protein called sos1 via its SH3 domain. At this point, the unintiated must find the proceedings pretty hokey, but the pathway is so general (and fundamental) that proteins from yeast may be substituted into the human pathway and still have it work.

4. At last we get to ras. This protein is ‘active’ when it binds GTP, and inactive when it binds GDP. Ras is a GTPase (it can hydrolyze GTP to GDP). Most mutations which make ras an oncogene decrease the GTPase activity of RAS leaving it in a permanently ‘turned on’ state. It is important for the neurologist to know that the defective gene in type I neurofibromatosis activates the GTPase activity of ras, turning ras off. Deficiencies (in ras inactivation) lead to a variety of unusual tumors familiar to neurologists.

Once RAS has hydrolyzed GTP to GDP, the GDP remains bound to RAS inactivating it. This is the function of sos1. It catalyzes the exchange of GDP for GTP on ras, thus activating ras.

5. What does activated ras do? It activates Raf-1 silly. Raf-1 is another oncogene. How does activated ras activate Raf-1 ? Ras appears to activate raf by causing raf to bind to the cell membrane (this doesn’t happen in vitro as there is no membrane). Once ras has done its job of localizing raf to the plasma membrane, it is no longer required. How membrane localization activates raf is less than crystal clear. [ Proc. Natl. Acad. Sci. vol. 93 pp. 6924 – 6928 ’96 ] There is increasing evidence that Ras may mediate its actions by stimulating multiple downstream targets of which Raf-1 is only one.

6. Raf-1 is a protein kinase. Protein kinases work by adding phosphate groups to serine, threonine or tyrosine. In general protein kinases fall into two classes those phosphorylating on serine or threonine and those phosphorylating on tyrosine. Biochemistry has a well documented series of examples of enzymes being activated (or inhibited) by phosphorylation. The best worked out is the pathway from the binding of epinephrine to its cell surface receptor to glycogen breakdown. There is a whole sequence of one enzyme phosphorylating another which then phosphorylates a third. Something similar goes on between Raf-1 and a collection of protein kinases called MAPKs (mitogen activated protein kinases). These were discovered as kinases activated when mitogens bound to their extracellular receptors.There may be a kinase lurking about which activates Raf (it isn’t Ras which has no kinase activity). Removal of phosphate from Raf (by phosphatases) inactivates it.

7. Raf-1 activates members of the MAPK family by phosphorylating them. There may be several kinases in a row phosphorylating each other. [ Science vol. 262 pp. 1065 – 1067 ’93 ] There are at least three kinase reactions at present at this point. It isn’t known if some can be sidestepped. Raf-1 activates mitogen activated protein kinase kinase (MAPK-K) by phosphorylation (it is called MEK in the ras pathway). MAPK-K activates mitogen activation protein kinase (MAPK) by phosphorylation. Thus Raf-1 is actually mitogen activated protein kinase kinase kinase (sort of like the character in Catch-22 named Junior Junior Junior). (1/06 — I think that Raf-1 is now called BRAF)

8. The final step in the pathway is activation of transcription factors (which turn genes off or on) by MAP kinases by (what else) phosphorylation. Thus the pathway from cell surface is complete.

Mind the gap (junction that is)

Gap junctions don’t get much play in pharmacology, or even in neurology, where they are widespread in the central nervous system, linking neurons to neurons, astrocytes to astrocytes. They may get quite a bit more if blocking them is a way of treating metastatic disease (see later).

A bit of background if you’re unfamiliar with them. This is from my notes Molecular Biology of the Cell 4th Edition p. 1074

The gap junction is a cylindrical oligomer composed of 6 identical rod shaped subunits (called connexins). They have 4 transmembrane segments and two extracellular loops which contain a beta-strand structure (and which are an essential structural basis for the docking of the two connexons). Multiple connexons in a membrane tend to form hexagonal arrays.

The gap junction spans the lipid bilayer creating a channel along the central axis. The pore is made of two such protein hexamers one from each cell (called a hemichannel or a connexon) arranged end to end. Different tissues have different specific gap junction proteins (connexins). Man has 14 distinct connexins each encoded by a separate gene (20 homologous proteins in man PNAS 103 pp. 5213 – 5218 ’06). Most cell types express more than one. Connexins are capable of assembling into a heteromeric connexon Adjacent cells expressing different connexins can form intercellular channels in which the two aligned dihalf-channels are different. Each gap junction can contain a cluster of a few to MANY THOUSANDS of CONNEXONs.

Neuroscientists should be interested in them as they form a functional ‘synapse’ between cells, e.g. a way of transferring information between them. For the afficienado there will be much more at the end. To flog a nearly dead horse, this is yet another way a wiring diagram of the brain won’t help you understand it — gap junctions don’t show up when you’re looking at classic synapses. For details see https://luysii.wordpress.com/2011/04/10/would-a-wiring-diagram-of-the-brain-help-you-understand-it/

A recent paper in Nature implied that cancer cells can form gap junctions with astrocytes (a glial cell of the brain). Usually we think of gap junctions being of the same cell type, but not here apparently.

Then they describe a mechanism for the cancer cell tweak the astrocyte so it produces something enabling the cancer cell to survive. Here’s whqt they claim

[ Nature vol. 533 pp. 493 – 498 ’16 ] Human and mouse breast and lung cancer cells express protocadherin7 (PCDH7) whicboth promotes (how?) the assembly of carcinoma – astrocyte gap junctions made of connexin43. PCDH7 normally is only expressed in brain. It joints the stialyl transferase ST6GALNAC5 and neuroserpin as brain restricted proteins which metastastic cells from breast and lung cancer use to colonize the brain.

Metastastic cells then uswe the channels to transfer cGAMP to astrocytes activating the STING pathway, which results in InterferonAlpha (IFNalpha) and Tumor Necrosis Factor (TNF), paracrine signals. These activate STAT1 and NFkappaB in the metastatic cells, supporting tumor growth and chemoresistance.

Meclofenamate and tonabersat are ‘modulators’ of gap junctions, breaking the loop between metastatic cancer cell and the astrocyte. Adding them to the tissue culture studied in the paper, inhibited tumor growth. So here might be a way treat metastatic cancer — particularly since meclofenamate is an FDA approved generic drug available without a prescription.

I think the mechanism described above is incomplete — why should a tumor cell transfer something to another cell to have it secrete something which makes the original cell use something it already had.

Now for a few of the things gap junctions are doing in the brain.
****

[ Neuron vol. 90 pp. 810 – 823 ’16 ] ManhyGABAeric interneurons (are there other kinds?) IN VITRO are coupled by gap junctions. This work used dual patch clamp recordings of interneurons IN VIVO. They studied coupled cerebellar Golgi cells, and showed that, in the presence of spontaneous background synaptic activity, electrically coupled cerebellar Golgi cells showed robust milliSecond precision correlated activity. This was further enhanced by sensory stimulation.

The electrical coupling equlized membrane potential fluctuations, so that coupled neurons approach action potential threshold together. They say that something called spike triggered spikelets transmitted through gap junctions conditionally triggered postJunctional spikes, if both neurons were close to threshold.

Spikelets are brief low amplitude potentials which look like action potentials but which are much smaller. A spike cannot be generated without a much larger potential change than provided by a spikelet, because the spikelet voltage is too small to activate the ion channels of electrically excitable membranes.

So gap junctions controls the temporal precision and degree of both spontaneous and sensory evoked correlated activity betwen interneurons, by the cooperative effects of shared synaptic depolarization and spikelet transmission.

[ Neuron vol. 90 pp. 912 – 913, 1043 – 1056 ’16 ] It has been found that the strength of electrical coupling between neurons in a network is highly variable (even in the same neuron, so it could be coupled at different strengths with each of its partners). Site specific modulation of electrical coupling quickly reconfigures networks of electrically coupled neurons in the retina. Phosphorylation of connexin36 alters its conductivity.

The number of gap junctions determines the strength of ele tical coupling between cerebellar Golgi cells. Ultrastructural analysis shows that gap junctions vary widely in size, which also influences coupling strength (according to a computer simulation). These are dendro-dendritic electrical synapses (widespread in the brain between inhibitory interneurons).

Only 18% or so of the channels present at the gap junctions account for the boserved strength of electrical transmission between cerebellar golgi cells.

Somato-somatic junctions occur in the mammalian trigeminal mesencephalic nucleus. Could the excess junctions be acting as adhesion molecules.

In one system, the turnover of gap junction channel proteins is rapid and comparable with that of glutamic acid receptors.

Gap junctions are ‘low pass filters’ (they pass slow fluctuations of membrane potential better than they pass rapid fluctuations). This is why the electrical synapses are inhibitory — each action potential from a Golgi cell consists of a rapid (but brief) depolarizing spike followed by a relatively deep and protracted afterhyperpolarization — which is 200 times longer than the spike — and transmitted much more effectively.

Inhibition by sparse excitatory input breaks up Golgi network synchronization, because the coupling to adjacent cells is different for each one, causing dispersion of the spikes.

In quietly attentive animals cerebellar Golgi cells generate rhythmic synchronous activity at 8 Hertz. The same behavior is seen in cerebellar slices. The hyperpolarizing electrical post-synaptic potentials (PSPs) are the only synchronizing force. This is the default state, but it can be disrupted by a variety of sensory stimuli (or by movements) which reduce spiking frequency and rhythmicity.

Golgi cells can inhibit thousands of granule cells, and every granule cell gets inhibitory input from 4 – 8 Golgi cells. The transient nature of network desynchronization ‘could’ allow the cerebellar input layer to act as a timing device over the 10 milliSecond to 1 second timescale.

In a gambling mood?

If a pair of posters to be presented Monday 6 June at the 2016 American Society of Clinical Oncology Annual Meeting in Chicago, Illinois, contains the results of a phase III clinical trial of rigosertib, and if the results are as good as a paper discussed below the stock Onconova Therapeutics (ONTX) will jump by a factor of 10 to 100.

Full disclosure: I own some. The posters may just describe the clinical trial rather than report the results in which case all bets are off. In that case, I’ll just hold the stock until the results are in. This isn’t the ‘pump and dump’ beloved of boiler room operators everywhere. The rationale for the drug and my take on the original paper (3 May ’16) are reproduced below.

Has the great white whale of oncology finally been harpooned?

The ras oncogene is the great white whale of oncology. Mutations in 20 – 40% of cancer turn its activity on so that nothing can turn it off, resulting in cellular proliferation. People have been trying to turn mutated ras off for years with no success.

A current paper [ Cell vol. 165 pp. 643 – 655 ’16 ] describes a new and different way to attack it. Once ras is turned on (either naturally or by mutation) many other proteins must bind to it, to produce their effects — they are called RAS effectors, among which are the uneuphoniously named RAF, RalGDS and PI3K. They bind to activated ras by the cleverly named Ras Binding Domain (RBD) which has 78 amino acids.

The paper describes rigosertib, a not that complicated molecule to the chemist, which inhibits the binding (by resembling the site on ras that the RBD binds to). It is a styryl benzyl sulfone and you can see the structure here — https://en.wikipedia.org/wiki/Rigosertib.

What’s good about it? Well it is in phase III trials for a fairly uncommon form of cancer (myelodysplastic syndrome). That means it isn’t horribly toxic or it wouldn’t have made it out of phase I.

Given the mechanism described, it is possible that Rigosertib will be useful in 20 – 40% of all cancer. Can you say blockbuster drug?

Do you have a speculative bent? Buy the company testing the drug and owning the patent — Onconova Therapeutics. It’s quite cheap — trading at $.40 (yes 40 cents !). It once traded as high as $30.00 — symbol ONTX. I don’t own any (yet), but for the price of a movie with a beer and some wings afterwards you could be the proud owner of 100 shares. If Rigosertib works, the stock will certainly increase more than a hundredfold.

Enough kidding around. This is serious business. In what follows you will find some hardcore molecular biology and cellular physiology showing just what we’re up against. Some of the following is quite old, and probably out of date (like yours truly), but it does give you the broad outlines of what is involved.

The pathway from Ras to the nucleus

The components of the pathway had been found in isolation (primarily because mutations in them were associated with malignancy). Ras was discovered as an oncogene in various sarcoma viruses. Mutations in ras found in tumors left it in a ‘turned on’ state, but just how ras (and everything else) fit into the chain of binding of a growth factor (such as platelet derived growth factor, epidermal growth factor, insulin, etc. etc.) to its receptor on the cell surface to alterations in gene expression wasn’t clear. It is certain to become more complicated, because anything as important as cellular proliferation is very likely to have a wide variety of control mechanisms superimposed on it. Although all sorts of protein kinases are involved in the pathway it is important to remember that ras is NOT a protein kinase.

l. The first step is binding of a growth factor to its receptor on the cell surface. The receptor is usually a tyrosine kinase. Binding of the factor to the receptor causes ‘activation’ of the receptor. Activation usually means increasing the enzymatic activity of the receptor in the tyrosine kinase reaction (most growth factor receptors are tyrosine kinases). The increase in activity is usually brought about by dimerization of the receptor (so it phosphorylates itself on tyrosine).

2. Most activated growth factor receptors phosphorylate themselves (as well as other proteins) on tyrosine. A variety of other proteins have domains known as SH2 (for src homology 2) which bind to phosphorylated tyrosine.

3. A protein called grb2 binds via its SH2 domain to a phosphorylated tyrosine on the receptor. Grb2 binds to the polyproline domain of another protein called sos1 via its SH3 domain. At this point, the unintiated must find the proceedings pretty hokey, but the pathway is so general (and fundamental) that proteins from yeast may be substituted into the human pathway and still have it work.

4. At last we get to ras. This protein is ‘active’ when it binds GTP, and inactive when it binds GDP. Ras is a GTPase (it can hydrolyze GTP to GDP). Most mutations which make ras an oncogene decrease the GTPase activity of RAS leaving it in a permanently ‘turned on’ state. It is important for the neurologist to know that the defective gene in type I neurofibromatosis activates the GTPase activity of ras, turning ras off. Deficiencies (in ras inactivation) lead to a variety of unusual tumors familiar to neurologists.

Once RAS has hydrolyzed GTP to GDP, the GDP remains bound to RAS inactivating it. This is the function of sos1. It catalyzes the exchange of GDP for GTP on ras, thus activating ras.

5. What does activated ras do? It activates Raf-1 silly. Raf-1 is another oncogene. How does activated ras activate Raf-1 ? Ras appears to activate raf by causing raf to bind to the cell membrane (this doesn’t happen in vitro as there is no membrane). Once ras has done its job of localizing raf to the plasma membrane, it is no longer required. How membrane localization activates raf is less than crystal clear. [ Proc. Natl. Acad. Sci. vol. 93 pp. 6924 – 6928 ’96 ] There is increasing evidence that Ras may mediate its actions by stimulating multiple downstream targets of which Raf-1 is only one.

6. Raf-1 is a protein kinase. Protein kinases work by adding phosphate groups to serine, threonine or tyrosine. In general protein kinases fall into two classes those phosphorylating on serine or threonine and those phosphorylating on tyrosine. Biochemistry has a well documented series of examples of enzymes being activated (or inhibited) by phosphorylation. The best worked out is the pathway from the binding of epinephrine to its cell surface receptor to glycogen breakdown. There is a whole sequence of one enzyme phosphorylating another which then phosphorylates a third. Something similar goes on between Raf-1 and a collection of protein kinases called MAPKs (mitogen activated protein kinases). These were discovered as kinases activated when mitogens bound to their extracellular receptors.There may be a kinase lurking about which activates Raf (it isn’t Ras which has no kinase activity). Removal of phosphate from Raf (by phosphatases) inactivates it.

7. Raf-1 activates members of the MAPK family by phosphorylating them. There may be several kinases in a row phosphorylating each other. [ Science vol. 262 pp. 1065 – 1067 ’93 ] There are at least three kinase reactions at present at this point. It isn’t known if some can be sidestepped. Raf-1 activates mitogen activated protein kinase kinase (MAPK-K) by phosphorylation (it is called MEK in the ras pathway). MAPK-K activates mitogen activation protein kinase (MAPK) by phosphorylation. Thus Raf-1 is actually mitogen activated protein kinase kinase kinase (sort of like the character in Catch-22 named Junior Junior Junior). (1/06 — I think that Raf-1 is now called BRAF)

8. The final step in the pathway is activation of transcription factors (which turn genes off or on) by MAP kinases by (what else) phosphorylation. Thus the pathway from cell surface is complete.

Follow

Get every new post delivered to your Inbox.

Join 93 other followers