Technology marches on — or does it?

Technology marches on — perhaps.  But it certainly did in the following Alzheimer’s research [ Neuron vol. 104 pp. 256 – 270 ’19 ] .  The work used (1) CRISPR (2) iPSCs (3) transcriptomics (4) translatomics to study Alzheimer’s.  Almost none of this would have been possible 10 years ago.

Presently over 200 mutations are known in (1) the amyloid precursor protein — APP (2) presenilin1 (3) presenilin2.  The presenilins are components of the gamma secretase complex which cleaves APP on the way to the way to the major components of the senile plaque, Abeta40 and Abeta42.

There’s a lot of nomenclature, so here’s a brief review.  The amyloid precursor protein (APP) comes in 3 isoforms containing 770, 751 and 695 amino acids.  APP is embedded in the plasma membrane with most of the amino acids extracellular.  The crucial enzyme for breaking APP down is gamma secretase, which cleaves APP inside the membrane.  Gamma secretase is made of 4 proteins, 2 of which are the presenilins.  Cleavage results in a small carboxy terminal fragment (which the paper calls beta-CTF) and a large amino terminal fragment. If beta secretase (another enzyme) cleaves the amino terminal fragment Abeta40 and Abeta42 are formed.  If alpha secretase (a third enzyme) cleaves the amino terminal fragment — Abeta42 is not formed.   Got all that?

Where do CRISPR and iPSCs come in?  iPSC stands for induced pluripotent stem cells, which can be made from cells in your skin (but not easily).  Subsequently adding the appropriate witches brew can cause them to differentiate into a variety of cells — cortical neurons in this case.

CRISPR was then used to introduce mutations characteristic of familial Alzheimer’s disease into either APP or presenilin1.  Some 16 cell lines each containing a different familial Alzheimer disease mutation were formed.

Then the iPSCs were differentiated into cortical neurons, and the mRNAs (transcriptomics) and proteins made from them (translatomics) were studied.

Certainly a technological tour de force.

What did they find?  Well for the APP and the presenilin1 mutations had effects on Abeta peptide production (but they differered).  Both however increased the accumulation of beta-CTF.  This could be ‘rescued’ by inhibition of beta-secretase — but unfortunately clinical trials have not shown beta-secretase inhibitors to be helpful.

What did increased beta-CTF actually do — there was enlargement of early endosomes in all the cell lines.   How this produces Alzheimer’s disease is anyone’s guess.

Also quite interesting, is the fact that translatomics and transcriptomics of all 16 cell lines showed ‘dysregulation’ of genes which have been associated with Alzheimer’s disease risk — these include APOE, CLU and SORL1.

Certainly a masterpiece of technological virtuosity.

So technology gives us bigger and better results

Or does it?

There was a very interesting paper on the effect of sleep on cerebrospinal fluid and blood flow in the brain [ Science vol. 366 pp. 372 – 373 ’19 ] It contained the following statement –”

During slow wave sleep, the cerebral blood flow is reduced by 25%, which lowers cerebral blood volume  by ~10%.  The reference for this statement was work done in 1991.

I thought this was a bit outre, so I wrote one of the authors.

Dr. X “Isn’t there something more current (and presumably more accurate) than reference #3 on cerebral blood flow in sleep?  If there isn’t, the work should be repeated”

I got the following back “The old studies are very precise, more precise than current studies.”

Go figure.

18 at one blow said the molecular biologist

With apologies to the brothers Grimm, molecular biologists may have found a way to treat 18 genetic diseases at one blow [ Cell vol. 170 pp. 899 – 912 ’17 ]. They use adeno-associated virus (AAV) packing a modified enzyme and an RNA to remove repeat expansions from RNA.   The paper give a list of the 18, all but one of which are neurologic.  They include such horrors as Huntington’s chorea, the most common form of familial ALS, 3 forms of spinocerebellar ataxia and 6 forms of spinocerebellar atrophy.

They use Cas9 from Streptococcus Pyogenes, part of the CRISPR system (https://en.wikipedia.org/wiki/CRISPR)  bacteria use to defend themselves against viruses, with a single guide RNA.  Even more interestingly, Cas9 is an enzyme which breaks up RNA, but the Cas9 they used is catalytically dead.  They think that just binding to the aggregated RNA containing the repeats is enough to break up the aggregate.  This is the way antiSense oligoNucleotides are thought to work.

The problem with getting a bacterial enzyme into a human cell is avoided here by using a virus to infect them (AAV).  It did get rid of RNA aggregates in patients’ cells from 4 of the diseases (two myotonic dystrophies, and the familial ALS).

It is almost too fantastic to be true.

Why almost all of these repeat expansion diseases affect the nervous system is anyone’s guess.  As you can image theories abound.  So all we have to do is figure out how to get the therapy into the brain (hardly a small task).

Is that mutation significant?

Face it, our genomes are a real mess. A study of just the parts of the genome coding for amino acids (2% at most) in about 2,500 people found an average of 205 variants which change the amino acid coded for IN EACH PERSON. Each person also had an average of 3 termination codons in the 15,000+ protein coding sequences they studied. So they are wandering around with 3 abnormally short proteins. You can read more about it in this old post –https://luysii.wordpress.com/2012/07/31/how-badly-are-thy-genomes-oh-humanity/

Here’s the problem — these people were healthy. Obviously, not a problem for them, but a big problem for physicians attempting to do genetic counseling. For how it affected epilepsy counseling see — https://luysii.wordpress.com/2011/07/17/weve-found-the-mutation-causing-your-disease-not-so-fast-says-this-paper/.

This brings us to Lynch syndrome (aka Hereditary NonPolyposis Colorectal Cancer — HNPCC). It is a familial cancer syndrome, and we now know what the problem is — mutations in any of four genes involved in a type of DNA mutation repair (there are many). The genes are called MSH2, MSH6, MLH1 and PMS2 (acronyms all whose names you don’t need to know) and the type of repair is called MisMatch Repair (MMR).

This isn’t academic at all. Suppose your aunt comes down with colon cancer and you get tested for mutations in one of the four, and a mutation is found. You’re fine now. The question before the house is — should you have your colon out? Colonoscopy won’t help because this kind of colon cancer doesn’t arise from polyps (which is what colonoscopy is looking for).

The problem is that the 4 genes are ‘peppered’ with missense variants (change the amino acid coded for). They are called VUS (Variants of Unknown Significance). The following paper [ Proc. Natl. Acad. Sci. vol. 113 pp. 3918 – 3820, 4128 – 4133 ’16 ] used a clever way to test a VUS for significance. This would have been impossible 5 years ago. What they did was use CRISPR to introduce the variant into the appropriate protein in mouse Embryonic Stem cells. Then they tested the manipulated stem cells for defects in MisMatch Repair. They tested 59 (yes fifty-nine) such VUSs and found that about 1/3 (19) produced MMR defects.

Fascinating time to be alive and reading about all this stuff.

Activating a proto-oncogene without mutating it

Many proto-oncogenes have to be mutated to cause cancer. Not so the TAL1, LMO2 genes. They drive blood formation, and are aberrantly activated (e.g. more proteins made from them is expressed) in T cell Acute Lymphoblastic Leukemia (TALL). [ Science vol. 351 pp. 1298- 1299, 1454 – 1458 ’16 ] activated them experimentally using the CRISPR technique, and therein hangs a tale.

Addendum 11 April — LMO2 is well known to gene therapists as early work (2002) using retroviruses inserted randomly in the genome to cure SCID (Severe Combined Immunodeficiency) resulted in TALL in 4kids.  The problem was that the vector integrated in multiple sites all over the genome and one such random site  turned on expression of LMO2.

I’ve written a series of six posts trying to imagine the incredible mass of DNA in a 10 micron nucleus on a human scale — we take it for granted, but it’s far from obvious how this is accomplished — here’s the link to the first — https://luysii.wordpress.com/2010/03/22/the-cell-nucleus-and-its-dna-on-a-human-scale-i/. — just follow the links to the rest.

[ Cell vol. 153 pp. 1187 – 1189, 1281 – 1295 ’13 ] Hi-C and 5C (Carbon Copy Chromosome Conformation Capture) allow determination of chromatin organization and long range chromatin interactions in an unbiased genome wide manner at the megaBase scale. Topologically associated domains (TADs) are the way the genome in the nucleus is organized into megabase to submegaBase sized interacting domains. TADs are conserved between species and are invariant across cell types. [ Call vol. 156 p. 19 ’14 ] They average 700 – 800 kiloBases and are said to contain 5 – 10 protein coding genes and a few hundred enhancers. The expression of genes within a TAD is ‘somewhat correlated’. Some TADs have active genes, while others have repressed genes. Genomic interactions are strong within a domain, but are sharply depleted on crossing the boundary between two TADs.

Well TADs have to be separated from each other. The current thinking is that the boundaries are formed by sites in the DNA which bind the CTCF protein, and possibly cohesin proteins as well. CTCF is a large protein (although maddeningly I can’t seem to find out how many amino acids it has) with a molecular mass of 80 kiloDaltons. It’s DNA binding is quite specific as it contains 11 zinc fingers (each of which can specifically bind a 3 nucleotide stretch of DNA). In addition to binding to DNA it can bind to itself, forming a perfect way to form loops of DNA.

All the Science paper did was to delete a few CTCF binding sites using the CRISPR technique around the two oncogenes and bang — expression increased. Why?  Because the insulation between the TAD containing the genes and adjacent TADs was broken, allowing control of the genes by enhancers in the new and larger TAD that had been previously sequestered in an adjacent TAD.  The deletions were thousands of basepairs away from the coding sequence of the genes themselves.  All very nice, but it’s fairly artificial.

However the paper notes that across a large pan-cancer cohort, there was a 2 fold enrichment for boundary CTCF site mutations.

That’s not a bug — that’s a feature

Back in the early days of computers you could own (aka personal computers) it wasn’t point and click, but hunt and peck, where commands in the early operating systems (DOS, etc.) had to be typed onto the command line using a keyboard. The interfaces were far from intuitive, to say the least, and the unexpected was always expected. When things went south software designers quickly learned to say “That’s not a bug, thats a feature ! ”

Essentially the same thing has happened to the latest and greatest tool in genetic engineering, the CRISPR system. It’s fascinating that it has been hiding in plain sight for FOUR decades. In med school in the mid60s the basic book about hereditary and DNA was “Sexuality and the Genetics of Bacteria” (1961) by Francois Jacob. No one had any idea that DNA would be sequenced. Viruses were studied (called bacteriophages back then).

No one had any idea that bacteria could defend themselves against viruses, but defend they do by their CRISPR system. It’s only been known for a decade, earlier papers on the subject by 3 different authors Mojica, Gilles Vergnaud, Alexander Bolotin were rejected before eventual publication.

Briefly, when a bacterium is infected by a virus, it makes a copy of fragments of its DNA, and pastes it into its genome. On subsequent invasions, it uses the DNA copy to make RNA, which along with a complex enzyme binds to the genome of the new organism, and destroys it.

It turns out that a PAM (Protospacer Adjacent Motif) is crucial for the whole system to work. The bacterial DNA doesn’t have such a sequence of DNA, and searches for it in the invader. The PAM isn’t large (just 3 nucleotides in a row) and the system looks for it in invading viral DNA double helices.

But where does it look? On the side of the double helix with the least information — the minor groove

Look at the following http://pharmafactz.com/wp/wp-content/uploads/2014/11/watson-crick-base-pairing.jpg

It shows classic Watson Crick base pairing — the major groove is a lot bigger taking up 210 degrees (hardly a groove) with more chemical information) than the minor groove. So binding to the major groove is likely to be far more accurate (as well as easier because it’s a larger space)

So why does E. Coli do this? Because different viruses contain different PAM sequences. [ Nature vol. 530 pp. 499 – 503 ’16 ] This is the crystal structure of the E. Coli Cascade complex (the business end of CRISPR) bound to a foreign double stranded DNA target. The 5′ ATG PAM is recognized in duplex form, from the minor groove side, by 3 structural features in the Cse1 subunit of cascade. The promiscuity inherent to minor groove DNA recognition explains how a single Cascade complex can respond to several distinct PAM sequences — this is a feature not a bug.

The twists and turns of topoisomerase (pun intended)

It is very sad that my late friend Nick Cozzarelli isn’t around to enjoy the latest exploits of the enzyme class he did so much great work on — the topoisomerases. For a social note about him see the end of the post.

We tend to be quite glib about just what goes on inside a nucleus when DNA is opened up and transcribed into mRNA by RNA polymerase II (Pol II). We think of DNA has a linear sequence of 4 different elements (which it is) and stop there. But DNA is a double helix, and the two strands of the helix wind around each other every 10 elements (nucleotides), meaning that within the confines of our nuclei this happens 320,000,000 times.

I’ve written a series of six posts on what we would see if our nuclei were enlarged  by a factor of 100,000 (which is the amount of compaction our DNA must undergo to fit inside the 10 micron (10 millionths of a meter) in diameter nucleus (since if fully extended our DNA would be 1 meter long. So if you compacted the distance from New York to Seattle (2840 miles or 14,995,200 feet) down by this factor you’d get a sphere 150 feet in diameter or half the length of a football (US) field. Now imagine blowing up the diameter and length of the DNA helix by 100,000 and you’d get something looking like a 2,840 mil long strand of linguini which twists on itself  320,000,000 times. The two strands are 3/8th of an inch thick. They twist around each other every 9/16ths of an inch.

For the gory details start at https://luysii.wordpress.com/2010/03/22/the-cell-nucleus-and-its-dna-on-a-human-scale-i/ and follow the links.

Well, we know that for DNA to be copied into mRNA it must be untwisted, the strands separated so RNA polymerase II (Pol II) can get to it.  Pol II is enormous — a mass of 500 kiloDaltons and 7 times thicker at 140 Angstroms than the DNA helix of 20 Angstrom thickness.

Consider the fos gene (which we’ll be talking about later). It contains 380 amino acids (meaning that the gene contains at least 1140 nucleotides ). The actual gene is longer because of introns (3,461 nucleotides), which means that the gene contains 346 complete turns of the double helix, all of which must be unwound to transcribe it into mRNA.

So it’s time for an experiment. Get about 3 feet of cord roughly 3/8 of an inch thick. Tie the ends together, loop one end around a hook in your closet, put a pencil in the other end and rotate it about 100 times (or until you get tired). Keeping everything the same, have a friend put another pencil between the two strands in the middle, separating them. Now pull on the strands to make the separation wider and move the middle pencil toward one end. In the direction of motion the stands will coil even tighter (supercoiling) and behind they’ll unwind.

This should make it harder for Pol II to do its work (or for enzymes which copy DNA to more DNA). This is where the various topoisomerase come in. They cut DNA allowing supercoils to unwind. They remain attached to the DNA they cut so that the DNA can be put back together. There are basically two classes of topoisomerase — Type I topoisomerase cuts one strand, leaving the other intact, type II cuts both.

Who would have thought that type II topoisomerase would be involved in the day to day function of our brain.

Neurons are extended things, with information flowing from dendrites on one side of the cell body to much longer axons on the other. The flow involves depolarization of the cell body as impulses travel toward the axon. We know that certain genes are turned on by this activity (e.g. the DNA coding for the protein is transcribed into mRNA which is translated into protein by the ribosome). They are called activity dependent genes.

This is where [ Cell vol. 1496 – 1498, 1592 – 1605 ’15 ] comes in. Prior to neuronal activity, when activity dependent genes are expressed at low levels, the genes still show the hallmarks of highly expressed genes (e.g. binding by transcription factors and RNA polymerase II, Histone H3 trimethylation of lysine #4 {H3K4Me3 } at promoters).

This work shows that such genes are highly negatively supercoiled (see above) preventing RNA polymerase II (Pol II) from extending into the gene body. On depolarization of the cell body in some way Topoisomerase IIB is activated, leading to double strand breaks (dsbs) within promoters allowing the DNA to unwind and Pol II to productively elongate through gene bodies.

There is evidence that neuronal stimulation leads to dsbs ( Nature NeuroScience vol. 16 pp. 613 – 621 ’13 ) throughout the transcription of immediate early genes (e.g. genes turned on by neural activity). The evidence is that there is phosphorylation of serine #139 on histone variant H2AX (gammaH2AX) which is a chromatin mark deposited on adjacent histones by the DNA damage response pathway immediately after DSBs are found.

Etoposide (a topoisomerase inhibitor) traps the enzyme in a state where it remains bound to the DNA of the dsb. On etoposide Rx, there is an increase in activity dependent genes (Fos, FosB, Npas4). Inhibition of topiosomerase IIB (the most prevalent topoisomerase in neurons) by RNA interference (RNAi) leads to blunted activity dependent induction of these genes. This implies that DNA cutting by topoisomerase IIB is required for gene activation in response to neuronal activity.  Other evidence is that knocking down topoisomerase  using RNA interference (RNAi) stops activity dependent gene transcription.

Further supporting this idea, the authors induced dsbs at promoters of activity dependent genes (Fos, fosB, Npas4) using the CRISPR system. A significant increase in transcription was found when the Fos promoter was targeted.

I frankly find this incredible. Double strand breaks are considered bad things for good reason and the cell mounts huge redundant machines to repair them, yet apparently neurons, the longest lived cells in our bodies are doing this day in and day out. The work is so fantastic that it needs to be replicated.

Social Note: Nick Cozzarelli is one of the reasons Princeton was such a great institution back in the 50s (and hopefully still is). Nick’s father was an immigrant shoemaker living in Jersey City, N. J. Princeton recognized his talent, took him in, allowing him to work his way through on scholarship, waiting tables in commons, etc. etc. He obtained a PhD in biochemistry from Harvard and later became a prof at Berkeley, where he edited the Proceedings of the National Academy of Sciences USA for 10 years. He passed away far too soon of Burkitt’s lymphoma in his late 60s. We were friends as undergraduates and in grad school.

I can only wonder what Nick would say about the latest twists of the topoisomerase story