Category Archives: Molecular Biology

So hush, little baby, don’t you cry/transcribe

“So hush, little baby, don’t you cry” is from the lyrics of Summertime in Porgy and Bess, the first true American opera.  Although first performed in 1935, scandalously it didn’t get its Met performance until 1985.

But HUSH in this context means Human Silencing Hub which keeps invaders into our genome quiet.  One class of invader is the l1 retrotransposon, which comprises 17% of our genome (there are others).  It is not transcribed into mRNA, clearly a good thing.  Unfortunately HUSH will block some attempts at gene therapy (those putting a new gene into our genome, rather than just correcting a mutated one).

HUSH is made from 3 proteins TASOR, MPP8 and periphilin. A great paper (Nature vol. 601 pp. 440 – 445 ’22) shows how it works.  HUSH recruits an ATP dependent chromatin remodeler producing chromosome compaction (so the transcription machinery can’t get to the gene) and SETDB1, which methylates the lysine at position #9 of histone H3.  This modification binds other proteins which further compact DNA.

Why doesn’t HUSH shut everything down?  It appears to recognize transcripts longer than 1.5 kiloBases — typical of L1 which is > 5 kiloBases long.   How HUSH does this isn’t known at present.

Our genes are broken into pieces called exons (average length 237 bases) separated by introns.  L1 and all retrotransposons, don’t have them as they are reverse transcribed from cytoplasmic RNA which has had the introns removed.

So HUSH’s incredible cleverness silences a DNA sequence it (and you) have never heard of, just by its absence of introns.

 

We now understand what amyloid actually is

Lately we have received an embarrassment of riches about amyloid and the diseases it causes.  I’ll start with the latest — the structure of TDP amyloid.

I must say it is a pleasure to get back to chemistry and away from the pandemic, however briefly.  So relax and prepare to enjoy some great chemistry and protein structure.

TDP43 (you don’t to know what the acronym stands for) is a protein which binds to RNA (among other things).  It also forms aggregates, and some 50 mutations are known producing FrontoTemporal  Dementia (FTD) and/or Amyotrophic Lateral Dementia (ALS).  I saw a case as a resident (before things were worked out) and knew something was screwy because while ALS is a horrible disease, patients are clear to the end (witness Stephen Hawking) and my patient was clearly dementing.

Mutations in TDP43 occur in 5% of familial ALS.  More to the point cytoplasmic aggregates of TDP43 occur in 95% of sporadic cases of ALS (no mutations), so neurologists have been fascinated with TDP43 for years.

Back before we knew much about the structure of amyloid, it was characterized by the dyes that would bind to it (Congo Red, thioflavin etc.) and birefringence (see below).  None of this is true for the aggregates of TDP43.

Well we now know what the structure of amyloid is.  You simply can’t do better than  Cell vol. 184 pp. 4857 – 4873 ’21 — but it might be behind a paywall.

So here’s the skinny about what amyloid actually is —

 

It is a significantly long polypeptide chain  flattening  out into a 4.8 Angstrom thick sheet, essentially living in 2 dimensions.  Thousands of sheets then pile on top of each other forming amyloid.  So amyloid is not a particular protein, but a type of conformation a protein can assume (like the alpha helices, beta pleated sheets etc. etc. ).

The structure also explained why planar molecules like Congo Red bind to amyloid (it slips between the sheets).   Or at least that’s what I thought.

 

Enter Nature vol. 601 pp. 29 – 30, 139 – 143 ’22 showing that some 79 amino acids of the 414 amino acids of TDP43 flatten out into single sheet in the aggregates, with the sheets piling on top of each other.  If that isn’t amyloid, what is?

 

Where are the beta strands producing birefringence if this is amyloid.  In fact where is the birefringence? (see below). The paper says that there are 10 beta strands in the 79 amino acids, but they are short with only two of them containing more than 3 amino acids (I guess they can see beta strands by measuring backbone angles a la Ramachandran plots).  The high number of glycine mediated turns prevents beta sheets from stacking next to each other precluding the crossBeta  structure (and birefringence).

 

Why doesn’t Congo Red bind?  My idea about how it binds to other amyloids (slipping between the sheets) clearly is incorrect.

 

There are all sorts of fascinating points about the amyloid of TDP43.  The filaments derived from patients are stable to heating to 65 C.   The structure of the TDP43 fibrils derived from patients with FTD/ALS are quite different in structure from synthetic filaments made from parts of TDP43, so possibly a lot of work will have to be done again.

 

Here is some more detail on amyloid structure:

 

So start with NH – CO – CHR.  NH  CO and C in the structure all lie in the same plane (the H and the side chain of the amino acid < R >  project out of the plane).
Here’s a bit of elaboration for those of you whose organic chemistry is a distant memory.  The carbon in the carbonyl bond (CO) has 3 bonding orbitals in one plane 120 degrees apart, with the 4th orbital perpendicular to the plane — this is called sp2 hybridization.  The nitrogen can also be hybridized to sp2.  This lets the pair of electrons above the plane roam around moving toward the carbon.  Why is this good?  Because any time you let electrons roam around you increase their entropy (S) and anything increasing entropy lowers their free energy (F)which is given by the formula F = H – TS where H is enthalpy (a measure of bond strength, and T is the absolute temperature in Kelvin.

 

So N and CO are in one plane, and so are the bonds from  N and C to the adacent atoms (C in both cases).

 

You can fit the plane atoms into a  rectangle 4.8 Angstroms high.  Well that’s one 2 dimensional rectangle, but the peptide bond between NH and CO in adjacent rectangles allows you to tack NH – CO – C s together while keeping them in a 3 dimensional parallelopiped 4.8 Angstroms high

 

Notice that in the rectangle the NH and CO bonds are projecting toward the top and bottom of the rectangle, which means that in each plane  NH – CO – CHR s, the NH and CO are pointing out of the 2 dimensional plane (and in opposite directions to boot). This is unlike protein structure in which the backbone NHs and COs hydrogen bond to each other.  There is nothing in this structure for them to bond to

 

What they do is hydrogen bond to another 3 dimensional parallelopiped (call it a sheet, but keep in mind that this is NOT the beta sheet you know about from the 3 dimensional structures of proteins we’ve had for years).
So thousands of sheets stacked together form the amyloid fibril.

 

Where does the 9 Angstrom reflection of cross beta (and birefringence) come from?  Consider the  [ NH – CHR – CO ]  backbone as it lies in the 4.8  thick plane (Having studied proteins structure since entering med school in ’62, I never thought such a thing would even be possible ! ).  It curves around like a snake lying flat.  Where are the side chains?  They are in the 4.8 thick plane, separating parts of the meandering backbone from each other — by an average of 9 Angstroms.
Here is an excellent picture of the Alzheimer culprit — the aBeta42 peptide as it forms the amyloid of the senile plaque
You can see the meandering backbone and the side chains keeping the backbone apart.

Then Nature [ vol. 598,  pp. 359 – 363 ’21] blows the field wide open, finding 19 different conformations of tau in clinically distinct diseases. Each clinical disease appears to be associated with a distinct polymorphism.  This is also true for the polymorphisms of alpha-synuclein, with distinct conformations being seen in each of Parkinsonism, multiple system atrophy and Lewy body dementia.

In none of the above diseases is there a mutation (change in amino acid sequence) in the protein.

Henry J. Heinz claimed to have 57 varieties of pickles in 1896, but Cell [ vol. 184 pp. 4857 – 4873 ’21  ] Page 4862 claims that 24 amyloid polymorphs of alpha-synuclein have been found and structurally characterized.  Recall that alpha-synuclein amyloid is the principal component of the Lewy body of Parkinsonism  and Lewy Body disese

How did they get the 24 different conformations?  They incubated the protein under different conditions (e.g. different salt concentrations, different alpha-synuclein concentrations, different salts).

Why is this incredibly good news? 

Because it moves us past amyloid itself, to the conditions which cause amyloid to form.  Certainly, removing amyloid or attacking it hasn’t resulted in any clinical benefit for the Alzheimer patient despite billions being spent by Big Pharma to do so.

We will start to study the ‘root causes’ of amyloid formation.   The amino acid sequence of each protein is identical despite the different conformations of the chain in the amyloid. Clearly the causes must be different for each of the different polymorphs of the protein.  This just has to be true.

Sometimes your licked before you’ve even started

The first issue of Neuron for 2022 got off to a rather depressing start, with two papers stating that the die was cast before you leave the uterus.

The first concerned Huntington’s chorea a hereditary movement disorder which doesn’t begin for decades after birth (like many hereditary disorders).  Or does it?

Well over 10,000 papers have been written to figure out why mutations in the huntingtin protein produces neurodegeneration, primarily in a small set of neurons deep in the brain.

Now that we know the genetic cause, it is possible to study people with the mutation long before they get sick.  One finding is a thin tract of axons connecting the two hemispheres (the corpus callosum).

So a transgenic mouse was created with a typical huntingtin mutation (111 glutamines in a row instead of the normal 7 for the mouse).  Even in utero axons forming the corpus callosum were shorter and fewer.  This was due to defects in bundling of microtubules in the axon growth cone.  It was due to a deficiency not of huntingtin but of a microtubule binding protein called NUMA1 (Nuclear Mitotic Apparatus 1) whose function we thought we knew in the mitotic spindle.  Giving NUMA1 reversed the axonal defect (in tissue culture).   So the problems with Huntington’s chorea go back to fetal life.

Even worse, 4 human fetuses of 13 weeks gestation (presumably aborted because the parents didn’t want a child with the gene, showed also sorts of abnormalities in the developing brain Science vol. 369 pp. 771 – 772, 787 -792 ’20 ].  Read it if you wish, I found it rather creepy.

The second paper is on congenital hydrocephalus [ Neuron vol. 110- pp. 12 – 15 ’22 ].  Several parent child trios were sequenced, and damaging mutations were found in 25% of the probands.  All of them were regulators of neural stem cell fate.  So the ventricles got big (that’s hydrocephalus after all), because there weren’t enough neurons to keep them small.

Even worse, the most consistent macroscopic (e.g. visible) finding in schizophrenia is large ventricles.  Were they doomed from birth?

If that isn’t depressing enough here’s a Supreme Court Justice holding forth on COVID19 in children. “We have over 100,000 children, which we’ve never had before, in serious condition, and many on ventilators” due to the coronavirus.  Presumably the Justice believes in science as many of her party claim.

The RNA world strikes again

Life is said to have originated in the RNA world.  We all know about the big 3 important RNAs for the cell, mRNA, ribosomal RNA and transfer RNA.  But just like the water, sewer, power and subway systems under Manhattan, there is another world down there in the cell which is just beginning to come into focus

I’ve written several posts about the RNA world in our cells (links at the end), but the latest is really staggering, in that RNA is helping to organize the how our DNA lies in the nucleus.

As usual the discoveries depended new technologies — RD-SPRITE in this case (you don’t want to know what the acronym stands for (by the bye have you noticed how many more acronyms are appearing in papers you read?).  It is extremely complex, but the technique is said to be able to simultaneously map thousands of  RNA and DNA molecules at high resolution relative to all other RNA and DNA molecules.  Details in Cell vol. 184 pp. 5775 – 5790 ’21 .

The count of long nonCoding (for protein that is) RNAs is now in the tens of thousands [ Science vol. 373 pp. 623 – 624 ’21 ]. They have all sorts of functions, but the present work shows that 93% of them stay close to the gene that transcribes them in the nucleus.  Here they bind other proteins in precise territories in the nucleus (because the gene for lncRNAs are found in territories as precise  in the nucleus).   This establishes functional compartments in the nucleus to regulate gene expression.

Interestingly long nonCoding RNAs are transcribed at very low levels, which led people to dismiss them as chaff.  By binding proteins this explains how so few molecules can do so much.

That’s pretty abstract.  Consider Xist, a large nonCoding RNA which inactivates one of the X chromosomes in females.  Just two xists are able to seed a multiprotein cloud around the Xist locus on the X.

Later to be described is Jpx which is crucial in establishing TADs (topologically associated domains)

Here are some older posts on the RNA world

Forgotten but not gone

Forgotten but not gone — take II

Forgotten but not gone — take III

Trogocytosis

Trogocytosis — sounds like something out of Lord of the Rings.  The word comes from the Greek trogo meaning nibble — I didn’t think Greeks had words for such trivial things.

But don’t laugh, trogocytosis is intimately involved in the immune system, and diminishing it may be a way to treat cancer.

So what is Trogocytosis?  It is the transfer of membrane fragments between two cells in physical contact.  This includes proteins embedded in the membrane.

[ PNAS vol. 118 e2110241118 ’21n ] tells us that human colon cancer cells can use trogocytosis to acquire attacking lymphocyte proteins, such as the immune regulatory proteins CTLA4 and Tim3.  Capturing these proteins allows the cancer to turn off the immune response against it.  This may be where part of the immunosuppressive tumor microenvironment comes from.

Neuroscientists and clinicians will be interested to know that trogocytosis is the process by which microglia phagocytose dendritic spines and synapses in the process of synaptic pruning. [ Proc. Natl. Acad. Sci. vol. 118 pp. 17605 – 17607 ’09 ].

Trogocytosis is important in immune system function, it is how dendritic cells transfer MHC class I antigen peptide complexes to other cells.  This allows direct transfer of preformed antigen/peptide complexes from an infected to to an antigen presenting cell without the need of further processing [ Nature vol. 471 pp. 581 – 582, 629 – 632 ’11 ].  It was a simpler time 10 years ago, this was also called cross-dressing.

So trogocytosis has its fingers in many pies, and inhibiting it to decrease the immunosuppressive tumor microenvironment is likely to have many other consequences — you heard about some of them here.

Solid evidence for acupuncture at last

The early hype about acupuncture was so extreme (bathwater) that I stopped looking for the medical baby within.  Part of the hype was a reaction against all things western.

However when stimulation of a mouse at the knee point (ST36) decreases mortality due to exposure to lipopolysaccharide by 40%, it’s time to sit up and take notice [ Nature vol. 598 pp. 573 – 574, 641 – 645 ’21 ].

Not only that but the authors found the neurons responsible for the effect.  These neurons in the dorsal root ganglion express the G Protein Coupled Receptor (Prokr2) which is a  receptor for prokineticin, a secreted protein which increases gut motility.

Stimulation of these neurons (or the point behind the knee they innervate) produces anti-inflammatory effects.  Destruction of these neurons (by expressing diphtheria toxin in them) prevents low intensity stimulation of ST36 from dampening inflammation.

The paper even gives a possible explanation for some of the irreproducible results in the field.  High intensity of stimulation of ST36 activates the sympathetic system, while low intensity stimulation activates the parasympathetic nervous system.  The latter activates the vagus nerve which stimulates the adrenal medulla to produce catecholamines (which are anti-inflammatory).  So high intensity stimulation of the same site produces no useful therapeutic effect.

I never thought I’d see high quality work like this on acupuncture, but there it is.  More is sure to follow.

Amyloid Structure at Last ! – 2 Birefringence

This was the state of the art 19 years ago in a PNAS paper (vol. 99 pp. 16742 – 16747 ’02).  “Amyloid fibrils are filamentous structures with typical diameters of 10 nanoMeters and lengths up to several microns.  No high resolution molecular structure of an amyloid fibril has yet been determined experimentally because amyloid fibrils are noncrystalline solid materials and are therefore incompatible with Xray crystallography and liquid state NMR.”

Well solid state NMR and cryo electron microscopy have changed all that and we now have structures for many amyloids at near atomic resolution.  It’s probably behind a pay wall but look at Cell vol. 184 pp. 4857 – 4873 ’21 if you have a chance.  I’ve spent the last week or so with it, and a series of posts on various aspects of the paper will be forthcoming.  The paper contains far too much to cram into a single post.

So lacking an Xray machine to do diffraction, what did we have 57 years ago when I started getting seriously interested in neurology?  To find amyloid we threw a dye called Congo Red on a slide, found that it bound amyloid and became birefringent when it did so.

Although the Cell paper doesn’t even mention Congo Red, the structure of amyloid they give explains why this worked.

What is birefringence anyway?  It means that light moving through a material travels at different speeds in different directions.  The refractive index of a material is the relative speed of light through that material versus the speed of light in a vacuum.   Stand in a shallow pool.  Your legs look funny because light travels slower in water than in air (which is nearly a vacuum).

Look at the structure of Congo Red — https://en.wikipedia.org/wiki/Congo_red.  It’s a long thin planar molecule, containing 6 aromatic rings, kept planar with each other by pi electron delocalization.

The previous post contained a more detailed description of amyloid — but suffice it to say that instead of wandering around in 3 dimensional space, the protein backbone in amyloid is confined to a single plane 4.8 Angstroms thick — here’s a link — https://luysii.wordpress.com/2021/10/11/amyloid-structure-at-last/

Plane after plane stacks on top of each other in amyloid.  So a micron (which is 10,000 Angstroms) can contain over 5,000 such planes, and an amyloid fibril can be several microns long.

It isn’t hard to imagine the Congo Red molecule slipping between the sheets, making it’s orientation fixed.  Sounds almost pornographic doesn’t it? This orients the molecule and clearly light moving perpendicular to the long axis of Congo Red will move at a different speed than light going parallel to the long axis of Congo Red, hence its birefringence when the dye binds amyloid.

Well B-DNA (the form we all know and love as the double helix) has its aromatic bases stacked on top of each other every 3.4 Angstroms.  So why isn’t it birefringent with Congo Red?  It has a persistence length of 150 basePairs or about .05 microns, which means that the average orientation is averaged out, unlike the amyloid in a senile plaque

There is tons more to come.  The Cell paper is full of fascinating stuff.

Amyloid structure at last !

As a neurologist, I’ve been extremely interested in amyloid  since I started in the late 60s.  The senile plaque of Alzheimers disease is made of amyloid.  The stuff was insoluble gunk. All we had back in the day was Xray diffraction patterns showing two prominent reflections at 4 and 9 Angstroms, so we knew there was some sort of repetitive structure.

My notes on papers on the subject over the past 20 years contain  about 100,000 characters (but relatively little enlightenment until recently).

A while ago I posted some more homework problems — https://luysii.wordpress.com/2021/09/30/another-homework-assignment/

Homework assignment #1:  design a sequence of 10 amino acids which binds to the same sequence in the reverse order forming a plane 4.8 Angstroms thick.

Homework assignment #2 design a sequence of 60 amino acids which forms a similar plane 4.8 Angstroms thick, such that two 60 amino acid monomers bind to each other.

Feel free to use any computational or theoretical devices currently at our disposal, density functional theory, force fields, rosetta etc. etc.

Answers to follow shortly

Hint:  hundreds to thousands of planes can stack on top of each other.

 

If you have a subscription to Cell take a look at a marvelous review full of great pictures and diagrams [ Cell vol. 184 pp. 4857 – 4873 ’21 ].

 

Despite all that reading I never heard anyone predict that a significantly long polypeptide chain could flatten out into a 4.8 Angstrom thick sheet, essentially living in 2 dimensions.  All the structures we had  (alpha helix, beta pleated sheet < they were curved >, beta barrel, solenoid, Greek key) live in 3 dimensions.

 

 

So amyloid is not a particular protein, but a type of conformation a protein can assume (like the structures mentioned above).

 

 

So start with NH – CO – CHR.  NH  CO and C in the structure all lie in the same plane (the H and the side chain of the amino acid < R >  project out of the plane).

 

Here’s a bit of elaboration for those of you whose organic chemistry is a distant memory.  The carbon in the carbonyl bond (CO) has 3 bonding orbitals in one plane 120 degrees apart, with the 4th orbital perpendicular to the plane — this is called sp2 hybridization.  The nitrogen can also be hybridized to sp2.  This lets the pair of electrons above the plane roam around moving toward the carbon.  Why is this good?  Because any time you let electrons roam around you increase their entropy (S) and anything increasing entropy lowers their free energy (F)which is given by the formula F = H – TS where H is enthalpy (a measure of bond strength, and T is the absolute temperature in Kelvin.

 

 

So N and CO are in one plane, and so are the bonds from  N and C to the adacent atoms (C in both cases).

 

You can fit the plane atoms into a  rectangle 4.8 Angstroms high.  Well that’s one 2 dimensional rectangle, but the peptide bond between NH and CO in adjacent rectangles allows you to tack NH – CO – C s together while keeping them in a 3 dimensional parallelopiped 4.8 Angstroms high.

 

 

Notice that in the rectangle the NH and CO bonds are projecting toward the top and bottom of the rectangle, which means that in each plane  NH – CO – CHR s, the NH and CO are pointing out of the 2 dimensional plane (and in opposite directions to boot). This is unlike protein structure in which the backbone NHs and COs hydrogen bond to each other.  There is nothing in this structure for them to bond to.

 

 

What they do is hydrogen bond to another 3 dimensional parallelopiped (call it a sheet, but keep in mind that this is NOT the beta sheet you know about from the 3 dimensional structures of proteins we’ve had for years).

 

 

So thousands of sheets stacked together form the amyloid fibril.

 

Where does the 9 Angstrom reflection of cross beta come from?  Consider the  [ NH – CHR – CO ]  backbone as it lies in the 4.8  thick plane (I never thought such a thing would be even possible ! ).  It curves around like a snake lying flat.  Where are the side chains?  They are in the 4.8 thick plane, separating parts of the meandering backbone from each other — by an average of 9 Angstroms

 

Here is an excellent picture of the Alzheimer culprit — the aBeta42 peptide as it forms the amyloid of the senile plaque

 

 

You can see the meandering backbone and the side chains keeping the backbone apart.

 

 

That’s just the beginning of the paper, and I’ll have lots more to say about amyloid as I read further.   Once again, biology instructs chemistry and biochemistry giving it more “things in heaven and earth, Horatio, than are dreamt of in your philosophy.”

The most interesting paper I’ve read this year

We all know what the estrogen receptor is and what it does.  It’s a large protein with 3  functional components.  Actually there are several estrogen receptor proteins, but I’m going to discuss just one — Estrogen Receptor alpha (ERalpha).

Here are the components:

a DNA binding domain (which binds to stretches of bases called the Estrogen Response Element (ERE))

a domain which binds estrogen changing the conformation of the third domain which is —

a domain which binds to RNA polymerase II activating it so it transcribes genes into mRNA.

Given the complexity of the hormonal cycles, it is far from surprising that the estrogen receptor controls the levels of 15% of all annotated protein coding genes < Cell vol. 145 pp. 622 – 634 ’11 >.

Given its importance in breast cancer, ERalpha has been intensively studied for years.

Now various regions of proteins have been assigned function, the SH2 domain binds to phosphotyrosine in proteins, SH3 binds to proline rich motifs, RNA Binding Domains (RBDs) bind to (what else?) RNAs.  Each of them has a characteristic sequence of amino acids allowing them to be picked out from DNA sequences.

Enter Cell vol. 184 pp. 5086 – 5088, 5215 – 5229 ’21 where ERalpha was found to bind to over 1,200 messenger RNAs (mRNAs).   It was not supposed to do that as it doesn’t contain any RBDs (well at least the RBDs we knew about — back to the drawing board on that one).  Even more interesting is the fact what most of the mRNAs bound by ERalpha aren’t from the genes whose ERE ERalpha binds.

Life is said to have originated in the RNA world.  We all know about the big 3 important RNAs for the cell, mRNA, ribosomal RNA and transfer RNA.  But just like the water, sewer, power and subway systems under Manhattan, there is another world down there in the cell which doesn’t much get talked about.  These areRNAs, whose primary (and possibly only) function is to interact with other RNAs.

The RNA world is still alive and well in all our cells.  It’s like DOS still out there under Windows, or Unix and its command line under the Apple interface. We studied proteins and DNA because they were (relatively) easy to study.

The papers go on to study how ERalpha RNA binding affects cancer (which it does).

But there are far larger questions the work brings forth.

l. ERalpha is just one nuclear hormone receptor and Estrogen is just one hormone.  Do the nuclear hormone receptor for other hormones also bind RNA? Have we been missing some of their actions inside cells and if so there are mechanisms to exploit?

2. Why stop at nuclear hormone receptors?  ERalpha binds RNA with no RNA binding  domains (RBDs) in sight.  Do other proteins also bind RNAs and if so what does it mean?

Fantastic stuff.  There is a whole world of possibilities opening before our eyes thanks to these papers.

It reminds me of what an anatomy professor told us when we were studying neuroanatomy — ‘unfortunately everything is connected to everything else’.