Category Archives: Molecular Biology

Bye bye stoichiometry

I’m republishing this old post from 2018, to refresh my memory (and yours) about liquid liquid phase separation before writing a new post on one of the most interesting papers I’ve read in recent years.  The field has exploded since this was written.

Until recently, developments in physics basically followed earlier work by mathematicians Think relativity following Riemannian geometry by 40 years.  However in the past few decades, physicists have developed mathematical concepts before the mathematicians — think mirror symmetry which came out of string theory — https://en.wikipedia.org/wiki/Mirror_symmetry_(string_theory). You may skip the following paragraph, but here is what it meant to mathematics — from a description of a 400+ page book by Amherst College’s own David A. Cox

Mirror symmetry began when theoretical physicists made some astonishing predictions about rational curves on quintic hypersurfaces in four-dimensional projective space. Understanding the mathematics behind these predictions has been a substantial challenge. This book is the first completely comprehensive monograph on mirror symmetry, covering the original observations by the physicists through the most recent progress made to date. Subjects discussed include toric varieties, Hodge theory, Kahler geometry, moduli of stable maps, Calabi-Yau manifolds, quantum cohomology, Gromov-Witten invariants, and the mirror theorem. This title features: numerous examples worked out in detail; an appendix on mathematical physics; an exposition of the algebraic theory of Gromov-Witten invariants and quantum cohomology; and, a proof of the mirror theorem for the quintic threefold.

Similarly, advances in cellular biology have come from chemistry.  Think DNA and protein structure, enzyme analysis.  However, cell biology is now beginning to return the favor and instruct chemistry by giving it new objects to study. Think phase transitions in the cell, liquid liquid phase separation, liquid droplets, and many other names (the field is in flux) as chemists begin to explore them.  Unlike most chemical objects, they are big, or they wouldn’t have been visible microscopically, so they contain many, many more molecules than chemists are used to dealing with.

These objects do not have any sort of definite stiochiometry and are made of RNA and the proteins which bind them (and sometimes DNA).  They go by any number of names (processing bodies, stress granules, nuclear speckles, Cajal bodies, Promyelocytic leukemia bodies, germline P granules.  Recent work has shown that DNA may be compacted similarly using the linker histone [ PNAS vol.  115 pp.11964 – 11969 ’18 ]

The objects are defined essentially by looking at them.  By golly they look like liquid drops, and they fuse and separate just like drops of water.  Once this is done they are analyzed chemically to see what’s in them.  I don’t think theory can predict them now, and they were never predicted a priori as far as I know.

No chemist in their right mind would have made them to study.  For one thing they contain tens to hundreds of different molecules.  Imagine trying to get a grant to see what would happen if you threw that many different RNAs and proteins together in varying concentrations.  Physicists have worked for years on phase transitions (but usually with a single molecule — think water).  So have chemists — think crystallization.

Proteins move in and out of these bodies in seconds.  Proteins found in them do have low complexity of amino acids (mostly made of only a few of the 20), and unlike enzymes, their sequences are intrinsically disordered, so forget the key and lock and induced fit concepts for enzymes.

Are they a new form of matter?  Is there any limit to how big they can be?  Are the pathologic precipitates of neurologic disease (neurofibrillary tangles, senile plaques, Lewy bodies) similar.  There certainly are plenty of distinct proteins in the senile plaque, but they don’t look like liquid droplets.

It’s a fascinating field to study.  Although made of organic molecules, there seems to be little for the organic chemist to say, since the interactions aren’t covalent.  Time for physical chemists and polymer chemists to step up to the plate.

The silence is deafening

3 weeks ago I published a post about a paper that I thought would be a real bombshell, in effect contradicting a paper in a prestigious journal, and strongly arguing from real data that the pandemic virus could have been made in a lab, quite possibly Wuhan.  .

Absolutely nothing has happened. No letters to PNAS (the source of the article) to Cell (the source of the criticized study).  With a question of this magnitude and importance  you’d think Nature or Science would weigh in about it.  The origin of the pandemic virus is certainly they’ve covered extensively.

So I’m going to send this to all concerned and see if I get any feedback.

Here is the original post.

Evidence that the pandemic virus was made in a lab

 

Everyone knows that the Chinese have been less than forthcoming about the origin of the pandemic virus (SARS-CoV-2).  An article in the current Proceedings of the National Academy of Sciences — https://doi.org/10.1073/pnas.2202769119 arguesthat US data, which hasn’t been released, and some 290 pages of which has been redacted could shed a good deal of light on the subject (without any help from China).  One of the authors is an economist, but the other has serious biochemical chops — https://www.pharmacology.cuimc.columbia.edu/profile/neil-l-harrison-phd.

Basically a variety of US institutions (see the paper — it’s freely available) have been working with the lab at Wuhan for years modifying the virus, long before the pandemic.  The paper names the names etc. etc. and is quite detailed, but I want to explain the evidence that the virus could have been produced (by human modification) at the Wuhan lab.  It has to do with a site in a viral protein which says ‘cut here’.

Here is more background than many readers will need, but the virus has affected us all and I want to make it accessible to as many as possible.

Proteins are linear strings of amino acids, just as this post is a linear sequence of letters, spaces and punctuation.

We have fewer amino acids (20 to be exact) than letters  and to save space each one has a one letter abbreviation (A for alanine V for valine, etc. etc.).  The spike protein (the SARS-CoV-2 protein binding to the receptor  for it on our cells) is quite long (1,273 amino acids all in a row).

Our genome codes for 588  proteins (called proteases) whose job it is to cut up other proteins. Obviously, it would be a disaster if they worked indiscriminately.  So each cuts at a particular sequence of amino acids. Think of the protease as a key and the sequence as a lock.  One protease called furin cuts in the middle of an 8 amino acid sequence RRAR’SVAS (R stands for aRginine and S for Serine).  This is called the furin cleavage site (FCS)

A paper (The origins of SARS-CoV-2: A critical review. Cell 184, 4848–4856 (2021) argued that the amino acid sequence of the FCS in SARS-CoV-2 is an unusual, nonstandard sequence for an FCS and that nobody in a laboratory would design such a novel FCS.  So, like many, I skimmed the paper and accepted its conclusions, as Cell is one of the premier molecular biology journals.

One final quote “The NIH has resisted the release of important evidence, such as the grant proposals and project reports of EHA, and has continued to redact materials released under FOIA, including a remarkable 290-page redaction in a recent FOIA release.”

Sounds like Watergate doesn’t it?

 

Watch this space

BMOR is a bad actor

RNA and proteins have long been known to interact, but classic molecular biology pretty much had proteins down as something that modified RNA function.   Not so for BMOR, a long nonCoding RNA (1,247 nucleotides) expressed in breast cancer cells metastatic to the brain.  BMOR binds to IRF3 (Interferon Regulatory factor 3) inhibiting its phosphorylation by TBK1 with subsequent movement to the nucleus where it stimulates interferon expression which then turns on hundreds of genes producing inflammation.  All this is described in Proc. Natl. Acad. Sci. vol. 119 e2200230119 ’22 —

May 26, 2022
119 (22) e2200230119
Not sure if it is behind a paywall.    Definitely worth a read because knocking down BMOR in breast cancer cells prevents them from spreading to the brain (probably  by using BMOR to turni off the brain’s immune response to them).  Even more interestingly, BMOR was found to be only substantially expressed in breast cancer metastasis to brain tissue versus breast cancer metastasis to nonbrain tissues.

 

 

Teleology as always raises its head.  What in the world is the normal function of BMOR?  It can’t be what it is doing in the animal model described here.  Why would a cell make something to help it kill the organism containing it?

 

Then of course, as is typical of all interesting research, larger questions are raised.  Are there other RNAs whose function is to modify protein function?  Remember that 75% of the genome is transcribed into RNA.  Most of this has been thought of as molecular chaff, like the turnings of a lathe.   Time pick up the chaff from the factory floor and give it a look.

Brilliant structural work on the Arp2/3 complex with actin filaments and why it makes me depressed

The Arp2/3 complex of 5 proteins forms side branches on existing actin filaments.  The following paper shows its beautiful structure along with movies.  Have a look — it’s open access. https://www.pnas.org/doi/10.1073/pnas.2202723119.

Why should it make me depressed? Because I could spend the next week studying all the ins and outs of the structure and how it works without looking at anything else.  Similar cryoEM studies of other multiprotein machines are coming out which will take similar amounts of time.  Understanding how single enzymes work is much simpler, although similarly elegant — see Cozzarelli’s early work on topoisomerase.

So I’m depressed because I’ll never understand them to the depth I understand enzymes, DNA, RNA etc. etc.

Also the complexity and elegance of these machines brings back my old worries about how they could possibly have arisen simply by chance with selection acting on them.  So I plan to republish a series of old posts about the improbability of our existence, and the possibility of a creator, which was enough to me get thrown off Nature Chemistry as a blogger.

Enough whining.

Here is why the Arp2/3 complex is interesting.  Actin filaments are long (1,000 – 20,000 Angstroms and thin (70 Angstroms).  It you want to move a cell forward by having them grow toward its leading edge, growing actin filaments would puncture the membrane like a bunch of needles, hence the need for side branches, making actin filaments a brush-like mesh which could push the membrane forward as it grows.

The Arp2/3 complex has a molecular mass of 225 kiloDaltons, or probably 2,250 amino acids or 16 thousand atoms.

Arp2 stands for actin related protein 2, something quite similar to the normal actin monomer so it can sneak into the filament. So can Arp3.  The other 5 proteins grab actin monomers and start them polymerizing as a branch.

But even this isn’t enough, as Arp2/3 is intrinsically inactive and multiple classes of nucleation promoting factors (NPFs) are needed to stimulate it.  One such NPF family is the WASP proteins (for Wiskott Aldrich Syndrome Protein) mutations of which cause the syndrome characterized by hereditary thrombocytopenia, eczema and frequent infections.

The paper’s pictures do not include WASP, just the 7 proteins of the complex snuggling up to an actin filament.

In the complex the Arps are in a twisted conformation, in which they resemble actin monomers rather than filamentous actin subunits which have a flattened conformation.  After activation arp2 and arp3 mimic the arrangement of two consecutive subunits along the short pitch helical axis of an actin filament and each arp transitions from a twisted (monomerLike) to a flattened (filamentLike) conformation.

So look at the pictures and the movies and enjoy the elegance of the work of the Blind Watchmaker (if such a thing exists).

If the right hand don’t get you, the left hand will

Do you know the source of the title?  I found it surprising.  Answer at the end.

Some cancer cells have elevated levels of an enzyme called PHosphoGlyceride DeHydrogenase (PHGDH, others have decreased levels.  Many cancers contain both types of cells.  Neither is good news.

Those cancers  with low levels of PHGDH  have slower growth.  That’s good news isn’t it?  No.  Such cells are more likely to metastasize.

Those with high levels of PHGDH are less likely to metastasize.  That’s good news isn’t it?  No. such cells grow faster.

So cancers with both types of cells are more aggressive.

Here’s how it works [ Nature vol. 605 pp. 617 – 617, 747 – 753 ’22 ].

PHGDH is on the pathway for synthesis of serine, an amino acid required for protein synthesis (like all of them).  So low levels of the enzyme result in less protein synthesis and less tumor growth.

So how is this bad?  PHGDH binds to another enzyme PFK (PhosphoFructoKinase) stabilizing it.  When PHGDH is low PFK enzyme levels are low, so the subsrate of PFK (fructose 6 phosphate) is diverted to making sialic acid, which modifies cell surface proteins making them more likely to migrate.

So blocking sialic acid synthesis reverses the effects of low PHGDH on cancer migration and metastasis — but it does potentiate cell proliferation.

You just can’t win

Things like this may explain other paradoxic and unexpected effects of enzyme blockade.

16 Tons by Tennessee Ernie Ford

Evidence that the pandemic virus was made in a lab

 

Everyone knows that the Chinese have been less than forthcoming about the origin of the pandemic virus (SARS-CoV-2).  An article in the current Proceedings of the National Academy of Sciences — https://doi.org/10.1073/pnas.2202769119 argues that US data, which hasn’t been released, and some 290 pages of which has been redacted could shed a good deal of light on the subject (without any help from China).  One of the authors is an economist, but the other has serious biochemical chops — https://www.pharmacology.cuimc.columbia.edu/profile/neil-l-harrison-phd.

Basically a variety of US institutions (see the paper — it’s freely available) have been working with the lab at Wuhan for years modifying the virus, long before the pandemic.  The paper names the names etc. etc. and is quite detailed, but I want to explain the evidence that the virus could have been produced (by human modification) at the Wuhan lab.  It has to do with a site in a viral protein which says ‘cut here’.

Here is more background than many readers will need, but the virus has affected us all and I want to make it accessible to as many as possible.

Proteins are linear strings of amino acids, just as this post is a linear sequence of letters, spaces and punctuation.

We have fewer amino acids (20 to be exact) than letters  and to save space each one has a one letter abbreviation (A for alanine V for valine, etc. etc.).  The spike protein (the SARS-CoV-2 protein binding to the receptor  for it on our cells) is quite long (1,273 amino acids all in a row).

Our genome codes for 588  proteins (called proteases) whose job it is to cut up other proteins. Obviously, it would be a disaster if they worked indiscriminately.  So each cuts at a particular sequence of amino acids. Think of the protease as a key and the sequence as a lock.  One protease called furin cuts in the middle of an 8 amino acid sequence RRAR’SVAS (R stands for aRginine and S for Serine).  This is called the furin cleavage site (FCS)

A paper (The origins of SARS-CoV-2: A critical review. Cell 184, 4848–4856 (2021) argued that the amino acid sequence of the FCS in SARS-CoV-2 is an unusual, nonstandard sequence for an FCS and that nobody in a laboratory would design such a novel FCS.  So, like many, I skimmed the paper and accepted its conclusions, as Cell is one of the premier molecular biology journals.

One final quote “The NIH has resisted the release of important evidence, such as the grant proposals and project reports of EHA, and has continued to redact materials released under FOIA, including a remarkable 290-page redaction in a recent FOIA release.”

Sounds like Watergate doesn’t it?

 

Watch this space

 

A new way to look at ALS (thank God)

It’s always good when a new way to look at a basically untreatable disease comes along.  We’ll know soon if looking at filamin A will be useful for Alzheimer’s disease.  Here’s another:  something we’ve known about for years (polyphosphate) may be important in Amyotrophic Lateral Sclerosis (ALS).   I used riluzole for ALS, but never saw any benefit.  It may have slowed the decline, but riluzole never stopped disease progression.

It is stated that 10% of ALS is familial, but I think this is an overstatement.  Even so mutations in a variety of proteins(superoxide dismutase 1 (SOD1) TDP43, C9orf72) do cause ALS, and studying them has taught us a lot about ALS.  There is plenty of work to do.  In 2016 a mere 160 mutations in the 153 amino acids of SOD1 had been found, but we still don’t know how they cause ALS despite hundreds of papers on the subject.  The proteins have allowed us to make mouse models of ALS, by putting in one or the other of mutated SOD1, TDP43, C9orf72 in motor neurons (or in whole animals)

Some real gumshoe work led to polyphosphate [ Neuron vol. 110 pp. 1603 – 1605 ’22 ].  Obviously in ALS, the motor neurons die, but recent work has shown that motor neurons are killed by neighboring astrocytes (containing any of the 3 the mutant proteins), when they are cultured together.   Normal astrocytes don’t do this.

So a lot of hard work found that it was polyphosphate in the supernatant fluid that was the killer.

So what is polyphosphate?  It’s been known for years, and is found in ALL cells — bacterial, plant, animal.  It also produced abiotically in volcanic exudates and deep sea steam vents.  No one knows what it does, so it has been called a molecular fossil.  Again teleology should inform biologic research (but it doesn’t).  Polyphosphate must be doing something useful or it wouldn’t be present in all living cells.

Chemically, polyphosphate is a chain of HUNDREDS to THOUSANDS of phosphate residues linked by high energy phosphoanhydride bonds.

Like this —

HO – PO2 – OH  + HO -PO2 -OH –>  HO – PO2 – 0 – PO2 – OH + H20

— the – O – in the middle is the phosphoanhydride bond

The authors treated motor neurons in culture with polyphosphate and found that it killed 40% of them.  So what?  Schmidt’s law of pharmacology, says that enough of anything will do anything,  So they looked at the spinal cords of patients dying of ALS and found that polyphosphate levels were higher than in neurologically normal controls.

So it’s open season on polyphosphate. Finding out what it does in normal cells, finding out how it kills motor neurons, finding out if decreasing its levels will help ALS (it does in cultures of motor neurons but that’s a long way from a living patient).  It’s an entirely new angle on an awful disease, with no useful treatment.  There is simply an enormous amount of work to be done.

Watch this space.

 

 

The cell is not a bag of water

We have over 800 G protein coupled receptors (GPCRs).  We have not found 800 distinct intracellular messengers (such cyclic adenosine monophosphate — aka CAMP).  A single cell can express up to 100 GPCRS — Mol. Pharm. vol. 88 pp. 181 – 187 ’15.  Some of them raise CAMP levels, others decrease it.  CAMP is supposed to diffuse freely within the cell.  If so, different GPCRs which change cellular CAMP levels to the same extent they should produce identical effects. But they don’t.

One example — Isuprel stimulation of beta adrenergic GPCRs increases cardiac contractile force and activates glycogen metabolism.  Prostaglandin E1 (PGE1) GPCR causes the same CAMP increase without this effect.

A recent fascinating paper may explain why [ Cell vol. 185 pp. 1130 – 1142 ’22 ]  The authors had previously done work showing that under basal conditions CAMP is mostly bound to a protein (regulatory protein kinase A subunit — aka PKA RIalpha ) leading to very low concentrations of free CAMP.

So free diffusion occurs only if CAMP levels are elevated well above the number of binding sites for it.

As usual, to get new interesting results, new technology had to be used.  A biosensor for CAMP based on Forster Resonance Energy Transfer aka FRET —  https://en.wikipedia.org/wiki/Förster_resonance_energy_transfer, was added to two different GPCRs — one for the Glucagon Like Peptide 1 (GLP1) and the other for the beta2 adrenergic receptor.

Even better, they fused the biosensor to the GPCRs using rulerlike  spacers each 300 Angstroms (30 nanoMeters) long.  So they could measure CAMP levels at 30 and 60 nanoMeters from the GPCR.  Levels were highest close to the receptor, but even at 30 and 60 nanoMeters away they were higher than the levels in the cytoplasm away from the cell membrane.  So this is pretty good evidence for what the authors call RAINs (Receptor Associated Independent camp Domains — God they love acronyms don’t they?).

Similar localized responses were seen with the beta2 adrenergic receptors, suggesting that RAINs might be a general phenomenon of GPCRs — but a lot more work is needed.

Even more interesting was the fact that there was no crosstalk between the RAINS of GLP1R and the beta2 adrenergic receptor.  Stimulation of one GPCR changed only the RAIN associated with it and didn’t travel to other RAINs

So the cell with its GPCRs resembles a neuron with its synapses on dendritic spines, where processing at each synapse remains fairly local before the neuron cell body integrates all of them.  It’s like Las Vegas — what happens at GPCR1 (synapse1) stays in GPCR1 (synapse1).  Well not quite, but you get the idea.

A synonymous codon that isn’t

Molecular biology is simply too elegant and beautiful to be left to the molecular biologists.  So I’m going to present the intriguing result of a recent paper as I would take notes on it for myself, and then unpack it explaining the various terms contained as I go along.

It you’re really adventurous — start reading a series of 5 posts I wrote starting with https://luysii.wordpress.com/2010/07/07/molecular-biology-survival-guide-for-chemists-i-dna-and-protein-coding-gene-structure/ and follow the links.

It should explain everything in the paper below.

The paper itself is Nature vol. 602 pp. 335 – 342 ’22 — https://www.nature.com/articles/s41586-022-04451-4.pdf.

The unvarnished result:  Just mutating glutamine to lysine at position 61 of the KRAS oncogene (Q61K)isn’t enough to make KRAS resistant to an anticancer drug that attacks it (Osimertinib).  One of the synonymous codons for glycine at position 60 must be switched to another.

OK:  let’s unpack this starting with synonymous codon.

The DNA making up our genome is a string of elements (nucleotides also known as bases) strung together.  Similarly, our proteins are strings of elements (amino acids).  The order is crucial; just as it is with the 26 letters making up words. Consider the two words united and untied.

Bases come on 4 varieties (A, T, G and C).  Amino acids come in twenty varieties (of which three are glycine (G), Glutamine (Q) and lysine (K) — the one letter abbreviations don’t make much sense but that’s the way it is.

Since order of both bases and amino acids are important, it’s clear that  A T and T A are different. 2 bases  can only code for 16 amino acids.  Go up to 3 bases and you can code for 64 amino acids, which is overkill.   A sequence of 3 bases is called a codon. All 64 codons   code for an amino acid (except for three of them about which much more later).  This means that there must be several codons coding for the same amino acid —  these are the synonymous codons.

The number of codons for a given amino acid ranges from 1 (methionine M) to 6 (Leucine L).  Here are the 4 synonymous codons for glycine — GGA, GGC, GGG and GGT.  Note how similar they are.

Now the human genome has 3,200,000,000 bases strung together divided into 46 pieces (the chromosomes).  If placed end to end (Dorothy Parker fashion) they would be 3 feet 3 inches (1 meter) long.  All this is in a cell so small it is invisible to the naked eye.   If this is too much to get your head around, you might enjoy the following series of 6 posts — start here and follow the links https://luysii.wordpress.com/2010/03/22/the-cell-nucleus-and-its-dna-on-a-human-scale-i/

Any 3 bases linked together code for an amino acid, but there are many different ways to ‘read’ the genome. Among the many proteins our genome codes for are the transcription factors (1,639 of them as of 2018) which bind to stretches of 10 or more bases, to activate certain genes.   That’s 4^10 possibilities (over a million) allowing a unique binding site for the 1,639.  So transcription factors read the genome in groups of 10 or so not 3.

There is yet another way to read the genome, and this has to do with the fact the genes coding for proteins are much longer (have more bases) than the 3 times the number of amino acids they code for.  The classic example is dystrophin, a gene mutated in Duchenne muscular dystrophy.  It’s a monster protein with 3,685 amino acids — so it needs 3,685 *3 = 11,055 bases in a row to code for them at 3 bases/amino acids.  The dystrophin gene, however, stretches for 2,220,223  bases.  So the protein coding parts of the gene (the exons) come in 79 different pieces separated by parts that don’t code for amino acids (the introns).

I’m skipping a lot here, but the introns must be spliced out of a copy of the gene (mRNA).  Again the genome is read by yet another machine (the spliceosome) which removes introns from newly formed copies of the gene (the mRNA).  The spliceosome is a huge molecular machine containing 5 RNAs (called small nuclear RNAs, aka snRNAs), along with 50 or so proteins with a total molecular mass again of around 2,500,000 kiloDaltons (a carbon atom is 12 Daltons).  Most proteins have introns and exons, and most of them exist in multiple forms due to alternative splicing of introns.  The spliceosome reads the mRNA in 6 – 8 base chunks looking for sites (splicing sites) to bind and begin splicing out introns. Yet another way to ‘read’ a sequence of bases.   Exon sequences which promote or repress alternative splicing sites are known (these are called EXE == exonic splicing enhancers, and ESSs = exonic splicing suppressors).

And now, at very long last, we get to the four synonymous codons of glycine which aren’t functionally synonymous at all.  This isn’t trivial: they determine the base sequence a mutated gene must have to produce cancer.

Here’s the unvarnished result once again — Just mutating glutamine to lysine at position 61 of the KRAS oncogene (Q61K) isn’t enough to make KRAS resistant to an anticancer drug that attacks it (Osimertinib).  One of the synonymous codons for glycine at position 60 must be switched to another.

What is KRAS?  A protein which gets its name from a virus causing cancer in rats.  Kirsten RAt Sarcoma virus.  KRAS, when active, relays signals from outside the cell to the nucleus to make the cell proliferate.  The protein exists in active and inactive forms.  Humans have KRAS, and 3 similar proteins.  Mutations causing  members of the protein family to remain in constantly active form are found in 1/3 of all cancers.  In the case of KRAS some activating mutations occur at positions 60 and 61 of the 189 amino acid protein.  That’s all it takes.

The codon for glutamine at position 61 in KRAS is CAA.  To change it to the codon for lysine requires a change of just one base e.g. from CAA (glutamine) to AAA (lysine) and now you have  a KRAS which is always active producing cancer.

Recall that glycine has 4 codons (GGA, GGC, GGG and GGT).  The one found in unmutated KRAS is GGT.  This codon is never found in the KRAS Q61K mutant seen in tumors.  Why?  Because GGTAAA forms a splice site which the splicing machine uses to cut out a different set of introns going to an exon.  This exon contains one of the 3 codons  mentioned above not coding for an amino acid.  They are called termination codons or stop codons, and tell the machinery making mRNA from DNA to quit.   This means that the full mutated  KRAS with its 188 amino acids is never made.  So tumor producing KRAS has GGGAAA or GGAAAA or GGCAAA at positions 60 and 61 and never GGTAAA

So the 3 synonymous glycine codons have very nonsynonymous effects.  Now you know.  Elegant isn’t it?

 

 

The RNA world strikes again (it never stopped)

Jpx is a long (over 200 nucleotides) nonCoding (for protein that is) RNA (e.g. a lncRNA).  It is an example of the RNA world from which we (presumably) sprang. One of its function is to control another RNA, and a fairly important one at that — namely Xist, which inactivates one of a woman’s two X chromosomes.  The jpx gene is just 10 kiloBases away from that of Xist. Jpx turns on the transcription of Xist which then goes and coats the X chromosome from which it is transcribed, shutting off most of its genes.

One of the mechanisms by which Jpx turns on Xist production is by binding to a protein called CTCF.  CTCF sits on the promoter of the Xist gene until Jpx binds to it displacing CTCF from the promoter.

CTCF is a much better known actor, and along with cohesin is thought to be responsible for the formation of chromosome loops, and the establishment of TADs (topologically associated domains) which are basically loops of chromosomes containing about a million nucleotides with an average of 8 protein coding genes which are coordinately expressed as a result.

That’s fairly impressive.  What happens when you knock out the jpx gene.  [ Cell vol. 184 pp. 6157 – 6173 ’21 ] did just this and all Hell broke loose.  Jpx keeps CTCF from binding promotors, and without jpx thousands of chromosome loops are replaced by others, with downregulation of some 700 protein coding genes.

Again, the RNA world is like some legacy software (think DOS) underlying the latest stuff (think Windows), forgotten but not gone.