Tag Archives: XIST

The RNA world strikes again (it never stopped)

Jpx is a long (over 200 nucleotides) nonCoding (for protein that is) RNA (e.g. a lncRNA).  It is an example of the RNA world from which we (presumably) sprang. One of its function is to control another RNA, and a fairly important one at that — namely Xist, which inactivates one of a woman’s two X chromosomes.  The jpx gene is just 10 kiloBases away from that of Xist. Jpx turns on the transcription of Xist which then goes and coats the X chromosome from which it is transcribed, shutting off most of its genes.

One of the mechanisms by which Jpx turns on Xist production is by binding to a protein called CTCF.  CTCF sits on the promoter of the Xist gene until Jpx binds to it displacing CTCF from the promoter.

CTCF is a much better known actor, and along with cohesin is thought to be responsible for the formation of chromosome loops, and the establishment of TADs (topologically associated domains) which are basically loops of chromosomes containing about a million nucleotides with an average of 8 protein coding genes which are coordinately expressed as a result.

That’s fairly impressive.  What happens when you knock out the jpx gene.  [ Cell vol. 184 pp. 6157 – 6173 ’21 ] did just this and all Hell broke loose.  Jpx keeps CTCF from binding promotors, and without jpx thousands of chromosome loops are replaced by others, with downregulation of some 700 protein coding genes.

Again, the RNA world is like some legacy software (think DOS) underlying the latest stuff (think Windows), forgotten but not gone.

The RNA world strikes again

Life is said to have originated in the RNA world.  We all know about the big 3 important RNAs for the cell, mRNA, ribosomal RNA and transfer RNA.  But just like the water, sewer, power and subway systems under Manhattan, there is another world down there in the cell which is just beginning to come into focus

I’ve written several posts about the RNA world in our cells (links at the end), but the latest is really staggering, in that RNA is helping to organize the how our DNA lies in the nucleus.

As usual the discoveries depended new technologies — RD-SPRITE in this case (you don’t want to know what the acronym stands for (by the bye have you noticed how many more acronyms are appearing in papers you read?).  It is extremely complex, but the technique is said to be able to simultaneously map thousands of  RNA and DNA molecules at high resolution relative to all other RNA and DNA molecules.  Details in Cell vol. 184 pp. 5775 – 5790 ’21 .

The count of long nonCoding (for protein that is) RNAs is now in the tens of thousands [ Science vol. 373 pp. 623 – 624 ’21 ]. They have all sorts of functions, but the present work shows that 93% of them stay close to the gene that transcribes them in the nucleus.  Here they bind other proteins in precise territories in the nucleus (because the gene for lncRNAs are found in territories as precise  in the nucleus).   This establishes functional compartments in the nucleus to regulate gene expression.

Interestingly long nonCoding RNAs are transcribed at very low levels, which led people to dismiss them as chaff.  By binding proteins this explains how so few molecules can do so much.

That’s pretty abstract.  Consider Xist, a large nonCoding RNA which inactivates one of the X chromosomes in females.  Just two xists are able to seed a multiprotein cloud around the Xist locus on the X.

Later to be described is Jpx which is crucial in establishing TADs (topologically associated domains)

Here are some older posts on the RNA world

Forgotten but not gone

Forgotten but not gone — take II

Forgotten but not gone — take III

Maybe there really is junk DNA

Until about 20 years ago, molecular biology was incredibly protein-centric.  Consider the following terms — nonsense codon, noncoding DNA, junk DNA.  All are pejorative and arose from the view that all the genome does is code for protein.  Nonsense codon means one of the 3 termination codons, which tells the ribosome to stop making protein.  Noncoding DNA means not coding for protein (with the implication that DNA not coding for protein isn’t coding for anything).

The term Junk DNA goes back to the 60s, a time of tremendous hubris as the grand biochemical plan of life was being discovered. People were not embarrassed to use the term ‘central dogma’ which was DNA makes RNA makes protein. It therefore came as a shock once we had a better handle on the size of the genome to discover that less than 2% of it coded for protein. Since much of it was made of repetitive sequences it was called junk DNA.

I never bought it, thinking it very dangerous to dismiss as unimportant what you did not understand or could not measure. Probably this was influenced by my experience as an Air Force M.D. ’68 – ’70 during the Vietnam war.

But now comes a sure to be contentious but well reasoned paper arguing that junk DNA does exist, even though it is occasionally transcribed [ Cell vol. 183 pp. 1151 – 1161 ’20 ]. The paper discusses all RNAs in the cell not part of the ribosome, or small nucleolar RNAs (snoRNAs) or microRNAs.

They note that no enzyme is perfect acting on only the substrate we think evolution optimized it for — they call this promiscuous behavior. So a transcription factor which binds to a particular promoter sequence will also bind to near miss sequence. Moreover such near misses are constantly being generated in our genome by random mutation. This is why they think that the ENCODE (ENCyopedia Of Dna Elements) found that the entire genome is transcribed into RNA. The implication made by many is that this must be functional.

However many random pieces of DNA can activate transcription [ Genes Dev. vol. 30 pp. 1895 – 1907 ’16 ] producing what the authors call transcriptional noise.

There is evidence that the cell has evolved a way to stop some of this. U1 snRNP recognizes the 5′ splice site motif. It is present in nuclei at an order of magnitude higher than other spliceosomal subcomplexes, so it monitors for RNAs which have a 5′ splice site motif but which lack the 3′ splice site. These RNAs are subsequently destroyed, never making it out of the nucleus.

They think the primary function of lncRNA is chromatin remodeling affecting gene expression — this is certainly true of XIST which silences one of the two X chromosomes females carry.

There is a lot more very technical molecular biology and close reasoning in the paper, but this should be enough to whet your interest. It is well worth reading. Probably, like me, you’ll be mentally arguing with the authors as you read it, but that’s the sign of a good paper.

Now for a question which has always puzzled me. Consider the leprosy organism. It’s a mycobacterium (like the organism causing TB), but because it essentially is confined to man, and lives inside humans for most of its existence, it has jettisoned large parts of its genome, first by throwing about 1/3 of it out (the genome is 1/3 smaller than TB from which it is thought to have diverged 66 million years ago), and second by mutation of many of its genes so protein can no longer be made from them. Why throw out all that DNA? The short answer is that it is metabolically expensive to produce and maintain DNA that you’re not using

If you want a few numbers here they are:
Genome of M. TB 4,441,529 nucleotides
Genome of M. Leprae 3,268,203 nucleotides

Clearly microorganisms are under high selective pressure, and the paper says that humans are under almost none, but it seems to me that multicellular organisms would have found a way to get rid of DNA it doesn’t need.

It may well be that all this DNA and the RNA transcribed from it is evolutionary potting soil, waiting for some new environmental stress to put it to use.

What junk DNA is doing

I’ve never bought the idea that the 98% of our 3.2 gigaBase genome not coding for protein is junk. Consider the humble leprosy organism.It’s a mycobacterium (like the organism causing TB), but because it essentially is confined to man, and lives inside humans for most of its existence, it has jettisoned large parts of its genome, first by throwing about 1/3 of it out (the genome is 1/3 smaller than TB from which it is thought to have diverged 66 million years ago), and second by mutation of many of its genes so protein can no longer be made from them. Why throw out all that DNA? The short answer is that it is metabolically expensive to produce and maintain DNA that you’re not using.

Which brings us to Cell vol. 156 pp. 907 – 919 ’14. At least half of our genome is made of repetitive elements. We have some 520,000 (imperfect) copies of LINE1 elements — each up to 6,000 nucleotides long. There are 1,400,000 (imperfect) copies of Alu each around 300 nucleotides long. This stuff has been called junk for decades. However it has become apparent that over 50% of our entire genome is transcribed into RNA. This is also expensive metabolically.

Addendum 17 Mar: Just the cost of making a single nucleotide from scratch to hook into mRNA is 50 ATP molecules (according to an estimate I read). It also takes energy for the polymerase to hook two nucleotides together — but I can’t find out what it is (anyone know?). It’s hard to avoid teleology when thinking about biology — but why should a cell expend all this metabolic energy to copy half or more of its genome into RNA, if it weren’t getting something useful back?

Why hasn’t evolution got rid of this stuff, like the leprosy organism? Probably because it’s doing several important things we don’t understand. Here’s one of them. The cell paper did something clever and obvious (now that someone else though of it). C0T-1 DNA is placental DNA predominantly 50 – 300 nucleotides in size, very enriched in repetitive DNA sequences. It is used to block nonspecific hybridization in microarray screening for mRNA coding for protein. The authors used C0T-1 DNA to look at whole cells to find RNA transcribed from these repetitive elements, and more importantly, to find where in the cell it was located.

Guess what they found? Repetitive DNA is associated big time with interphase (e.g. not undergoing mitosis) active chromatin (aka euchromatin). So RNA transcribed from Alu and LINE1 is a structural component of our chromosomes. Since the length of the 3.2 gigaBases of our genome, if stretched out, is 1 METER, a lot of our DNA occurs in very compact structures (heterochromatin) which is thought to be transcriptionally inactive. What happens when you use RNAase (an enzyme breaking down RNA) to remove it? The chromosomes condense to heterochromatin. So the junk may be keeping our chromosomes in an ‘open’ state, a fairly significant function.

This is the exact opposite of XIST, a 17,000 nucleotide RNA transcribed from the X chromosome, which keeps one of the two X’s each female possesses inactive by coating it like the ecRNAs

The authors conclude with “we are far from understanding genome expression and regulation.” Amen.

If some of this is a bit above your molecular biological pay grade — please see a series of articles “Molecular Biology Survival Guide for Chemists” — here’s a link to the first one — https://luysii.wordpress.com/2010/07/07/molecular-biology-survival-guide-for-chemists-i-dna-and-protein-coding-gene-structure/. There are 4 more.