I’ve never bought the idea that the 98% of our 3.2 gigaBase genome not coding for protein is junk. Consider the humble leprosy organism.It’s a mycobacterium (like the organism causing TB), but because it essentially is confined to man, and lives inside humans for most of its existence, it has jettisoned large parts of its genome, first by throwing about 1/3 of it out (the genome is 1/3 smaller than TB from which it is thought to have diverged 66 million years ago), and second by mutation of many of its genes so protein can no longer be made from them. Why throw out all that DNA? The short answer is that it is metabolically expensive to produce and maintain DNA that you’re not using.
Which brings us to Cell vol. 156 pp. 907 – 919 ’14. At least half of our genome is made of repetitive elements. We have some 520,000 (imperfect) copies of LINE1 elements — each up to 6,000 nucleotides long. There are 1,400,000 (imperfect) copies of Alu each around 300 nucleotides long. This stuff has been called junk for decades. However it has become apparent that over 50% of our entire genome is transcribed into RNA. This is also expensive metabolically.
Addendum 17 Mar: Just the cost of making a single nucleotide from scratch to hook into mRNA is 50 ATP molecules (according to an estimate I read). It also takes energy for the polymerase to hook two nucleotides together — but I can’t find out what it is (anyone know?). It’s hard to avoid teleology when thinking about biology — but why should a cell expend all this metabolic energy to copy half or more of its genome into RNA, if it weren’t getting something useful back?
Why hasn’t evolution got rid of this stuff, like the leprosy organism? Probably because it’s doing several important things we don’t understand. Here’s one of them. The cell paper did something clever and obvious (now that someone else though of it). C0T-1 DNA is placental DNA predominantly 50 – 300 nucleotides in size, very enriched in repetitive DNA sequences. It is used to block nonspecific hybridization in microarray screening for mRNA coding for protein. The authors used C0T-1 DNA to look at whole cells to find RNA transcribed from these repetitive elements, and more importantly, to find where in the cell it was located.
Guess what they found? Repetitive DNA is associated big time with interphase (e.g. not undergoing mitosis) active chromatin (aka euchromatin). So RNA transcribed from Alu and LINE1 is a structural component of our chromosomes. Since the length of the 3.2 gigaBases of our genome, if stretched out, is 1 METER, a lot of our DNA occurs in very compact structures (heterochromatin) which is thought to be transcriptionally inactive. What happens when you use RNAase (an enzyme breaking down RNA) to remove it? The chromosomes condense to heterochromatin. So the junk may be keeping our chromosomes in an ‘open’ state, a fairly significant function.
This is the exact opposite of XIST, a 17,000 nucleotide RNA transcribed from the X chromosome, which keeps one of the two X’s each female possesses inactive by coating it like the ecRNAs
The authors conclude with “we are far from understanding genome expression and regulation.” Amen.
If some of this is a bit above your molecular biological pay grade — please see a series of articles “Molecular Biology Survival Guide for Chemists” — here’s a link to the first one — https://luysii.wordpress.com/2010/07/07/molecular-biology-survival-guide-for-chemists-i-dna-and-protein-coding-gene-structure/. There are 4 more.