How many distinct RNA polymers can be made using the mass of the earth to do so?

The comments by Another O-chemist and Yggdrasil on the last post were excellent, and just the type I’d hoped to get, but before responding I’d like to throw this post into the mix.

Why RNA? Because that’s what the earliest forms of life were made of according to the best current speculations. What is the mass of the average RNA nucleotide? (base + sugar + phosphate). Phosphate has a mass of 96 Daltons, ribose a mass of 115 Daltons, and the ‘average base has a mass of (112 + 115 + 134 + 150)/4 = 128. So the average mass of an RNA nucleotide is 96 + 115 + 128 = 339 or very nearly 3 nucleotides per kiloDalton.

As before, according to Halliday’s Physics 6 Edition the mass of the earth is 6 * 10^27 grams. Assume the earth is entirely made of C, H, O, N and P in just the proportions we need. By the calculations in the previous post, a kilodalton has a mass of 10^-21 grams. Each position in the polyribonucleotide can be one of the 4 bases.

Now it’s time to calculate the number of distinct possibilities for a polyribonucleotide of length n. Pretty simple — it’s just 4^n. Order is crucial, just as united has a different meaning from untied, GACU is different from AGCU (the nucleotides of RNA are abbreviated A, G, C, U).

So for a polyribonucleotide length of n = 21 there are 4,398,046,511,104 distinct orderings of the nucleotides, Each ordering has a mass of 21/3 = 7 kiloDaltons. That’s 4 trillion of them. We’re up to a mass for all orderings of 10^-9 of a gram (a nanoGram) without breathing hard.

Length 42 gets us to 18,446,744,073,709,551,616 — about 2 * 10^19 possibilities,m each with a mass of 42/3 = 14 kiloDaltons. We’re within two orders of magnitude of 1 gram.

Given the current ratio of the genetic code of 3 nucleotides/amino acid, that’s only enough for a 14 amino acid peptide. Now the 64 possible 3 nucleotide codons only code for 20 amino acids + 1 stop codon, so there is some coding overkill in these numbers.

Not so fast. Consider progeria, a terrible (but fortunately rare) disease — only 50 kids with it worldwide. [ Nature vol. 440 pp. 32 – 34 ’06 ]. Unfortunates with progeria age rapidly and die of old age diseases (heart attack and stroke) in their teens. [ Nature vol. 423 pp. 293 – 298, 298 – 301 ’03 ] The defective gene has been found and is Lamin A (a component of the nucleus which helps to shape it). 18/20 cases showed a de novo mutation at the same place in the gene (1825 C –> T) — in codon #608. This doesn’t change the amino acid (which is glycine) but results in a cryptic splice site within exon 11 resulting in the production of a protein with 50 amino acids missing near the carboxy terminus (but the carboxy terminal end of the protein is still there and can be farnesylated). The truncated mutant is called progerin.

So even two distinct codons mapping to the same amino acid can have profoundly different effects. Further examples include the exonic splicing enhancers and inhibitors. For details see my post of 20 Jan ’09 “The Death of the Synonymous Codon” under Chemiotics in the blog of “The Skeptical Chymist”. It’s too long to go into here but pretty interesting

Onward and upward. 4^60 is 1,329,227,995,784,915,872,903,807,060,280,344,580 or about 10^36 polynucleotides 60 bases long each with a mass of 20 kiloDaltons. The mass of all 10^36 of them is then 2 x 10^37 kiloDaltons. Recall that a kiloDalton is 10^-21 grams, so this group has an aggregate mass of 10^16 grams. It’s pretty clear that by the time we get to a polynucleotide of 90 units we’ll have exhausted the mass of the earth.

4^90 = 1532495540865888858358347027150309180000000000000000000

The ribosome is thought to be a molecular fossil of the RNA world. Although there are some 50 proteins to be found on its surface, its catalytic center is pure RNA. How large are the RNAs of the ribosome? Here’s what Molecular Biology of the Cell 4th edition says (p. 343). The eukaryotic ribosome has a molecular mass of 4.2 megaDaltons and is an 80S particle (S stands for Svedberg unit). It is comprised of a 60S subunit of mass 2.4 megaDaltons and a 40S subunit of mass 1.4 megaDaltons. The 60S subunit has 3 ribosomal RNAs of 5S (120 nucleotides), 28S (4700 nucleotides) and 5.8S (160 nucleotides). The 40S subunit has a single 18S rRNA of 1900 nucleotides.

I leave it to the readers to propose a mechanism to achieve this combinatorial feat. I’m satisfied that the above argument shows that randomly trying out all possibilities and coming up with the RNAs of the ribosome is physically impossible. In some way the nucleotides of the ribosomal RNAs must be linked together consistently. RNA dependent polymerases are known which can do it (but they are proteins). Assume that there exists an RNA which can act as the enzyme to link RNA nucleotides together (the way the ribosome links amino acid together) — a big assumption, but one which current speculation seems to require. Such an enzyme made out of RNA (a ribozyme) must have a pre-existing template of ribosomal RNA to do so. Where did the template come from? How did it arise?

And so grubby old chemistry, the province of nerds and other lower forms of animal life, puts us in direct contact with profound questions of existence. Perhaps it will supply an answer as well.

Post a comment or leave a trackback: Trackback URL.


  • Yggdrasil  On December 30, 2009 at 11:17 pm

    Perhaps here is a thought to consider. Here we’re considering the probability of life arising on Earth. But, perhaps a more realistic calculation is the probability of life arising on any earth-like planet. If life is (for example) a one in one million event then it does seem unlikely for it to arise on Earth. However, if there are one million Earth-like planets, then the occurrence of life somewhere in the universe seems much more likely if not inevitable.

    This doesn’t even consider the somewhat crazy theories of multiple parallel universes that give even more chances for life to arise somewhere in a collection of universes, although this seems like somewhat of a cop-out to describe why an unlikely event could occur.

  • luysii  On January 7, 2010 at 10:27 pm

    Yggdrasil: thanks: keep on commentin’

    “Here we’re considering the probability of life arising on Earth. ” Agree. Interesting, how a rather hardheaded back of the envelope sort of chemical calculation gets us right there. You seem to be advocating a sort of Anthropic Cosmological Principle (see the book by Barrow and Tipler — well written and fascinating, and not out of date even though decades old). The idea isn’t any more refutable than solipcism, but it seems (to me) like an easy way out.

    Hugh Everett (parallel universes) was John Wheeler’s grad student when I was an undergraduate there. None of my physicist/philosopher friends (Heinz Pagels, Ray Chao, John Graves, philosophy majors I roomed with, etc. etc.) ever mentioned him or his work.

    Compared to this stuff, Genesis seems downright prosaic.

  • Sili  On January 16, 2010 at 1:21 pm

    I cannot take credit, that is due to Darwin – the answer is: Selection.

  • luysii  On January 16, 2010 at 3:01 pm

    Sili — no question that selection is operative. The point of this post and the last one is just how tiny a fragment of the space of all potential amino acid sequences and potential nucleic acid sequences selection has to operate on. We’ll never know if a completely different set of amino acid and nucleic acid sequences would produce something like life. Possibly the quantum computer could help us out, but at this point the largest number it has succesfully factored is 15 (I think).

  • Galaxy Rising  On September 28, 2010 at 9:04 pm

    Wow, I suppose my exceptionally long comment on the OTHER page was completely ignored then?

    I’ll simplify it for you then: probabilistic processes can find good results in a search space without brute forcing the entire search space. This is a basic tenant of pretty much every heuristic, a good deal of data structures, and quantum computations as a whole, something you seem to imply you know something about (and yet ignore actual basis for consistently).

    Shall I simplify it more? By allowing some kind of error rate, an algorithm becomes a heuristic. Heuristics tend to work well on average, but have error or failure conditions. Quantum computations, by the way, are faster for two reasons: parallelization, and an arbitrarily small error rate.

    Ok, so I think I’ve pushed it home that the entire search space need not be explored, because we can use heuristics instead of algorithms, if we are ok with not finding optimal outcomes, only good outcomes. That’s all well and good, but how does this work?

    Well, obviously, some RNA combinations aren’t going to be very active and some are. A heuristic can disregard paths which have a high probability of not working, which may or may not be ignore some valuable finds, but will definitely speed up the process and lower the amount of tests necessary.

    An interesting example would be that very few things incorporate elements above lead, noble gases, or what we regard as rare metals (gold, silver, etc). Why not incorporate these things? After all, zinc, iron, copper, and sulfur are as bad as lead and gold. And it isn’t even that they are rarer; some things much rarer than lead are used in proteins and RNA complexes. We can think of a chunk of elements and complexes that COULD exist as being eliminated at the root, because they simply weren’t very good at their jobs to begin with.

    Does this mean they can’t be good? No. But they were rejected off hand, their tree of discovery nipped in the bud, and that resulted in a huge amount of possible combinations and complexes being not searched.

    Evolution is iterative; each success is repeated, refined, and constantly altered. However, each failure is discarded, many times never to be heard of again. This pruning allows for a much smaller subset of the search space to be searched, but that smaller subset is much more likely to have useful and active compounds.

    Your question is simply flawed by that single fact; we need not have the BEST complex, merely one that works well enough. Finding one that works AT ALL is trivial, as the barest and smallest form of the working complexes usually are self forming. Refining that trivial and abundant strand of RNA into one which is very very good at it’s job is also trivial; all it takes is time and alterations.

    As a final note, it would probably do you a world of good to actually look into computational mechanics deeper than a few inches; decision trees that do exactly what you are looking for have been well researched for over 50 years, along with a myriad of pruning mechanisms which exactly account for the combinational complexities you wish to solve using brute force. We’re not on Babbage machines anymore; man up.

    • luysii  On October 19, 2010 at 11:06 pm

      Galaxy Rising: Sorry for the delay in responding. The current post is a roadmap to all the posts bearing on this topic, and I’ll begin responding to your (and other) criticisms in subsequent posts.

  • luysii  On September 28, 2010 at 10:10 pm

    Apologies ! ! Your comments got lost in the shuffle. Not sure what happened. Thanks for taking the trouble to write both of them. You’ll have a response soon.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: