Category Archives: The cell nucleus on a human scale

If I were a billionaire

If I were a billionaire I’d fund the following research study immediately.  Where ?  Not Research Triangle Park, the Acela corridor or the Bay Area but Sturgis South Dakota.  Why?

Spend 11 minutes of your time looking at the following video —

The 80th Annual Sturgis Motorcycle Rally began there 9 August. It is expected to attract between 250,000 and 500,000 people and last 10 days.  Masks are not required and the video shows that very few are wearing them.  Note the rather close seating for eating and drinking, the stores and restaurants with low ceilings and long horizontal extent (and rather poor ventilation)  I’m sure the actual event will have far closer human contact than shown as the video was shot when the festival was about to begin.

You’d never get an experiment like this to pass an institutional review board, but there it is for the taking  Mr. Billionaire.

Spend some of that cash getting vans to Sturgis and offer free COVID-19 testing (both antibody testing and genome testing) to any one wanting it.  This is an independent bunch, so all you ask is that they stay in touch and let you know how they’re doing in the weeks and months ahead.   Tell them, they’ll hear the results after the 19th when the festival ends if they want, so they’ll need a way for you to contact them.

Probably most will not divulge information about themselves, but you will  surely find some cooperative people.  So ask them to tell you about age, sex, medical conditions.  Offer to do a BMI for them.  Have your staff eyeball their ethnicity, rather than ask.  Since you are funding the study, you’l be able to keep the information completely private.

The study will tell us a huge amount about transmission, susceptibility, clinical course etc. etc.  You don’t need another house or mistress.  Take that cash and do something for humanity.

Of course, the population isn’t representative.  Almost entirely white, very few people over 70 or under 15 (please spend 11 minutes of your time looking at the video.  The level of obesity and smoking  is impressive).

Of course there will be ethical concerns.  Suppose you find someone shedding the virus — do you contact them?  Probably best to wait a few a weeks before testing.   This is a naturalistic study after all and you’re a billionaire not a doc.

Hurry there are just 7 more days to go.



The butterfly effect in embryology

How the snake lost its legs. No, this isn’t a Just So story a la Rudyard Kipling, but a fascinating paper in Cell (vol. 167 pp. 598 – 600, 633 – 642 ’16 ). All it takes is a 17 nucleotide deletion in ZRS (Zone of polarizing activity Regulatory Sequence), an enhancer of gene expression involved in limb development. The enhancer is at least 1,300 nucleotides long (but I can’t find out just how long ZRS is). The deletion removes a binding site for a transcription factor (ETS) which turns on some limb development genes.

ZRS has long been known to be involved in limb development, and mutations distributed over 700 nucleotides are associated with a variety of human limb malformations. So the authors sequenced the enhancer in a variety of species (including many snakes) and found that only snakes had the deletion.

Then they put the snake ZRS into genetically engineered transgenic mice and found markedly shortened limbs. That was all it took. Reintroducing the missing 17 nucleotides into the transgenics restores normal limb development. Staggering what genetic technology is capable of.

Where does the butterfly effect come in? Because the enhancer is 1,000,000 nucleotides away from some of the genes it controls. If you were studying sequences around the genes it controls, you’d never find the deletion (until you’d run through a large number of grad students). Human biology (with limb malformations) told the authors where to look.

Straightened out 1,000,000 nucleotides is 3,200,000 Angstroms,or 320 microns (32 times the size of the average 10 micron nucleus). Remarkable how it finds its target. You might be interested in a series of posts which try to imagine these goings on at human scale — blowing up the nucleus so it fits in a football stadium with our double stranded DNA blown up to the size of linguini with a total total length of 2840 miles. Start here –

The tail of RNA polymerase II and the limits of chemical explanation

When I study math books, I’m always amazed at how much the reader is expected to internalize and retain.  A theorem proved 100 pages or so ago is referred to in the course of a proof without further ado.  The pure chemist reading this longest of posts, with minimal exposure to modern molecular biology, may feel the same way.  You’ll need all 4 articles of, and all 6 articles of at your fingertips to get through this one.  The stuff is at my mental fingertips because I’ve been learning and thinking about it for decades.  Perhaps mathematicians are the same way, or perhaps they really are smarter than everyone else.

The article assumes you have a solid chemical background.  I find it somewhat sad that only a chemist with a decent molecular biological background can fully understand the elegance and beauty of what is to follow. I hope this post and the 10 above provide enough background for what is to follow.

Recall that eukaryotic RNA polymerase II (pol II) is really a complex of 12 distinct proteins in man with a total mass of 550 kiloDaltons.  The RBP1 subunit is the largest of the 12 and contains a truly fascinating carboxy terminal domain (CTD) — to be discussed in some detail later in this post.  The function of pol II is transcription of a protein coding gene into messenger RNA (mRNA). Pol II binds to DNA upstream (5′ to) the DNA which actually codes for the amino acids making up the protein. Just binding there (this site is called the promoter) is far from enough for gene transcription to actually begin.  5 general transcription factors (pol II transcription factors B, D, E, F, H — aka TFIIB, etc.) are required.  All 5  general transcription factors are actually multiprotein complexes.  Then there is the mediator complex, a complex of more than 20 proteins which allows communication between transcriptional activators (enhancers) and repressors found elsewhere in the DNA.  So the whole gemish contains 60 proteins with a mass of 3,500,000 Daltons.  The heaviest atom in all this is phosphorus, so this means at least 100,000 atoms are involved.  Have a look at Science vol. 288 pp. 632 – 633, 640 – 649 ’00 — it’s old but good and written by Kornberg fils who won his Nobel for this work.

I’ve mentioned some of the processing that goes on after the section of the DNA actually coding for amino acids is transcribed into RNA (splicing, the polyA tail, etc. etc.).  There is also some modification of the 5′ end of the RNA (called the cap), requiring a variety of binding proteins and enzymes to occur.

Just binding to the promoter, separating the two strands of DNA and starting to copy (transcribe) one of them into RNA is not enough.  This happens all the time, but after making  RNAs 5 – 10 nucleotides long, pol II pauses, releases the RNA just made and pops back to the promoter (which it really never left).  The other proteins of the 3.5 megaDalton initiation complex hold onto pol II keeping it there.

Here is where the carboxy terminal domain of the largest subunit of pol II comes in.  It is a fascinating structure, which can only be completely understood by the chemist.  It is made of 52 imperfect repeats of the 7 amino acids.  Here is the consensus repeat (listed from the amino terminal end to the carboxy terminal end — as protein sequences are always presented).

Tyrosine Serine Proline Threonine Serine Proline Serine

What should strike the biochemically oriented chemist is that the 3 (out of 20) amino acids with hydroxyl groups account for 5/7 ths of the structure.  This means that all of them can be phosphorylated.  The two prolines are hardly dull, because they make it impossible for classic alpha helices to form — sometimes they are called helix breakers.  The OH groups mean that the heptad is quite hydrophilic.  Phosphorylation of any two OHs of the heptad means that the chain will be pretty much straight out due to charge charge repulsion.  The number of distinct phosphorylated states of even one heptad is 2^5 =32, that for the whole CTD is 32^52.

Chemists more familiar with biochemistry, know that phosphorylation and dephosphorylation of serine, threonine and tyrosine is extensively used by the cell to control protein/protein interactions.  That’s why our genome codes for 518 different protein kinases (which esterify hydroxyls by phosphate  despite the rather weird name) and 137 phosphatases.

So the phosphorylation state (how much, which ones) of the carboxy terminal domain determine which proteins bind to it.  Here is where the fun begins.

Just to give a glimpse of what is going on in our cells all the time, here are the gory details of formation of the cap at the 5′ of mRNA.  You don’t have to read the details between the asterisks to follow the rest of the post


   [ Proc. Natl. Acad. Sci. vol. 86 pp. 5795 – 5799 ’89 ] All cellular cytoplasmic mRNAs have a 7 methyl guanylate cap attached to their 5′ ends.  The cap structure is added early during the transcription of mRNA by RNA polymerase II in the nucleus (after the first 25 nucleotides of a given mRNA are formed).  
       Three enzymes are involved in mRNA cap formation 
   (1) an RNA triphosphatase which cleaves the 5′ triphosphate terminus of the primary transcript to a 5′ diphosphate terminated RNA 
   (2) a guanyltransferase, which caps the structure with GMP — forming a 5′ – 5′ linkage 
   (3) a methyl transferase which adds a methyl group to the nitrogen at position #7 of guanine (see the structure of 7 methyl guanosine). 
    (4) The cap structure can then be further methylated by a ribose 2’0 methyltransferase.

The 3 capping enzymes bind to the phosphorylated carboxy terminal domain of pol II, so they can grab the newly formed 5′ end of the mRNA as it emerges from a tunnel in pol II.  Not only that, but the enzymes bind to a specific pattern of phosphorylation of the tail (namely serine #5 by a kinase called Cdk7).

         An intricate mechanism exists to stop transcription from proceeding too far, so the 5′ end of the emerging RNA is properly processed.  During the formation of the transcription initiation complex (or soon after initiation) DRB sensitivity inducing factor (DSIF) is recruited to the transcription complex (by binding to the CTD).  Additionally, after initiation of transcription, the negative elongation factor (NELF) is recruited through interaction with DSIF.  This results in the arrest of the transcription complex before it enters into productive elongation. DSIF/NELF mediated arrest is then relieved by means of phosphorylation of the carboxy terminal domain on serine #2 by positive transcription elongation factor b (P-TEFb) and the transcription complex resumes elongation.  This causes DSIF and NELF (both are proteins) to drop away from the CTD.

       Even so, pol II is still linked to the initiation complex at the promoter.  How does it get started again and move away from the promoter? The process is called promoter clearance or promoter escape.  Another phosphorylation of the CTD is involved — this time on serine #5 by a kinase called Cdk7, which is found in one of the general transcription factor complexes (TFIIH).     

       Eventually a whole bunch of proteins (called the super elongation complex) binds to the CTD allowing not just escape, but movement down DNA.  The complex includes the P-TEFb, ELL2, AFF4, AFF1 ENL and AF9 proteins.  So now pol II is chugging down DNA adding a new base every 50 milliSeconds or so.  A whole other group of kinases modifies the CTD so different proteins can bind to it after the terminal codon is reached and finish processing the mRNA.  I’m going to skip this as you have the general idea, but rest assured it is just as complicated as putting on the 5′ cap described above.

Now for the exquisite mechanisms described in Proc. Natl. Acad. Sci. vol. 108 pp. 14717 – 14718 ’11.  In the previous post – — I wondered how the large pol II enzyme transcribes DNA wound twice around the nucleosome (I really haven’t found an answer that satisfies me).  Work has shown that pol II slows down when it reaches a nucleosome (it incorporates fewer nucleotides into the growing mRNA per second.

“95% of human multiexon protein coding genes are alternatively spliced” [ Nature vol. 465 pp. 16 – 17 ‘1o ]  So how is the decision made between two alternative exons by the splicing machinery?  It turns out that pol II is involved here as well.  There is no logical reason it has to be.  The whole mRNA could be formed by the polymerase and then it could move elsewhere in the nucleus to the splicing machinery.  But in this one well studied case, alternative splicing occurs as pol II is transcribing one particular gene (which is mutated in type I neurofibromatosis).

Now for a side trip to neurology.  There is an awful disease called paraneoplastic encephalomyelitis.  The brain is subject to an immune attack in some patients with cancer (and in some it can be the first symptom) with resultant dementia, convulsions, incoordination and death.  For years we wondered what the immune system was attacking.  Now we know it is any of three proteins (HuB, HuC, HuD) found only in the brain.  They bind to messenger RNA.  Why the immune system sometimes chooses them for attack and how cancer sometimes triggers this isn’t known for sure.  One of the theories is that the cancer cells produce something that immunologically looks lik the Hu proteins, which the immune system regards as foreign.  Fortunately it is fairly rare, but I did see a few cases.

Also recall that the nucleosome is only the first stage of the 100,000 fold compaction of DNA required to fit it into the nucleus.  The higher order arrangement of nucleosomes is the matter of decades of intense study which unfortunately hasn’t reached a conclusion, but there is no question that nucleosomes are close together in the nucleus, whether or not the 30 nanoMeter fiber packing 6 or so nucleosomes per level of the fiber.

So the 3 Hu’s are yet another set of proteins binding to the carboxy terminal domain (CTD) of the large subunit of pol II.  So what?  They interact with histone deacetylase 2 (HDAC2) which removes the acetyl group from the the epsilon amino group of lysine, changing an amide to an amine — increasing the positive charge on the nitrogen.  This has the effect of compacting DNA as the protonated amine can then bind to the zillions of negatively charged phosphates of the DNA backbone.  Here’s another place where you simply must know chemistry to understand what’s going on.

So a protein bound to the CTD of pol II recruits another protein which chemically modifies another protein around which DNA is wrapped.  This has the remarkable effect of directly linking the epigenetic machinery to the transcription machinery.  Epigenetics had been thought of as determining which proteins were made in a given cell (e.g. an on/off effect) rather than how they were spliced.

How does this work? The theory is advanced the certain splicing signals are stronger than others. This means if the transcription machinery is slowed down (say by more chromosome compaction), it will have a chance to splice at the weaker splicing signal.

Things are even more complicated.  Back in the day, newsreels were shown before movies (rather than the hideous trailers of today). They sometimes amused American audiences by showing sped up films of crazed foreigners playing the sport of curling — see  A (very heavy) stone is essentially slid on ice toward a target.  In front of the stone are two guys sweeping furiously, to alter the surface of the ice, so the stone lands where they want it to.  With sped up film, they look like idiots.

The PNAS article proposes that something like that happens during transcription — preceding the pol II complex are enzymes called histone acetyl transferase (HATs) the yang to the yin of the HDAC. They acetylate the epsilon amino group of lysines on the histones making up the nucleosomes (making it harder for lysine to bind to the phosphates of DNA.  This presumably opens up compacted DNA letting pol II (which is pretty large itself at 5 x 5 x 7 nanoMeters) get through the chroatin easing transcription. These are the sweepers of curling.  Then along comes pol II.  Near the end of its run along the gene, it recruits Hu proteins which recruit HDAC2 which closes up chromatin again.

Elegant yes?  Incredible, no?

Hopefully, a few readers have actually made it this far.  For questions, critiques, ambiguities, errors of fact, etc. etc., just post a comment.

Now for some philosophy. You can’t really understand any of this without knowing a fair amount of organic chemistry and some protein chemistry as well.   Chemistry explains how all this happens.  It is totally useless in explaining why.  As soon as you ask just what the CTD, the Hu proteins, HDACs, HATs, pol II or anything else in the cell are for, you are in the land of Aristotle, where everything had an innate purpose and function.  You have crossed the Cartesian divide between the physical and the world of ideas, a place where chemistry can no longer help you.

        Still, it is a magnificent thing to have the background to contemplate all this.  Even so,  I’m sure our knowledge is far from complete.  No one said it better than Pascal — “Man is but a reed, the most feeble thing in nature, but he is a thinking reed.”

The Cell and its Nucleus on a human scale – VI — untwisting the linguini

Recall that the 150 foot sphere sitting on the 50 yardline contains some 15,000,000 feet of twisted linguini (DNA).  The two strands are 3/8th of an inch thick.  They twist around each other every 9/16ths of an inch.  We now have the problem separating the two strands to transcribe one of them into messenger RNA (mRNA) so the ribosome can make protein from the mRNA.  The RNA is single stranded.  The machine which accomplishes transcription is RNA polymerase II (see for more detail).  We’re going to essentially forget chemistry at this point and just consider what must occur physically when this happens.

At this point, since word pictures can only go so far,  get two pieces of string each about 1/8th of an inch thick — say a bass guitar string, and wrap them around each other 20 – 50 times, approximating our linguini.  Staple both ends down to some wood. Now start in the middle of the strings, and separate the two strands by a few turns using a pencil.  RNA polymerase must actually do this to copy one strand of DNA into RNA.   You will quickly see that the string knots up in front of the separated parts.  Now imagine moving the separated part forward — the knots get worse.  The DNA (and the strings) respond by forming supercoils — hard to draw but easy to see if you have string in front of you.  The supercoils are called positive (overwound) in front of the separated part and negative (underwound) behind it.

Now of course the ends of DNA aren’t fixed in the cell, but they might as well be, because even the shortest chromosome (#21) is 46 miles long, with a twist every 9/16ths of an inch (in the linguini model at least).  The solution is an ingenious family of enzymes called topoisomerases.   They cut either one or both strands of DNA, allowing the overwound sections to unravel, and the underwound sections to tighten up.  After this happens topoisomerases hook the broken strands back together.  Such a type of enzyme must have been present at very early times, when life began to use double stranded DNA (or double stranded RNA for that matter).  How double stranded DNA could have coded for something absolutely required to allow double stranded DNA to be transcribed into anything, I’ll leave to the ‘it all happened by chance, and with natural selection producing incremental improvement’ boys.  I don’t have a clue, and regard the existence of topoisomerase as rather miraculous.

Now it’s time to consider the size of RNA polymerase II.  It’s much larger in man than bacteria.  Even in bacteria it’s much larger than width of the double helix.  The longest part is 7 times the diameter of the helix and the other axes are 5 – 6 times larger.  The transcription rate is around 3 kiloBases/minute or a transcription rate  of an additional nucleotide to the growing chain every 20 milliSeconds.  So every 5th of a second it has transcribed a complete turn of the helix, merrily inducing coiling upstream and downstream.  It’s pretty clear that it’s the DNA which must move rather than the polymerase, as the polymerase is so much larger than the DNA it’s working on.

In our model the polymerase is only moving 50 * 1/16 of an inch per second or about 3 inches a second — we should be able to see that.  We should be able to see the polymerase as well as it’s 4 inches long in its longest part.

Lastly, consider the nucleosome which most of the DNA is wrapped around.  It’s still far from clear just how the 4 inch polymerase can move around the two turns of DNA wrapped twice around a nucleosome 2 inches in diameter and an inch high, separating the strands and copying one of them as it goes.  Perhaps the nucleosome is displaced and then quickly reforms after the polymerase passes.

When you consider what’s really going on inside us all the time, it’s hard to imagine that this works flawlessly enough to allow us to live.  The more you know about molecular biology, the more miraculous life becomes, not less.

Finally, a rest in peace to my late friend Nick Cozzarelli, gone far too soon, who did seminal work on the topoisomerases.

The Cell and its Nucleus on a human scale – V

Now that we have the nucleosome in hand, we’ve compacted cellular DNA down by about 1/9  (call it 10 fold).  But we’ve still got another 10,000 fold of compaction to get it all into the nucleus. What’s next?  It’s time to pause and consider that as structures in the cell get larger and larger, they get less regular and less regular. You can do Xray crystallography and solution NMR of  nucleosomes and get something believable.  Obviously the cell and its contents don’t form a crystal, so at some point we’ll have to stop assembling nice structures out of regular building blocks, as DNA was assembled from nucleotides, the double helix from two strands of DNA, the nucleosome from 146 elements of the double helix and 8 proteins. Sadly, that point has probably been reached at the next level of DNA compaction.

This is the infamous and much studied 30 nanoMeter fiber.  Recall that DNA is 2 nanoMeters in width, the nucleosome is 11 nanoMeters wide and 6 nanoMeters high and shaped like a cylinder.  So you can probably only pack 6 nucleosomes in a fiber  30 nanoMeters in diameter.  Call this another factor of 10 of compaction.  This still leaves over a thousandfold more DNA compaction to discover.  Going back to the linguini metaphor, the 30 nanoMeter fiber  (at 3/8 of an inch/2 nanoMeters) would be between 5 and 6 inches thick.  If all 15,000,000 feet of DNA were put into this structure, we’d have 15,000 feet of  6 inch rope in our 150 foot sphere. (To see where the pasta and the numbers come from, go back to the first post in the series).

Unfortunately “The structure of the compacted 30 nanoMeter fiber remains unresolved despite intensive effort” [ Cell vol. 128 pp. 651 -654 ’07 ]. Molecular Biology of the Cell 5th edition p. 217 has some nice pictures of two different models of the fiber.  But models they are.  Consider how you get to see the 30 nanoMeter fiber.  First, kill the cell to get the DNA.  Then swell the chromosomes in hypotonic buffer, then fix them, then dehydrate them with alcohol and then embed them into plastic, and then take an electron micrograph.  30 nanoMeters is 300 Angstroms, less than 1/10 of the smallest wavelength of visible light (4000 Angstroms) so you have to do all this to get a glimpse of them.

After years of effort by many workers, the following paper throws in the towel [ Proc. Natl. Acad. Sci. vol. 105 pp. 19732 – 19737 ’08 ].  They think the compacted chromosome in metaphase (which is thick enough to be seen by visible light and presumably made by mushing the 30 nanoMeter fibers together) is highly disordered, with NO regular arrangement of the nucleosomes.  The whole thing is said to resemble a polymer melt, in which the nucleosomes no longer interact with adjacent nucleosomes on the DNA helix (as they do in the 30 nanoMeter fiber), but with any nucleosome which happens to be nearby.   This fits with the ability of large proteins (topoisomerase IIalpha, condensins — don’t worry if you don’t know what they are, just accept that they are as large or larger than the nucleosome) to diffuse into the chromosome.  A truly tight regular structure would not permit this.

Has this sort of thing happened before?  You bet, and I got into it on day #1 of medical school back in 1962.  Those of you not interested in ancient history or philosophical musings on scientific error, can stop reading now and wait for the next post in the series, which will concern what must happen when DNA is read by chemical complexes far larger than the nucleosome.

I’d just finished two years of graduate work in organic chemistry in a department which eventually produced 6 Nobelists. When I was there they were doing the work that netted them the prize and definitely not resting on their laurels.  Back then electron microscopy of the cell was pretty new, and something called the unit membrane which bounds all eukaryotic cells had been described.  It consisted of two dark lines with a light line sandwiched between,  the whole business being about 70 Angstroms thick. The question at the time was did it represent lipid sandwiched by protein or protein sandwiched by lipid.  Well, I knew as much chemistry as anyone and I tried to figure it out.  One of the chemicals used to get the pictures was good old osmium tetroxide — which said vic-diol to me.  The more I read about fixing, dehydrating and embedding, the more I realized, that no chemist could possibly figure out what was going on.

Those of you knowing some cellular biology will realize that the whole thing was an artifact.  Yes, cells are bounded by the plasma membrane, but it is a lipid bilayer in which float proteins passing through (transmembrane proteins).  There are a few proteins attached to the extracellular surface only (the glypiated proteins), and a few bound to the intracellular surface, but not in the regular fashion seen on the electron micrographs back in the day.

So did anyone write a paper saying that the ‘unit membrane’ was an artifact?  No. People just began to ignore it, and stopped mentioning it.  Well not everyone.  I laid out $100 of the long green 4 years ago for “Basic Neurochemistry” based on fabulous reviews and there on p. 7 is the unit membrane.  I stopped reading at this point.

Pretty harmless.  Not in medicine though.  Here’s one example, but there are many, many more.  The late Michael DeBakey was a great man, a pioneer cardiac surgeon, teacher, medical statesman etc. etc.   His word was law in the profession.  He spent his life opening up narrowed or even occluded arteries (in the heart, or leading to the head).  There are four main vessels leading to the brain, two carotid arteries which carry most of the blood and two vertebral arteries. Narrowing of the carotid artery in the neck is fairly common and can lead to stroke. The carotid is quite accessible and is the pulsing artery you can feel just interior to the angle of the jaw (not too hard now! ).  DeBakey was the first to open a narrow carotid artery up in 1953.

He also said you could open up a completely occluded carotid artery to treat stroke. Surgeons all over the country tried it, but almost everyone who had the procedure died.  So they stopped doing it.  As far as I know, no paper ever appeared contradicting DeBakey.

Why this happened takes us pretty far afield, into brain metabolism etc. etc., and if anyone wants to know, I’ll put it in as a comment. 

It’s been fun socializing with old friends in the past month or so.  Look for the frequency of posts to increase.

Here’s a link to the next paper in the series

Unfortunately as of 3/14 it’s the last paper in the series — there will be more if I ever get to i.

The cell nucleus and its DNA on a human scale – IV

At this point our parts catalogue of the 150 foot nucleus contains just 15,000,000 feet of double stranded DNA and water.  No proteins to copy DNA into RNA or DNA, or to repair it or to mush it down (the subject of this post).  In our blown up model nucleus,  DNA is a cylinder 20 Angstroms (3/8 of an inch) thick.  The thickness of each base of DNA is 1/16th of an inch.  For how I arrived at these numbers see the first post in the series.  You can visualize DNA at this level of magnification as two strands of linguini wrapped around each other every 10/16ths of an inch forming a right handed helix (which I’m never sure how to draw).

What do you do with this much linguini?  Well, an Italian friend (uncle Tom) showed me how to eat it properly by using a spoon to curl it around a fork.

So does the cell.  Except that fork is a set of 8 proteins (called histones) packed together.  When the DNA is wrapped around it, the particle is called a nucleosome.  How big is it?  How much DNA is wrapped around the 8 histones of each nucleosome?  Not very much, just 147 nucleotides in about 1.7 complete turns (around the nucleosome — recall that the two strands of DNA wrap around each other every 10 nucleotides or so).  The turns around the proteins of the nucleosome form a left handed helix (as opposed to the right handed turns of the double helix).

How big is the nucleosome?  Voet && Voet (Biochemistry 3rd Ed. p. 1424) gives the diameter at 110 Angstroms and the thickness at 60 Angstroms.  Trying to visualize this means that given 20 Angstroms = 3/8 of an inch, means that the nucleosome is just over 1 inch high and just over 2 inches wide.

The net effect is to shorten the overall length of DNA.  Well, by just how much? 147 nucleotides is about 147 1/16ths of an inch or about 9 inches long.  Molecular Biology of the Cell (5th Ed) says that we have 30,000,000 nucleosomes per nucleus, which if you multiply it out is more than the 3.2 billion base pairs we have. But the 3.2 is the size of half the genome as we have two copies of each chromosome.  Recall that we decided not to wait while the grounds crew pumped in the other 15,000,000 feet, (see the second post in the series).

Going from 9 inches to just over an inch is roughly a 10-fold compaction of DNA.  But we’ve got to compact DNA down by a factor of 100,000.  Why? Because the DNA in our cells, if stretched out is 1 meter long, while our nuclei are only 10 millionths of a meter (10 microns) in diameter.  We still have a factor of 10,000 to account for.

Before going on to higher levels of chromosomal organization think a bit about what the nucleosome looks like.  The 1.7 turns of DNA are pretty close to each other.  From top to bottom the nucleosome is just over an inch, but DNA in our model is 3/8 of an inch thick.  Not much room between the turns.   Also, there isn’t any room to speak of between the DNA and the histone core of the nucleosome, as I’ve described it (more on this in future posts).  Wikipedia says that there are over 120 direct protein DNA interactions (probably salt bridges and hydrogen bonds) and ‘several hundred’ water mediated protein DNA interactions.

Voet has the mass of the nucleosome core particle (8 proteins + 150 nucleotides of DNA) at 205 kiloDaltons.  So how fast does it move? Recall from the last post that at 80 Farenheit (27 Celsius) something with a molecular mass of 100 kiloDaltons moves at 9 meters/second, while something with a mass 10 times that moves at 2.7 meters/second.  So the nucleosome has plenty of time to travel all over the nucleus (if it were free, but it isn’t).  Even with binoculars sitting in the stands we couldn’t see the nucleosomes zooming about, but we certainly could if we got close to the sphere.

DNA nucleosomal compaction potentially introduces another problem. Although the DNA has been shortened by the nucleosome, it takes up more volume.  What is the volume of a cylinder 110 Angstroms wide and 60 Angstroms tall?  It’s pi * 5.5 * 5.5 * 6.0 cubic nanoMeters or 570 cubic nanoMeters.  We know how much volume 147 nucleotides of double stranded DNA takes up — it’s pi * 1 * 1 * 147 * .34 cubic nanoMeters  = 156 (the .34 is the 3.4 Angstrom thickness of a given nucleotide).  In the second post of the series, I calculated that even assuming two copies of each chromosome, the DNA occupied 6.28 cubic microns of a 523 cubic micron nucleus. Assuming most DNA is found in nucleosomes (and given 30,000,000 nucleosomes/nucleus over half of it is) we have to multiply the 6.28 by 570/156 obtaining 23 cubic microns taken up by DNA bundled into nucleosomes in a 523 cubic micron nucleus — still plenty of room. (But we have not accounted for proteins manipulating the DNA — copying it, repairing it, etc. etc.).

While this blog is mostly about chemistry, it’s time to pause and think of what must happen to untangle  the 30,000,000 feet of linguini into each of the 46 chromosomes. Then the cell must pair chromosome #1 with chromosome #1 (not with any other), chromosome #2 with chromosome #2, etc. etc.  Then it must line them up on the meiotic (not mitotic) spindle so each daughter cell gets one member of each pair.  No wonder 30% of conceptions are thought to be spontaneously aborted because of chromosomal abnormalities (not 30% of clinical pregnancies, these abortions happen very early on before a woman realizes that she has conceived).

Even worse, think about duplicating each of the two strands of DNA on each chromosome (bringing the total to 60,000,000 feet of linguini), lining up all 46 chromosomes on the mitotic plate, splitting the chromosomes so that each daughter cell gets the correct 30,000,000 feet in the trillion or so cells that make us up. What’s miraculous is that we’re here at all, not that we get sick.   Such thoughts helped me deal with the Godawful stuff a neurologist sees (and I saw it for 38 years). Molecular biology as psychotherapy (or religion if you look at it that way).

Finally, people have been screwing around for over 30 years trying to figure out the next level of compaction of DNA (which must exist in some form to fit all the DNA into the nucleus).  They have found something called the 30 nanoMeter fiber.  Therein hangs a tale, and some scientific philosophizing, but that’s for the next post.

Here’s a link to the next post in the series

The cell nucleus and its DNA on a human scale – III

So we’re in the grandstand looking at a sphere 150 feet in diameter, which contains 15,000,000 feet of linguini which is 3/8 of an inch thick.  The sphere is a 10 micron spherical nucleus blown up. The volume of the nucleus is 523 x 10^-18 meters, but a meter has 10^3 liters in it as a liter is 1000 cubic centimeters. The 3/8 of an inch is what 20 Angstroms looks like at this magnification.  Can we see water?  Well water is about 4 Angstroms across, or 1/16 of an inch. We’re not going to see any water from our perch, even with good binoculars.  But there’s an even better reason why.  Try and figure out what it is before reading the next paragraph.

Let’s assume that everything in the sphere is at Superbowl temperature (27 Centigrade, 80 Farenheit — it is New Orleans after all).  How fast is water moving at this temperature?  The average velocity of water (mass 18 Daltons, or 0.18 kiloGrams/mole = M) at 300 Kelvin is 

Sqrt[ 3 * R * T /M ]   in Meters/second

R is the gas constant = 8.314 Joules/mole * Kelvin

T is 300 Kelvin

This is 645 Meters/second.  That’s a lot of times around a 10 micron nucleus.  It’s also why molecular dynamics simulations have trouble computing times longer than 1 microSecond, and why they need to see what’s happening on a nanoScond to picoSecond scale.  Things happen fast at the chemical level. 

But we’ve blown up 10 microns to 150 feet, or increased distances by a factor of  4,500,000, so 645 meters/second  times 4,500,000 is a bit faster than the speed of light (which we know is impossible).  So the water molecules can’t be seen — even if they were quite large, and they’re not.   

We’re going to be dealing with far heavier entities than water, so what is the mean speed of something with a mass of 1,000 Daltons (1 kiloGram/Mole).  It’s 87 meters/second.  10,000 Daltons (10 kiloGrams/Mole) has an average speed of 27 meters/second, 100,000 Daltons (100 kiloGrams/Mole) moves at 9 meters/second, 1,000,000 Daltons (1 megaDalton or 1000 kiloGrams/Mole) clips along 2.7 meters/second (about as fast as you walk) .

 Is it meaningful to even think about something with a molecular mass of 1,000,000,000 Daltons (a gigaDalton)?  Of course it is;  any chromosome has a mass far greater than this, figuring around 1000 Daltons per base pair of the double helix (including the sugars and the phosphates) a gigaDalton is only a megaBase  The velocity is .08 meters/second.  The smallest chromosome contains 47 megabases giving a velocity of .012518 meters/second.  Well, that’s about one centiMeter/second, and our nucleus is 10 microns or 1/1000th of a centimeter.

This means that even something as big (and this long) as a chromosome will be all over the nucleus many times in the course of a second.  We’ll see a writhing 15,000,000 feet of linguini, if we see anything at all.   We’re going to have to slow time down if we want to see anything at all.  That’s for next time. 

How many water molecules can our nucleus hold?  By a previous calculation we know that the volume of our nucleus is 523 * 10^-18 cubic meters.  But there are 10^3 liters in a cubic meter.   A liter is 1,000 cubic centimeters (pretty nearly).  So nuclear volume is 523 * 10^-15 liters.  The concentration of water is 1000/18 or 55.5 molar, and there are 6.023 * 10^23 molecules/mole so a liter of water contains 55.5 * 6.023 x 10^23 molecules, and our nucleus contains 1.7 * 10^13 molecules of water (convert that to dollars and you have something of the order of magnitude of the national debt).

So it’s amazing that DNA holds up against the pounding that it takes.  645 Meters/second is 1465 miles/hour, and if there’s nothing in our nucleus but DNA and water (with the DNA making up only 6% of the volume of nucleus) I shudder to think of how many times a second our DNA is getting hit throughout its extent (perhaps one of you can figure it out).  It seems nothing short of miraculous that DNA holds up for a second, let alone a lifetime. Familiarity does not breed contempt.

But every compound chemists deal with is this strong.  Most solvents have molecular masses under 1,000 daltons, so the solvent molecules are moving faster than 87 meters/second.    Most compounds chemists deal with are at most an order of magnitude greater than the solvent, so they’re getting clobbered by something nearly their size (and surviving).  

Most chemistry books (even the magnificent Clayden) mention solvent, but never show it in any reaction mechanism.  It’s the elephant no one sees.   I don’t know enough about molecular dynamics simulations to talk about how (or whether) they handle solvent.

Here’s a link to the next paper in the series

The cell and its nucleus on a human scale – II

The grounds crew has finished pumping in the 15,000,000 feet of linguini into the 150 foot sphere sitting on the 50 yardline.  The head groundskeeper is coming our way.  He tells us will have to come back because they aren’t done if we want the exercise to be realistic.  Think what else they have to do before you look at the next paragraph.  

The 15,000,000 feet covers all 3 billion bases (3 gigaBases) of the human genome.  But being human, we have a backup just to be sure.  We have two of each chromosome in our nuclei (if you’re a woman).  Males have two of the 22 nonSex chromosomes and an X and Y, women have two of their X chromosomes.  This means 30,000,000 feet of linguini, bringing the space occupied by it up to 6.28 cubic microns, in the 523 cubic micron nucleus.  The adventurous can figure out what this means in cubic feet in the sphere, but the ratio will be the same.   Do we want the high priced spread?   This would be the set of chromosomes just before the nucleus is to divide, where each strand of the double helix of each chromosome has been copied — bringing the total amount of linguini up to 60,000,000 feet.  We tell him to just bring in the second set and we’ll be back.

2 Days later. 

We’re back in the grandstand 75 feet up at eye level with the equator of the sphere.  What can we see?  A la Clinton it depends on what you mean by see. The smallest wavelength of visible light is 4000 Angstroms (4 * 10^3 * 10^-10 meters) = 400 nanometers (400 x 10^-9 meters) = .4 microns or 4% of our 10 micron sphere. To see an object, we throw light at it, and the object alters the trajectory of the light wave.  To do so the object must be of the same size as the wavelength of the light wave.  This means that we can’t see anything closer together than .04 * 150 = 6 feet apart on or in the sphere.  Light of wavelength 6 feet will pass over the linguini like a gentle ocean swell passing over a swimmer. Parenthetically, this is why a century ago the brain wasn’t thought to be made of cells.  The neurons and glia are plastered so close together that visible light couldn’t see the boundaries between them.  You can also imagine why light microscopists had such a hard time seeing what was going on.  

Well, we’re going to use visible light to look at our 150 foot sphere, and 3/8 of an inch (the thickness of the linguini) is a lot bigger than .4 microns (which is .0004 milliMeters), so we can probably see the linguini (if we use high powered binoculars) and if the strands of linguini stick together enough.  Why are some of the strands of linguini likely to be close together?  Think a bit before reading the next paragraph. 

Even though the strands account for only 6% or so of the sphere’s volume, they can be at most 150 feet long before hitting the edge of the sphere, and we’ve got 15,000,000 feet of the stuff.  So there have to be at least 100,000 150 foot lengths of it in the nucleus (and probably a lot more) so it has to curl back on itself.  Well, the genome is chopped up into 23 different chromosomes (24 if you count the X and Y as separate).  This doesn’t help much as even the smallest (chromosome #21) contains 47 megaBases.  Each base is 1/16 of an inch (roughly) so there are 192 per foot.  So chromosome #21 is 244,792 feet long or 46 miles long.  Something must be done.  That’s for next time.

Here’s a link to the next paper in the series

The cell nucleus and its DNA on a human scale – I

The nucleus is a very crowded place, filled with DNA, proteins packing up DNA, proteins patching up DNA, proteins opening up DNA to transcribe it etc. Statements like this produce no physical intuition of the sizes of the various players (to me at least).  How do you go from the 1 Angstrom hydrogen atom, the 3.4 Angstrom thickness per nucleotide (base) of DNA, the roughly 20 Angstrom diameter of the DNA double helix, to any intuition of what it’s like inside a spherical nucleus with a diameter of 10 microns?

How many bases are in the human genome?  It depends on who you read — but 3 billion (3 * 10^9) is a lowball estimate — Wikipedia has 3.08, some sources have 3.4 billion.  3 billion is a nice round number.  How physically long is the genome?  Put the DNA into the form seen in most textbooks — e.g. the double helix.  Well, an Angstrom is one ten billionth (10^-10) of a meter, and multiplying it out we get

3 * 10^9 (bases/genome) * 3.4 * 10^-10 (meters/base) = 1 (meter).

The diameter of a typical nucleus is 10 microns (10 one millionths of a meter == 10 * 10^-6 = 10^-5 meter.   So we’ve got fit the textbook picture of our genome into something 1/100,000 smaller. We’ll definitely have to bend it like Beckham.

As a chemist I think in Angstroms, as a biologist in microns and millimeters, but as an American I think in feet and inches.  To make this stuff comprehensible, think of driving from New York City to Seattle.  It’s 2840 miles or 14,995,200 feet (according to one source on the internet). Now we’re getting somewhere.  I know what a foot is, and I’ve driven most of those miles at one time or other.  Call it 15 million feet, and pack this length down by a factor of 100,000.  It’s 150 feet, half the size of a (US) football field.

Next, consider how thick DNA is relative to its length.  20 Angstroms is 20 * 10^-10 meters or 2 nanoMeters (2 * 10^-9 meters), so our DNA is 500 million times longer than it is thick.  What is 1/500,000,000 of 15,000,000 feet?  Well, it’s 3% of a foot which is .36  of an inch, very close to 3/8 of an inch.   At least in my refrigerator that’s a pair of cooked linguini twisted around each other (the double helix in edible form).  The twisting is pretty tight, a complete turn of the two strands every 35.36 angstroms, or about 1 complete turn every 1.5 thicknesses, more reminiscent of fusilli than linguini, but fusilli is too thick.  Well, no analogy is perfect.  If it were, it would be a description.   One more thing before moving on.

How thinly should the linguini be sliced to split it apart into the constituent bases?  There are roughly 6 bases/thickness, and since the thickness is 3/8 of an inch, about 1/16 of an inch.  So relative to driving from NYC to Seattle, just throw a base out the window every 1/16th of an inch, and you’ll be up to 3 billion before you know it.

You’ve been so good following to this point that you get tickets for 50 yardline seats in the superdome.  You’re sitting far enough back so that you’re 75 feet above the field, placing you right at the equator of our 150 foot sphere. The north and south poles of the sphere are over the 50 yard line. halfway between the two sides.  You are about to the watch the grounds crew pump 15,000,000 feet of linguini into the sphere. Will it burst?  We know it won’t (or we wouldn’t exist).  But how much of the sphere will the linguini take up?

The volume of any sphere is 4/3 * pi * radius^3.  So the volume of our sphere of 10 microns diameter is 4/3 * 3.14 * 5 * 5 * 5 *  = 523 cubic microns. There are 10^18 cubic microns in a meter.  So our spherical nucleus has a volume of 523 * 10^-18 cubic meters.  What is the volume of the DNA cylinder? Its radius is 10 Angstroms or 1 nanoMeter.  So its volume is 1 meter (length of the stretched out DNA) * pi * 10^-9 * 10^-9 meters 3.14 * 10^-18 cubic meters (or 3.14 cubic microns == 3.14 * 10^-6 * 10^-6 * 10^-6

Even though it’s 15,000,000 feet long, the volume of the linguini is only 3.14/523 of the sphere.  Plenty of room for the grounds crew who begin reeling it in at 60 miles an hour.  Since they have 2840 miles of the stuff to reel in, we’ll have to come back in a few days to watch the show.  While we’re waiting, we might think of how anything can be accurately located in 2840 miles of linguini in a 150 foot sphere.

Here’s a link to the next paper in the series