Category Archives: Molecular Biology

Does she or doesn’t she? Only her geneticist knows for sure

Back in the day there was a famous ad for Claroil — Does she or doesn’t she? Only her hairdresser knows for sure.  Now it’s the geneticist who can sequence genes for Two Pore Channels in pigment forming cells (melanocytes) who really knows.

Except for redheads, skin and hair color is determined by how much eumelanin you have.  All human melanins are  polymers of oxidation products of tyrosine (DOPA, DOPAquinone) and indole 5,6 quinone, so its chemical structure isn’t certain.  It is made inside a specialized organelle of the melanocyte called (logically enough) the melanosome.

There is all sorts of interesting chemistry and physiology involved.  In particular a melanosome protein called Pmel17 adopts an amyloid-like structure (so not all amyloid is bad !) for the construction of melanin.  The crucial enzyme oxidizing tyrosine is tyrosinase, and its activity strongly depends on pH, being most active at pH 7 (neutral pH).

In the melanosome membrane is found TPC2, which helps control ion flow in and out of the melanosome.  Two mutations Methionine #484 –> Leucine (or M484L) and Glycine #734 –> Glutamic acid (G734E) are associated with a shift from brown to blond.  You have blond hair if your melanosomes make less melanin.  Both mutations result in an increase in TPC2 activity resulting in lower pH, lower tyrosinase activity and less melanin in the melanosome — voila — a blond.

So it doesn’t take a big (one amino acid in over 734) change in the huge TCP2 protein for the shift to occur.

Advertisements

Who knew Marshall McLuhan was a molecular biologist

Marshall McLuhan famously said “the medium is the message”. Who knew he was talking about molecular biology?  But he was, if you think of the process of transcription of DNA into various forms of RNA as the medium and the products of transcription as the message.  That’s exactly what this paper [ Cell vol. 171 pp. 103 – 119 ’17 ] says.

T cells are a type of immune cell formed in the thymus.  One of the important transcription factors which turns on expression of the genes which make a T cell a Tell is called Bcl11b.  Early in T cell development it is sequestered away near the nuclear membrane in highly compacted DNA. Remember that you must compress your 1 meter of DNA down by 100,000fold to have it fit in the nucleus which is 1/100,000th of a meter (10 microns).

What turns it on?  Transcription of nonCoding (for protein) RNA calledThymoD.  From my reading of the paper, ThymoD doesn’t do anything, but just the act of opening up compacted DNA near the nuclear membrane produced by transcribing ThymoD is enough to cause this part of the genome to move into the center of the nucleus where the gene for Bcl11b can be transcribed into RNA.

There’s a lot more to the paper,  but that’s the message if you will.  It’s the act of transcription rather than what is being transcribed which is important.

The paper doesn’t talk about the structure of ThymoD — how long it is, whether it binds to anything in the nucleus — etc. etc.  Perhaps I’ve missed it.  I’ve written the lead author. Hopefully I won’t be too embarrassed by what he responds.

Here’s more about McLuhan — https://en.wikipedia.org/wiki/Marshall_McLuhan

If some of the terms used here are unfamiliar — look at the following post and follow the links as far as you need to.  https://luysii.wordpress.com/2010/07/07/molecular-biology-survival-guide-for-chemists-i-dna-and-protein-coding-gene-structure/

 

How fast is your biological clock ticking — latest results

Our family breeds like sequoias.  Medicine has improved, but biology hasn’t changed, and problems with fertility and miscarriages have emerged in the generation behind me.   A cousin had a child at 46 who is now in grad school.  My brother had a child at 48, also doing OK. One son, who is north of 50 has an infant and a 3 year old.  That’s why the following paper from Iceland is so relevant.  I’ve posted on this subject before, but the new paper has 10 times the data of the old [ Nature vol. 549 pp. 519 – 522 ’17 ].

The paper is from Iceland, and whether the data can be extrapolated to other populations isn’t clear — but the biology in question is so basic that I think it can. Some 1,548 mother father child trios had their entire genomes (to 35 fold coverage).  In addition, 225 of the children had reproduced, providing a few 2 generation families.  If any position in the 3,200,000,000 bases of the genome differs from that of the mother and the father, than a mutation has taken place.  It isn’t clear how old the children were when sequenced, so possibly some of the mutations arose since birth.

Some 108,778 de novo mutations were found in over 1548 + 225 (at least) individuals — so each individual carried an average of 61 de novo mutations.  When the number of mutations were plotted against the ages of both parents, it was found that each year a father waited to reproduce added 1.51 mutations.  Previous work (with much less data) stated that the age of the mother didn’t matter.  No so, although the mutational burden of an additional year before reproduction in a woman increased the mutations 4 times less (.37 extra mutations/year of maternal life).

The previous paper reported on was somewhat suspect, because the 78 parent child trios had a child with autism.  Not so in this population study.

The numbers were large enough, that the type of mutation could be studied.  Mothers and fathers had different types of mutations in different frequencies.   They found one 20 megaBase region on chromosome #8 with a mutation rate of cytosine to guanosine (C to G) 50 times higher than the rest of the genome.

People use ‘molecular clocks’ to time evolution of species, based on the assumption that the mutation rate is constant.  But it isn’t with age, and a shift in the average age for reproduction could seriously screw up the molecular clock predictions.

An average of 61 de novo mutations per individual sounds pretty horrible, but it isn’t when you consider that 3,200,000,000 – 61 positions were copied faithfully (an error rate of 1 in 50 million).

 

The worst name for a drug I’ve ever heard of

It is simply impossible for me to think of a worse name for a drug which might help people with Down syndrome than ALGERNON.   The authors can be excused as they’re all from Japan, but the editor of the paper Fred Gage should have known about ‘Flowers for Algernon’– https://en.wikipedia.org/wiki/Flowers_for_Algernon.  Briefly, it’s a story about a drug which tripled the intelligence of Algernon a laboratory mouse which was then given to a retarded individual (Charlie Gordon) whose intelligence similarly tripled, only to decline like Algernon’s.  It was originally a short story, then a book, then a play etc. etc.

The drug is potentially quite exciting — ALGERNON is an acronym forALtered GenERatioN Of Neurons).  It increases the number of neurons form by mice with a model of trisomy 21.  The brain is bigger, and the animals do better on tests.  It is thought to work by inhibiting an enzyme (DYRK1A) which adds phosphate to serine, threonine and tyrosine, making it a dual specificity kinase.  It phosphorylates a variety of proteins known to have significant effects on brain development (tau, cyclin D1, caspase9, Notch, gli1, etc). The net effect of DYRK1A inhibition is to increase neural stem cell proliferation during fetal life.

Chemists will be interested in just how simple the structure of ALGERNON is — it’s an all aromatic compound made of a pyridine linked to a fused 6:5 ring system in which the 5 membered ring contains 2 nitrogens.  That’s it.  No alcohols, methyls, ethyls, ..  amines, amides, ethers etc., etc.

The authors blue-sky a bit at the end.  They note that mice show neural proliferation during adult life (we do as well, but to a much lesser extent).  It might be useful to improve function in living Down syndrome individuals, and just about any other neurological problem in which neural proliferation would be beneficial.  It might also be offered to women carrying a Down fetus who object to abortion on moral grounds.  Exciting stuff, but for god’s sake change the name.

DNA solves a 25 year old rape/murder in a new way

It probably won’t make national news, but a 25 year old rape/murder of a 24 year old school teacher was solved with a new way to use DNA.  Yes, the cops had a DNA sample; but no matches were found in the national databases.  Recently they sent some DNA to https://www.parabon-nanolabs.com, a company that claims to produce a descriptive profile of the source of any human DNA sample, including pigmentation, face morphology, and other forensically relevant traits.  Sounds like total BS but it worked.

Law enforcement had worked on the case for 25 years, and the parents never gave up. Over the years they compiled a long list of suspects and persons of interest.  The physical description produced by the company allowed the police to narrow down the list of possible suspects and focus on a smaller group.  Prior to receiving the Parabon information, the DA’s office had begun reexamining all the evidence and reviewing the stories of all the many subjects interviewed over the years, in an attempt to rule out suspects.

In the last few months, the DA’s office went through legal processes to get court orders to compel people of interest to provide DNA samples.  Now that’s fascinating — aren’t you supposed to be protected against self-incrimination — clearly something to ask my nephew — who is an attorney very interested in such matters.  I’ll put in addendum from him.

That was how investigators showed up at the suspect’s residence last week to execute a court order compelling him to give a sample.  He was not home at the time, but investigators left a message explaining why they were there.  The suspect upon learning this, bolted to another state, where he apparently tried to kill himself.  He has apparently confessed.  God only knows what else he’s done in the past 25 years.

Addendum 25 Sep ’17

The following work by Venter is scary with its implications for privacy [ Proc. Natl. Acad. Sci. vol. 114 pp. 10166 -b10171 ’17 ]

Prediction of human physical traits and demographic informa- tion from genomic data challenges privacy and data deidenti- fication in personalized medicine. To explore the current capa- bilities of phenotype-based genomic identification, we applied whole-genome sequencing, detailed phenotyping, and statistical modeling to predict biometric traits in a cohort of 1,061 partici- pants of diverse ancestry. Individually, for a large fraction of the traits, their predictive accuracy beyond ancestry and demographic information is limited. However, we have developed a maximum entropy algorithm that integrates multiple predictions to deter- mine which genomic samples and phenotype measurements origi- nate from the same person. Using this algorithm, we have reiden- tified an average of >8 of 10 held-out individuals in an ethnically mixed cohort and an average of 5 of either 10 African Americans or 10 Europeans. This work challenges current conceptions of personal privacy and may have far-reaching ethical and legal implications.

18 at one blow said the molecular biologist

With apologies to the brothers Grimm, molecular biologists may have found a way to treat 18 genetic diseases at one blow [ Cell vol. 170 pp. 899 – 912 ’17 ]. They use adeno-associated virus (AAV) packing a modified enzyme and an RNA to remove repeat expansions from RNA.   The paper give a list of the 18, all but one of which are neurologic.  They include such horrors as Huntington’s chorea, the most common form of familial ALS, 3 forms of spinocerebellar ataxia and 6 forms of spinocerebellar atrophy.

They use Cas9 from Streptococcus Pyogenes, part of the CRISPR system (https://en.wikipedia.org/wiki/CRISPR)  bacteria use to defend themselves against viruses, with a single guide RNA.  Even more interestingly, Cas9 is an enzyme which breaks up RNA, but the Cas9 they used is catalytically dead.  They think that just binding to the aggregated RNA containing the repeats is enough to break up the aggregate.  This is the way antiSense oligoNucleotides are thought to work.

The problem with getting a bacterial enzyme into a human cell is avoided here by using a virus to infect them (AAV).  It did get rid of RNA aggregates in patients’ cells from 4 of the diseases (two myotonic dystrophies, and the familial ALS).

It is almost too fantastic to be true.

Why almost all of these repeat expansion diseases affect the nervous system is anyone’s guess.  As you can image theories abound.  So all we have to do is figure out how to get the therapy into the brain (hardly a small task).

The emperor has no clothes

As an old organic chemist, I’ve always been fascinated with size of proteins (n functional groups in a protein of length n — not counting the amide bonds), and the myriad of shapes they can assume.  It seems nothing short of miraculous (to me at least) that the proteins making us up assume just a few shapes out of the nearly 3^n possible shapes (avoiding self intersection removes a few).

This has been ‘explained’ by the potential energy funnel, down which newly formed proteins slide to their final few destinations.  Now I took quantum mechanics 56+ years ago, and back then a lot of heavy lifting was required just to calculate the potential energy surface required to bring two hydrogen atoms together to form molecular hydrogen.

I’ve never seen a potential energy surface for a protein actually calculated, and I’m not sure molecular dynamics simulations do this (please correct me if I’m wrong).

So I was glad to see the following in a paper by

S. WALTER ENGLANDER, Ph.D.

Jacob Gershon-Cohen Professor of Medical Science
Professor of Biochemistry and Biophysics

at my alma mater Penn Med (the hell with the Perelman’s, Penn sold themselves out to the Perelman’s very cheaply).

“A critical feature of the funneled ELT (Energy Landscape Theory) model is that the many-pathway residue-level conformational search must be biased toward native-like interactions. Otherwise, as noted by Levinthal , an unguided random search would require a very long time. How this bias might be implemented in terms of real protein interactions has never been discovered. One simply asserts that natural evolution has made it so, formulates this view as a so-called principle of minimal frustration, and attributes it to the shape of the funneled energy landscape. 

 Proc. Natl. Acad. Sci. vol. 114 pp. 8253 – 8258 ’17.

So the potential energy funnel of energy landscape theory is not something you can calculate explicitly (like a gravitational or an electrical potential), but just a high-falutin’ description of what happens inside our cells, masquerading as an explanation.

So when does a description become an explanation?  Newton famously said Hypotheses non fingo (Latin for “I feign no hypotheses” when discussing the action at a distance which his theory of gravity entailed.

Well it becomes an explanation when you can use the description to predict and define new phenomena — e.g. using Newton’s laws to send a projectile to Jupiter, using Einstein’s theory of gravitation to predict black holes and gravitational waves etc. etc.

In this sense Energy Landscape Theory is just words.  If it wasn’t you could predict the shape an arbitrary string of amino acids would assume (and you can’t).  Theory does work fairly well when folding algorithms are given a protein of known shape (but not published), but try them out on an arbitrary string — which I don’t think has been done.

But it gets worse.  ELT sweeps the problem of why a protein should have one (or a few) shapes under the rug, by assuming that they do.  I’m far from convinced that this is the case in general, which means that the proteins which make us up are quite special.

I’ll conclude with an earlier post on this subject, which basically says that an experiment to decide the issue, while possible in theory is physically impossible to fully perform.

A chemical Gedanken experiment

This post is mostly something I posted on the Skeptical Chymist 2 years ago.  Along with the previous post “Why should a protein have just one shape (or any shape for that matter)” both will be referred to in the next one –“Gentlemen start your motors”, concerning the improbability of the chemistry underlying our existence and whether it is reasonable to believe that it arose by chance.

In the early days of quantum mechanics Einstein and Bohr threw thought experiments (gedanken experiments) at each other like teenagers tossing cherry bombs.  None of the gedanken experiments were regarded as remotely possible back then, although thanks to Bell and Aspect, quantum nonlocality and entanglement now have a solid experimental basis.  To read more about this you can’t do much better than “The Age of Entanglement” by Louisa Gilder.

Frankly, I doubt that most strings of amino acids have a dominant shape (e.g., biological meaning), and even if they did, they couldn’t find it quickly enough (theLevinthal paradox).  For details see the previous post.

How would you prove me wrong? The same way you’d prove a pair of dice was loaded. Just make (using solid-phase protein synthesis a la Merrifield) a bunch of random strings of amino acids (each 41 amino acids long) and see how many have a dominant shape. Any sequence forming a crystal does have a dominant shape, if the sequence doesn’t crystallize, use NMR to look at it in solution. You can’t make all of them, because the earth doesn’t have enough mass to do so (see “https://luysii.wordpress.com/2009/12/20/how-many-proteins-can-be-made-using-the-entire-earth-mass-to-do-so/). That’s why this is a gedanken experiment — it can’t possibly be performed in toto.

Even so, the experiment is over (and I’m wrong) if even 1% of the proteins you make turn out to have a dominant shape.

However, choosing a random string of amino acids is far from trivial. Some amino acids appear more frequently than others depending on the protein. Proteins are definitely not a random collection of amino acids. Consider collagen. In its various forms (there are over 20, coded for by at least 30 distinct genes) collagen accounts for 25% of body protein. Statistically, each of the 20 amino acids should account for 5% of the protein, yet one amino acid (glycine) accounts for 30% and proline another 15%. Even knowing this, the statistical chances of producing 300 copies in a row of glycine–any amino acid–any amino acid by a random distribution of the glycines are less than zilch. But one type of bovine collagen protein has over 300 such copies in its 1042 amino acids.

One further example of the nonrandomness of proteins. If you were picking out a series of letters randomly hoping to form a word, you would not expect a series of 10 ‘a’s to show up. But we normally contain many such proteins, and for some reason too many copies of the repeated amino acid produce some of the neurological diseases I (ineffectually) battled as a physician. Normal people have 11 to 34 glutamines in a row in a huge (molecular mass 384 kiloDaltons — that’s over 3000 amino acids) protein known as huntingtin. In those unfortunate individuals withHuntington’s chorea, the number of repeats expands to over 40. One of Max Perutz’s last papers [Proc. Natl. Acad. Sci. USA 99, 5591–5595 (2002)] tried to figure out why this was so harmful.

On to the actual experiment. Suppose you had made 1,000,000 distinct random sequence proteins containing 41 amino acids and none of them had a dominant shape. This proves/disproves nothing. 10^6 is fewer than the possibilities inherent in a string of 5 amino acids, and you’ve only explored 10^6/(20^41) of the possibilities.

Would Karl Popper, philosopher of science, even allow the question of how commonly proteins have a dominant shape to be called scientific? Much of what I know about Popper comes from a fascinating book “Wittgenstein’s Poker” and it isn’t pleasant. Questions not resolvable by experiment fall outside Popper’s canon of questions scientific. The gedanken experiment described can resolve the question one way, but not the other. In this respect it’s like the halting problem in computer science (there is no general rule to tell if a program will terminate).

Would Ludwig Wittgenstein, uberphilosopher, think the question philosophical? Probably not. His major work “Tractatus Logico-Philosophicus” concludes with “What we cannot speak of we must pass over in silence”. While he’s the uberphilosopher he’s also the antiscientist. It’s exactly what we don’t know which leads to the juiciest speculation and most creative experiments in any field of science. That’s what I loved about organic chemistry years ago (and now). It is nearly always possible to design a molecule from scratch to test an idea. There was no reason to make [7]paracyclophane, other than to get up close and personal with the ring current.

If the probability or improbability of our existence, to which the gedanken experiment speaks, isn’t a philosophical question, what is?

Back then, this post produced the following excellent comment.

I’m not sure your assessment of what Popper would regard as science is accurate. Popper advocated “falsifiability”, i.e. that a statement cannot be proved true, only false. Non-scientific statements are those for which evidence that they are false cannot be found. You are in fact giving a perfect example of a situation where falsifiability is useful. If you tested, as you suggested, a million random proteins and many of them formed structures reliably, this would in fact disprove the hypothesis fairly conclusively (if only probabilistically). The fact that the test was passed by the first million proteins would be evidence that the theory was true (though obviously not concrete).

Also, it is relatively easy to choose what random proteins to make. Just use a random number generator (a pseudorandom generator would do too, probably). It doesn’t matter that they would be unlikely to produce a specific sequence generated in nature, as we are looking at specifically wanting to look at random sequences. The idea that 300 glycines is particularly unusual if protein generation is random is probably one which should be treated with a degree of caution. As the sequence was not specified as an unusual sequence beforehand, there are a large number of possible sequences that you could have seized on, and so care is needed.

This is only the most obvious experiment that could be carried out to test this idea, and I’m sure with advances, there is the distinct chance that more ingenious ways could be devised.

Additionally the the mass restriction is not in fact terribly useful except as an illustration that there is a massively large number of proteins, as once you have made and tested a protein, you can in fact reuse its atoms to make another protein.

Finally, I haven’t read Wittgenstein, but that final quote does not really support your statement that he is “anti-science” or would be against the production of novel cyclophanes. Organic chemistry clearly lies in the realm of “what we can speak”, as we are in fact speaking about it.

Posted by: MCliffe

My response —

MCliffe — thank you for your very thoughtful comments on the post. It’s great to know that someone out there is reading them.

Popper and the logical positivists solved many philosophical problems by declaring them meaningless (which Popper later took to mean not falsifiable). Things got to such a point in the 50s that Bertrand Russell was moved to came up with the meaningful (to most) but non-falsifiable statement — In the event of a nuclear war we shall all be dead.

You are quite right that it is easy to make a random sequence of amino acids using a computer. It’s been shown again and again that our intuitive notion of randomness is usually incorrect. I chose collagen because it is the most common protein in our body, and because it is highly nonrandom. Huntingtin was used because I dealt with its effects as a Neurologist (and because there are 8 more diseases with too many identical amino acids in a row — all of which for some unfathomable reason produce neurologic disease — they are called triplet diseases because it takes 3 nucleotides of DNA to code for a single amino acid).

Even accepting 300 glycines in 1000 or so amino acids (collagen) and putting that frequency into the random generator and turning it on, we would not expect those 300 glycines to appear at position n, position n+4, n+7, . . . , n + 898 randomly.

The idea of using the atoms over and over to escape the mass restriction is clever. Unfortunately it runs up against a time restriction. Let us suppose there is a super-industrious post-doc who can make a new protein every nanosecond (reusing the atoms). There are 60 * 60 * 24 * 365 = 31,536,000 ~ 10^7 seconds in a year and 10^10 years (more or less) since the big bang. This is 10^9 * 10^7 * 10 ^10 = 10^26 different proteins he could make since the dawn of time. But there are 20^41 = 2^41 * 10^41 proteins of length 41 amino acids. 2^41 = 2,199,023,255,552 = 10^12. So he has only tested 10^26 of 10^53 possible 41 amino acid proteins in all this time.

This is what I was getting at by saying the the gedanken experiment was not a priori falsifiable — we lack the time, space and mass to run it to completion. As you note, it could well end quite early if I’m wrong. Suppose 10^9/10^53 of the proteins DO have a dominant shape — the postdoc will be very unlikely to find any of them.

I think your final point is well taken. My reading of “Wittgenstein’s Poker” is that what he was saying in his last sentence really was “What we cannot speak (with certainty) of we must pass over in silence”. We cannot speak of the outcome of this Gedanken experiment with any degree of certainty.

Once again Thanks

What our DNA looks like inside a living cell

Time to rewrite the textbooks.  DNA in the living cell looks nothing like the pictures that have appeared in textbooks for years. Gone are the 30 nanoMeter fiber and higher order structures.

Here is the old consensus of how DNA in the nucleus is organized.

There are two different structural models of the 30 nanoMeter fiber (1) solenoid — diameter 33 nanoMeters with 6 nucleosomes ever 11 nanoMeters along the axis (2) two start zigzag fiber — diameter 27 – 30 nanoMeters with 5 – 6 nucleosomes every 11 nanoMeters.
The 30 nanoMeter fiber is throught to assemble into helically folded 120 nanoMeter chromonema, 300  – 700 nanoMeter chromitids and mitotic chromosomes (1,400 nanoMeters).     The chromonema structures 9measured between 100 and 130 nanoMeters) are based on electron micrographic studies of permeabilized nuclei from which other components have been extracted with detergenes and high salt to visualize chromatin — hardly physiologic.

Got all that?  Good, now forget it.  It’s wrong.

First off, forget nanoMeters.  Organic chemists think in Angstroms — the diameter of the smallest atom Hydrogen is almost exactly 1 Angstrom, making it the perfect organic chemical yardstick.  If you must think in nanoMeters, just divide the number of Angstroms by 10.

First, a few numbers to get started.
 The classic form of DNA is B-DNA (this is still correct). https://en.wikipedia.org/wiki/Nucleic_acid_double_helix.  Each nucleotide pair is 3.4 Angstroms above the next and there are 10.4 nucleotides per turn of the helix  (so 1 full turn of B DNA is 35.36 Angstroms).  The diameter of B-DNA is 19 Angstroms.

The nucleosome consists of 147 bases of DNA wrapped around a central mass made of 8 histone proteins. The histone octamer is made of two copies each of histones H2A, H2B, H3 and H4.  The core particle in its entirety is 100 Angstroms in diameter and 57 Angstroms along the axis of the disk and possesses nearly dyadic symmetry.  There are 1.65 turns of DNA around the histone octamer, and during the trip there are 14 contact points between histones and DNA.

Now on to the actual paper [ Science vol. 357 pp. 354 – 355, 370, eaag0025 1 –> 13 ‘ 17 ]  The movies contained within alone are worth a year’s subscription to Science.

To visualize DNA in the living cells the authors invented a technique called  Chromatin Electron Microscopy Tomography (ChromEMT).

 DNA is transparent to electrons.  They use a fluorescent  DNA binding dye (Deep Red fluorescing AnthraQuinone Nr.5  ). For a structure see — http://onlinelibrary.wiley.com/doi/10.1002/1097-0320(20000801)40:4%3C280::AID-CYTO4%3E3.0.CO;2-7/full.  It has 3 probably aromatic rings fused together like anthracene, so it could easily intercalate between the bases of the double helix.   Then there are OH groups and amines to bind to the backbone.  The dye gets into cells easily.  Most importantly, DRAQ5 produces reactive oxygen species when hit by the right kind of light.  Somehow they get diaminobenzidine in the cells, which the reactive oxygen species polymerizes to polybenzimidazole.

 We’re not done yet.  The polymer is also transparent to electrons, but it can react with good old Osmium tetroxide (which is electron dense). permitting visualization of DNA on electron microscopy (at last)

  The technique is the first that can be used in living cells.  It shows that most chromatin in the nucleus is mostly organized as a disordered polymer of 50 to 240 Angstroms diameter.   This is consistent with beads on a string (with nucleosomes being the beads).  They found little evidence for higher order structures (the 300 to 1,200 Angstrom fibers of classic textbook models — which is in fact based on in vitro visualization of non-native chromatin. The 30 nanoMeter chromatin fiber (300 Angstroms) is nowhere to be seen.  However, they do find 300 Angstrom fibers using their new  method but only in nuclei purified from hypotonically lysed chicken RBCs treated with MgCl2 (hardly physiologic).

       They were able to make a movie of an electron micrograph in the nucleus using eight tilts of the stage There is more DNA at the nuclear rim (as that’s where the heterochromatin is mostly), but you still see the little 5 – 24  nanoMeter circles (just more of them the closer you get to the nuclear membrane).
      Another movie of a mitotic chromosome shows the same little circles (50 – 240 Angstroms) just packed together more closely.  You just see a lot of them, but there is no obvious bunching of them into higher structures.
     The technique (ChromEMT is amazing in that it allows the ultrastructure of individual chromatin chains, megabase domains and mitotitc chromosomes to be resolved and visualized as a continuum in serial slices.   The found that the 5 – 12 and 12 – 24 chromatin diameters were the same regardless of how heavily the chromatin was compacted.
      The paper is incredible and worth a year’s subscription to Science.  It likely is behind a paywall.
It’s hard to get your mind around the amount of compaction involved in getting the meter of DNA of the human genome into a nucleus.  Molecular Biology of the Cell 4th Edition p. 198 put it this way —  Compacting the meter of DNA into a 6 micron nucleus is like compacting 24 miles of very fine thread into a tennis ball.
I actually wrote a series of posts, trying to put the amount of compaction into human scale.  Here is the first post — follow the links at the end to the others.

The cell nucleus and its DNA on a human scale – I

The nucleus is a very crowded place, filled with DNA, proteins packing up DNA, proteins patching up DNA, proteins opening up DNA to transcribe it etc. Statements like this produce no physical intuition of the sizes of the various players (to me at least).  How do you go from the 1 Angstrom hydrogen atom, the 3.4 Angstrom thickness per nucleotide (base) of DNA, the roughly 20 Angstrom diameter of the DNA double helix, to any intuition of what it’s like inside a spherical nucleus with a diameter of 10 microns?

How many bases are in the human genome?  It depends on who you read — but 3 billion (3 * 10^9) is a lowball estimate — Wikipedia has 3.08, some sources have 3.4 billion.  3 billion is a nice round number.  How physically long is the genome?  Put the DNA into the form seen in most textbooks — e.g. the double helix.  Well, an Angstrom is one ten billionth (10^-10) of a meter, and multiplying it out we get

3 * 10^9 (bases/genome) * 3.4 * 10^-10 (meters/base) = 1 (meter).

The diameter of a typical nucleus is 10 microns (10 one millionths of a meter == 10 * 10^-6 = 10^-5 meter.   So we’ve got fit the textbook picture of our genome into something 1/100,000 smaller. We’ll definitely have to bend it like Beckham.

As a chemist I think in Angstroms, as a biologist in microns and millimeters, but as an American I think in feet and inches.  To make this stuff comprehensible, think of driving from New York City to Seattle.  It’s 2840 miles or 14,995,200 feet (according to one source on the internet). Now we’re getting somewhere.  I know what a foot is, and I’ve driven most of those miles at one time or other.  Call it 15 million feet, and pack this length down by a factor of 100,000.  It’s 150 feet, half the size of a (US) football field.

Next, consider how thick DNA is relative to its length.  20 Angstroms is 20 * 10^-10 meters or 2 nanoMeters (2 * 10^-9 meters), so our DNA is 500 million times longer than it is thick.  What is 1/500,000,000 of 15,000,000 feet?  Well, it’s 3% of a foot which is .36  of an inch, very close to 3/8 of an inch.   At least in my refrigerator that’s a pair of cooked linguini twisted around each other (the double helix in edible form).  The twisting is pretty tight, a complete turn of the two strands every 35.36 angstroms, or about 1 complete turn every 1.5 thicknesses, more reminiscent of fusilli than linguini, but fusilli is too thick.  Well, no analogy is perfect.  If it were, it would be a description.   One more thing before moving on.

How thinly should the linguini be sliced to split it apart into the constituent bases?  There are roughly 6 bases/thickness, and since the thickness is 3/8 of an inch, about 1/16 of an inch.  So relative to driving from NYC to Seattle, just throw a base out the window every 1/16th of an inch, and you’ll be up to 3 billion before you know it.

You’ve been so good following to this point that you get tickets for 50 yardline seats in the superdome.  You’re sitting far enough back so that you’re 75 feet above the field, placing you right at the equator of our 150 foot sphere. The north and south poles of the sphere are over the 50 yard line. halfway between the two sides.  You are about to the watch the grounds crew pump 15,000,000 feet of linguini into the sphere. Will it burst?  We know it won’t (or we wouldn’t exist).  But how much of the sphere will the linguini take up?

The volume of any sphere is 4/3 * pi * radius^3.  So the volume of our sphere of 10 microns diameter is 4/3 * 3.14 * 5 * 5 * 5 *  = 523 cubic microns. There are 10^18 cubic microns in a meter.  So our spherical nucleus has a volume of 523 * 10^-18 cubic meters.  What is the volume of the DNA cylinder? Its radius is 10 Angstroms or 1 nanoMeter.  So its volume is 1 meter (length of the stretched out DNA) * pi * 10^-9 * 10^-9 meters 3.14 * 10^-18 cubic meters (or 3.14 cubic microns == 3.14 * 10^-6 * 10^-6 * 10^-6

Even though it’s 15,000,000 feet long, the volume of the linguini is only 3.14/523 of the sphere.  Plenty of room for the grounds crew who begin reeling it in at 60 miles an hour.  Since they have 2840 miles of the stuff to reel in, we’ll have to come back in a few days to watch the show.  While we’re waiting, we might think of how anything can be accurately located in 2840 miles of linguini in a 150 foot sphere.

Here’s a link to the next paper in the series

https://luysii.wordpress.com/2010/03/23/the-cell-and-its-nucleus-on-a-human-scale-ii/

A possible new player

Drug development is very hard because we don’t know all the players inside the cell. A recent paper describes an entirely new class of player — circular DNA derived from an ancient virus.  The authoress is Laura Manuelidis, who would have been a med school classmate had I chosen to go to Yale med instead of Penn.   She is the last scientist standing who doesn’t believe Prusiner’s prion hypothesis.  She didn’t marry the boss’s daughter being female, so she married the boss instead;  Elias Manuelidis a Yale neuropathologist who would be 99 today had he not passed away at 72 in 1992.

The circular DNAs go by the name of SPHINX  an acronym  for  Slow Progressive Hidden INfections of X origin.  They have no sequences in common with bacterial or eukaryotic DNA, but there some homology to a virus infecting Acinebacter, a wound pathogen common in soil and water.

How did she find them?  By doggedly pursuing the idea the neurodegenerative diseases such as Cruetzfeldt Jakob Disease (CJD) and scrapie were due to an infectious agent triggering aggregation of the prion protein.

As she says:  “The cytoplasm of CJD and scrapie-infected cells, but not control cells, also contains virus-like particle arrays and because we were able to isolate these nuclease-protected particles with quantitative recovery of infectivity, but with little or no detectable PrP (Prion Protein), we began to analyze protected nucleic acids. Using Φ29 rolling circle amplification, several circular DNA sequences of <5 kb (kilobases) with ORFs (Open Reading Frames) were thereby discovered in brain and cultured neuronal cell lines. These circular DNA sequences were named SPHINX elements for their initial association with slow progressive hidden infections of X origin."

SPHINX itself codes for a 324 amino acid protein, which is found in human brain, concentrated in synaptic boutons.  Strangely, even though the DNAs are presumably viral derived, they contain intervening sequences which don't code for protein.

The use of rolling circle amplification is quite clever, as it will copy only circular DNA.

Stanley Prusiner is sure to weigh in.  Remarkably, Prusiner was at Penn Med when I was and was even in my med school fraternity (Nu Sigma Nu)  primarily a place to eat lunch and dinner.  I probably ate with him, but have no recollection of him whatsoever.

Circular DNAs outside chromosomes are called plasmids. Bacteria are full of them. The best known eukaryote containing plasmids is yeast. Perhaps we have them as well. Manuelidis may be the first person to look.

Should you take aspirin after you exercise?

I just got back from a beautiful four and a half mile walk around a reservoir behind my house.  I always take 2 adult aspirin after such things like this.  A recent paper implies that perhaps I should not [ Proc. Natl. Acad. Sci. vol. 114 pp. 6675 – 6684 ’17 ].  Here’s why.

Muscle has a set of stem cells all its own.  They are called satellite cells.  After injury they proliferate and make new muscle. One of the triggers for this is a prostaglandin known as PGE2 — https://en.wikipedia.org/wiki/Prostaglandin_E2 — clearly a delightful structure for the organic chemist to make.  It binds to a receptor on the satellite cell (called EP4R) following which all sorts of things happen, which will make sense to you if you know some cellular biochemistry.  Activation of EP4R triggers activation of the cyclic AMP (CAMP) phosphoCREB pathway.  This activates Nurr1, a transcription factor which causes cellular proliferation.

Why no aspirin? Because it inhibits cyclo-oxygenase which forms the 5 membered ring of PGE2.

I think you should still aspirin afterwards, as the injury produced in the paper was pretty severe — muscle toxins, cold injury etc. etc. Probably the weekend warriors among you don’t damage your muscles that much.

A few further points about aspirin and the NSAIDs

Now aspirin is an NSAID (NonSteroid AntiInflammatory Drug) — along with a zillion others (advil, anaprox, ansaid, clinoril, daypro, dolobid, feldene, indocin — etc. etc. a whole alphabet’s worth). It is rather different in that it has an acetyl group on the benzene ring.  Could it be an acetylating agent for things like histones and transcription factors, producing far more widespread effects than those attributable to cyclo-oxygenase inhibition.   I’ve looked at the structures of a few of them — some have CH2-COOH moieties in them, which might be metabolized to an acetyl group –doubt.  Naproxen (Anaprox, Naprosyn) does have an acetyl group — but the other 13 structures I looked at do not.

Another possible negative of aspirin after exercise, is the fact that inhibition of platelet cyclo-oxygenase makes it harder for them to stick together and form clots (this is why it is used to prevent heart attack and stroke). So aspirin might result in more extensive micro-hemorrhages in muscle after exercise (if such things exist).