Category Archives: Molecular Biology

Just because a receptor is there doesn’t mean it’s doing anything

Normally when we see a receptor on the surface of a cell (such as the receptor for Vascular Endothelial Growth Factor — VEGF) we assume that when its ligand binds, something happens inside the cell. Not always so, says Cell vol. 159 pp. 473 – 474, 584 – 596 ’14. VEGF is crucial in fetal development (inactivation of just one of the two copies of the gene is lethal for the mouse embryo [ PNAS vol. 95 pp. 14389 - 14394 98 ]).

One of the problems in diabetic retinopathy is proliferation of retinal blood vessels. For those who don’t already know — light outside the eye has to pass through all 10 cellular layers of the retina before it hits the photoreceptors which can absorb it. So more vessels in the neuronal layers isn’t good.

The Cell paper shows that in the developing retina, VEGF is depleted near neurons, by neuronal engulfment of the VEGF/VEGF receptor complex and degradation. The complex doesn’t do anything metabolically to the neuron. This prevents misdirected angiogenesis into the neuronal plane.

This turns what we’ve always thought about receptors on the cell surface on its head — they must be doing something inside the cell, when a ligand binds to them. Apparently not always.

Ebola — an update (25 Oct ’14)

The experiment of nature referred to in a previous post (https://luysii.wordpress.com/2014/10/16/an-experiment-of-nature/) when Amber Vinson, a nurse who had helped care for a fatal case of Ebola, took a commercial flight from Cleveland to Dallas the day she became symptomatic with Ebola is almost over. She was diagnosed 14 October, the day she took the flight, and so far no one on the flight has become ill (presumably the 100+ or so are under surveillance).

However, another experiment of Nature has just begun. An M. D. who’d been in Africa treating Ebola victims was diagnosed with it on the 23rd. He had returned to NYC from Africa 14 October and had been up and about in the city. According to the Times he began to feel sluggish the evening of the 21st, went all over the city on the 22nd, including a 3 mile jog on the west side, and noted a mild temperature (100.3 not 103 as initially reported) the morning of the 23rd — reported it immediately and was hospitalized the same day. New York City chastened by the disastrous response to the first case in Texas, sent 3 guys in Hazmat suits to his apartment to pick the doctor up, according to the NYT of 26 October. Some contacts, such as his fiancee are easy to trace, the people he rode with on the subway are not.

The incubation period is said to be no more than 21 days, so neither experiment of nature is truly over. From this case we now know the incubation period can be as short as 7 – 9 days.

As noted in the previous post — The genome of Ebola is RNA which mutates much more rapidly than DNA genomes. It does this so quickly that at death from AIDS (another RNA virus), there are so many viral variants present that the infecting ensemble is called a quasiSpecies. With a large population infected in Africa there is more Ebola virus extant than at any time in the past.

We have a small handle on just how fast the virus is mutating [ Science vol. 345 pp. 1369 - 1372 '14 (12 Sep '14) ]. This is a report of 98 virus genomes from 78 patients from Sierra Leone (all this year). The Ebola genome contains 18,959 to 18,961 nucleotides and codes for at least 7 proteins. Compared to all previously known Ebola genome sequences, the virus from Sierra Leone contains 341 fixed changes (e.g. the changes were present in every virus they sequenced). The changes were present in all 7 proteins.

It isn’t clear (to me) from reading the paper how much variation in the viral genome there is (1) in a given individual (2) between individuals. Note that all samples were obtained from late May to early June this year, so the work is a good baseline.

Why is this scary? Because, as is typical for a virus with a genome made of RNA, Ebola is mutating rapidly. This means that we can’t be sure that its incubation characteristics, or its ability to spread from human to human will remain constant.

Producing the paper, required lots of collaboration between people in the USA and Africa, so there are 58 co-authors of the paper. Showing just how bad the disease is five of the fifty-eight co-authors died of Ebola. R. I. P. Mohamed Fullah, Mbalu Fonnie, Alex Moigboi, Alice Kovoma, S. Humarr Khan.

The incredible information economy of frameshifting

Her fox and dog ate our pet rat

H erf oxa ndd oga teo urp etr at

He rfo xan ddo gat eou rpe tra t

The last two lines make no sense at all, but (neglecting the spaces) they have identical letter sequences.

Here are similar sequences of nucleotides making up the genetic code as transcribed into RNA

ATG CAT TAG CCG TAA GCC GTA GGA

TGC ATT AGC CGT AAG CCG TAG GA.

GCA TTA GCC TAA GCC GTA GGA ..

Again, in our genome there are no spaces between the triplets. But all the triplets you see are meaningful in the sense that they each code for one of the twenty amino acids (except for TAA which says stop). ATG codes for methionine (the purists will note that all the T’s should be U). I’m too lazy to look the rest up, but the ribosome doesn’t care, and will happily translate all 3 sequences into the sequential amino acids of a protein.

Both sets of sequences have undergone (reading) frame shifts.

A previous post https://luysii.wordpress.com/2014/10/13/the-bach-fugue-of-the-genome/ marveled about how something too small even to be called a virus coded for a protein whose amino acids were read in two different frames.

Frameshifting is used by viruses to get more mileage out of their genomes. Why? There is only so much DNA you can pack into the protein coat (capsids) of a virus.

[ Proc. Natl. Acad. Sci. vol. 111 pp. 14675 - 14680 '14 ] Usually DNA density in cell nuclei or bacteria is 5 – 10% of volume. However, in viral capsids it is 55% of volume. The pressure inside the viral capsid can reach ten atmospheres. Ejection is therefore rapid (60,000 basepairs/second).

The AIDS virus (HIV1) relies on frame shifting of its genome to produce viable virus. The genes for two important proteins (gag and pol) have 240 nucleotides (80 amino acids) in common. Frameshifting occurs to allow the 240 nucleotides to be read by the cell’s ribosomes in two different frames (not at once). Granted that there are 61 3 nucleotide combinations to code for only 20 amino acids, so some redundancy is built in, but the 80 amino acids coded by the two frames are usually quite different.

That the gag and pol proteins function at all is miraculous.

The phenomenon is turning out to be more widespread. [ Proc. Natl. Acad. Sci. vol. 111 pp. E4342 - E4349 '14 ] KSHV (Kaposi’s Sarcoma HerpesVirus) causes (what else?) Kaposi’s sarcoma, a tumor quite rare until people with AIDS started developing it (due to their lousy immune system being unable to contend with the virus). Open reading frame 73 (ORF73) codes for a major latency associated nuclear antigen 1 (LANA1). It has 3 domains a basic amino terminal region, an acidic central repeat region (divisible into CR1, CR2 and CR3) and another basic carboxy terminal region. LANA1 is involved in maintaning KSHV episomes, regulation of viral latency, transcriptional regulation of viral and cellular genes.

LANA1 is made of multiple high and lower molecular weight isoforms — e.g. a LANA ladder band pattern seen in immunoblotting.

This work shows that LANA1 (and also Epstein Barr Nuclear antigen 1` ) undergo highly efficient +1 and -2 programmed frameshifting, to generate previously undescribed alternative reading frame proteins in their repeat regions. Programmed frameshifting to generate multiple proteins from one RNA sequence can increase coding capacity, without increasing the size of the viral capsid.

The presence of similar repeat sequences in human genes (such as huntingtin — the defective gene in Huntington’s chorea) implies that we should look for frame shifting translation in ourselves as well as in viruses. In the case of mutant huntingtin frame shifting in the abnormally expanded CAG tracts rproduces proteins containing polyAlanine or polySerineArginine tracts.

Well G, A , T and C are the 1’s and 0’s of the way genetic information is stored in our genomic computer. It really isn’t surprising that the genome can be read in alternate frames. In the old days, textual information in bytes had parity bits to make sure the 1’s and 0’s were read in the correct frame. There is nothing like that in our genome (except for the 3 stop codons).

What is truly suprising it that reading in alternate frame produces ‘meaningful’ proteins. This gets us into philosophical waters. Clearly

Erf oxa ndd oga teo urp etr at

Rfo xan ddo gat eou rpe tra t

aren’t meaningful to us. Yet gag and pol are quite meaningful (even life and death meaningful) to the AIDS virus. So meaningful in the biologic sense, means able to function in the larger context of the cell. That really is the case for linguistic meaning. You have to know a lot about the world (and speak English) for the word cat to be meaningful to you. So meaning can never be defined by the word itself. Probably the same is true for concepts as well, but I’ll leave that to the philosophers, or any who choose to comment on this.

The Bach Fugue of the Genome

There are more things in heaven and earth, Horatio,
Than are dreamt of in your philosophy.
– Hamlet (1.5.167-8), Hamlet to Horatio

Just when you thought we’d figured out what genomes could do, the virusoid of rice yellow mottle virus performs a feat of dense coding I’d have thought impossible. The following work requires a fairly sophisticated understanding of molecular biology which the articles in “Molecular Biology Survival Guide for Chemists” might provide the background. Give it a shot. This is fascinating stuff. If the following seems incomprehensible, start with –https://luysii.wordpress.com/2010/07/07/molecular-biology-survival-guide-for-chemists-i-dna-and-protein-coding-gene-structure/ and then follow the links forward.

Virusoids are single stranded circular RNAs which are dependent on a virus for replication. They are distinct from viroids because viroids need nothing else to replicate. Neither the virusoid or the viroid were thought to code for protein (until now). They are usually found inside the protein shells of plant viruses.

[ Proc. Natl. Acad. Sci. vol. 111 pp. 14542 - 14547 '14 ] Viroids and virusoids (viroid like satellite RNAs) are small (220 – 450 nucleotide) covalently closed circular RNAs. They are the smallest known replicating circular RNA pathogens. They replicate via a rolling circle mechanism to produce larger concatemers which are then processed into monomeric forms by a self-splicing hammerhead ribozyme, or by cellular enzymes.

The rice yellow mottle virus (RYMV) contains a virusoid which is a covalently closed circular RNA of a mere 220 nucleotides. A 16 kiloDalton basic protein is made from it. How can this be? Figure the average molecular mass of an amino acid at 100 Daltons, and 3 codons per amino acid. This means that 220 can code for 73 amino acids at most (e.g. for a 7 – 8 kiloDalton protein).

So far the RYMV virusoid is the only RNA of viroids and virusoids which actually codes for a protein. The virusoid sequence contains an internal ribosome entry site (IRES) of the following form UGAUGA. Intiation starts at the AUG, and since 220 isn’t an integral multiple of 3 (the size of amino acid codons), it continues replicating in another reading frame until it gets to one of the UGAs (termination codons) in UGAUGA or UGAUGA. Termination codons can be ignored (leaky codons) to obtain larger read through proteins. So this virusoid is a circular RNA with no NONcoding sequences which codes for a protein in either 2 or 3 of the 3 possible reading frames. Notice that UGAUGA contains UGA in both of the alternate reading frames ! So it is likely that the same nucleotide is being read 2 or 3 ways. Amazing ! ! !

It isn’t clear what function the virusoid protein performs for the virus when the virus has infected a cell. Perhaps there aren’t any, and the only function of the protein is to help the virusoid continue existence inside the virus.

Talk about information density. The RYMV virusoid is the Bach Fugue of the genome. Bach sometimes inverts the fugue theme, and sometimes plays it backwards (a musical palindrome if you will).

It is unfortunate that more people don’t understand the details of molecular biology so they can appreciate mechanisms of this elegance. Whether you think understanding it is an esthetic experience, is up to you. I do. To me, this resembles the esthetic experience that mathematics offers.

A while back I wrote a post, wondering if the USA was acquiring brains from the MidEast upheavals, the way we did from Europe because of WWII. Here’s the link https://luysii.wordpress.com/2014/09/28/maryam-mirzakhani/.

Clearly Canada has done just that. Here are the authors of the PNAS paper above and their affiliations. Way to go Canada !

Mounir Georges AbouHaidar
aDepartment of Cell and Systems Biology, University of Toronto, Toronto, ON, Canada M5S 3B2; and
Srividhya Venkataraman
aDepartment of Cell and Systems Biology, University of Toronto, Toronto, ON, Canada M5S 3B2; and
Ashkan Golshani
bBiology Department, Carleton University, Ottawa, ON, Canada K1S 5B6
Bolin Liu
aDepartment of Cell and Systems Biology, University of Toronto, Toronto, ON, Canada M5S 3B2; and
Tauqeer Ahmad
aDepartment of Cell and Systems Biology, University of Toronto, Toronto, ON, Canada M5S 3B2; and

The thermodynamic subtlety of cholera

Who knew that the cholera organism passed a thermodynamics course with flying colors? Consider that it has to function at widely different temperatures (37 C when it infects us, and 20 – 30 C when it’s out in the world). When it infects us it needs to make toxins and build a secretion system to export it. This cost a lot of metabolic money (ATP). Clearly there’s no point in doing this at temperatures outside the body and a lot of reasons not to (at least 60 as turning on toxin production and building the secretion system involves synthesizing at least 60 different proteins).

If some of the following terms are unfamiliar have a look at https://luysii.wordpress.com/2010/07/07/molecular-biology-survival-guide-for-chemists-i-dna-and-protein-coding-gene-structure/ and follow the links.

How does thermodynamics help the organism turn on these genes at body temperature (37 C in us)? ToxT is a protein which turns on production of the 60 proteins. The mRNA for ToxT is only translated into protein by the ribosome at 37 C.

[ Proc. Natl. Acad. Sci. vol. 111 pp. 14241 - 14246 '14 ] The mRNA for ToxT has what the authors call an RNA thermometer in its untranslated region. It is just a sequence of nucleotides which binds to the Shine Dalgarno (SD) element (http://en.wikipedia.org/wiki/Shine-Dalgarno_sequence) in the ToxT mRNA tying it up, so the SD element can’t bind to the ribosome, meaning the mRNA for ToxT can’t be transcribed into protein . Guess what? The thermometer only binds to the SD element at low temperatures, at higher temperatures the binding is unstable leaving the SD sequence free, turning on synthesis of ToxT which then turns on the 60 proteins involved in toxin production. Clever no?

Cholera is a terrible disease, afflicting less developed countries causing terrible infant mortality. I can’t resist mentioning a completely avoidable epidemic inflicted in the name of risk reduction years ago.

[ Nature vol. 354 p. 255 '91 ] An amazing article places the blame for the cholera epidemic sweeping South America starting in Peru on a misguided application of an Environmental Protection Study implicating water chlorination as a cause of cancer. During the 80’s Peruvian officials, citing the EPA study, stopped chlorinating many of the well in Lima. However, others say that the decision might have been more based on economics than data from the EPA.

It is comforting to know that the 3516 who have died so far have been spared a long bout with cancer.

9 Oct ’14 — Emo wrote the following comment today

Story of Peruvian officials stopping chlorinating water supply based on EPA study was debunked in a study published in Lancet one year after the nature news story: Swerdlow et al. “Waterborne Transmission of Epidemic Cholera in Trujillo, Peru: Lessons for a Continent at Risk,” Lancet Vol. 340 No. 8810 (July 4, 1992), pgs. 28-33. They never chlorinated water in Trujillo, second largest city in the country because they didn’t believe deep well water needed disinfection and cost of chlorinator and chlorine was too much

Thanks Emo

Can losing one gene do all that? Yes it can — there’s still hope

The Cancer Genome Atlas has dashed our hopes of finding ‘the’ cause of cancer. It has sequenced the genomes of a large number of cancers — the following paper looked at 21 tumor types sequencing the protein coding parts (exomes) of 4,742 specimens, along with that of normal tissues [ Nature vol. 505 pp. 495 - 501 '14 ].

The problem is that lots of mutations have been found in every type of cancer studied this way.

The following is typical — 178 cases of lung cancer (squamous cell variety) were studied. Some 360 mutations in exons, 165 genomic rearrangements, and 323 copy number alterations were found — but this doesn’t represent the results for the 178 cases as a whole. This was the average amount of genomic mayhem seen in each individual tumor . How do you find ‘the’ cause of the cancer in this mess? One way might be to find a gene mutated in all 178 cases (e. g. recurrent mutations). This would be the holy grail — the mutation driving cancer formation, the rest being the chaff of the well known genomic instability due to the high mutation rate of cancer cells. They found 11 such genes, but they were far from mutated in all cases. Pretty depressing isn’t it?

A recent paper [ Proc. Natl. Acad. Sci. vol. 111 pp. 14009 - 14010, E4066 - E4075 '14 ] gave an example of a huge number of changes in the clinical activity of a cancer cell line due to the functional loss of just one gene (called COSMC). Here’s what happened. In a pancreatic cancer cell line, COSMC knockout produced malignant xenografts (e.g. placing the cells in an immunodeficient animal and watching what happens), which could be reversed by reintroduction of COSMC. The changes include (1) increased proliferation, (2)loss of contact inhibition of growth, (3) loss of tissue architecture, (4) less basement membrane adhesion and (5) invasive growth — remarkable that knocking out just one gene could do so much. Perhaps not a driver mutation, but certainly a delicious drug target. Before getting too excited, remember that this occurred in a cell line which was cancerous to begin with.

The quick and dirty explanation of what is going on is that COSMC is a protein chaperone for an enzyme adding a sugar to proteins destined either for secretion or for insertion into the cell membrane. Lose COSMC and the whole pattern of sugar attachments to these proteins changes. There are a lot of proteins modified by adding sugars (glycosylated proteins), actually 446 of them, with 1,471 sites for this to happen.

The rest of the post is for the cognoscenti and concerns the gory details.

From the paper itself — “Neoplastic transformation of human cells is virtually always associated with aberrant glycosylation of proteins and lipids.” The most frequently seen glycophenotype are the Tn and STn carbohydrate epitopes of epithelial cell cancers. They arise when mucin-type O-linked glycans (normally more complex) are truncated so that only a single -N-acetylgalactosamine (Tn) or N-acetylgalactosamine modified with sialic acid (STn) remains attached to the protein by a serine or a threonine. There are ‘up to’ 20 GalNAc transferases adding GalNAc to serine or threonine. Overall there are some 200 glycosyltransferase found in the secretory pathway. In most cases the GalNAc is modified with beta 1 –> 3 galactose by a single enzyme (called C1GalT1). This reaction is dependent on COSMC, a protein chaperone.

Although there weren’t mutations in the glycosyltransferases studied in 46 cases of pancreatic cancer, 40% of them showed hypermethylation of the COSMC (e.g. methylated cytosines in the promoter region, which shut down transcription of COSMC). This correlated with expression of truncated O-Glycans (e.g. the Tn and STn antigens) and loss of C1GalT expression.

A very UNtheoretical approach to cancer diagnosis

We have tons of different antibodies in our blood. Without even taking mutation into account we have 65 heavy chain genes, 27 diversity segments, and 6 joining regions for them (making 10,530) possibilities — then there are 40 genes for the kappa light chains and 30 for the lambda light chains or over 1,200 * 10,530. That’s without the mutations we know that do occur to increase antibody affinity. So the number of antibodies probably ramming around in our blood is over a billion (I doubt that anyone has counted then, just has no one has ever counted the neurons in our brain). Antibodies can bind to anything — sugars, fats, but we think of them as mostly binding to protein fragments.

We also know that cancer is characterized by mutations, particularly in the genes coding for proteins. Many of the these mutations have never been seen by the immune system, so they act as neoantigens. So what [ Proc. Natl. Acad. Sci. vo. 111 pp. E3072 - E3080 '14 ] did was make a chip containing 10,000 peptides, and saw which of them were bound by antibodies in the blood.

The peptides were 20 amino acids long, with 17 randomly chosen amino acids, and a common 3 amino acid linker to the chip. While 10,000 seems like a lot of peptides, it is a tiny fraction (actually 10^-18
of the 2^17 * 10^17 = 1.3 * 10^22 possible 17 amino acid peptides).

The blood was first diluted 500x so blood proteins other than antibodies don’t bind significantly to the arrays. The assay is disease agnostic. The pattern of binding of a given person’s blood to the chip is called an immunosignature.

What did they measure? 20 samples from each of five cancer cohorts collected from multiple geographic sites and 20 noncancer samples. A reference immunosignature was generated. Then 120 blinded samples from the same diseases gave 95$% classification accuracy. To investigate the breadth of the approach and test sensitivity, the immunosignatures 75% of over 1,500 historical samples (some over 10 years old) comprising 14 different diseases were used as training, then the other 25% were read blind with an accuracy of over 98% — not too impressive, they need to get another 1,500 samples. Once you’ve trained on 75% of the sample space, you’d pretty much expect the other 25% to look the same.

The immunosignature of a given individual consists of an overlay of the patterns from the binding signals of many of the most prominent circulating antibodies. Some are present in everyone, some are unique.

A 2002 reference (Molecular Biology of the Cell 4th Edition) states that there are 10^9 antibodies circulating in the blood. How can you pick up a signature on 10K peptides from this. Presumably neoAntigens from cancer cells elicit higher afifnity antibodies then self-antigens. High affiity monoclonals can be diluted hundreds of times without diminishing the signal.

The next version of the immunosignature peptide microArray under development contains over 300,000 peptides.

The implication is that each cancer and each disease produces either different antigens and or different B cell responses to common antigens.

Since the peptides are random, you can’t align the peptides in the signature to the natural proteomic space to find out what the antibody is reacing to.

It’s a completely atheoretical approach to diagnosis, but intriguing. I’m amazed that such a small sample of protein space can produce a significant binding pattern diagnostic of anything.

It’s worth considering just what a random peptide of 17 amino acids actually is. How would you make one up? Would you choose randomly giving all 20 amino acids equal weight, or would you weight the probability of a choice by the percentage of that amino acid in the proteome of the tissue you are interested in. Do we have such numbers? My guess is that proline, glycine and alanine would the most common amino acids — there is so much collagen around, and these 3 make up a high percentage of the amino acids in the various collagens we have (over 15 at least).

Bad news on the cancer front

[ Nature vol. 512 pp. 143 - 144, 155 - 160 '14 ] Nuc-seq is an innovative sequencing method which achieves almost complete sequencing of whole genomes in single cells. It sequences DNA from cells about to divide (the G2/M stage of the cell cycle which has twice the DNA content of the usual cell). Genomes of multiple single cells from two types of human breast cancer (estrogen receptor positive and triple negative — the latter much more aggressive) and found that no two genomes of individual tumor cells were identical. Many cells had new mutations unique to them.

This brings into question what we actually mean by a cancer cell clone. They validated some of the single cell mutations by deep sequencing of a single molecule (not really sure what this is).

Large scale structural changes in DNA (amplification and deletion of large blocks of DNA) occurred early in tumor development. THey remain stable as clonal expansion of the tumor occur (e.g. they were found in all the cancer cells whose genome was sequenced). Point mutations accumulated more gradually generating extensive subclonal diversity. Many of the mutations occur in less than 10% of the tumor mass. Triple negative breast cancers (aggressive) have mutation rates 13 times greater than the slower growing estrogen receptor positive breast cancer cells.

This implies that the mutations are there BEFORE chemotherapy. This has always been a question as most types of chemotherapy attack DNA replication and are inherently mutagenic. It also implies that slamming cancer with chemotherapy early before it has extensively mutated is locking the barn door after the horse has been stolen. It still might help in preventing metastasis, so the approach remains viable.

However nuc-seq may only be useful for cancer cells without aneuploidy http://en.wikipedia.org/wiki/Aneuploidy which is extremely common in cancer cells.

Why is this such bad news? It means that before chemotherapy even starts there is a high degree of genetic diversity present in the tumor cell population. This means that natural selection (in the form of chemotherapy) has a diverse population to work on at the get go, making resistance far more likely to occur.

Had enough? Here’s more — [ Nature vol. 511 pp. 543 - 550 '14 ] A report of 230 resected lung adenocarcinomas using mRNA, microRNA and DNA sequencing found an incredible 8.8 mutations/megaBase — e.g. 3.2 * 3.8 * 1,000 == 28,000 mutations. Aberrations in NF1, MET. ERBB2 and RIT1 occured in 13% and were enriched in samples otherwise lacking an activated oncogene. Even when not mutated, mRNA splicing was different in tumors. As far as oncogenic pathways, multiple pathways were involved — p53in 63%, PI3K mTOR in 25%, Receptor Tyrosine Kinase in 76%, cell cycle regulators 64%.

This is the opposite side of the coin from the first paper, where the genomes of single tumor cells were sequenced. It is doubtful that all cells have the 28,000 mutations, which probably result from each cell having a subset. The first paper didn’t count how many mutations a single cell had (as far as i could see).

So oncologists are attacking a hydra-headed monster.

Are Van der Waals interactions holding asteroids together?

A recent post of Derek’s concerned the very weak (high kD) but very important interactions of proteins within our cells. http://pipeline.corante.com/archives/2014/08/14/proteins_grazing_against_proteins.phpAr

Most of this interaction is due to Van der Waals forces — http://en.wikipedia.org/wiki/Van_der_Waals_force. Shape shape complementarity (e.g. steric factors) and dipole dipole interactions are also important.

Although important, Van der Waals interactions have always seemed like a lot of hand waving to me.

Well guess what, they are now hypothesized to be what is holding an asteroid together. Why are people interested in asteroids in the first place? [ Science vol. 338 p. 1521 '12 ] “Asteroids and comets .. reflect the original chemical makeup of the solar system when it formed roughly 4.5 billion years ago.”

[ Nature vol. 512 p. 118 '14 ] The Rosetta spacecraft reached the comet 67P/Churyumov-Gerasimenko after a 10 year journey becoming the first spacecraft to rendezvous with a comet. It will take a lap around the sun with the comet and will watch as the comet heats up and releases ice in a halo of gas and dust. It is now flying triangles in front of the comet, staying 100 kiloMeters away. In a few weeks it will settle into a 30 kiloMeter orbit around he comet. It will attempt to place a lander (Philae) the size of a washing machine on its surface in November. The comet is 4 kiloMeters long.

[ Nature vol. 512 pp. 139 - 140, 174 - 176 '14 ] A kiloMeter sized near Earth asteroid called (29075) 1950 DA (how did they get this name?) is covered with sandy regolith (heterogeneous material covering solid rock { on earth } it includes dust, soil, broken rock ). The asteroid rotates every 2+ hours, and it is so small that gravity alone can’t hold the regolith to its surface. An astronaut could scoop up a sample from its surface, but would have to hold on to the asteroid to avoid being flung off by the rotation. So the asteroid must have some degree of cohesive strength. The strength required is 64 pascals to hold the rubble together — about the pressure that a penny exerts on the palm of your hand. A Pascal is 1/101,325 of atmospheric pressure.

They think the strength comes from van der Waals interactions between small (1 – 10 micron) grains — making it fairy dust. It’s rather unsatisfying as no one has seen these particles.

The ultimate understanding of the large multi-protein and RNA machines (ribosome, spliceosome, RNA polymerase etc. etc. ) without which life would be impossible will involve the very weak interactions which hold them together. Along with permanent dipole dipole interactions, charge interactions and steric complementarity, the van der Waals interaction is high on anyone’s list.

Some include dipole dipole interactions as a type of van der Waals interaction. The really fascinating interaction is the London dispersion force. These are attractions seen between transient induced dipoles formed in the electron clouds surrounding each atomic nucleus.

It’s time to attempt the surmount the schizophrenia which comes from trying to see how quantum mechanics gives rise to the macroscopic interactions between molecules which our minds naturally bring to matters molecular (with a fair degree of success).

Steric interactions come to mind first — it’s clear that an electron cloud surrounding molecule 1 should repel another electron cloud surrounding molecule 2. Shape complementarity should allow two molecules to get closer to each other.

What about the London dispersion forces, which are where most of the van der Waals interaction is thought to be. We all know that quantum mechanical molecular orbitals are static distributions of electron probability. They don’t fluctuate (at least the ones I’ve read about). If something is ‘transiently inducing a dipole’ in a molecule, it must be changing the energy level of a molecule, somehow. All dipoles involve separation of charge, and this always requires energy. Where does it come from? The kinetic energy of the interacting molecules? Macroscopically it’s easy to see how a collision between two molecules could change the vibrational and/or rotation energy levels of a molecule. What does a collision between between molecules look like in terms of the wave functions of both. I’ve never seen this. It has to have been worked out for single particle physics in an accelerators, but that’s something I’ve never studied.

One molecule inducing a transient dipole in another, which then induces a complementary dipole in the first molecule, seems like a lot of handwaving to me. It also appears to be getting something for nothing contradicting the second law of thermodynamics.

Any thoughts from the physics mavens out there?

I sincerely hope it works, but I’m very doubtful

A fascinating series of papers offers hope (in the form of a small molecule) for the truly horrible Werdnig Hoffman disease which basically kills infants by destroying neurons in their spinal cord. For why this is especially poignant for me, see the end of the post.

First some background:

Our genes occur in pieces. Dystrophin is the protein mutated in the commonest form of muscular dystrophy. The gene for it is 2,220,233 nucleotides long but the dystrophin contains ‘only’ 3685 amino acids, not the 770,000+ amino acids the gene could specify. What happens? The whole gene is transcribed into an RNA of this enormous length, then 78 distinct segments of RNA (called introns) are removed by a gigantic multimegadalton machine called the spliceosome, and the 79 segments actually coding for amino acids (these are the exons) are linked together and the RNA sent on its way.

All this was unknown in the 70s and early 80s when I was running a muscular dystrophy clininc and taking care of these kids. Looking back, it’s miraculous that more of us don’t have muscular dystrophy; there is so much that can go wrong with a gene this size, let along transcribing and correctly splicing it to produce a functional protein.

One final complication — alternate splicing. The spliceosome removes introns and splices the exons together. But sometimes exons are skipped or one of several exons is used at a particular point in a protein. So one gene can make more than one protein. The record holder is something called the Dscam gene in the fruitfly which can make over 38,000 different proteins by alternate splicing.

There is nothing worse than watching an infant waste away and die. That’s what Werdnig Hoffmann disease is like, and I saw one or two cases during my years at the clinic. It is also called infantile spinal muscular atrophy. We all have two genes for the same crucial protein (called unimaginatively SMN). Kids who have the disease have mutations in one of the two genes (called SMN1) Why isn’t the other gene protective? It codes for the same sequence of amino acids (but using different synonymous codons). What goes wrong?

[ Proc. Natl. Acad. Sci. vol. 97 pp. 9618 - 9623 '00 ] Why is SMN2 (the centromeric copy (e.g. the copy closest to the middle of the chromosome) which is normal in most patients) not protective? It has a single translationally silent nucleotide difference from SMN1 in exon 7 (e.g. the difference doesn’t change amino acid coded for). This disrupts an exonic splicing enhancer and causes exon 7 skipping leading to abundant production of a shorter isoform (SMN2delta7). Thus even though both genes code for the same protein, only SMN1 actually makes the full protein.

Intellectually fascinating but ghastly to watch.

This brings us to the current papers [ Science vol. 345 pp. 624 - 625, 688 - 693 '14 ].

More background. The molecular machine which removes the introns is called the spliceosome. It’s huge, containing 5 RNAs (called small nuclear RNAs, aka snRNAs), along with 50 or so proteins with a total molecular mass again of around 2,500,000 kiloDaltons. Think about it chemists. Design 50 proteins and 5 RNAs with probably 200,000+ atoms so they all come together forming a machine to operate on other monster molecules — such as the mRNA for Dystrophin alluded to earlier. Hard for me to believe this arose by chance, but current opinion has it that way.

Splicing out introns is a tricky process which is still being worked on. Mistakes are easy to make, and different tissues will splice the same pre-mRNA in different ways. All this happens in the nucleus before the mRNA is shipped outside where the ribosome can get at it.

The papers describe a small molecule which acts on the spliceosome to increase the inclusion of SMN2 exon 7. It does appear to work in patient cells and mouse models of the disease, even reversing weakness.

Why am I skeptical? Because just about every protein we make is spliced (except histones), and any molecule altering the splicing machinery seems almost certain to produce effects on many genes, not just SMN2. If it really works, these guys should get a Nobel.

Why does the paper grip me so. I watched the beautiful infant daughter of a cop and a nurse die of it 30 – 40 years ago. Even with all the degrees, all the training I was no better for the baby than my immigrant grandmother dispensing emotional chicken soup from her dry goods store (she only had a 4th grade education). Fortunately, the couple took the 25% risk of another child with WH and produced a healthy infant a few years later.

A second reason — a beautiful baby grandaughter came into our world 24 hours ago.

Poets and religious types may intuit how miraculous our existence is, but the study of molecular biology proves it (to me at least).

Follow

Get every new post delivered to your Inbox.

Join 69 other followers