Tag Archives: dissociation constant

The uses and abuses of molarity — II

Just as the last post showed why a 1 Molar solution of a protein makes no sense at all, it is reasonable to ask what the highest concentration of a single protein in the cellular environment could be. Strangely, it was very hard for me to find an estimate of the percentage of protein mass inside a eukaryotic cell. There is one for the red blood cell, which is essentially a bag of hemoglobin. The amount is 33 grams/deciliter or 330 grams/liter. Hemoglobin (which is a tetramer) has a molecular mass of 64,000 Daltons.  So that’s 330/64000 = .5 x 10^-3 Molar.   So all proteins in our cells have a maximum concentration at most in the milliMolar range.

Before moving on, how do you think the red blood cell gets its energy?  Amazingly it is by anaerobic glycolysis, not using the oxygen carried by hemoglobin at all.  Why? If it used oxidative phosphorylation which runs on oxygen, it would burn up.  That’s why red cells do not contain mitochondria. 

On to Kd the dissociation constant.  At least 475 FDA approved drugs target G Protein Coupled Receptors (GPCRs), and our genome codes for some 826 of them.  Almost 500 of them code for smell receptors, and of the 300 or so not involved with smell 1/3 are orphans (as of 2019) with no known ligand.  There are GPCRs for all neurotransmitters which is why neurologists and psychiatrists are very interested in them. 

The Kd is defined as [ free ligand ][ free receptor ]/ [ ligand bound to receptor]  where all the  [  ]’s are concentrations in Moles/liter (e.g. Molar concentrations). 

There’s the rub.  Kd makes sense when ligand and receptor are swimming around in solution, but GPCRs never do this.  The working GPCR is embedded in our cell membrane which topologists tell us are 2 dimensional manifolds embedded in 3 dimensional space.  What does concentration mean in a situation like this?  Think of the entropy involved in getting all the GPCRs to lie in a single plane.  Obviously not so simple.  

People get around this by using radioactive ligands, and embedding GPCRs in membranes and measuring the time for ligands to bind and unbind (e.g. kinetics), but this is miles away from the physiologic situations — for details please see

2019 Apr 5; 485: 9–19.
The same is true for other proteins of interest — ion channels for the neurologist, hormone receptors for the endocrinologist, angiotension converting enzyme 2 (ACE2) for the pandemic virus.  
I think that all Kd’s of membrane embedded receptors do is give you an ordinal ordering (e.g. receptor A binds ligand B tighter than ligand C ) but not a quantitative one.
Next up, how a Nobel prizewinner totally misunderstood the nature and applicability of molarity and studies on a two dimensional gas (complete with Pressure * Area = n * Gas Constant * Temperature).


The uses and abuses of Molarity

Quick what does a one Molar solution of a protein look like?

Answer: It doesn’t. The average protein mass is 100 kiloDaltons — http://book.bionumbers.org/how-big-is-the-average-protein/. That’s 100,000 grams per mole (100 kilograms).

A mole of any chemical is Avogadro’s number of it — or 6.02 x 10^23.  The molar mass counts 1 gram for each hydrogen it contains, 12 for each carbon etc. etc. 

A 1 molar concentration of any chemical is its molecular mass dissolved in 1 liter of water, which is 1,000 cubic centimeters (cc.).  The density of water is pretty much the same between 32 and 212 Fahrenheit (or 1 – 100 centigrade).  

What is the molar concentration of water, e.g. how many moles of water are in a liter of water.  The molecular mass of water is 18 so there are 1000/18 = 55.6 moles of water per liter of water.  

Well you can’t get 220 pounds of our 100 kiloDalton protein into 2.2 pounds (1 kiloGram) of water.  You could decorate each of the 6.02 x 10^23 protein molecules with 55 waters. 

Why belabor the obvious?  Because numbers are infinitely divisible and it is possible to talk about concentrations given in moles which make no chemical sense. Why?  Because matter is not infinitely divisible.  Divisibility for chemists stops at the atom level. 

Now let’s do some biology.  Cell size is measured in microns or 10^-6 meters.   A liter is a cube 10 centimeters on a side, so it is 10^-3 cubic meters.  A cubic micron is 10^-18 cubic meters, so there are 10^15 cubic microns in a liter. 

Now lets put 1 molecule in our cubic micron and each and every cubic micron in a liter of water.  What is its concentration in moles?  Our liter contains 10^15 molecules of our chemical, so its Molar concentration is 10^15/6.02 *10^23or .16 x 10^-8  or 1.6 x 10^-9 or 1.6 nanoMolar.    So 1 cubic micron is the volume  at which concentration less than 1.6 nanoMolar make no sense. 

It should be noted that 1 cubic micron contains plenty of water molecules to dissolve our molecule.  The actual number:

55 x 6.02 x 10^23/10^15 = 331 x 10^8  = 3 x 10^10 of them.

Notice that the mass of the molecule makes no difference.  Molar means moles/liter and liter is just a volume.  The number of molecules is what is crucial. 

As the volume goes up 1 molecule/volume makes sense at lower and lower concentrations. 

At this point the physicist says ‘consider a spherical cow’.  The biologist doesn’t have to.  We have lymphocytes which are nearly spherical with diameters ranging from 6 to 14 microns. 

Call it 10 microns.  Then the volume of our lymphocyte is  4/3 * pi * 5^3 = 524 cubic microns (call it 1,000 cubic microns to make things easier).  Recall that a liter contains 10^15 cubic microns.  So a liter can contain at most 10^12 lymphocytes, or 10^12 of our molecules so their concentration is 10^12/6.02 * 10^23 or 1.6 x 10^-12 molar. or 1.6 picoMolar.   Molar concentrations lower than 1.6 picoMolar make no chemical or biological sense in volumes of 1000 cubic microns. 

Are there chemicals in the lymphocyte with concentrations that low?  Sure there are.  Each chromosome is a molecule, so in male lymphocytes there is exactly one X chromosome and one Y. 

Next up.  Is a dissociation constant (Kd) in the femtoMolar (10^-15 Molar) range biologically meaningful?   I’m not sure and am still thinking about it, but the answer has some relevance to Alzheimer’s disease. 

When the dissociation constant doesn’t tell you what you want to know

Drug chemists spend a lot of time getting their drugs to bind tightly to their chosen target.  Kd’s (dissociation constants) are measured with care –https://en.wikipedia.org/wiki/Dissociation_constant.  But Kd’s are only  a marker for the biologic effects that are the real reason for the drug.  That’s why it was shocking to find that Kd’s don’t seem to matter in a very important and very well studied system.

It’s not the small molecule ligand protein receptor most drug chemists deal with, it’s the goings on at the immunologic synapse between antigen presenting cell and T lymphocyte (a much larger ligand target interface — 1,000 – 2,000 Angstroms^2 — than the usual site of drug/protein binding).   A peptide fragment lies down in a groove on the Major Histocompatibility Complex (pMHC) where it is presented to the T lymphoCyte Receptor (TCR) — another protein complex.  The hope is that an immune response to the parent protein of the peptide fragment will occur.


However, the Kd’s (affinities)of strong (e.g. producing an immune response) peptide agonist ligands and those producing not much (e.g. weak) are similar and at times overlapping.  High affinity yet nonStimulatory interactions occur with high frequency in the human T cell repertoire [ Cell vol. 174 pp. 672 – 687 ’18 ].  The authors  determined the structure of both weak and strong ligands bound to the TCR.  One particular TCR had virtually the same structure when bound to strong and weak agonist ligands. When studied in two dimensional membranes, the dwell time of ligand with receptor didn’t distinguish strong from weak antigens (surprising).

In general the Kds  pMHC/TCR  are quite low — not in the nanoMolar range beloved by drug chemists (and found in antigen/antibody binding), but 1000 times weaker in the micromolar range.  So [ Proc. Natl. Acad. Sci. vol. 115 pp. E7369 – E7378 ’18 ] cleverly added an extra few amino acids which they call molecular velcro, to boost the affinity x 10 (actually this decreases Kd tenfold).

One rationale for the weak binding is that it facilitates scanning by the TCR of  the pMHC  repertoire allowing the TCR to choose the best.  So they added the velcro, expecting the repertoire to be less diverse (since the binding was tighter).  It was just the same. Again the Kd didn’t seem to matter.


Even more interesting, the first paper noted that productive TCR/pMHC bonds had catch bonds — e.g. bonds which get stronger the more you pull on them. The authors were actually able to measure the phenomenon. Catch bonds been shown to exist in a variety of systems (white cells sticking to blood vessel lining, bacterial adhesion), but their actual mechanism is still under debate.  The great thing about this paper (p. 682) is molecular dynamics simulation showed the conformational changes which occurred during catch bond formation in one case..   They even have videos.  Impressive.

This sort of thing is totally foreign to all solution chemistry, as there is no way to pull on a bond in solution.  Optical tweezers allow you to pull and stretch molecules (if you can attach them to large styrofoam balls).

Consensus isn’t what it used to be.

Technology marches on.  The influence of all 2^20 = 1,048,576 variants of 5 nucleotides on either side of two consensus sequences for transcription factor binding were (1) synthesized (2) had their dissociation constants (Kd’s) measured.  The consensus sequences were for two yeast transcription factors (Pho4 and Cbf1).  [ Proc.  Natl. Acad. Sci. vol. 115 pp. E3692 – E3702 ’18 ] .  The technique is called BET-seq (Binding Energy Topography by sequencing).

What do you think they found?

A ‘large fraction’ of the flanking mutations changed overall binding energies by as much as consensus site mutations.  The numbers aren’t huge (only 2.6 kiloCalories/mole).  However at 298 Kelvin 25 Centigrade 77 Fahrenheit (where RT = .6) every 1.36 kiloCalories/mole is worth a factor of 10 in the equilibrium constant.  So binding can vary by 100 fold even in this range.

The work may explain some ChIP data in which some strips of DNA are occupied despite the lack of a consensus site, with other regions containing consensus sites remaining unoccupied.  The authors make the interesting point that submaximal binding sites might be preferred to maximal ones because they’d be easier for the cell to control (notice the anthropomorphism of endowing the cell with consciousness, or natural selection with consciousness).  It is very easy to slide into teleological thinking in these matters.  Whether or not you like it is a matter of philosophical and/or theological taste.

Pity the poor computational chemist, trying to figure out binding energy to such accuracy with huge molecules like a transcriptional factors and long segments of DNA.

It is also interesting to think what “Molar” means with these monsters.  How much does a mole of hemoglobin weigh?  64 kiloGrams more or less.  It simply can’t be put into 1000 milliliters of water (which weighs 1 kiloGram).  A liter of water contains 1000/18 moles (55.6) moles of water.  So solubilizing 1 molecule of hemoglobin would certainly use more than 55 molecules of water.  Reality must intrude, but we blithely talk about concentration this way.  Does anyone out there know what the maximum achievable concentration of hemoglobin actually is?

Homework assignment and answer

A few days ago I gave the following homework assignment for the ace protein chemist, and promised an answer.

Here’s the assignment

Homework assignment for the protein chemist

As an ace protein chemist you are asked to design two proteins, both intrinsically disordered which form a tight complex with a picoMolar dissociation constant.  To make the problem ‘easier’ there is no need for specific amino acid interactions between the proteins.  To make the problem harder, even in the tight complex formed, the two proteins remain intrinsically disordered.

Hint: ‘nature’, ‘evolution’, ‘God’ —  whatever you chose to call it, has solved the problem.

Here’s the answer

The structure of an unstructured protein

Protein structure without structure.  No I haven’t fallen under the spell of a Zen master.  As Bill Clinton would say, it depends on what you mean by structure.

If you mean a segment of the protein chain which doesn’t settle down into one structure, you are talking about intrinsically disordered proteins. It is estimated that 40% of all human proteins contain at least one intrinsically disordered segment of 30 amino acids or more ( Nature vol. 471 pp. 151 – 153 ’11 ).   The same paper ‘estimates’ that 25% of all human proteins are likely to be disordered from beginning to end.

Frankly, I’ve always been amazed that any protein settles down into one shape — for details please see — https://luysii.wordpress.com/2010/08/04/why-should-a-protein-have-just-one-shape-or-any-shape-for-that-matter/. But that’s ‘old news’ as another Clinton would say.

Two fascinating papers in the current Nature (vol. 555 pp. 37 – 38, 61 – 66 ’18 1 March) describe the interaction of two very unstructured proteins.  One is prothymosin-alpha with 111 amino acids and a net negative charge of -44.  The other is Histone H1 with at least 189 amino acids and a net positive charge of + 53.  With such a charge imbalance it’s unlikely that they can coalesce into a compact single form.  So they are both intrinsically disordered proteins.

However the two proteins bind to each other quite tightly (dissociation constant is in the picoMolar range).  Even when they form a complex, a variety of techniques (NMR, single molecule fluorescent techniques, computation) show that neither settles down into a single form and are still unstructured.

So where’s the structure?  It isn’t in the amino acid sequence.  It isn’t the conformations adopted in space.  The structure is  in the net charge.  Many intrinsically disordered proteins have levels of net charge similar to those of prothymosin alpha and histone H1.  In the human proteome alone, several hundred proteins that are predicted to be intrinsically disordered contain contiguous stretches of at least 50 residues with a fractional net charge similar to that of H1 or proThymosin alpha (Bioinformatics 21, 3433–3434 2005) — hopefully there’s something newer.

The amino-acid sequences of disordered regions in proteins evolve rapidly, yet (Proc. Natl. Acad. Sci vol. 114 pp. E1450–E1459 2017) showed that the net charge is conserved despite a high degree of sequence diversity .  This should be a current enough reference.

Why in the world would the cell have something like this?  Most readers probably know what histones are.  If so, stop and think how the binding of the two proteins could be used by the cell before reading what the authors say about it.

“The interaction mechanism of proThymosin alpha and Histone H1 probably aids their biological function.  proThymosin alpha assists with the assembly and disassembly of chromatin, the material in which DNA is packaged with histone proteins (such as H1) in cells. To perform its function, proThymosin alpha must recognize its histone substrates rapidly and with sufficient affinity to compete with the high affinity of histone–DNA interactions (a similar high positive charge high negative charge interaction). The high binding affinity of Pro-Tα for H1 and the association rate of the two pro-teins imply that the dissociation of proThymosin alpha–H1 complexes is slow enough to allow functional outcomes, but fast enough not to slow down biological turnover.”

Why don’t they form a coacervate — a bunch of molecules held together by hydrophobic forces? Why don’t they show liquid liquid phase separations? The authors speculate that it might be due to the complementarity of the two proteins in terms of effective length and opposite net charge. Also they don’t have hydrophobic and aromatic side chains and cation pi interations which are said to favor phase separation mediated by proteins.

Addendum 20 March’18 — from my comment and a response on Derek’s blog


Another way to look at these very charge imbalanced proteins, is that they are being strongly (and positively) selected for. They are incredibly improbable on a purely statistical basis. Prothymosin alpha has 111 amino acids of which 44 are negatively charged. There are 20 amino acids of which only 2 (glutamic acid and aspartic acid) have negative charges at physiologic pH — cysteine and tyrosine can form anions but under much more basic conditions. So, assuming a random assortment of amino acids, the idea that 10% of the amino acids could fight for space with 90% of the rest and win around 40% of the time in 111 battles is extremely improbable. You’d have to use Stirling’s approximation for factorials to figure out exactly how improbable this is. Any takers?

DCRogers says:
March 20, 2018 at 2:35 pm
CDF(N=111, X=44, p=0.1) = 1.87 * 10^-16

Homework assignment for the protein chemist

As an ace protein chemist you are asked to design two proteins, both intrinsically disordered which form a tight complex with a picoMolar dissociation constant.  To make the problem ‘easier’ there is no need for specific amino acid interactions between the proteins.  To make the problem harder, even in the tight complex formed, the two proteins remain intrinsically disordered.

Hint: ‘nature’, ‘evolution’, ‘God’ —  whatever you chose to call it, has solved the problem.

Answer in a few days.