Gotterdamerung — The Twilight of the GWAS

Life may be like a well, but cellular biochemistry and gene function is like a mattress.  Push on it anywhere and everything changes, because it’s all hooked together.  That’s the only conclusion possible if a review of genome wide association studies (GWAS) is correct [ Cell vol. 169 pp. 1177 – 1186 ’17 ].

 It’s been a scandal for years that GWAS studies as they grow larger and larger are still missing large amounts of the heritability of known very heritable conditions (e.g. schizophrenia, height).  It’s been called the dark matter of the genome (e.g. we know it’s there, but we don’t know what it is).

If you’re a little shaky about how GWAS works have a look at — it will come up again later in this post.

We do know that less than 10% of the SNPs found by GWAS lie in protein coding genes — this means either that they are randomly distributed, or that they are in regions controlling gene expression.  Arguing for randomness — the review states that the heritability contributed by each chromosome tends to be closely proportional to chromosome length.  Schizophrenia is known to be quite heritable, and monozygotic twins have a concordance rate of 40%.  Yet an amazing study (which is quoted but which I have not read) estimates that nearly 100% of all 1 megabase windows in the human genome contribute to schizophrenia heritability (Nature Genet. vol. 47 pp. 1385 – 1392 ’15). Given the 3.2 gigaBase size of our genome that’s 3,200 loci.

Another example is the GIANT study about the heritability of height.  The study was based on 250,000 people and some 697 gene wide significant loci were found.  In aggregate they explain a mere SIXTEEN PERCENT.

So what is going on?

It gets back to the link posted earlier. The title —  “Tolstoy rides again”  isn’t a joke.  It refers to the opening sentence of Anna Karenina — “Happy families are all alike; every unhappy family is unhappy in its own way”.  So there are many routes to schizophrenia (and they are spread all over the genome).

The authors of the review think that larger and larger GWAS studies (some are planned with over a million participants) are not going to help and are probably a waste of money.  Whether the review is Gotterdamerung for GWAS isn’t clear, but the review is provocative.The review is new and it will be interesting to see the response by the GWAS people.

So what do they think is going on?  Namely that everything in organismal and cellular biochemistry, genetics and physiology is related to everything else.  Push on it in one place and like a box spring mattress, everything changes.  The SNPs found outside the DNA coding for proteins are probably changing the control of protein synthesis of all the genes.

The dark matter of the genome is ‘the plan’ which makes the difference between animate and inanimate matter.   For more on this please see —

Fascinating and enjoyable to be alive at such a time in genetics, biochemistry and molecular biology.

Post a comment or leave a trackback: Trackback URL.


  • Bryan  On July 6, 2017 at 11:33 am

    >We do know that less than 10% of the SNPs found by GWAS lie in protein coding genes — this means either that they are randomly distributed, or that they are in regions controlling gene expression.

    Two notes:
    1. Because only 2% of the genome encodes protein, that 10% of SNPs lie in protein-coding sequences suggests the opposite—that GWAS hits are enriched in protein coding sequences (unless more than 10% of the total SNPs analyzed happen to lie in protein-coding sequences).

    2. Because GWAS (genome wide _association_ studies) can only identify SNPs that correlate with genetic disease, SNPs identified by GWAS cannot be assumed to cause the phenotype studied. Very likely these SNPs are linked to a nearby locus (whether that is a coding or non-coding sequence) that causes the effect. In other words, the location of SNPs identified in GWAS tells us nothing about the locations of the underlying sequences that influence the trait studied.

  • luysii  On July 6, 2017 at 11:47 am

    Bryan — agree with both your points. It does seem logical, that control of the placement and amount of the bricks (proteins) would be more complicated and intricate than the number of the different types of bricks (not that protein structure is in any sense simple).

  • GCC  On July 11, 2017 at 11:19 am

    About Bryan’s first comment, I’m not sure that’s the right comparison to make. The post notes that “less than 10% of the SNPs found by GWAS lie in *protein coding genes*”, but then you say that “10% of SNPs lie in *protein-coding sequences*”. Even if we ignore the “less than”, presumably many of the ~10% of SNPs in protein coding genes are in introns, so based on this info alone, I don’t think the argument that GWAS hits are enriched in protein coding sequences makes sense.

    But your second comment is definitely true and important. The SNPs associated by GWAS are almost certainly not causative in most cases, but just nearby enough to be linked to the causative variants. So really, it’s the locations of the causative variants that will be interesting to see as more of them are discovered.

    Anyway, great post and comments. It’ll be interesting to see how this all plays out in the years to come.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: