Reproducibility and its discontents

“Since the launch of the registry in 2000, which forced researchers to preregister their methods and outcome measures, the percentage of large heart-disease clinical trials reporting significant positive results plummeted from 57% to a mere 8%”. I leave it to you to speculate why this happened, but my guess is that probably the data were sliced and diced until something of significance was found. I’d love to know what the comparable data is on anti-depressant trials. The above direct quote is from Proc. Natl. Acad. Sci. vol. 113 pp. 6454 – 6459 ’16. The article looked at the 100 papers published in ‘top’ psychology journals, about which much has been written — here’s the reference to the actual paper — Open Science Collaboration (2015) Psychology. Estimating the reproducibility of psychological science. Science 349(6251):aac4716.

The sad news is that only 39% of these studies were reproducible. So why beat a dead horse? The authors came up with something quite useful — they looked at how sensitive to context each of the 100 studies actually was. By context they mean the time of the study (e.g., pre- vs. post-Recession), culture (e.g., individualistic vs. collectivistic culture), the location (e.g., rural vs. urban setting), or the population (e.g., a racially diverse population vs. a predominantly White or Black or Latino population). Their conclusions were that the contextual sensitivity of the research topic was associated with replication success (e.g. the more context sensitive, the less likely it was that the study could be reproduced). This was even after statistically adjusting for several methodological characteristics (e.g., statistical power, effect size, etc. etc). The association between contextual sensitivity and replication success did not differ across psychological subdisciplines.

Addendum 15 June ’16 — Sadly, the best way to say this is — The more likely a study is to be true (replicable) the more likely it is to be not generally applicable (e.g. useful).

So this is good. Up to now the results of psychology studies have been reported in the press as of general applicability (particularly those which enforce the writer’s preferred narrative). Caveat emptor is two millenia old. Carl Sagan said it best — “Extraordinary claims require extraordinary evidence.”

For an example data slicing and dicing, please see —

Post a comment or leave a trackback: Trackback URL.


  • Bryan  On June 13, 2016 at 9:32 am

    That 8% success rate looks even worse when you consider that most of the studies are probably using a p < 0.05 cutoff, meaning that we would expect 5% of studies to be false positives. So less than 40% of those positive results are actually real.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: