You’ve heard of P-hacking, slicing and dicing your data until you get a statistically significant result. I wrote a post about null-hacking –https://luysii.wordpress.com/2019/12/22/null-hacking-reproducibility-and-its-discontents-take-ii/. Welcome to the world of pipeline hacking. Here is a brief explanation of the highly technical field of functional magnetic resonance imaging (fMRI). Skip to the **** if you know this already.
Chemists use MRI all the time, but they call it Nuclear Magnetic Resonance. Docs and researchers quickly changed the name to MRI because no one would put their head in something with Nuclear in the name.
There are now noninvasive methods to study brain activity in man. The most prominent one is called BOLD (Blood Oxygen Level Dependent), and is based on the fact that blood flow increases way past what is needed with increased brain activity. This was actually noted by Wilder Penfield operating on the brain for epilepsy in the 1930s. When a patient had a seizure on the operating table (they could keep things under control by partially paralyzing the patient with curare) the veins in the area producing the seizure turned red. Recall that oxygenated blood is red while the deoxygenated blood in veins is darker and somewhat blue. This implied that more blood was getting to the convulsing area than it could use.
BOLD depends on slight differences in the way oxygenated hemoglobin and deoxygenated hemoglobin interact with the magnetic field used in magnetic resonance imaging (MRI). The technique has had a rather checkered history, because very small differences must be measured, and there is lots of manipulation of the raw data (never seen in papers) to be done. 10 years ago functional magnetic imaging (fMRI) was called pseudocolor phrenology.
Some sort of task or sensory stimulus is given and the parts of the brain showing increased hemoglobin + oxygen are mapped out. As a neurologist as far back as the 90s, I was naturally interested in this work. Very quickly, I smelled a rat. The authors of all the papers always seemed to confirm their initial hunch about which areas of the brain were involved in whatever they were studying.
****
Well now we know why. The data produced by and MRI is so extensive and complex that computer programs (pipelines) must be used to make those pretty pictures. The brain has a volume of 1,200 cubic centimeters (or 1,200,000 cubic millimeters). Each voxel of an MRI (like the pixels on your screen is about 1 cubic millimeter) and basically gives you a number of how much energy is absorbed by the voxel. Computer programs (called pipelines) must be used to process it and make those pretty pictures you see.
Enter Nature vol. 582 pp. 36 – 37, 84 – 88 ’20 and the Neuroimaging Analysis Replication and Prediction Study (NARPS). 70 different teams were given the raw data from 108 people, each of whom was performing one or the other of two versions of a task through to study decision making under risk. The groups were asked to analyze the data to test 9 different hypotheses about what part of the brain should light up in relation to specific feature of the task.
Now when a doc orders a hemoglobin from the lab he’s pretty should that they’ll all give the same result because they determine hemoglobin by the same method. Not so for functional MRI. All 70 teams analyzed the data using different pipelines and workflows.
Was there agreement. 20% of the teams reported a result different from most teams. Random is 50%. Remember they all got the same raw data.
From the News and Views commentary on the the paper.
“It is unfortunately common for researchers to explore various pipelines to find the version that yields the ‘best’ results, ultimately reporting only that pipeline and ignoring the others.”
This explains why I smelled a rat 30 years ago. I call this pipeline hacking.
Further infelicities in the field can be found in the following posts
l. it was shown in 2014 that 70% of people having functional MRIs (fMRIs) were asleep during the test, and that until then fMRI researchers hadn’t checked for it. For details please see
https://luysii.wordpress.com/2014/05/18/how-badly-are-thy-researchers-o-default-mode-network/. You don’t have to go to med school, to know that the brain functions quite differently in wake and sleep.
2. A devastating report in [ Proc. Natl. Acad. Sci. vol. 113 pp. 7699 – 7600, 7900 – 7905 ’16 ] showed that certain common settings in 3 software pacakages (SPM, FSL, AFNI) used to analyze fMRI data gave false positive results ‘up to’ 70% of the time. Some 3,500 of the 40,000 fMRI studies in the literature over the past 20 years used these settings. The paper also notes that a bug (now corrected after being used for 15 years) in one of them also led to false positive results. For details see — https://luysii.wordpress.com/2016/07/17/functional-mri-research-is-a-scientific-sewer/
In fairness to the field, the new work and #1 and #2 represent attempts by workers in fMRI to clean it up. They’ve got a lot of work to do.