Reproducibility Article in “The Conversation”

I was asked to write 200-300 words on my views on whether there is a reproducibility crisis in the sciences for an article that was appearing in The Conversation. I was so passionate about what I was writing that I ended up writing over 1,200 words. The final article was, of course, edited down by their team to meet the 300 word guide. Below I have posted my full piece.

That there is a reproducibility crisis in psychological science—and arguably across all sciences—is, to me, beyond doubt. Murmurings of low reproducibility began in 2011— the so-called “year of horrors” for psychological science (Wagenmakers, 2012), with the infamous fraud case of Diedrik Stapel being its low-light. But murmurings now have empirical evidence. In 2015, the Open Science Collaboration published the findings of our large-scale effort to closely-replicate 100 studies in psychology (Open Science Collaboration, 2015). And the news was not good: Only 36% of studies were replicated.

Whilst low reproducibility is not unique to psychological science—indeed, cancer biology is currently reviewing its own reproducibility rate, and things are not looking great (see Baker & Dolgin, 2017)—psychology is leading the way in getting its house in order. Several pioneering initiatives have been introduced which, if embraced by the community, will leave psychological science in a strong position moving forward. Here I focus on three I believe are the most important.

Study Pre-Registration & Registered Reports

In a delightfully concerning study, Simonsohn et al. (2013) demonstrated that, in the absence of any true effect, researchers can find statistically significant effects in their studies by engaging in questionable research practices (QRPs), such as selectively reporting outcome measures that produced significant effects and dropping experimental conditions that produced no effect. Another QRP could include analysing your data in a variety of ways (for example, maybe a couple of participants didn’t show the effect you were looking for, so why not remove them from the analysis and see whether that “clears things up”?). What was concerning about this study is that many of these QRPs were not really considered “questionable” at the time. Indeed, many researchers have admitted to engaging in such QRPs (John et al., 2013).

As such, I do not believe that the presence of QRPs reflect explicit attempts at fraud. Rather, they likely stem from a blurred distinction between exploratory and confirmatory research. In exploratory research, many measures might be taken, many experimental conditions administered, and the data scrutinised using a variety of approaches looking for interesting patterns. Confirmatory research tests explicit hypotheses using pre-planned methods and analytical strategies. Both approaches are valid—exploratory research can generate interesting questions, and confirmatory research can address these questions—but what is not valid is to report an exploratory study as though it were confirmatory (Wagenmakers et al., 2012); that is, to find an effect in exploratory research and to publish the finding together with a narrative that this effect was expected all along.

Many researchers have started to pre-register their studies detailing their predictions, experimental protocols, and planned analytical strategy before data collection begins. When the study is submitted for publication, researchers can demonstrate that no QRPs have occurred because they can point to a time-stamped document verifying their plans before data collection commenced, leading to an increase in confidence in the claims reported. This is confirmatory research at its finest.

Some journals have taken this one stage further, by introducing Registered Reports, where papers containing details of a study’s rationale and detailed proposed methods are reviewed and accepted (or rejected!) for publication before the experiment has been conducted. The neuroscience journal Cortex—with their Registered Reports Editor Professor Chris Chambers of Cardiff University—has led the way with this format. Many other journals have now started to offer such reports.

This is an important contribution to the academic publishing structure because it incentivises best research practice. Here research is judged on the soundness of the methods and the importance of the question being addressed, and not the particular results of the study. Current incentive structures in our universities—together with general pressure for increased publications (the so-called “publish or perish” attitude)—leads researchers to prioritise “getting it published” over “getting it right” (Nosek et al., 2012), potentially leading to implicit or explicit use of QRPs to ensure a publishable finding. With the advent of Registered Reports, researchers can finally do both: prioritise “getting it right” by submitting a strong and well-evidenced research proposal, and it will be published regardless of what the data say.

Open Data, Open Materials

Science works by independent verification, not by appeal to authority. As noted by Wicherts and colleagues (2011), independent verification of data analysis is important because “…analyses of research data are quite error prone, accounts of statistical results may be inaccurate, and decisions that researchers make during the analytical phase of a study may lean towards the goal of achieving a preferred (significant) result” (p. 1). Given this importance, most journal policies ask for researchers to make available their data. Yet, when asked for their data, Wicherts and colleagues (2006) found just 73% of researchers provided their data when asked. Some researchers have begun to refuse to review journal article submissions unless the authors provide their data (or provide a convincing reason for why this is not possible) as part of the Peer-Reviewers’ Openness Initiative (see Morey et al., 2015); after all, if a reviewer cannot access the data a paper is based upon, how can a full review be completed?

The flagship psychology journal Psychological Science since 2014 has incentivised researchers to share their experimental material and data by providing badges to studies that comply to open practices by publishing data and materials together with their papers in the journal. (The journal offers a third badge if the study is pre-registered.) This intervention has been remarkably effective: Kidwell et al. (2016) reported that 23% of studies in Psychological Science provided open data, a rise from lower than 3% before the badges were in use. More journals are now encouraging authors to make their data open as a consequence.

Registered Replication Reports

I tell my students all the time that “replication is the most important statistic” (source of quote unknown). To me, an empirical finding in isolation doesn’t mean all that much until it has been replicated. In my own lab, I make an effort to replicate an effect before trying to publish it. As my scientific hero Richard Feynman is famous for saying “Science is a way of trying not to fool yourself… …and you are the easiest person to fool”. As scientists, we have a professional responsibility to ensure the findings we are reporting are robust and reproducible.

But we must also not allow others’ findings to fool us, either. That is why replication of other people’s findings should become a core component of any working lab (a course of action we have facilitated by publishing a “Replication Recipe”: a guide to performing convincing replications; Brandt et al., 2014).

You’d be forgiven for thinking that reports of replications must be common place in the academic literature. This is not the case. Many journals seek novel theories and/or findings, and view replications as treading over old ground. As such, there is little incentive for career-minded academics to conduct replications. However, if the results of the Open Science Collaboration (2015) tell us nothing else, it is that old ground needs to be re-trodden.

The Registered Replication Report format in the high-impact journal Perspectives on Psychological Science seeks to change this. In this format, many teams of researchers each independently perform a close replication of an important finding in the literature, all following an identical and shared protocol of study procedures. The final report—a single paper with all contributing researchers gaining authorship—collates the findings across all teams in a meta-analysis to firmly establish the size and reproducibility of an effect. Such large-scale replication attempts in a high-profile journal such as Perspectives can only help to encourage psychological scientists to view replication as a valid area of  their research programme.

Conclusion

2011 was described as a year of horrors for psychological science. Whilst certainly improvements can be made, our discipline has made impressive strides to improve our science. In just 6 years psychological has moved from a discipline in crisis to a discipline leading the way in how to conduct strong, rigorous, reproducible research.

References

Baker, M, & Dolgin, E. (2017). Cancer reproducibility project releases first results. Nature, 541, 7637, 269.

Brandt, M.J., IJzerman, H., Dijksterhuis, A., Farach, F., Geller, J., Giner-Sorolla, R., Grange, J.A., Perugini, M., Spies, J., & van ‘t Veer, A. (2014). The replication recipe: What makes for a convincing replication? Journal of Experimental Social Psychology, 50, 214-224.

John L. K., Loewenstein G., Prelec D. (2012). Measuring the prevalence of questionable research practices with incentives for truth-telling. Psychological Science, 23, 524–532

Kidwell, M.C., Lazarević, L.B., Baranski, E., Hardwicke, T.E., Piechowski, S., Falkenberg, L-S., et al. (2016). Badges to acknowledge open practices: A simple, low-cost, effective method for increasing transparency. PLoS Biology, 14(5), e1002456.

Morey, R. D., Chambers, C. D., Etchells, P. J., Harris, C. R., Hoekstra, R., Lakens, D., . . . Zwaan, R. A. (2016). The peer reviewers’ openness initiative: Incentivizing open research practices through peer review. Royal Society Open Science, 3(1), 150547.

Nosek, B.A., Spies, J.R., & Motyl, M. (2012). Scientific utopia: II. Restructuring incentives and practices to promote truth over publishability. Perspectives on Psychological Science, 7, 6, 615-631.

Open Science Collaboration (2015). Estimating the reproducibility of psychological science. Science, 349, 943.

Simons, D.J., Holcolmbe, A.O., & Spellman, B.A. (2014). An Introduction to Registered Replication Reports at Perspectives on Psychological Science. Perspectives on Psychological Science, 9, 552–555

Wagenmakers, E.-J. (2012). A year of horrorsDe Psychonoom, 27, 12-13.

Wagenmakers, E.-J., Wetzels, R., Borsboom, D., van der Maas, H.L.J., & Kievit, R. (2012). An agenda for purely confirmatory research. Persepctives on Psychological Science, 7, 632-638.

Wicherts, J.M., Bakker, M., & Molenaar, D. (2011). Willingness to share research data is related to the strength of the evidence and the quality of reporting of statistical results. PLoS ONE 6(11), e26828.

Wicherts, J.M., Borsboom, D., Kats, J., & Molenaar, D. (2006) The poor availability of psychological research data for reanalysis. American Psychologist, 61, 726–728.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s