TL;DR: Anne wrote a blog post for the first time in over 2 years. The situation must be quite serious indeed.
Estimated reading time: 8 minutes
Update 28th March: I specified that the Everett et al. quote printed below came from the first version of their preprint and has since been changed. An earlier version of this blog post said that “their analyses didn’t link up very tightly with their preregistration”. To clarify this statement, I replaced it with “their analyses were underspecified by their preregistration”.
It’s a tough time for junior researchers. If you’re one and you ever tried to convince your advisor that this study you’re working on really will take much longer than they’d like, your COVID-19-alerted colleagues are currently putting you out of arguments. In Europe and North America (where most of the English-language literature in psychology is produced), the COVID-19 pandemic started having substantial effects on everyday life — mostly in the form of social distancing — no more than two weeks ago, but psych researchers haven’t been messing about. In this short time, the Psychological Science Accelerator put out a call for “rapid and impactful study proposals on COVID-19”, received 66(!!)Legend: ! = hovering on the brink of significance; !! = decisively significant; !!! = globally significant proposals in four(!!!) days, sifted through them, decided to run three of them and started preparing the data collection. Chris Chambers called researchers to sign up as reviewers for rapid-review Registered Reports on COVID-19 at Royal Society Open Science, got over 530(!!) responses within 48(!) hours, and moved the first RR to in-principle acceptance in just 6(!!) days which saw 2(!!!) rounds of review.
Outside of these concerted efforts, individual researchers haven’t been dragging their feet either. If you spend anywhere near as much time as me on Twitteroh god I really hope you don’t you’ve probably seen the avalanche of “crisis papers” come rolling in — from how to combat COVID19-related misinformation to the right type of moralistic finger-wagging to get people to stay the fuck home, to personality traits that predict who’s stealing all your toilet paper.Hint: it’s people who like stealing things. Earlier today (26th of March), I counted 20(!) English-language, psychology-related crisis preprints (15 with new data, 4 reviews, and one opinion piece),Method: Searched OSF Preprints on the 26th of March 2020 at ~12:30 CET for “COVID OR coronavirus OR corona” and included all English-language papers that were published on PsyArXiv and all non-PsyArXiV papers that were tagged with “Social and Behavioral Sciences” AND either “Psychology” or a psychology subfield. 16 of which have been published in the last two weeks(!!).
I think it’s great and genuinely heartening that so many colleagues want to help and are putting their usual work aside to invest time and money to combat the crisis. As psychologists, we know that we’re not in the front line of the war against the pandemic — research more essential than ours is provided by virologists, immunologists, epidemiologists, and public-health-ologists.Inspired by the fabulous Alie Ward But the battle we’re fighting has a huge behavioural component, and so it’s natural that we feel a call of duty too.
But I’m also worried. If the reform movement of the last years has brought us to any kind of consensus, it’s probably that a mix of problematic pressures and incentives should be handled with safety glasses because it has a tendency to explode in your face. And that’s exactly the situation we’re in right now: First, there’s time pressure like you’ve never seen before — authors who want to produce research with a real impact on the crisis have to publish now. Literally every day counts to flatten the curve or buffer side effects of social distancing policies. Second, the incentives are a gallery of pure glory — including hundreds of preprint downloads within mere hours, thousands of Twitter likes, being called “heroic”, a probability of getting attention from journalists and policy makers so high that you’ll consider changing the title of the “outreach” section on your CV to “impact”, and last but far from least, journals that welcome your paper with open arms and waive your APCs.
So, did we put on our safety goggles and kevlar gloves? It doesn’t feel like it. In fact, it feels as if we’ve put our guard down rather than up: We understand and accept that a study put together and written up in a few days can’t be as polished and rigorous as one carried out over weeks and months. We’re willing to put up with less-than-ideal study designs, outcome measures, and analysis methods. It’s an emergency after all, right? We just can’t afford our usual levels of rigour (not that our usual levels of rigour are anything to write home about)!You should definitely write home instead of delivering the message in person though. We’ll have to put up with somewhat higher error rates because the cause is just too important. Face masks made of t-shirt fabric are better than nothing, right? I’ve certainly felt this pull. But I think it might be based on a faulty analogy to other essential needs we’re currently facing.
What are those other essentials? We need a vaccine, we currently don’t have one. We need effective anti-viral drugs. We need face masks and protective gear, we currently don’t have enough. We need test kits. We need hospital beds and medical staff. We need respirators. In a situation where the demand for essentials exceeds the supply, it can be wise to lower one’s guard, take a gamble and accept second-best or risky alternatives. But in my opinion, most psychological crisis research is not in the same category of limited-supply essentials that are worth a gamble: We do have ways to communicate with the public and impose rules, we do have some idea of how to combat loneliness and anxiety. Crisis research can help fine-tune more general knowledge to the specific situation. But that means we’re talking about marginal gains, not an emergency supply.
Ok ok, so maybe our work isn’t essential. But what’s the harm? Every little helps, right? In the first version of a preprint published last Friday, Everett et al. wrote:
Perhaps most importantly, our effects are small and in many cases do not pass conventional levels of statistical significance. However, in the context of a public health crisis, even changing the behavior of a small percentage of individuals can save lives, and small changes in framing of messages are a cheaply-implemented tool that could be readily implemented on communications platforms.
I think this quote (which the authors changed in a later version of the preprint) nails a sentiment shared by many: impact may be small and uncertain, but it might save a few lives here and there, so we should try it! The problem with this argument is that any effect that might save a few lives might just as well cost a few lives. And if we operate in emergency mode, lowering our guard and accepting higher error rates for studies that, if anything, have a higher risk of bias and error than non-crisis research, we exacerbate the risk that what we thought were marginal gains will explode in our faces. Here’s a summary of five problems with crisis research that I worry about:
- Sign errors
The true size of that small, marginally significant positive effect you observed, that could save a few lives — it could be negative. It could cost a few lives. Even 2,000 MTurkers can give you really shitty power for tiny effects in the range of d < .1, which makes Type-S errors more likely.
- Moar errors
Not just sampling error, but errors of all shapes and colours are likely to increase — affecting study designs and materials, data processing and analyses, and the reporting of results. Rushing things means authors have less time to double-check, are more stressed and perhaps sleep deprived, and might expect that reviewers and the community at large will be more lenient and forgiving in this emergency situation. And they might be right: the research community probably is more lenient and forgiving, and, like the authors, more stressed than usual, making it less likely that errors get caught.
- Fast measures ≠ valid measures
Measures of behavioural intentions are having a field day. That’s not surprising given that fast research mostly means online research, and measuring actual behaviour is expensive and slow. But shaky proxies don’t magically become more valid because we’re in a pandemic. If you don’t know how your outcome measure relates to the thing you’re actually interested in, then you don’t know what your manipulation will do in the real world.
- A non-expert audience
Crisis papers will be read by many more people outside of the field — that’s kind of the point. But it means that a sizable chunk of the readership (and perhaps the most impactful chunk) is poorly prepared to understand methodological details and limitations. The “risk” of a study with rather humble real-world implications getting picked up and misinterpreted by the media and spread to perhaps millions of people is much greater than usual. Peder Isager also pointed out that authors with no experience of communicating their research to journalists or policymakers might now suddenly have to do just that, a situation that comes with its own risks.
- Opportunity cost
Yet another crisis paper isn’t just another flower in the garden that might be nice to look at but comes at no cost. The cost invested by the authors is the most obvious one and was probably chosen deliberately (MTurk/Prolific studies might be cheap but they’re not free; one week of work for five people isn’t nothing). But there are other costs that I think we’re likely to forget. People who read, think about, and discuss a crisis paper could have read, thought about, and discussed another paper instead. Who knows — if there were no crisis papers appearing at all, maybe we would instead go back to the literature and dig out relevant studies that were produced more slowly and cautiously? Max Maier pointed out that a number of papers written in response to the swine flu may be relevant now but largely seem to be getting ignored. And finally, the arguably most dangerous opportunity cost comes after papers have been read and discussed: Actually implementing one strategy, say framing messages to the public in a certain way, means not implementing another. What if the alternative would be better? What if the conventional way of calling people to reason is more effective or less harmful?
Some of these concerns already seem to become true: Farid Anvari took a closer look at the Everett et al. paper, found that their analyses were underspecified by their preregistration, and argues that many of the reported outcomes should have been corrected for multiple comparisons (and wouldn’t remain significant after doing that). Credit where credit is due: Senior author Molly Crockett immediately made these concerns public on Twitter, the preprint was updated, and to my knowledge the authors are currently preparing a replication of their study as a Registered Report. That’s an exemplary response, but I can’t help worrying about the impact of the >800 downloads their preprint had amassed before it was updated.
I’m certain that all authors of the crisis preprints I found were acting in good faith and with the best intentions. My point is not that all of this research is pointless or harmful — some of it may have a genuine positive impact. But I do feel that our concern about the extremely unusual and serious situation we’re in leads us to overlook the potential costs of conducting and consuming research in emergency mode. Let’s not let our guard down before we’ve considered the consequences.When Daniël “justify everything” Lakens proclaims peace with the Redefine Significance folks, you better lower your alpha. In fact, one of the best uses of our time right now might be to check, criticise, and help improve the work of others — Farid Anvari’s commentary on Everett et al. being a stellar example.
PS: I started a Zotero library with COVID-19-related psychology preprints and a spreadsheet to track these papers’ publication and modification dates, downloads, whether they were preregistered and when, and if they have open data. I intend to keep these resources updated but can’t say in which intervals. It might be an interesting resource for future meta-research — do get in touch if you’re interested in using these data or want to join the tracking effort in some way. Finally, a hat tip goes to Flávio Azevedo and his compilation of COVID-19 social-science research as well as the COVID-19 social science project tracker.
This blog post was written in less than 48 hours and, at the request of the author, received extra lenient The 100% CI
Half-Arsed Rapid-Response Review™ to bask in as much COVID-19 publicity as possible. The author is particularly indebted to Malte Elson, who substantially lowered his standards for an acceptable rate of puns.
|Legend: ! = hovering on the brink of significance; !! = decisively significant; !!! = globally significant
|oh god I really hope you don’t
|Hint: it’s people who like stealing things.
|Method: Searched OSF Preprints on the 26th of March 2020 at ~12:30 CET for “COVID OR coronavirus OR corona” and included all English-language papers that were published on PsyArXiv and all non-PsyArXiV papers that were tagged with “Social and Behavioral Sciences” AND either “Psychology” or a psychology subfield.
|Inspired by the fabulous Alie Ward
|You should definitely write home instead of delivering the message in person though.
|When Daniël “justify everything” Lakens proclaims peace with the Redefine Significance folks, you better lower your alpha.