Mediation analysis has gotten a lot of flak, including classic titles such as “Yes, but what’s the mechanism? (Don’t expect an easy answer)” (Bullock et al., 2010), “What mediation analysis can (not) do” (Fiedler et al., 2011), “Indirect effect ex machina” (The 100% CI, 2019), “In psychology everything mediates everything” (Brown, 2020), “That’s a lot to process! Pitfalls of popular path models” (Rohrer et al., 2022), and “Mediation analysis is counterintuitively invalid” (Datacolada, 2022).[1] Does mediation analysis cause catchy titles? Or maybe it’s just that I forgot the many other papers on the topic that had less memorable titles. Nevertheless, mediation persisted.
The central concern is that claims about mediation are causal claims. We claim that some cause X affects an outcome Y via some mediator M, X → M → Y. Without reference to causality, in purely statistical terms, mediation is indistinguishable from confounding (X ← M → Y, MacKinnon et al., 2010) and really just not substantively meaningful. To be more precise, mediation analysis combines statements about three causal effects: the effect of X on M, the effect of M on Y (together, these two make up the indirect effect) and whatever effect remains of X on Y (the direct effect).
Cross-sectional observational mediation analysis
Causal inference is a tricky business, and causal inference about three effects is thrice as tricky. For example, if we only have observational cross-sectional data, we need to worry about (1) confounding between the cause of interest and the outcome, which includes (2) confounding between the cause of interest and the mediator,[2]Variables that confound the cause of interest with the mediator, X ← C → M, will automatically confound the cause of interest with the outcome as long as the mediator actually does affect the outcome: X ← C → M → Y. Many thanks to Jeremy Labrecque who pointed that out. and (3) confounding between the mediator and the outcome. That’s a lot of confounding to worry about. Only if we are willing to assume that we know all relevant confounders between the variables, have measured them, and adjusted for them appropriately could we be confident in the resulting mediation claims.
Experimental mediation analysis
The “counterintuitive” thing about mediation analysis is that we still need to worry about confounding in a slightly better scenario in which the cause of interest is a randomized experimental manipulation. In this scenario, nothing can be confounded with the cause of interest, which is great. It means that we can causally identify the total effect of X on Y, as well as the effect of X on M. Unfortunately, we still need to consider the possibility of confounding between the mediator and the outcome.[3]If you’re an experimental researcher, your preferred solution may be to simply randomize the hell out of the mediator as well. This leads to fully experimental “solutions to mediation”, although in my mind they are a slightly distinct topic—clever experimental design to narrow down mechanisms. If that works for you, go for it. But the mediators researchers are interested in often are variables that cannot be directly manipulated to begin with (at least not in a very targeted manner, such as psychological variables), so it’s unfortunately not a general solution for the use cases people have in mind. If such confounding exists and is not accounted for (for example, by statistically controlling for the mediator-outcome confounder), then our estimate of M → Y will be biased. This, in turn, means that our estimate of the indirect is biased (X → M combined with M → Y). Somewhat less intuitively, it also means that our estimate of the direct effect will be biased. The reason for this is that to estimate the direct effect, we have to condition on the mediator—but the mediator is a collider, and conditioning on it will introduce collider bias.[4]We have a section explaining this in “That’s a lot to process!” titled “Identification of the direct effect.”
Longitudinal mediation analysis
Sometimes people make it sound as if the “cross-sectional” part of “cross-sectional mediation analysis” was the problem. So what changes if longitudinal data are available? This clearly seems like an improvement; at least we get some temporal order in there. Unfortunately, that does not fully fix confounding either.
If analyzed properly, longitudinal data allow us to rule out so-called time-invariant confounding. For example, imagine we collected intensive longitudinal observational data from a single individual, repeatedly measuring X, M, and Y. In our subsequent analyses, X, M and Y cannot be confounded by time-invariant effects of factors such as the individual’s upbringing and childhood experience and their stable dispositions. After all, those factors will be the same at any point in time. But time-varying confounding can still be an issue—other stuff might happen in the individual’s life which in turn affects two or three of the variables we are interested in.
What happens if we additionally randomize X with the help of an experimental manipulation? Think in terms of an intensive within-subject experiment. Now, we don’t need to worry about X → M or X → Y being confounded by anything, so we only need to concern ourselves with mediator-outcome confounding. And, due to the longitudinal nature of our design, we may be able to rule out time-invariant mediator-outcome confounding. This still leaves time-varying mediator-outcome confounding as a source of bias—stuff that happens in people’s lives and affects both the mediator and the outcome. I’d say that this leaves us in a much better spot requiring fewer assumptions than any other scenario. Alas, this is also a highly involved design which I don’t think I have seen implemented properly so far. And of course it won’t be universally applicable.
So, what should you do with that mediation analysis of yours
There are different ways to cope with all of this. One way is to outright dismiss mediation analysis. For example, some editors have stated that (cross-sectional, observational) mediation analyses will usually get desk-rejected. This may seem a bit extreme, but then if you think about it—if such an analysis is the centerpiece of an article and the authors do nothing to credibly address confounding, they are trying to sell elaborate causal storytelling based on three correlations in a trench coat.[5]To my current knowledge, I stole this excellent phrase from Sanjay Srivastava. There may be edge cases in which this is justified, but I’d want to see even just one of them before I conclude that cross-sectional observational mediation analysis should be taken seriously by default.[6]The edge case that I have in mind is one in which the correlations at hand are strongly predicted by some theory of interest, but are most definitely not predicted by our background knowledge of the world (i.e., they are surprising unless we believe in the theory of interest). In that case, I guess a cross-sectional observational mediation analysis could constitute a severe test of some theory, see also this blog post trying to link the notion of hypothesis testing with causal inference. I just think that this is unlikely to happen in psychology in general and even less likely for the pairs of variables that researchers usually throw into their mediation models. For example, for most pairs of self-reported psychological variables, only those who live under a rock (or try to publish a paper) would act surprised if they turned out to be mildly correlated.

But what if the mediation analysis is not quite that central to your paper, or if there’s something else that you believe renders your mediation claim more plausible (randomization, longitudinal data, a good understanding of the underlying causal net)?
I guess in those cases, the best you can do is actually actively engage with the possibility of confounding. This involves modeling confounders one way or another, and conducting sensitivity analyses to figure out how bad confounding would need to be to change the conclusions (MacKinnon & Pirlott, 2015, provide a nice overview). Even with those steps, mediation analysis still involves untested assumptions which should be spelled out. The very least I want to see in a paper reporting a mediation claim (no matter how plausible it is) is a section discussing that any interpretation of the mediation results rests on the assumption that there is no (unmodeled) confounding between the variables of interest. And if I end up handling your manuscript, you will definitely get bonus points for actually spelling out some possible confounders that one should worry about (and that future studies should additionally measure).
I understand that this approach to mediation analysis is not particularly sexy. There’s a nice section in a paper by Pearl and Bareinboim which states that “assumptions are self-destructive in their honesty.” That happens to apply here as well. Then again, I don’t think there’s an easy way out at this point—editors and reviewers are increasingly aware that mediation analysis is just not as simple as people used to think. The golden days of just throwing three variables into the statistical software of your choice to tell an elaborate story are coming to an end. You might as well prepare for it by talking about the underlying causal assumptions more explicitly.

Footnotes
↑1 | Does mediation analysis cause catchy titles? Or maybe it’s just that I forgot the many other papers on the topic that had less memorable titles. |
---|---|
↑2 | Variables that confound the cause of interest with the mediator, X ← C → M, will automatically confound the cause of interest with the outcome as long as the mediator actually does affect the outcome: X ← C → M → Y. Many thanks to Jeremy Labrecque who pointed that out. |
↑3 | If you’re an experimental researcher, your preferred solution may be to simply randomize the hell out of the mediator as well. This leads to fully experimental “solutions to mediation”, although in my mind they are a slightly distinct topic—clever experimental design to narrow down mechanisms. If that works for you, go for it. But the mediators researchers are interested in often are variables that cannot be directly manipulated to begin with (at least not in a very targeted manner, such as psychological variables), so it’s unfortunately not a general solution for the use cases people have in mind. |
↑4 | We have a section explaining this in “That’s a lot to process!” titled “Identification of the direct effect.” |
↑5 | To my current knowledge, I stole this excellent phrase from Sanjay Srivastava. |
↑6 | The edge case that I have in mind is one in which the correlations at hand are strongly predicted by some theory of interest, but are most definitely not predicted by our background knowledge of the world (i.e., they are surprising unless we believe in the theory of interest). In that case, I guess a cross-sectional observational mediation analysis could constitute a severe test of some theory, see also this blog post trying to link the notion of hypothesis testing with causal inference. I just think that this is unlikely to happen in psychology in general and even less likely for the pairs of variables that researchers usually throw into their mediation models. For example, for most pairs of self-reported psychological variables, only those who live under a rock (or try to publish a paper) would act surprised if they turned out to be mildly correlated. |
Shameless plug for my blog post on this: http://steamtraen.blogspot.com/2020/04/in-psychology-everything-mediates.html
Given that you went through the trouble of coming up with a good title, I’ll include this in the opening.
Well, at least the title is impressive. 😂
It’s all that people remember. Source: me.
I am 100% stealing the phrase “three correlations in a trench coat”
By all means, feel free to use it — I think (not entirely sure) I actually stole it from Roger Giner-Sorolla.
I wish I’d said that! But actually it was Sanjay:
https://bsky.app/profile/sanjaysrivastava.com/post/3ldrluwran22t
Preach on! A couple of corollaries:
In the first wave of doubts about mediation there was a tendency to go causal-agnostic, and report mediations with disclaimers that “of course this mediation does not prove causation.” While technically correct, it is category-wrong, kind of like saying “of course this proof, done in plane geometry, does not prove that we’re dealing with objects on a plane.” I’m afraid to look because I can almost guarantee that some of my papers circa 2011 did this!
If you’re a doubter, try explaining to someone why you multiply the a and b path to get the indirect path, without resorting to a causal story. Multiplication is the operation of contingency!
I still think a lot of designs even in experimental psychology have inherent causal plausibility. If you manipulate a factor, get a subjective measure in the middle (e.g. self-report), and an objective measure at the end (e.g., a decision or behaviour), it is hard to argue that the decision could reach back in time and cause the subjective state. With two self-reports it is easier to claim that they both tap the same subjective moment.
Hi Roger,
thanks for chiming in!
The whole “The data are compatible with X, but of course they cannot prove it” thing seemed to have been popular for some time, and at least two people brought it up in response to this blog post. To me it mostly seems like a way to absolve oneself of inferential responsibilities and maintain plausible deniability (“I never claimed this proves anything!” which leads directly to “Science never proves anything; future experimental studies!”). But maybe it was just the obvious coping mechanism back in the day! I’d never judge a social psychologist for what they did in the 2010s 😂
I also can’t disagree with your point regarding the plausibility of mediation in experimental psychology. I guess it would depend on the variables involved — the subjective measure may just be correlated with the actual mediator (or the outcome, for that matter). One thing I have seen are treatment studies in which the outcome is some more “solid” symptom measure, but the mediator is essentially just some way to measure subjective well-being (say, a measure of meaning in life, or satisfaction with social relationships and the like). That is the type of mediation analysis that is guaranteed to result in a spurious indirect effect. But I guess that’s closer to self-report measures than to the type of experimental study you have in mind, where the fit between mediator and outcome is much tighter.
Your conclusion that mediation analysis with cross-sectional design with obervational data do not work is in my opinion overshooting and in some ways actually missing the point. If one does not buy into controlling for a set of potential confounder which we need in mediation analysis in a cross-sectional design with observational data, the situation is much better for the total effect or to say it clearer for any attempt to identify a causal effect with cross-sectional design with obervational data. There are good reasons to be sceptical what you can do in such a setting. But in general, the problem does not get much worse for mediation analysis relative to identifying a total effect.
There are good reasons why some disciplines have shifted to natural experiments or actual experiments, or on the other end in observable based disciplines, large surveys become more and more comprehensive to close the gaps of missing unobservables.
I agree with your assessment that the identification of indirect effects in cross-sectional observational data is essentially the same issue as the identification of any sort of effect in cross-sectional observational data (just with an additional set of confounders to consider). That I’m coming down more strongly on the side of “don’t do this” has probably a lot to do with practices in the more social corners of psychology.
Specifically, mediation analysis is often endorsed and explicitly taught when at the same time there is a strong consensus that you cannot possibly identify total causal effects in observational data. Which is of course logically inconsistent, but I think it has to do with the fact that people don’t see how “correlation does not imply causation” actually extends to mediation analysis. And then, more often than not, the mediator is some psychological self-report measure, as is the outcome, which opens up a huge space of possible confounders (for the treatment of interest there’s at least some chance that it’s slightly “more exogeneous”). Of course, in that scenario, the same arguments would apply to any attempt to identify the total effect of some psychological self-report measure on some other psychological self-report measure. But that’s actually something where a lot of psychologists would immediately say “nah, you cannot do that” (except maybe personality psychologists, which tbh gives me a lot of headaches about my own field. But I digress).
So I absolutely can see how you’d see my framing as overshooting, but for me it’s more pushing back against a certain type of selective understanding in the field. Does that make sense?
“mediation analysis is often endorsed and explicitly taught when at the same time there is a strong consensus that you cannot possibly identify total causal effects in observational data” => Ok, that’s just self-inflicted pain, and I’m beginning to understand now where your take is coming from. But then again, the general condemnation of mediation analysis is too strong in my opinion, whereas you are probably simply arguing for a consistent and thorough perspective of causal inference.
I agree that it’s just self-inflicted pain 😂 I’m not even sure people read my post as a condemnation of mediation analysis, I already got social media comments that imply I’m being to permissive here 🙈 It’s the duality of causal inference
This is prevalent in my area, especially with SEM models and the ubiquitous Process macro. Editors are starting to push back, but cross-sectional observational still gets through and I am guilty as well. I think because we have sophisticated modeling tools and want to use them. Who wants to publish a bunch of regressions anymore? It’s not “sexy” enough /s.
Truly curious what the solution is? Is it back to simple regressions? Is it being more intentional about measuring possible confounders and putting them into model (and then having to increase sample sizes by a lot)?
Thanks for chiming in! I see two way forwards:
(1) As you say, being more intentional about measuring potential confounders, more generally — taking the causal inference task underlying mediation analysis more seriously. I think MacKinnon & Pirlott (2015) is a great source for that direction: https://pubmed.ncbi.nlm.nih.gov/25063043/
(2) Maybe do something else that is more reasonably achievable. We discuss that in the ending paragraphs of our process paper here: https://journals.sagepub.com/doi/10.1177/25152459221095827#sec-4. There seems to be a strong expectation in parts of psychology that one person, or one paper does it all — present thrilling new data, establish an interesting effect and then demonstrate the underlying mechanisms and maybe also moderators. And, of course, you don’t just publish one of these papers every five years, but ideally it’s multiple a year. Taking causal inference more seriously may be used as an opening to scale back a bit and take things more slowly. For example, as we suggest in our paper, even just identifying a plausible causal identification strategy for a hard problem may warrant its own publication.
Which is the better way forward probably strongly depends on the variables involved as it determines whether the involved assumptions are strong but potentially defensible vs. just completely out there (as I believe is the case for many scenarios in which people really just look at 3 psychological self-report measures).