class: center middle main-title section-title-1 # Linear mediation .class-info[ **Session 12** .light[MATH 80667A: Experimental Design and Statistical Methods <br> HEC Montréal ] ] --- layout: false name: linear-mediation class: center middle section-title section-title-4 # Linear mediation --- layout: true class: title title-4 --- # Reminder: three types of causal associations .pull-left-3[ .box-4.medium.sp-after-half[Confounding] .box-inv-4.small.sp-after-half[Common cause] .box-inv-4.small[Causal forks **X** ← **Z** → **Y**] ] .pull-middle-3[ .box-4.medium.sp-after-half[Causation] .box-inv-4.small.sp-after-half[Mediation] .box-inv-4.small[Causal chain **X** → **Z** → **Y**] ] .pull-right-3[ .box-4.medium.sp-after-half[Collision] .box-inv-4.small.sp-after-half[Selection /<br>endogeneity] .box-inv-4.small[inverted fork **X** → **Z** ← **Y**] ] <img src="img/12/causal_dag_aheiss.jpg" width="70%" style="display: block; margin: auto;" /> --- # Notation Define - treatment of individual `\(i\)` as `\(X_i\)`, typically binary with `\(X_i \in \{0,1\}\)` and - `\(X=0\)` (control), else `\(X=x_0\)` - `\(X=1\)` (treatment) - potential mediation given treatment `\(x\)` as `\(M_i(x)\)` and - potential outcome for treatment `\(x\)` and mediator `\(m\)` as `\(Y_i(x, m)\)`. --- # Sequential ignorability assumption 1. Given pre-treatment covariates `\(\boldsymbol{Z}\)`, potential outcomes for mediation and treatment are conditionally independent of treatment assignment. $$ Y_i(x', m), M_i(x)\ {\perp\mkern-10mu\perp}\ X_i \mid \boldsymbol{Z}_i = \boldsymbol{z}$$ 2. Given pre-treatment covariates `\(\boldsymbol{Z}\)` and observed treatment `\(x\)`, potential outcomes for the response are independent of mediation. $$ Y_i(x', m)\ \perp\mkern-10mu\perp\ M_i(x) \mid X_i =x, \boldsymbol{Z}_i = \boldsymbol{z}$$ - Assumption 1 holds under randomization of treatment. - Assumption 2 implies there is no confounder affecting both `\(Y_i, M_i\)`. --- # Directed acyclic graph <div class="figure" style="text-align: center"> <img src="12-slides_files/figure-html/fig-dag-linearmed-1.png" alt="Directed acyclic graph of the linear mediation model" width="60%" /> <p class="caption">Directed acyclic graph of the linear mediation model</p> </div> .small[ `\(\boldsymbol{Z}_M\)` and `\(\boldsymbol{Z}_Y\)` are controls for confounders, may or not be present in the model. ] --- # Total effect **Total effect**: overall impact of `\(X\)` (both through `\(M\)` and directly) `$$\begin{align*}\mathsf{TE}(x, x_0) = \mathsf{E}[ Y \mid \text{do}(X=x)] - \mathsf{E}[ Y \mid \text{do}(X=x_0)]\end{align*}$$` This can be generalized for continuous `\(X\)` to any pair of values `\((x_1, x_2)\)`. .pull-left[ .box-inv-4[ **X** → **M** → **Y** <br>plus <br>**X** → **Y** ] ] .pull-right[ <img src="12-slides_files/figure-html/moderation-1.png" width="80%" style="display: block; margin: auto;" /> ] --- # Average controlled direct effect .small[ `\begin{align*} \textsf{ACDE}(m, x, x_0) &= \mathsf{E}\{Y_i(x, m) - Y_i(x_0, m)\} \\&= \mathsf{E}\{Y \mid \text{do}(X=x, m=m)\} - \mathsf{E}\{Y \mid \text{do}(X=x_0, m=m)\} \end{align*}` ] The average controlled direct effect (ACDE) is the expected change in response for the population when - the experimental factor changes from `\(x\)` to `\(x_0\)` and - the mediator is set to a fixed value `\(m\)` This typically requires experimental manipulation of both variables. --- # Direct and indirect effects **Natural direct effect**: the expected change in `\(Y\)` under treatment `\(x\)` if `\(M\)` is set to whatever value it would take under control `\(x_0\)` `$$\textsf{NDE}(x, x_0) = \mathsf{E}[Y\{x, M(x_0)\} - Y\{x_0, M({x_0})\}]$$` **Natural indirect effect**: the expected change in `\(Y\)` if we set `\(X\)` to its control value and change the mediator value which it would attain under `\(x\)` `$$\textsf{NIE}(x, x_0) = \mathsf{E}[Y\{x_0, M(x)\} - Y\{x_0, M(x_0)\}]$$` .small[ Counterfactual conditioning reflects a physical intervention (experimentation), not mere conditioning. ] --- # Necessary and sufficiency of mediation From Pearl (2014): > The difference `\(\textsf{TE}-\textsf{NDE}\)` quantifies the extent to which the response of `\(Y\)` is owed to mediation, while `\(\textsf{NIE}\)` quantifies the extent to which it is explained by mediation. These two components of mediation, the necessary and the sufficient, coincide into one in models void of interactions (e.g., linear) but differ substantially under moderation ??? This definition works under temporal reversal and gives the correct answer (the regression-slope approach of the linear structural equation model does not). In linear systems, changing the order of arguments amounts to flipping signs --- # The Baron−Kenny linear mediation model Consider the following two linear regression models with a binary treatment `\(X \in \{0,1\}\)` and `\(M\)` binary or continuous: `$$\begin{align} \underset{\text{mediator}}{M} &= \underset{\text{intercept}}{c_M} + \alpha X + \underset{\text{error term}}{\varepsilon_M}\\ \underset{\text{response}}{Y} &= \underset{\text{intercept}}{c_Y} + \underset{\text{direct effect}}{\beta X} + \gamma M + \underset{\text{error term}}{\varepsilon_Y} \end{align}$$` We assume that zero-mean error terms `\(\varepsilon_M\)` and `\(\varepsilon_Y\)` are **uncorrelated**. - This is tied to the *no confounders* assumption. --- # Total effect decomposition Plugging the first equation in the second, we get the marginal model for `\(Y\)` given treatment `\(X\)` `$$\begin{align} \mathsf{E}(Y \mid X=x) &= \underset{\text{intercept}}{(c_Y + \gamma c_M)} + \underset{\text{total effect}}{(\beta + \alpha\gamma)}\cdot x \end{align}$$` In an experiment, we can obtain the total effect via the ANOVA model, with $$ `\begin{align*} Y &= \underset{\text{average of control}}{\nu} + \underset{\text{total effect}}{\tau X} + \underset{\text{error term}}{\varepsilon_{Y'}} \\ \tau&= \mathrm{E}\{Y \mid \mathrm{do}(X=1)\} - \mathrm{E}\{Y \mid \mathrm{do}(X=0)\} \end{align*}` $$ --- # Example from Preacher and Hayes (2004) .small[ > Suppose an investigator is interested in the effects of a new cognitive therapy on life satisfaction after retirement. >Residents of a retirement home diagnosed as clinically depressed are randomly assigned to receive 10 sessions of a new cognitive therapy `\((X = 1)\)` or 10 sessions of an alternative (standard) therapeutic method `\((X = 0)\)`. > After Session 8, the positivity of the attributions the residents make for a recent failure experience is assessed `\((M)\)`. > Finally, at the end of Session 10, the residents are given a measure of life satisfaction `\((Y)\)`. The question is whether the cognitive therapy’s effect on life satisfaction is mediated by the positivity of their causal attributions of negative experiences. ” ] --- # Old method This approach has been discontinued, but still appears in older papers. Baron and Kenny recommended running three linear regressions and testing 1. whether `\(\mathscr{H}_0: \alpha=0\)` 2. whether `\(\mathscr{H}_0: \tau=0\)` (total effect) 3. whether `\(\mathscr{H}_0: \gamma=0\)` The average conditional mediation effect (ACME) in the linear mediation model is `\(\alpha\gamma\)` and we can check whether it's zero using Sobel's test statistic. --- # Problems with Baron–Kenny approach - We conduct three tests, so this inflates the Type I error. - The total effect can be zero because `\(\alpha\gamma = - \beta\)`, even if there is mediation. - The method has lower power to detect mediation when effect sizes are small. --- # Sobel's test Based on estimators of coefficients `\(\widehat{\alpha}\)` and `\(\widehat{\gamma}\)`, construct a test statistic $$ S = \frac{\widehat{\alpha}\widehat{\gamma}-0}{\mathsf{se}(\widehat{\alpha}\widehat{\gamma})} $$ The coefficient and variance estimates can be extracted from the output of the regression model. In large sample, `\(S \stackrel{\cdot}{\sim}\mathsf{Normal}(0,1)\)`, but this approximation may be poor in small samples. ??? Without interaction/accounting for confounders, `\(\alpha\gamma = \tau - \beta\)` and with OLS we get exactly the same point estimates. The derivation of the variance is then relatively straightforward using the delta method. --- # Other test statistics Sobel's test is not the only test. Alternative statistics are discussed in > MacKinnon, D. P., Lockwood, C. M., Hoffman, J. M., West, S. G., & Sheets, V. (2002). A comparison of methods to test mediation and other intervening variable effects. Psychological Methods, 7(1), 83–104. https://doi.org/10.1037/1082-989X.7.1.83 --- # Alternative An alternative to estimate _p_-value and the confidence interval is through the nonparametric **bootstrap** with the percentile method, popularized by Preacher and Hayes (2004) Nonparametric bootstrap: repeat `\(B\)` times, say `\(B=10\ 000\)` 1. sample `\(n\)` (same as original number of observations) tuples `\((Y_i, X_i, M_i)\)` from the database **with replacement** to obtain a new sample. 2. recalculate estimates `\(\widehat{\alpha}^{(b)}\widehat{\gamma}^{(b)}\)` for each bootstrap dataset --- # Bootstrap confidence intervals **Percentile-based method**: for a equitailed `\(1-\alpha\)` interval 1. Run the nonparametric bootstrap and obtain estimates `\(\widehat{\alpha}^{(b)}\)` and `\(\widehat{\gamma}^{(b)}\)` from the `\(b\)`th bootstrap sample. 2. Compute the `\(\alpha/2\)` and `\(1-\alpha/2\)` empirical quantiles of `$$\{\widehat{\alpha}^{(b)}\widehat{\gamma}^{(b)}\}_{b=1}^B.$$` --- # Boostrap two-sided _p_-value Compute the sample proportion of bootstrap statistics that are larger/smaller than zero. 1. Order bootstrap statistics `\(S^{(1)} \leq \cdots \leq S^{(B)}\)` and let `\(S^{(0)} = -\infty\)`, `\(S^{(M+1)} = \infty\)`. 2. Find `\(M\)` ( `\(0 \leq M \leq B\)` ) such that `\(S^{(M)} < 0 \leq S^{(M+1)}\)` (if it exists) 3. The `\(p\)`-value is `$$p = 2\min\{M/B, 1-M/B\}.$$` ??? Note: many bootstraps! parametric, wild, sieve, block, etc. and many methods (basic, studentized, bias corrected and accelerated --- # Model assumptions Same assumptions as analysis of variance and linear models - Linearity of the mean model - residual plots, fitted values `\(\widehat{y}\)` against `\(m\)` and `\(x\)` - Independent/uncorrelated errors - no confounding, lack of serial correlation (e.g., cross-panels) - Equal variance of errors in each model (homoskedasticity) - Large samples --- # Causal assumptions Conclusions about mediation are valid only when causal assumptions hold. Assuming that `\(X\)` is randomized, we need - Lack of interaction between `\(X\)` and `\(M\)` - can be added to model, then use NID definition - Causal direction: `\(M \to Y\)`, so `\(M\)` must be an antecedent cause - `\(M\)` must be measured before `\(Y\)` - Reliability of `\(M\)` (no measurement error) - No confounding between `\(X\)` and `\(M\)` - can be included, but not mediators/colliders + correct form - effect constant over individuals/levels --- # Sensitivity analysis The no-unmeasured confounders assumption should be challenged. One way to assess the robustness of the conclusions to this is to consider correlation between errors, as (e.g., [Bullock, Green and Ha, 2010](https://doi.org/10.1037/a0018933)) `$$\mathsf{E}(\widehat{\gamma})= \gamma + \mathsf{Cov}(\varepsilon_M, \varepsilon_Y)/\mathsf{Va}(\varepsilon_M)$$` - We vary `\(\rho=\mathsf{Cor}(\varepsilon_M, \varepsilon_Y)\)` to assess the sensitivity of our conclusions to confounding. - The `medsens` function in the **R** package `mediation` implements the diagnostic of [Imai, Keele and Yamamoto (2010)](https://doi.org/10.1214/10-STS321) for the linear mediation model. --- # Defaults of linear mediation models .pull-left-wide[ - Definitions contingent on model - (even if causal quantities have a meaning regardless of estimation method) - It is possible to weaken assumptions (at the expense of more complicated models) - Most papers do not consider confounders, or even check for assumptions - Generalizations to interactions, multiple mediators, etc., requires care ] .pull-right-narrow[ ![](img/12/spherical_cow.png) .small[Keenan Crane] ] --- # Key references .small[ - Baron and Kenny (1986), [The Moderator-Mediator Variable Distinction in Social Psychological Research: Conceptual, Strategic, and Statistical Considerations](https://doi.org/10.1037/0022-4514.51.6.1173), *Journal of Personality and Social Psychology* - Imai, Keele and Tingley (2010), [A General Approach to Causal Mediation Analysis](https://doi.org/10.1037/a0020761), *Psychological Methods*. - Imai, Tingley and Yamamoto (2013), [Experimental designs for identifying causal mechanisms (with Discussion)](https://doi.org/10.1111/j.1467-985X.2012.01032.x), Journal of the Royal Statistical Society: Series A. - Pearl (2014), [Interpretation and Identification of Causal Mediation](http://dx.doi.org/10.1037/a0036434), *Psychological Methods*. - Bullock, Green, and Ha (2010), [Yes, but what’s the mechanism? (don’t expect an easy answer)](https://doi.org/10.1037/a0018933) - Uri Simonsohn (2022) [Mediation Analysis is Counterintuitively Invalid](http://datacolada.org/103) - Preacher, K. J., and Hayes, A. F. (2004). [SPSS and SAS procedures for estimating indirect effects in simple mediation models](https://doi.org/10.3758/BF03206553). Behavior Research Methods, Instruments & Computers. - [David Kenny's website](https://davidakenny.net/cm/mediate.htm) ]