Analysis of covariance and moderation

# Analysis of covariance and moderation

**Session 8**

]

---
layout: false
name: ancova
class: center middle section-title section-title-7

# Analysis of covariance

---

# Covariates

.box-inv-7.large.sp-after-half[Covariate]
Explanatory measured **before** the experiment

Typically, cannot be acted upon.

.box-inv-7.large.sp-after-half[Example]
socioeconomic variables <br> environmental conditions

]

---

# IJLR: It's Just a Linear Regression...

All ANOVA models covered so far are linear regression model.

The latter says that

`$$\underset{\text{average response}\vphantom{l}}{\textsf{E}(Y_{i}\vphantom{\beta_p})}  = \underset{\text{linear (i.e., additive) combination of explanatories}}{\beta_0 + \beta_1 \mathrm{X}_{1i} + \cdots + \beta_p \mathrm{X}_{pi}}$$`

In an ANOVA, the model matrix `$\mathbf{X}$` simply includes columns with `$-1$`, `$0$` and `$1$` for group indicators that enforce sum-to-zero constraints.

---

# What's in a model?

In experimental designs, the explanatories are

- experimental factors (categorical)
- continuous (dose-response)

.box-inv-7.medium[
Random assignment implies<br> no systematic difference between groups.
]

---

# ANCOVA = Analysis of covariance

- Analysis of variance with added continuous covariate(s) to reduce experimental error (similar to blocking).
- These continuous covariates are typically concomitant variables (measured alongside response).
- Including them in the mean response (as slopes) can help reduce the experimental error (residual error).

---

# Control to gain power!

.box-inv-7.sp-after-half[
Identify external sources of variations
]

- enhance balance of design (randomization)
- reduce mean squared error of residuals to increase power

These steps should in principle increase power **if** the variables used as control are correlated with the response.

---

# Example

Abstract of [van Stekelenburg et al. (2021)](https://doi.org/10.1177/09567976211007788)

> In three experiments with more than 1,500 U.S. adults who held false beliefs, participants first learned the value of scientific consensus and how to identify it. Subsequently, they read a news article with information about a scientific consensus opposing their beliefs. We found strong evidence that in the domain of genetically engineered food, this two-step communication strategy was more successful in correcting misperceptions than merely communicating scientific consensus.

???

Aart van Stekelenburg, Gabi Schaap, Harm Veling and Moniek Buijzen (2021), Boosting Understanding and Identification of Scientific Consensus Can Help to Correct False Beliefs, Psychological Science
https://doi.org/10.1177/09567976211007788
---

# Experiment 2: Genetically Engineered Food

We focus on a single experiment; preregistered exclusion criteria led to `$n=442$` total sample size (unbalanced design).

Three experimental conditions:

.float-left.center[.box-7[`Boost`] .box-7[`Boost Plus`] .box-7[Consensus only (`consensus`)] ]

---

# Model formulation

Use `post` as response variable and `prior` beliefs as a control variable in the analysis of covariance.

> their response was measured on a visual analogue scale ranging from –⁠100 (I am 100% certain this is false) to 100 (I am 100% certain this is true) with 0 (I don’t know) in the middle.

---
# Plot of post vs prior response

---

# Model formulation

Average for the `$r$`th replication of the `$i$`th experimental group is
`$$\begin{align*}\mathsf{E}(\texttt{post}_{ir}) &= \mu + \alpha_i\texttt{condition}_i + \beta \texttt{prior}_{ir}.\\
\mathsf{Va}(\texttt{post}_{ir}) &= \sigma^2\end{align*}$$`

We assume that there is no interaction between `condition` and `prior`
- the slopes for `prior` are the same for each `condition` group.
- the effect of prior is linear

---

# Contrasts of interest

1. Difference between average boosts (`Boost` and `BoostPlus`) and control (`consensus`)
2. Difference between `Boost` and `BoostPlus` (pairwise)

Inclusion of the `prior` score leads to increased precision for the mean (reduces variability).

---

# Contrasts with ANCOVA

- The estimated marginal means will be based on a fixed value of the covariate (rather than detrended values)
- In the `emmeans` package, the average of the covariate is used as value.
- the difference between levels of `condition` are the same for any value of `prior` (parallel lines), but the uncertainty changes.

Multiple testing adjustments:

- Methods of Bonferroni (prespecified number of tests) and Scheffé (arbitrary contrasts) still apply
- Can't use Tukey anymore (adjusted means are not independent anymore).

---

# Data analysis - loading data

``` r
library(emmeans)
options(contrasts = c("contr.sum", "contr.poly"))
data(SSVB21_S2, package = "hecedsm")
# Check balance
with(SSVB21_S2, table(condition))
```

```
## condition
##     Boost BoostPlus consensus 
##       149       147       146
```

---

# Data analysis - scatterplot
.pull-left.small[

``` r
library(ggplot2)
ggplot(data = SSVB21_S2,
       aes(x = prior, y = post)) + 
       geom_point() + 
       geom_smooth(method = "lm",
                   se = FALSE)
```
.small[Strong correlation; note responses that achieve max of scale.]
]
.pull-right[
<img src="img/07/Experiment2_scatter.png" width="70%" style="display: block; margin: auto;" />

]

---

# Data analysis - model

``` r
# Check that the data are well randomized
car::Anova(lm(prior ~ condition, data = SSVB21_S2), type = 2)
# Fit linear model with continuous covariate
model1 <- lm(post ~ condition + prior, data = SSVB21_S2)
# Fit model without for comparison
model2 <- lm(post ~ condition, data = SSVB21_S2)
# Global test for differences
car::Anova(model1)
car::Anova(model2)
```

---

# Data analysis - ANOVA table
.pull-left.small[
<table>
 <thead>
  <tr>
   <th style="text-align:left;"> term </th>
   <th style="text-align:right;"> sum of squares </th>
   <th style="text-align:right;"> df </th>
   <th style="text-align:right;"> statistic </th>
   <th style="text-align:right;"> p-value </th>
  </tr>
 </thead>
<tbody>
  <tr>
   <td style="text-align:left;"> condition </td>
   <td style="text-align:right;"> 14107 </td>
   <td style="text-align:right;"> 2 </td>
   <td style="text-align:right;"> 3.0 </td>
   <td style="text-align:right;"> 0.05 </td>
  </tr>
  <tr>
   <td style="text-align:left;font-weight: bold;color: black !important;background-color: yellow !important;"> prior </td>
   <td style="text-align:right;font-weight: bold;color: black !important;background-color: yellow !important;"> 385385 </td>
   <td style="text-align:right;font-weight: bold;color: black !important;background-color: yellow !important;"> 1 </td>
   <td style="text-align:right;font-weight: bold;color: black !important;background-color: yellow !important;"> 166.1 </td>
   <td style="text-align:right;font-weight: bold;color: black !important;background-color: yellow !important;"> 0.00 </td>
  </tr>
  <tr>
   <td style="text-align:left;"> Residuals </td>
   <td style="text-align:right;"> 1016461 </td>
   <td style="text-align:right;"> 438 </td>
   <td style="text-align:right;">  </td>
   <td style="text-align:right;">  </td>
  </tr>
</tbody>
</table>
]
.pull-right.small[
<table>
 <thead>
  <tr>
   <th style="text-align:left;"> term </th>
   <th style="text-align:right;"> sum of squares </th>
   <th style="text-align:right;"> df </th>
   <th style="text-align:right;"> statistic </th>
   <th style="text-align:right;"> p-value </th>
  </tr>
 </thead>
<tbody>
  <tr>
   <td style="text-align:left;"> condition </td>
   <td style="text-align:right;"> 11680 </td>
   <td style="text-align:right;"> 2 </td>
   <td style="text-align:right;"> 1.83 </td>
   <td style="text-align:right;"> 0.162 </td>
  </tr>
  <tr>
   <td style="text-align:left;font-weight: bold;color: black !important;background-color: yellow !important;"> Residuals </td>
   <td style="text-align:right;font-weight: bold;color: black !important;background-color: yellow !important;"> 1401846 </td>
   <td style="text-align:right;font-weight: bold;color: black !important;background-color: yellow !important;"> 439 </td>
   <td style="text-align:right;font-weight: bold;color: black !important;background-color: yellow !important;">  </td>
   <td style="text-align:right;font-weight: bold;color: black !important;background-color: yellow !important;">  </td>
  </tr>
</tbody>
</table>
]

---

# Data analysis - contrasts

``` r
emm1 <- emmeans(model1, specs = "condition")
# Note order: Boost, BoostPlus, consensus
emm2 <- emmeans(model2, specs = "condition")
# Not comparable: since one is detrended and the other isn't
contrast_list <- list(
   "boost vs control" = c(0.5,  0.5, -1), 
   #av. boosts vs consensus
   "Boost vs BoostPlus" = c(1, -1,  0))
contrast(emm1, 
         method = contrast_list, 
         p.adjust = "holm")
```

---
# Data analysis - _t_-tests

.pull-left.small[
<table>
 <thead>
  <tr>
   <th style="text-align:left;"> contrast </th>
   <th style="text-align:right;"> estimate </th>
   <th style="text-align:right;"> se </th>
   <th style="text-align:right;"> df </th>
   <th style="text-align:right;"> t stat </th>
   <th style="text-align:right;"> p-value </th>
  </tr>
 </thead>
<tbody>
  <tr>
   <td style="text-align:left;"> boost vs control </td>
   <td style="text-align:right;"> -8.37 </td>
   <td style="text-align:right;"> 4.88 </td>
   <td style="text-align:right;"> 438 </td>
   <td style="text-align:right;"> -1.72 </td>
   <td style="text-align:right;"> 0.09 </td>
  </tr>
  <tr>
   <td style="text-align:left;"> Boost vs BoostPlus </td>
   <td style="text-align:right;"> 9.95 </td>
   <td style="text-align:right;"> 5.60 </td>
   <td style="text-align:right;"> 438 </td>
   <td style="text-align:right;"> 1.78 </td>
   <td style="text-align:right;"> 0.08 </td>
  </tr>
</tbody>
</table>
Contrasts with ANCOVA with `prior` (Holm-Bonferroni adjustment with `$k=2$` tests)
]
.pull-right.small[
<table>
 <thead>
  <tr>
   <th style="text-align:left;"> contrast </th>
   <th style="text-align:right;"> estimate </th>
   <th style="text-align:right;"> se </th>
   <th style="text-align:right;"> df </th>
   <th style="text-align:right;"> t stat </th>
   <th style="text-align:right;"> p-value </th>
  </tr>
 </thead>
<tbody>
  <tr>
   <td style="text-align:left;"> boost vs control </td>
   <td style="text-align:right;"> -5.71 </td>
   <td style="text-align:right;"> 5.71 </td>
   <td style="text-align:right;"> 439 </td>
   <td style="text-align:right;"> -1.00 </td>
   <td style="text-align:right;"> 0.32 </td>
  </tr>
  <tr>
   <td style="text-align:left;"> Boost vs BoostPlus </td>
   <td style="text-align:right;"> 10.74 </td>
   <td style="text-align:right;"> 6.57 </td>
   <td style="text-align:right;"> 439 </td>
   <td style="text-align:right;"> 1.63 </td>
   <td style="text-align:right;"> 0.10 </td>
  </tr>
</tbody>
</table>
Contrasts for ANOVA (Holm-Bonferroni adjustment with `$k=2$` tests)
]

---

# Data analysis - assumption checks

.pull-left.small[

``` r
# Test equality of variance
levene <- car::leveneTest(
   resid(model1) ~ condition, 
   data = SSVB21_S2,
   center = 'mean')
# Equality of slopes (interaction)
car::Anova(lm(post ~ condition * prior, 
           data = SSVB21_S2),
           model1, type = 2)
```

Levene's test of equality of variance: _F_ (2, 439) = 2.04 with a `$p$`-value of 0.131.
]
.pull-right.small[

<table>
 <thead>
  <tr>
   <th style="text-align:left;"> term </th>
   <th style="text-align:right;"> sum of squares </th>
   <th style="text-align:right;"> df </th>
   <th style="text-align:right;"> statistic </th>
   <th style="text-align:right;"> p-value </th>
  </tr>
 </thead>
<tbody>
  <tr>
   <td style="text-align:left;"> condition </td>
   <td style="text-align:right;"> 14107 </td>
   <td style="text-align:right;"> 2 </td>
   <td style="text-align:right;"> 3.0 </td>
   <td style="text-align:right;"> 0.05 </td>
  </tr>
  <tr>
   <td style="text-align:left;"> prior </td>
   <td style="text-align:right;"> 385385 </td>
   <td style="text-align:right;"> 1 </td>
   <td style="text-align:right;"> 166.1 </td>
   <td style="text-align:right;"> 0.00 </td>
  </tr>
  <tr>
   <td style="text-align:left;"> condition:prior </td>
   <td style="text-align:right;"> 3257 </td>
   <td style="text-align:right;"> 2 </td>
   <td style="text-align:right;"> 0.7 </td>
   <td style="text-align:right;"> 0.50 </td>
  </tr>
  <tr>
   <td style="text-align:left;font-weight: bold;color: black !important;background-color: yellow !important;"> Residuals </td>
   <td style="text-align:right;font-weight: bold;color: black !important;background-color: yellow !important;"> 1016461 </td>
   <td style="text-align:right;font-weight: bold;color: black !important;background-color: yellow !important;"> 438 </td>
   <td style="text-align:right;font-weight: bold;color: black !important;background-color: yellow !important;">  </td>
   <td style="text-align:right;font-weight: bold;color: black !important;background-color: yellow !important;">  </td>
  </tr>
</tbody>
</table>

Model with interaction `condition*prior`. Slopes don't differ between condition.
]

---

# The kitchen sink approach

.box-inv-7.medium.sp-after[Should we control for more stuff?]

.box-7.medium.sp-after[NO! ANCOVA is a design to reduce error]

- Randomization should ensure that there is no confounding
- No difference (on average) between group given a value of the covariate.
- If it isn't the case, adjustment matters

---

# Equal trends
.small[
- If trends are different, meaningful comparisons (?)
- Differences between groups depend on value of the covariate
]
<img src="08-slides_files/figure-html/ancovadifftrend-1.png" width="50%" style="display: block; margin: auto;" />
.small[Due to lack of overlap, comparisons hazardous as they entail extrapolation one way or another.]

---

# Testing equal slope

.box-7.sp-after[Compare two nested models]

- Null `$\mathscr{H}_0$`: model with covariate
- Alternative `$\mathscr{H}_a$`: model with interaction covariate * experimental factor

---

layout: false
name: linear-mediation
class: center middle section-title section-title-6

# Moderation

---

---

# Moderator

A **moderator** `$W$` modifies the direction or strength of the effect of an explanatory variable `$X$` on a response `$Y$` (interaction term).

<div class="figure" style="text-align: center">
<img src="08-slides_files/figure-html/fig-dag-moderation-1.png" alt="Directed acyclic graph of moderation" width="50%" />
<p class="caption">Directed acyclic graph of moderation</p>
</div>

.small[
Interactions are not limited to experimental factors: we can also have interactions with confounders, explanatories, mediators, etc.
]

---

# Moderation in a linear regression model

In a regression model, we simply include an **interaction** term to the model between `$W$` and `$X$`.

For example, if `$X$` is categorical with `$K$` levels and `$W$` is binary or continuous, imposing sum-to-zero constraints for `$\alpha_1, \ldots, \alpha_K$` and `$\beta_1, \ldots, \beta_K$` gives
$$
\underset{\text{average response of group `$k$` at `$w$`}}{\mathrm{E}(Y \mid X=k, W=w)} = \underset{\text{intercept of group `$k$`}}{\alpha_0 +\alpha_k} + \underset{\text{slope of group `$k$`}}{(\beta_0 + \beta_k)} w
$$

---

# Testing for the interaction

Test jointly whether coefficients associated to `$XW$` are zero, i.e., `$$\beta_1 = \cdots = \beta_K=0.$$`

The moderator `$W$` can be continuous or categorical with `$L \geq 2$` levels

The degrees of freedom (additional parameters for the interaction) in the `$F$` test are

- `$K-1$` for continuous `$W$`
   - are slopes parallel?
- `$(K-1) \times (L-1)$` for categorical `$W$`
   - are all subgroup averages the same?

---

# Example

We consider data from [Garcia et al. (2010)](https://doi.org/10.1002/ejsp.644), a study on gender discrimination. Participants were given a fictional file where a women was turned down promotion in favour of male colleague despite her being clearly more experimented and qualified.

The authors manipulated the decision of the participant, with choices:
.small[

- not to challenge the decision (no protest), 
- a request to reconsider based on individual qualities of the applicants (individual)
- a request to reconsider based on abilities of women (collective).

]
The postulated moderator variable is `sexism`, which assesses pervasiveness of gender discrimination.

---
# Model fit

We fit the linear model with the interaction.

``` r
data(GSBE10, package = "hecedsm")
lin_moder <- lm(respeval ~ protest*sexism, 
               data = GSBE10)
summary(lin_moder) # coefficients
car::Anova(lin_moder, type = 2) # tests
```

---

# ANOVA table

<table class="table" style="margin-left: auto; margin-right: auto;">
 <thead>
  <tr>
   <th style="text-align:left;"> term </th>
   <th style="text-align:right;"> sum of squares </th>
   <th style="text-align:right;"> df </th>
   <th style="text-align:right;"> stat </th>
   <th style="text-align:left;"> p-value </th>
  </tr>
 </thead>
<tbody>
  <tr>
   <td style="text-align:left;"> sexism </td>
   <td style="text-align:right;"> 0.27 </td>
   <td style="text-align:right;"> 1 </td>
   <td style="text-align:right;"> 0.21 </td>
   <td style="text-align:left;"> .648 </td>
  </tr>
  <tr>
   <td style="text-align:left;"> protest:sexism </td>
   <td style="text-align:right;"> 12.49 </td>
   <td style="text-align:right;"> 2 </td>
   <td style="text-align:right;"> 4.82 </td>
   <td style="text-align:left;"> .010 </td>
  </tr>
  <tr>
   <td style="text-align:left;"> Residuals </td>
   <td style="text-align:right;"> 159.22 </td>
   <td style="text-align:right;"> 123 </td>
   <td style="text-align:right;">  </td>
   <td style="text-align:left;">  </td>
  </tr>
</tbody>
</table>

---

# Effects

<img src="08-slides_files/figure-html/unnamed-chunk-4-1.png" width="65%" style="display: block; margin: auto;" />
.small[

Results won't necessarily be reliable outside of the range of observed values of sexism.

]
---

# Comparisons between groups

Simple effects and comparisons must be done for a fixed value of sexism (since the slopes are not parallel).

The default value in `emmeans` is the mean value of `sexism`, but we could query for averages at different values of sexism (below for empirical quartiles).

``` r
quart <-  quantile(GSBE10$sexism, probs = c(0.25, 0.5, 0.75))
emmeans(lin_moder, 
        specs = "protest", 
        by = "sexism",
        at = list("sexism" = quart))
```

With moderating *factors*, give weights to each sub-mean corresponding to the frequency of the moderator rather than equal-weight to each category (`weights = "prop"`).

]
---

# Sensitivity analysis

The [Johnson and Neyman (1936)](https://doi.org/10.1007/bf02288864) method looks at the range of values of moderator `$W$` for which difference between treatments (binary `$X$`) is not statistically significant.

``` r
lin_moder2 <- lm(
  respeval ~ protest*sexism, 
  data = GSBE10 |> 
  # We dichotomize the manipulation, pooling protests together
  dplyr::mutate(protest = as.integer(protest != "no protest")))
# Test for equality of slopes/intercept for two protest groups
anova(lin_moder, lin_moder2)
# p-value of 0.18: fail to reject individual = collective.
```

---
# Syntax for plot

``` r
jn <- interactions::johnson_neyman(
  model = lin_moder2, # linear model
  pred = protest, # binary experimental factor
  modx = sexism, # moderator
  control.fdr = TRUE, # control for false discovery rate
  mod.range = range(GSBE10$sexism)) # range of values for sexism
jn$plot
```

---

# Plot of Johnson−Neyman intervals

<div class="figure" style="text-align: center">
<img src="08-slides_files/figure-html/fig-jn-1.png" alt="Johnson−Neyman plot for difference between protest and no protest as a function of sexism." width="70%" />
<p class="caption">Johnson−Neyman plot for difference between protest and no protest as a function of sexism.</p>
</div>

---

# Moderation

More generally, **moderation** refers to any explanatory variable (whether continuous or categorical) which **interacts** with the experimental manipulation.

- For categorical-categorical, this is a multiway ANOVA model
- For continuous-categorical, use linear regression

---

# Summary

* Inclusion of continuous covariates may help filtering out unwanted variability.
* These are typically variables measured before or alongside the response variable.
* This design reduce the residual error, leading to an increase in power (more ability to detect differences in average between experimental conditions).
* We are only interested in differences due to experimental condition (marginal effects).
* In general, there should be no interaction between covariates/blocking factors and experimental conditions. 
* This hypothesis can be assessed by comparing the models with and without interaction, if there are enough units (e.g., equality of slope for ANCOVA).
* Moderators are variables that interact with the experimental factor. We assess their presence by testing for an interaction in a linear regression model.

]