Complete factorial designs

# Complete factorial designs

**Session 6**

.light[MATH 80667A: Experimental Design and Statistical Methods <br>for Quantitative Research in Management <br>
HEC Montréal
]

]

---

# Outline

.box-5.medium.sp-after-half[Unbalanced designs]

.box-6.medium.sp-after-half[Multifactorial designs]

---
layout: false
name: unbalanced-designs
class: center middle section-title section-title-5

# Unbalanced designs

---
class: title title-5
# Premise

So far, we have exclusively considered balanced samples

.box-inv-5.large.sp-after-half.sp-before[
balanced = same number of observational units in each subgroup
]

Most experiments (even planned) end up with unequal sample sizes.

---

Unbalanced samples may be due to many causes, including randomization (need not balance) and loss-to-follow up (dropout)

If dropout is random, not a  problem
- Example of Baumann, Seifert-Kessel, Jones (1992): 
   > Because of illness and transfer to another school, incomplete data were obtained for one subject each from the TA and DRTA group

---

If loss of units due to treatment or underlying conditions, problematic!

Rosensaal (2021) rebuking a study on the effectiveness of  hydrochloriquine as treatment for Covid19 and reviewing allocation:
   > Of these 26, six were excluded (and incorrectly labelled as lost to follow-up): three were transferred to the ICU, one died, and two terminated treatment or were discharged

Sick people excluded from the treatment group! then claim it is better.

Worst: "The index [treatment] group and control group were drawn from different centres."

???

Review of: “Hydroxychloroquine and azithromycin as a treatment of COVID-19: results of an open-label non-randomized clinical trial Gautret et al 2010, DOI:10.1016/j.ijantimicag.2020.105949
https://doi.org/10.1016/j.ijantimicag.2020.106063
---

Two main reasons

1. Power considerations: with equal variance in each group, balanced samples gives the best allocation
2. Simplicity of interpretation and calculations: the interpretation of the `$F$` test in a linear regression is unambiguous

---

Consider a t-test for assessing the difference between treatments `$A$` and `$B$` with equal variability
`$$t= \frac{\text{estimated difference}}{\text{estimated variability}} = \frac{(\widehat{\mu}_A - \widehat{\mu}_B) - 0}{\mathsf{se}(\widehat{\mu}_A - \widehat{\mu}_B)}.$$`

The standard error of the average difference is 
`$$\sqrt{\frac{\text{variance}_A}{\text{nb of obs. in }A} + \frac{\text{variance}_B}{\text{nb of obs. in }B}} = \sqrt{\frac{\sigma^2}{n_A} + \frac{\sigma^2}{n_B}}$$`

---
class: title title-5
# Optimal allocation of ressources

<img src="06-slides_files/figure-html/stderrordiffcurve-1.png" width="65%" style="display: block; margin: auto;" />
.small[
The allocation  of `$n=n_A + n_B$` units that minimizes the std error is `$n_A = n_B = n/2$`.
]

---
class: title title-5
# Example: tempting fate

We consider data from Multi Lab 2, a replication study that examined Risen and Gilovich (2008) who
.small[
> explored the belief that tempting fate increases bad outcomes. They tested whether people judge the likelihood of a negative outcome to be higher when they have imagined themselves [...] tempting fate [...] (by not reading before class) or  not [tempting] fate (by coming to class prepared). Participants then estimated how likely it was that [they] would be called on by the professor (scale from 1, not at all likely, to 10, extremely likely).
]

The replication data gathered in 37 different labs focuses on a 2 by 2 factorial design with gender (male vs female) and condition (prepared vs unprepared) administered to undergraduates.

---

- We consider a 2 by 2 factorial design.
- The response is `likelihod`
- The experimental factors are `condition` and `gender`
- Two data sets: `RS_unb` for the full data, `RS_bal` for the artificially balanced one.

]

.pull-left.small[

```r
summary_stats <- 
  RS_unb |> 
  group_by(condition) |> 
  summarize(nobs = n(),
            mean = mean(likelihood))
```
]
.pull-right[

<table>
<caption>Summary statistics</caption>
 <thead>
  <tr>
   <th style="text-align:left;"> condition </th>
   <th style="text-align:right;"> nobs </th>
   <th style="text-align:right;"> mean </th>
  </tr>
 </thead>
<tbody>
  <tr>
   <td style="text-align:left;"> unprepared </td>
   <td style="text-align:right;"> 2192 </td>
   <td style="text-align:right;"> 4.606 </td>
  </tr>
  <tr>
   <td style="text-align:left;"> prepared </td>
   <td style="text-align:right;"> 2241 </td>
   <td style="text-align:right;"> 4.060 </td>
  </tr>
</tbody>
</table>
  
]
]

.pull-left.small[

```r
# Enforce sum-to-zero parametrization
options(contrasts = rep("contr.sum", 2))
# Anova is a linear model, fit using 'lm'
# 'aov' only for *balanced data*
model <- lm(
  likelihood ~ gender * condition,
  data = RS_unb)
library(emmeans)
emm <- emmeans(model, 
               specs = "condition")
```
]
.pull-right[

<table>
<caption>Marginal means for condition</caption>
 <thead>
  <tr>
   <th style="text-align:left;"> condition </th>
   <th style="text-align:right;"> emmean </th>
   <th style="text-align:right;"> SE </th>
  </tr>
 </thead>
<tbody>
  <tr>
   <td style="text-align:left;"> unprepared </td>
   <td style="text-align:right;"> 4.504 </td>
   <td style="text-align:right;"> 0.0540 </td>
  </tr>
  <tr>
   <td style="text-align:left;"> prepared </td>
   <td style="text-align:right;"> 4.022 </td>
   <td style="text-align:right;"> 0.0535 </td>
  </tr>
</tbody>
</table>
.small[
Note unequal standard errors.
]
]
]
]
---
class: title title-5
# Explaining the discrepancies

Estimated marginal means are based on equiweighted groups:
`$$\widehat{\mu} = \frac{1}{4}\left( \widehat{\mu}_{11} + \widehat{\mu}_{12} + \widehat{\mu}_{21} + \widehat{\mu}_{22}\right)$$`
where `$\widehat{\mu}_{ij} = n_{ij}^{-1} \sum_{r=1}^{n_{ij}} y_{ijr}$`.

The sample mean is the sum of observations divided by the sample size.

The two coincide when `$n_{11} = \cdots = n_{22}$`.

---
class: title title-5
# Why equal weight?

- The ANOVA and contrast analyses, in the case of unequal sample sizes, are generally based on marginal means (same weight for each subgroup).
- This choice is justified because research questions generally concern comparisons of means across experimental groups.

---
class: title title-5
# Revisiting the `$F$` statistic

Statistical tests contrast competing **nested** models:

- an alternative (full) model
- a null model, which imposes restrictions (a simplification of the alternative models)

The numerator of the `$F$`-statistic compares the sum of square of a model with (given) main effect, etc. to a model without.

---
class: title title-5
# What is explained by condition?

Consider the `$2 \times 2$` factorial design with factors `$A$`: `gender` and `$B$`: `condition` (prepared vs unprepared) without interaction.

What is the share of variability (sum of squares) explained by the experimental condition?
---
class: title title-5
# Comparing differences in sum of squares (1)

Consider a balanced sample

```r
anova(lm(likelihood ~ 1, data = RS_bal), 
      lm(likelihood ~ condition, data = RS_bal))
# When gender is present
anova(lm(likelihood ~ gender, data = RS_bal), 
      lm(likelihood ~ gender + condition, data = RS_bal))
```

---
class: title title-5
# Comparing differences in sum of squares (2)

Consider an unbalanced sample

```r
anova(lm(likelihood ~ 1, data = RS_unb), 
      lm(likelihood ~ condition, 
         data = RS_unb))
# When gender is present      
anova(lm(likelihood ~ gender, data = RS_unb), 
      lm(likelihood ~ gender + condition, 
         data = RS_unb))
```

---

Balanced designs yield orthogonal factors: the improvement in the goodness of fit (characterized by change in sum of squares) is the same regardless of other factors.

So effect of `$B$` and `$B \mid A$` (read `$B$` given `$A$`) is the same.

- test for `$B \mid A$` compares `$\mathsf{SS}(A, B) - \mathsf{SS}(A)$`
- for balanced design, `$\mathsf{SS}(A, B) = \mathsf{SS}(A) + \mathsf{SS}(B)$` (factorization).

We lose this property with unbalanced samples: there are distinct formulations of ANOVA.

---

The default method in **R** with `anova` is the sequential decomposition: in the order of the variables `$A$`, `$B$` in the formula

- So `$F$` tests are for tests of effect of 
  - `$A$`, based on `$\mathsf{SS}(A)$`
  - `$B \mid A$`, based on `$\mathsf{SS}(A, B) - \mathsf{SS}(A)$`
  - `$AB \mid A, B$` based on `$\mathsf{SS}(A, B, AB) - \mathsf{SS}(A, B)$`

Since the order in which we list the variable is **arbitrary**, these `$F$` tests are not of interest.

---

Impact of 
- `$A \mid B$`  based on `$\mathsf{SS}(A, B) - \mathsf{SS}(B)$`
- `$B \mid A$` based on `$\mathsf{SS}(A, B) - \mathsf{SS}(A)$`
- `$AB \mid A, B$` based on `$\mathsf{SS}(A, B, AB) - \mathsf{SS}(A, B)$`
- tests invalid if there is an interaction.
- In **R**, use `car::Anova(model, type = 2)`

---

Most commonly used approach

- Improvement due to `$A \mid B, AB$`, `$B \mid A, AB$` and `$AB \mid A, B$`
- What is improved by adding a factor, interaction, etc. given the rest 
- may require imposing equal mean for rows for `$A \mid B, AB$`, etc. 
   - (**requires** sum-to-zero parametrization)
- valid in the presence of interaction
- but `$F$`-tests for main effects are not of interest
- In **R**, use `car::Anova(model, type = 3)`

---
class: title title-5
# ANOVA for unbalanced data
.pull-left.small[

```r
model <- lm(
  likelihood ~ condition * gender,
  data = RS_unb)
# Three distinct decompositions
anova(model) #type 1
car::Anova(model, type = 2)
car::Anova(model, type = 3)
```

<table>
<caption>ANOVA (type I)</caption>
 <thead>
  <tr>
   <th style="text-align:left;">   </th>
   <th style="text-align:right;"> Df </th>
   <th style="text-align:right;"> Sum Sq </th>
   <th style="text-align:right;"> F value </th>
  </tr>
 </thead>
<tbody>
  <tr>
   <td style="text-align:left;"> gender </td>
   <td style="text-align:right;"> 1 </td>
   <td style="text-align:right;"> 164.94 </td>
   <td style="text-align:right;"> 29.1 </td>
  </tr>
  <tr>
   <td style="text-align:left;"> condition </td>
   <td style="text-align:right;"> 1 </td>
   <td style="text-align:right;"> 332.34 </td>
   <td style="text-align:right;"> 58.7 </td>
  </tr>
  <tr>
   <td style="text-align:left;"> gender:condition </td>
   <td style="text-align:right;"> 1 </td>
   <td style="text-align:right;"> 36.55 </td>
   <td style="text-align:right;"> 6.5 </td>
  </tr>
  <tr>
   <td style="text-align:left;"> Residuals </td>
   <td style="text-align:right;"> 4429 </td>
   <td style="text-align:right;"> 25086.33 </td>
   <td style="text-align:right;">  </td>
  </tr>
</tbody>
</table>
]

.pull-right.small[
<table>
<caption>ANOVA (type II)</caption>
 <thead>
  <tr>
   <th style="text-align:left;">   </th>
   <th style="text-align:right;"> Df </th>
   <th style="text-align:right;"> Sum Sq </th>
   <th style="text-align:right;"> F value </th>
  </tr>
 </thead>
<tbody>
  <tr>
   <td style="text-align:left;"> gender </td>
   <td style="text-align:right;"> 1 </td>
   <td style="text-align:right;"> 166.33 </td>
   <td style="text-align:right;"> 29.4 </td>
  </tr>
  <tr>
   <td style="text-align:left;"> condition </td>
   <td style="text-align:right;"> 1 </td>
   <td style="text-align:right;"> 332.34 </td>
   <td style="text-align:right;"> 58.7 </td>
  </tr>
  <tr>
   <td style="text-align:left;"> gender:condition </td>
   <td style="text-align:right;"> 1 </td>
   <td style="text-align:right;"> 36.55 </td>
   <td style="text-align:right;"> 6.5 </td>
  </tr>
  <tr>
   <td style="text-align:left;"> Residuals </td>
   <td style="text-align:right;"> 4429 </td>
   <td style="text-align:right;"> 25086.33 </td>
   <td style="text-align:right;">  </td>
  </tr>
</tbody>
</table>

<table>
<caption>ANOVA (type III)</caption>
 <thead>
  <tr>
   <th style="text-align:left;">   </th>
   <th style="text-align:right;"> Df </th>
   <th style="text-align:right;"> Sum Sq </th>
   <th style="text-align:right;"> F value </th>
  </tr>
 </thead>
<tbody>
  <tr>
   <td style="text-align:left;"> gender </td>
   <td style="text-align:right;"> 1 </td>
   <td style="text-align:right;"> 167.71 </td>
   <td style="text-align:right;"> 29.6 </td>
  </tr>
  <tr>
   <td style="text-align:left;"> condition </td>
   <td style="text-align:right;"> 1 </td>
   <td style="text-align:right;"> 227.88 </td>
   <td style="text-align:right;"> 40.2 </td>
  </tr>
  <tr>
   <td style="text-align:left;"> gender:condition </td>
   <td style="text-align:right;"> 1 </td>
   <td style="text-align:right;"> 36.55 </td>
   <td style="text-align:right;"> 6.5 </td>
  </tr>
  <tr>
   <td style="text-align:left;"> Residuals </td>
   <td style="text-align:right;"> 4429 </td>
   <td style="text-align:right;"> 25086.33 </td>
   <td style="text-align:right;">  </td>
  </tr>
</tbody>
</table>

]
---
class: title title-5
# ANOVA for balanced data
.pull-left.small[

```r
model2 <- lm(
  likelihood ~ condition * gender,
  data = RS_bal)
anova(model2) #type 1
car::Anova(model2, type = 2)
car::Anova(model2, type = 3)
# Same answer - orthogonal!
```

<table>
<caption>ANOVA (type I)</caption>
 <thead>
  <tr>
   <th style="text-align:left;">   </th>
   <th style="text-align:right;"> Df </th>
   <th style="text-align:right;"> Sum Sq </th>
   <th style="text-align:right;"> F value </th>
  </tr>
 </thead>
<tbody>
  <tr>
   <td style="text-align:left;"> condition </td>
   <td style="text-align:right;"> 1 </td>
   <td style="text-align:right;"> 141.86 </td>
   <td style="text-align:right;"> 24.1 </td>
  </tr>
  <tr>
   <td style="text-align:left;"> gender </td>
   <td style="text-align:right;"> 1 </td>
   <td style="text-align:right;"> 121.69 </td>
   <td style="text-align:right;"> 20.6 </td>
  </tr>
  <tr>
   <td style="text-align:left;"> condition:gender </td>
   <td style="text-align:right;"> 1 </td>
   <td style="text-align:right;"> 37.88 </td>
   <td style="text-align:right;"> 6.4 </td>
  </tr>
  <tr>
   <td style="text-align:left;"> Residuals </td>
   <td style="text-align:right;"> 2500 </td>
   <td style="text-align:right;"> 14733.84 </td>
   <td style="text-align:right;">  </td>
  </tr>
</tbody>
</table>
]

.pull-right.small[
<table>
<caption>ANOVA (type II)</caption>
 <thead>
  <tr>
   <th style="text-align:left;">   </th>
   <th style="text-align:right;"> Df </th>
   <th style="text-align:right;"> Sum Sq </th>
   <th style="text-align:right;"> F value </th>
  </tr>
 </thead>
<tbody>
  <tr>
   <td style="text-align:left;"> condition </td>
   <td style="text-align:right;"> 1 </td>
   <td style="text-align:right;"> 141.86 </td>
   <td style="text-align:right;"> 24.1 </td>
  </tr>
  <tr>
   <td style="text-align:left;"> gender </td>
   <td style="text-align:right;"> 1 </td>
   <td style="text-align:right;"> 121.69 </td>
   <td style="text-align:right;"> 20.6 </td>
  </tr>
  <tr>
   <td style="text-align:left;"> condition:gender </td>
   <td style="text-align:right;"> 1 </td>
   <td style="text-align:right;"> 37.88 </td>
   <td style="text-align:right;"> 6.4 </td>
  </tr>
  <tr>
   <td style="text-align:left;"> Residuals </td>
   <td style="text-align:right;"> 2500 </td>
   <td style="text-align:right;"> 14733.84 </td>
   <td style="text-align:right;">  </td>
  </tr>
</tbody>
</table>

<table>
<caption>ANOVA (type III)</caption>
 <thead>
  <tr>
   <th style="text-align:left;">   </th>
   <th style="text-align:right;"> Df </th>
   <th style="text-align:right;"> Sum Sq </th>
   <th style="text-align:right;"> F value </th>
  </tr>
 </thead>
<tbody>
  <tr>
   <td style="text-align:left;"> condition </td>
   <td style="text-align:right;"> 1 </td>
   <td style="text-align:right;"> 141.86 </td>
   <td style="text-align:right;"> 24.1 </td>
  </tr>
  <tr>
   <td style="text-align:left;"> gender </td>
   <td style="text-align:right;"> 1 </td>
   <td style="text-align:right;"> 121.69 </td>
   <td style="text-align:right;"> 20.6 </td>
  </tr>
  <tr>
   <td style="text-align:left;"> condition:gender </td>
   <td style="text-align:right;"> 1 </td>
   <td style="text-align:right;"> 37.88 </td>
   <td style="text-align:right;"> 6.4 </td>
  </tr>
  <tr>
   <td style="text-align:left;"> Residuals </td>
   <td style="text-align:right;"> 2500 </td>
   <td style="text-align:right;"> 14733.84 </td>
   <td style="text-align:right;">  </td>
  </tr>
</tbody>
</table>

]
---
class: title title-5
# Recap

- If each observation has the same variability, a balanced sample maximizes power.
- Balanced designs have interesting properties:
   - estimated marginal means coincide with (sub)samples averages
   - the tests of effects are unambiguous
   - for unbalanced samples, we work with marginal means and type 3 ANOVA
   - if empty cells (no one assigned to a combination of treatment), cannot estimate corresponding coefficients (typically higher order interactions)

---

# Practice

From the OSC psychology replication

> People can be influenced by the prior consideration of a numerical anchor when forming numerical judgments. [...]  The anchor provides an initial starting point from which estimates are adjusted, and a large body of research demonstrates that adjustment is usually insufficient, leading estimates to be biased towards the initial anchor.

.small[
[Replication of Study 4a of Janiszewski & Uy (2008, Psychological Science) by J. Chandler](https://osf.io/aaudl/)
]

???

People can be influenced by the prior consideration of a numerical anchor when forming numerical judgments. The anchor provides an initial starting point from which estimates are adjusted, and a large body of research demonstrates that adjustment is usually insufficient, leading estimates to be biased towards the initial anchor. Extending this work, Janiszewski and Uy (2008) conceptualized people's attempt to adjust following presentation of an anchor as movement along a subjective representation scale by a certain number of units. Precise numbers (e.g. 9.99) imply a finer-resolution scale than round numbers (e.g. 10). Consequently, adjustment along a subjectively finer resolution scale will result in less objective adjustment than adjustment by the same number of units along a subjectively coarse resolution scale.

In three experimental studies the authors demonstrate this predicted basic effect and rule out various alternative explanations. Two additional studies (4a and b) found that this effect was especially strong when people were explicitly given more motivation to adjust their estimates (e.g., by implying that the initial anchor substantially overestimated the price).

---

layout: false
name: factorial-designs
class: center middle section-title section-title-6 animated fadeIn

# Multifactorial designs

---

We can consider multiple factors `$A$`, `$B$`, `$C$`, `$\ldots$` with respectively `$n_a$`, `$n_b$`, `$n_c$`, `$\ldots$` levels and with `$n_r$` replications for each.

The total number of treatment combinations is

.box-inv-6.sp-after-half[
`$n_a \times n_b \times n_c \times \cdots$`
]

.box-6.medium[
**Curse of dimensionality**
]

---

Each cell of the cube is allowed to have a different mean

`$$\begin{align*}
\underset{\text{response}\vphantom{cell}}{Y_{ijkr}\vphantom{\mu_{j}}} = \underset{\text{cell mean}}{\mu_{ijk}} + \underset{\text{error}\vphantom{cell}}{\varepsilon_{ijkr}\vphantom{\mu_{j}}}
\end{align*}$$`
with `$\varepsilon_{ijkt}$` are independent error term for 
- row `$i$`
- column `$j$`
- depth `$k$`
- replication `$r$`

---
class: title title-6
# Parametrization of a three-way ANOVA model

With the **sum-to-zero** parametrization with factors `$A$`, `$B$` and `$C$`, write the response as

`$$\begin{align*}\underset{\text{theoretical average}}{\mathsf{E}(Y_{ijkr})} &= \quad \underset{\text{global mean}}{\mu} \\ &\quad +\underset{\text{main effects}}{\alpha_i + \beta_j + \gamma_k}  \\ & \quad + \underset{\text{two-way interactions}}{(\alpha\beta)_{ij} + (\alpha\gamma)_{ik} + (\beta\gamma)_{jk}} \\ & \quad + \underset{\text{three-way interaction}}{(\alpha\beta\gamma)_{ijk}}\end{align*}$$`

---
.small[
<div class="figure" style="text-align: center">
<img src="img/06/cube.png" alt="global mean, row, column and depth main effects" width="20%" /><img src="img/06/cube_rows.png" alt="global mean, row, column and depth main effects" width="20%" /><img src="img/06/cube_column.png" alt="global mean, row, column and depth main effects" width="20%" /><img src="img/06/cube_depth.png" alt="global mean, row, column and depth main effects" width="20%" />
<p class="caption">global mean, row, column and depth main effects</p>
</div>
]
.small[
<div class="figure" style="text-align: center">
<img src="img/06/cube_rowcol.png" alt="row/col, row/depth and col/depth interactions and three-way interaction." width="20%" /><img src="img/06/cube_rowdepth.png" alt="row/col, row/depth and col/depth interactions and three-way interaction." width="20%" /><img src="img/06/cube_coldepth.png" alt="row/col, row/depth and col/depth interactions and three-way interaction." width="20%" /><img src="img/06/cube_all.png" alt="row/col, row/depth and col/depth interactions and three-way interaction." width="20%" />
<p class="caption">row/col, row/depth and col/depth interactions and three-way interaction.</p>
</div>
]

---
class: title title-6
# Example of three-way design

.small[
Petty, Cacioppo and Heesacker (1981). Effects of rhetorical questions on persuasion: A cognitive response analysis. Journal of Personality and Social Psychology.

A `$2 \times 2 \times 2$` factorial design with 8 treatments groups and `$n=160$` undergraduates.

Setup: should a comprehensive exam be administered to bachelor students in their final year?

- **Response** Likert scale on `$-5$` (do not agree at all) to `$5$` (completely agree)
- **Factors**
- `$A$`: strength of the argument (`strong` or `weak`)
- `$B$`: involvement of students `low` (far away, in a long time) or  `high` (next year, at their university)
- `$C$`: style of argument, either `regular` form or `rhetorical` (Don't you think?, ...)
]

---
class: title title-6

# Interaction plot

.small[
Interaction plot for a  `$2 \times 2 \times 2$` factorial design from Petty, Cacioppo and Heesacker (1981)
]

???

p.472 of Keppel and Wickens

---
class: title title-6
#  The microwave popcorn experiment

What is the best brand of microwave popcorn?

- **Factors**
- brand (two national, one local)
- power: 500W and 600W
- time: 4, 4.5 and 5 minutes
- **Response**: <s>weight</s>, <s>volume</s>, <s>number</s>, percentage of popped kernels.
- Pilot study showed average of 70% overall popped kernels (10% standard dev), timing values reasonable
- Power calculation suggested at least `$r=4$` replicates, but researchers proceeded with `$r=2$`...

---

```r
data(popcorn, package = 'hecedsm')
# Fit model with three-way interaction
model <- aov(percentage ~ brand*power*time,
             data = popcorn)
# ANOVA table - 'anova' is ONLY for balanced designs
anova_table <- anova(model) 
# Quantile-quantile plot
car::qqPlot(model)
```

]

All points fall roughly on a straight line.

]

```r
popcorn |> 
   group_by(brand, time, power) |>
   summarize(meanp = mean(percentage)) |>
ggplot(mapping = aes(x = power, 
                     y = meanp, 
                     col = time, 
                     group = time)) + 
  geom_line() + 
  facet_wrap(~brand)
```
]

No evidence of three-way interaction (hard to tell with `$r=2$` replications).
]

]

---

| terms | degrees of freedom | 
|:---:|:-----|:-------|
| `$A$` | `$n_a-1$` | 
| `$B$` | `$n_b-1$` | 
| `$C$` | `$n_c-1$` | 
| `$AB$` | `$(n_a-1)(n_b-1)$` | 
| `$AC$` | `$(n_a-1)(n_c-1)$` | 
| `$BC$` | `$(n_b-1)(n_c-1)$` | 
| `$ABC$` | `${\small (n_a-1)(n_b-1)(n_c-1)}$` | 
| `$\text{residual}$` | `$n_an_bn_c(R-1)$` | 
| `$\text{total}$` | `$n_an_bn_cn_r-1$` |

]

---

<table class="table" style="margin-left: auto; margin-right: auto;">
<caption>Analysis of variance table for microwave-popcorn</caption>
 <thead>
  <tr>
   <th style="text-align:left;">   </th>
   <th style="text-align:right;"> Degrees of freedom </th>
   <th style="text-align:right;"> Sum of squares </th>
   <th style="text-align:right;"> Mean square </th>
   <th style="text-align:right;"> F statistic </th>
   <th style="text-align:right;"> p-value </th>
  </tr>
 </thead>
<tbody>
  <tr>
   <td style="text-align:left;"> brand </td>
   <td style="text-align:right;"> 2 </td>
   <td style="text-align:right;"> 331.10 </td>
   <td style="text-align:right;"> 165.55 </td>
   <td style="text-align:right;"> 1.89 </td>
   <td style="text-align:right;"> 0.180 </td>
  </tr>
  <tr>
   <td style="text-align:left;"> power </td>
   <td style="text-align:right;"> 1 </td>
   <td style="text-align:right;"> 455.11 </td>
   <td style="text-align:right;"> 455.11 </td>
   <td style="text-align:right;"> 5.19 </td>
   <td style="text-align:right;"> 0.035 </td>
  </tr>
  <tr>
   <td style="text-align:left;"> time </td>
   <td style="text-align:right;"> 2 </td>
   <td style="text-align:right;"> 1554.58 </td>
   <td style="text-align:right;"> 777.29 </td>
   <td style="text-align:right;"> 8.87 </td>
   <td style="text-align:right;"> 0.002 </td>
  </tr>
  <tr>
   <td style="text-align:left;"> brand:power </td>
   <td style="text-align:right;"> 2 </td>
   <td style="text-align:right;"> 196.04 </td>
   <td style="text-align:right;"> 98.02 </td>
   <td style="text-align:right;"> 1.12 </td>
   <td style="text-align:right;"> 0.349 </td>
  </tr>
  <tr>
   <td style="text-align:left;"> brand:time </td>
   <td style="text-align:right;"> 4 </td>
   <td style="text-align:right;"> 1433.86 </td>
   <td style="text-align:right;"> 358.46 </td>
   <td style="text-align:right;"> 4.09 </td>
   <td style="text-align:right;"> 0.016 </td>
  </tr>
  <tr>
   <td style="text-align:left;"> power:time </td>
   <td style="text-align:right;"> 2 </td>
   <td style="text-align:right;"> 47.71 </td>
   <td style="text-align:right;"> 23.85 </td>
   <td style="text-align:right;"> 0.27 </td>
   <td style="text-align:right;"> 0.765 </td>
  </tr>
  <tr>
   <td style="text-align:left;"> brand:power:time </td>
   <td style="text-align:right;"> 4 </td>
   <td style="text-align:right;"> 47.33 </td>
   <td style="text-align:right;"> 11.83 </td>
   <td style="text-align:right;"> 0.13 </td>
   <td style="text-align:right;"> 0.967 </td>
  </tr>
  <tr>
   <td style="text-align:left;"> Residuals </td>
   <td style="text-align:right;"> 18 </td>
   <td style="text-align:right;"> 1577.87 </td>
   <td style="text-align:right;"> 87.66 </td>
   <td style="text-align:right;">  </td>
   <td style="text-align:right;">  </td>
  </tr>
</tbody>
</table>

---
class: title title-6
# Omitting terms in a factorial design

The more levels and factors, the more parameters to estimate (and replications needed)
- Costly to get enough observations / power
- The assumption of normality becomes more critical when `$r=2$`!

It may be useful not to consider some interactions if they are known or (strongly) suspected not to be present

- If important interactions are omitted from the model, biased estimates/output!

---
class: title title-6
# Guidelines for the interpretation of effects

Start with the most complicated term (top down)

- If the three-way interaction `$ABC$` is significative:
    - don't interpret main effects or two-way interactions!
    - comparison is done cell by cell within each level
- If the `$ABC$` term isn't significative:
    - can marginalize and interpret lower order terms
    - back to a series of two-way ANOVAs

---

# What contrasts are of interest?

- Can view a three-way ANOVA as a series of one-way ANOVA or two-way ANOVAs...

Depending on the goal, could compare for variable `$A$`
- marginal contrast `$\psi_A$` (averaging over `$B$` and `$C$`)
- marginal conditional contrast for particular subgroup: `$\psi_A$` within `$c_1$`
- contrast involving two variables: `$\psi_{AB}$`
- contrast differences between treatment at `$\psi_A \times B$`, averaging over `$C$`.
- etc.

See helper code and chapter 22 of Keppel & Wickens (2004) for a detailed example.

---
class: title title-6
# Effects and contrasts for microwave-popcorn

Following preplanned comparisons

- Which combo (brand, power, time) gives highest popping rate? (pairwise comparisons of all combos)
- Best brand overall (marginal means marginalizing over power and time, assuming no interaction)
- Effect of time and power on percentage of popped kernels 
- pairwise comparison of time `$\times$` power
- main effect of power
- main effect of time

---

Let `$A$`=brand, `$B$`=power, `$C$`=time

Compare difference between percentage of popped kernels for 4.5 versus 5 minutes, for brands 1 and 2

`$$\mathscr{H}_0: (\mu_{1.2} -\mu_{1.3}) - (\mu_{2.2} - \mu_{2.3}) = 0$$`

```r
library(emmeans)
# marginal means
emm_popcorn_AC <- emmeans(model, 
                          specs = c("brand","time"))
contrast_list <- 
  list(
    brand12with4.5vs5min = c(0, 0, 0, 1, -1, 0, -1, 1,0))
contrast(emm_popcorn_AC,  # marginal mean (no time)
         method = contrast_list) # list of contrasts
```

]

---

Compare all three times (4, 4.5 and 5 minutes)

At level 99% with Tukey's HSD method

- Careful! Potentially misleading because there is a `brand * time` interaction present.

```r
# List of variables to keep go in `specs`: keep only time
emm_popcorn_C <- emmeans(model, specs = "time")
pairs(emm_popcorn_C, 
      adjust = "tukey", 
      level = 0.99, 
      infer = TRUE)
```