5.3 Two-way ANOVA and irrelevant hypotheses

A two-way analysis extends the one-way ANOVA by considering two factors \(\mathbf{D}_1\) and \(\mathbf{D}_2\) respectively taking \(L_1\) and \(L_2\) levels. The full model with interactions is usually specified using the design matrix \(\mathbf{X} = [\mathbf{1}_n, \mathbf{D}_{1}, \mathbf{D}_2, (\mathbf{D}_1\circ \mathbf{D}_2) ]\), so the mean model is \[\begin{align*} \mathrm{E}(Y_{i})= \omega + \sum_{j=1}^{L_1-1} \beta_j{\mathbf 1}_{D_{i1}=j+1} + \sum_{k=1}^{L_2-1} \gamma_k{\mathbf 1}_{D_{i2}=k+1} + \sum_{j=1}^{L_1-1}\sum_{k=1}^{L_2-1} \nu_{jk}{\mathbf 1}_{D_{i1}=j+1}{\mathbf 1}_{D_{i2}=k+1}. \end{align*}\] Different submodels can be specified:

  1. the intercept-only model, \(M_1\), has 1 parameter (\(\nu_{jk}=\beta_j=\gamma_k=0, j=1, \ldots, L_1-1, k = 1, \ldots, L_2-1\))
  2. the model \(M_2\) with only the second factor (\(\nu_{jk}=\beta_j=0, j=1, \ldots, L_1-1, k = 1, \ldots, L_2-1\)) has \(L_2\) parameters
  3. the model \(M_3\) with only the first factor (\(\nu_{jk}=\gamma_k=0, j=1, \ldots, L_1-1, k = 1, \ldots, L_2-1\)) has \(L_1\) parameters
  4. the additive model \(M_4\) (\(\nu_{jk}=0, j=1, \ldots, L_1-1, k = 1, \ldots, L_2-1\)), has \(L_1+L_2-1\) parameters;
  5. the additive model \(M_5\) with the interaction term has \(L_1L_2\) parameters;

Note that we consider interaction terms only if the corresponding main effects are included. Removing main effects or lower-order interactions while keeping higher order terms implies that the levels of the factor are not arbitrarily labelled and rearrangement of the factors (changing the baseline) leads to potentially different conclusion.

To see this, consider the model with design matrix \([\mathbf{1}_n, \mathbf{D}_{1}, (\mathbf{D}_1\circ \mathbf{D}_2)]\), so \(\gamma_k=0\) \((k = 1, \ldots, L_2-1)\).

The mean level for observations in the different groups are \[\begin{align*} \mathrm{E}(Y_i \mid D_{i1}=1, D_{i2}=k) &= \omega \\ \mathrm{E}(Y_i \mid D_{i1}=j, D_{i2}=1; j > 1) &= \omega + \beta_j \\ \mathrm{E}(Y_i \mid D_{i1}=j, D_{i2}=k; j, k > 1) &= \omega + \beta_j + \nu_{jk} \\ \end{align*}\] The effect of moving from level \(D_{2}=1\) to \(D_{2}=k>1\) is zero if \(D_{1}=1\), whereas it is \(\nu_{jk}\) if \(D_{1}>1\). We therefore are treating the levels as being different and the models fitted (and the inference) will differ if we change the baseline category for the factors. This means that the latter is no longer arbitrary! In R this baseline is the first factor in alphabetical order, so is essentially arbitrary. In the same way we will not consider interactions without main effects, one would not consider polynomials of degree \(k\) without including the coefficients of the lower order. For a degree two polynomial of the form \(\sum_{i=0}^k \beta_ix^i\), testing the hypothesis \(\beta_i=0\) for \(i < k\) means restricting attention to polynomials whose \(k\)th order term is zero, which is usually nonsensical. Thus, any Wald test (the \(t\) values output by summary) corresponding to such an hypothesis should be disregarded.