5 Analysis of variance

Consider the linear model \(\boldsymbol{Y} = \mathbf{1}_n\alpha + \mathbf{Z}\boldsymbol{\beta} + \boldsymbol{\varepsilon}\) where \(\mathbf{X}=(\mathbf{1}_n^\top, \mathbf{Z}^\top)^\top\) is a full rank \(n \times p\) design matrix. Let as usual \(\mathrm{TSS} = \boldsymbol{y}^\top\mathbf{M}_{\mathbf{1}_n}\boldsymbol{y}\), the total sum of square, and \(\mathrm{RSS}= \boldsymbol{y}^\top\mathbf{M}_{\mathbf{X}}\boldsymbol{y}\), the sum of squared residuals. Under the assumptions of the Gaussian linear model (or asymptotically), the \(F\)-test statistic for testing the null hypothesis \(\mathrm{H}_0: \boldsymbol{\beta}=\mathbf{0}_{p-1}\) against the alternative \(\mathrm{H}_a: \boldsymbol{\beta} \in \mathbb{R}^{p-1}\) assuming the larger model is correctly specified is \[F = \frac{(\mathrm{TSS}-\mathrm{RSS})/(p-1)}{\mathrm{RSS}/(n-p)}.\] Under the null hypothesis, \(F \sim \mathcal{F}(p-1, n-p)\).

An ANOVA table (anova) arranges the information about the sum of squares decomposition, the degree of freedom and the value of the \(F\) test statistic in the following manner.

Sum of squares degrees of freedom scaled sum of squares test statistic \(P\)-value
ESS \(p-1\) \(\mathrm{ESS}/(p-1)\) \(F\) \(1-\texttt{pf}(F, p-1, n-p)\)
RSS \(n-p\) \(\mathrm{RSS}/(n-p)\)