2.5 Summary of week 2

If \(\mathbf{X}\) is an \(n \times p\) design matrix containing covariates and \(\boldsymbol{Y}\) is our response variable, we can obtain the ordinary least squares (OLS) coefficients for the linear model \[\boldsymbol{y} = \mathbf{X}\boldsymbol{\beta}+ \boldsymbol{\varepsilon}, \qquad \mathrm{E}(\boldsymbol{\varepsilon})=\boldsymbol{0}_n,\] by projecting \(\boldsymbol{y}\) on to \(\mathbf{X}\); it follows that \[\mathbf{X}\hat{\boldsymbol{\beta}}=\mathbf{X}(\mathbf{X}^\top\mathbf{X})^{-1}\mathbf{X}^\top{\boldsymbol{y}}\] and \[\hat{\boldsymbol{\beta}} = (\mathbf{X}^\top\mathbf{X})^{-1}\mathbf{X}^\top{\boldsymbol{y}}.\]

The dual interpretation (which is used for graphical diagnostics), is the row geometry: each row corresponds to an individual and the response is a \(1\) dimensional point. \(\hat{\boldsymbol{\beta}}\) describes the parameters of the hyperplane that minimizes the sum of squared Euclidean vertical distances between the fitted value \(\hat{y}_i\) and the response \(y_i\). The problem is best written using vector-matrix notation, so

\[ \mathrm{argmin}_{\boldsymbol{\beta}} \sum_{i=1}^n (y_i- \mathbf{x}_i\boldsymbol{\beta})^2 \equiv \mathrm{argmin}_{\boldsymbol{\beta}} (\boldsymbol{y} - \mathbf{X}\boldsymbol{\beta})^\top(\boldsymbol{y}-\mathbf{X}\boldsymbol{\beta}) \equiv \boldsymbol{e}^\top\boldsymbol{e}. \]

The solution to the OLS problem has a dual interpretation in the column geometry, in which we treat the vector of stacked observations \((y_1, \ldots, y_n)^\top\) (respectively the vertical distances \((e_1, \ldots, e_n)^\top\)) as elements of \(\mathbb{R}^n\). There, the response \(\boldsymbol{y}\) space can be decomposed into fitted values \(\hat{{\boldsymbol{y}}} \equiv \mathbf{H}_{\mathbf{X}} = \mathbf{X}\hat{\boldsymbol{\beta}}\) and residuals \(\boldsymbol{e} = \mathbf{M}_{\mathbf{X}} = \boldsymbol{y} - \mathbf{X}\hat{\boldsymbol{\beta}}\). By construction, \(\boldsymbol{e} \perp \hat{{\boldsymbol{y}}}\).

We therefore get \[\boldsymbol{y} = \hat{\boldsymbol{y}} + \boldsymbol{e}\] and since these form a right-angled triangle, Pythagoras’ theorem can be used to show that \(\|\boldsymbol{y}\|^2 = \|\hat{\boldsymbol{y}}\|^2 + \|\boldsymbol{e}\|^2.\)