3.1 Revisiting the interpretation of the parameters of a linear model

Geometrically, the linear model \boldsymbol{y} = \mathbf{X} \boldsymbol{\beta} + \text{residuals} corresponds to the projection on to the span of \mathbf{X} and gives the line of best fit in that space.

It is perhaps easiest to visualize the two-dimensional case, when \mathbf{X} = (\mathbf{1}_n^\top, \mathbf{x}_1^\top)^\top is a n \times 2 design matrix and \mathbf{x}_1 is a continuous covariate. In this case, the coefficient vector \boldsymbol{\beta}=(\beta_0, \beta_1)^\top represent, respectively, the intercept and the slope.

If \mathbf{X} = \mathbf{1}_n, the model only consists of an intercept, which is interpreted as the mean level. Indeed, the projection matrix corresponding to \mathbf{1}_n, \mathbf{H}_{\mathbf{1}_n}, is a matrix whose entries are all identically n^{-1}. The fitted values of this model thus correspond to the mean of \boldsymbol{y}, \bar{y} and the residuals are the centred values \boldsymbol{y}-\mathbf{1}_n \bar{y} whose mean is zero.

More generally, for \mathbf{X} an n \times p design matrix, the interpretation is as follows: a unit increase in \mathrm{x}_{ij} (\mathrm{x}_{ij} \mapsto \mathrm{x}_{ij}+1) leads to a change of \beta_j unit for y_i (y_i \mapsto \beta_j+y_i), other things being held constant. Beware of models with higher order polynomials and interactions: if for example one is interested in the coefficient for \mathbf{x}_j, but \mathbf{x}_j^2 is also a column of the design matrix, then a change of one unit in \mathbf{x}_j will not lead to a change of \beta_jx_j for y_j!

The FWL theorem says the coefficient \boldsymbol{\beta}_2 in the regression \boldsymbol{y} =\mathbf{X}_1\boldsymbol{\beta}_1 + \mathbf{X}_2\boldsymbol{\beta}_2 + \boldsymbol{\varepsilon} is equivalent to that of the regression \mathbf{M}_1\boldsymbol{y} =\mathbf{M}_1 \mathbf{X}_2\boldsymbol{\beta}_2 + \boldsymbol{\varepsilon} This can be useful to distangle the effect of one variable.

The intercept coefficient does not correspond to the mean of \boldsymbol{y} unless the other variables in the design matrix have been centered (meaning they have mean zero). Otherwise, the coefficient \beta_0 associated to the intercept is nothing but the level of y when all the other variables are set to zero. Adding new variables affects the estimates of the coefficient vector \boldsymbol{\beta}, unless the new variables are orthogonal to the existing lot.