4.1 Confidence and prediction intervals

In the linear model with IID errors \(\boldsymbol{\varepsilon}\sim \mathsf{IID}(0, \sigma^2)\), we have \(\mathsf{Var}(\hat{\boldsymbol{\beta}}) = \sigma^2(\mathbf{X}^\top\mathbf{X})^{-1}\). The standard errors for \(\hat{\boldsymbol{\beta}}\) are then simply the square root of the diagonal entries (which are the variance \(\mathsf{Var}(\hat{\beta}_j)\) for \(j=1, \ldots, p\). Confidence intervals for the coefficients are given by \(\hat{\beta}_i \pm t_{n-p}({0.025})\mathsf{se}(\hat{\beta}_i)\).

We can also draw intervals around the regression line by considering combinations \(\mathbf{x} = (1, \texttt{mpg})\) for different values of \(\texttt{mpg}\) as illustrated below. The reasoning is similar, except that we now obtain the interval for a function of \(\widehat{\boldsymbol{\beta}}\). For each new vector of regressors \(\mathbf{x}^i \equiv \mathbf{c}\), we get new fitted values \(\hat{y}^i= \mathbf{x}^i\widehat{\boldsymbol{\beta}}\) whose variance is, by the delta-method, given by \(\mathsf{Var}(y^i)={\sigma^2}\mathbf{x}^{i\top}(\mathbf{X}^\top\mathbf{X})^{-1}\mathbf{x}^i\). We replace \(\sigma^2\) by the usual estimator \(s^2\) and thus the pointwise confidence interval is given by the usual Student-\(t\) test statistics, with this time \[\hat{y}^i \pm t_{n-p}({0.025})\mathsf{se}(\hat{y}^i) = \mathbf{x}^i\widehat{\boldsymbol{\beta}} \pm t_{n-p}({0.025})\sqrt{s^2\mathbf{x}^{i\top}(\mathbf{X}^\top\mathbf{X})^{-1}\mathbf{x}^i}.\]

For the prediction interval, we consider instead \[ \mathbf{x}^i\widehat{\boldsymbol{\beta}} \pm t_{n-p}({0.025})\sqrt{s^2 \left[ \mathbf{I}_n + \mathbf{x}^{i\top}(\mathbf{X}^\top\mathbf{X})^{-1}\mathbf{x}^i\right]}.\] Provided the model is correct, new observations \(y_{\mathrm{new}}\) should fall 19 times out of 20 within the reported prediction interval.

As we move away from the bulk of the data (average value of \(\mathbf{x}\)), the hyperbolic shape of the intervals becomes visible. Note here how the prediction interval is necessarily wider than the confidence interval (iterated variance formula).

## [1] TRUE
## [1] TRUE
##                 2.5 %    97.5 %
## (Intercept) 33.450500 41.119753
## wt          -6.486308 -4.202635

The function predict takes as imput a data.frame object containing the same column names as those of the fitted lm object. The names can be obtained from names(ols$model)[-1].

As usual, we can verify we get the same result if we computed the intervals manually.