Solution 1
Exercise 1.1
Linear mixed effect regression model specifies that response vectors for individual \(i\), \(\boldsymbol{Y}_i \in \mathbb{R}^k\), are Gaussian. The model includes model matrix \(\mathbf{X}_i\) with fixed effect coefficients \(\boldsymbol{\beta}\), and another \(k\times l\) model matrix \(\mathbf{Z}_i\) with random effects. The hierarchical formulation of the model is \[\begin{align*} \boldsymbol{Y}_i \mid \mathcal{B}_i=\boldsymbol{b}_i &\sim \mathsf{Gauss}_k(\mathbf{X}_i\boldsymbol{\beta} + \mathbf{Z}_i\boldsymbol{b}_i, \sigma^2 \mathbf{I}_k) \\ \mathcal{B}_i & \sim \mathsf{Gauss}_l(\boldsymbol{0}_k, \boldsymbol{\Omega}) \end{align*}\]
- Using the tower property, derive the marginal mean and covariance matrix of \(\boldsymbol{Y}_i\)
- Hence obtain the parameters of the joint distribution of \((\boldsymbol{Y}_i^\top, \mathcal{B}_i^\top)^\top.\)
Solution. Using the law of iterated expectation and variance
\[\begin{align*} \mathsf{E}_{\boldsymbol{Y}_i}(\boldsymbol{Y}_i) & =\mathsf{E}_{\mathcal{B}_i}\left\{\mathsf{E}_{\boldsymbol{Y}_i \mid \mathcal{B}_i}(\boldsymbol{Y}_i)\right\} \\&= \mathsf{E}_{\mathcal{B}_i}(\mathbf{X}_i\boldsymbol{\beta} + \mathbf{Z}_i\mathcal{B}_i) \\&= \mathbf{X}_i\boldsymbol{\beta} \end{align*}\]
\[\begin{align*} \mathsf{Va}_{\boldsymbol{Y}_i}(\boldsymbol{Y}_i) & =\mathsf{Va}_{\mathcal{B}_i}\left\{\mathsf{E}_{\boldsymbol{Y}_i \mid \mathcal{B}_i}(\boldsymbol{Y}_i)\right\} + \mathsf{E}_{\mathcal{B}_i}\left\{\mathsf{Va}_{\boldsymbol{Y}_i \mid \mathcal{B}_i}(\boldsymbol{Y}_i)\right\} \\&= \mathsf{Va}_{\mathcal{B}_i}(\mathbf{X}_i\boldsymbol{\beta} + \mathbf{Z}_i\mathcal{B}_i) +\mathsf{E}_{\mathcal{B}_i}(\sigma^2\mathbf{I}_k) \\&= \mathbf{Z}_i\boldsymbol{\Omega}\boldsymbol{Z}_i^\top + \sigma^2\mathbf{I}_k. \end{align*}\]
Since the conditional and marginal are Gaussian, and the product of their density functions is also \(\exp(-\cdot)\), with \(\cdot\) quadratic in \(\boldsymbol{Y}_i\) and \(\mathcal{B}_i\), it must be multivariate Gaussian. As the latter is fully characterized by the mean and variance, it suffices to derive the covariance between \(\boldsymbol{Y}_i\) and \(\mathcal{B}_i\), which is the only missing piece of information. The latter is by definition
\[\begin{align*} \mathsf{Co}(\boldsymbol{Y}_i,\mathcal{B}_i) & = \mathsf{E}_{\boldsymbol{Y}_i,\mathcal{B}_i}\left[\left\{\boldsymbol{Y}_i - \mathsf{E}_{\boldsymbol{Y}_i}(\boldsymbol{Y}_i)\right\}\left\{\mathcal{B}_i - \mathsf{E}_{\mathcal{B}_i}(\mathcal{B}_i)\right\}^\top\right] \\&= \mathsf{E}_{\boldsymbol{Y}_i,\mathcal{B}_i}\{(\boldsymbol{Y}_i - \mathbf{X}_i\boldsymbol{\beta})\mathcal{B}_i^\top\} \\&= \mathsf{E}_{\mathcal{B}_i}\left[\mathsf{E}_{\boldsymbol{Y}_i \mid \mathcal{B}_i}\{(\boldsymbol{Y}_i - \mathbf{X}_i\boldsymbol{\beta})\mathcal{B}_i^\top\}\right] \\&= \mathbf{Z}_i\mathsf{E}_{\mathcal{B}_i}(\mathcal{B}_i\mathcal{B}_i^\top) \\&=\mathbf{Z}_i\boldsymbol{\Omega} \end{align*}\]
and so we find \[\begin{align*} \begin{pmatrix}\boldsymbol{Y}_i \\ \mathcal{B}_i \end{pmatrix}\sim \mathsf{Gauss}_{k+l} \left\{ \begin{pmatrix} \mathbf{X}_i \boldsymbol{\beta} \\ \boldsymbol{0}_l \end{pmatrix}, \begin{pmatrix} \mathbf{Z}_i\boldsymbol{\Omega}\boldsymbol{Z}_i^\top + \sigma^2\mathbf{I}_k & \mathbf{Z}_i\boldsymbol{\Omega} \\ \boldsymbol{\Omega}\mathbf{Z}_i^\top & \boldsymbol{\Omega}\end{pmatrix}\right\}. \end{align*}\]
Exercise 1.2
Consider a simple random sample of size \(n\) from the Wald distribution, with density \[\begin{align*} f(y; \nu, \lambda) = \left(\frac{\lambda}{2\pi y^{3}}\right)^{1/2} \exp\left\{ - \frac{\lambda (y-\nu)^2}{2\nu^2y}\right\}\mathrm{I}(y > 0) \end{align*}\] for location \(\nu >0\) and shape \(\tau>0.\) You may take for given that the expected value of the Wald distribution is \(\mathsf{E}(Y) = \nu.\)
Write down the likelihood and show that it can be written in terms of the sufficient statistics \(\sum_{i=1}^n y_i\) and \(\sum_{i=1} y_i^{-1}.\)
Solution. The log likelihood for an independent and identically distributed sample is, up to terms not depending on the parameters, \[\begin{align*} \ell(\nu, \lambda) \stackrel{\nu, \lambda}{\propto} \frac{n}{2} \ln(\lambda) - \frac{\lambda}{2\nu^2} \sum_{i=1}^n y_i + \frac{n\lambda}{\nu} - \frac{\lambda}{2} \sum_{i=1}^n \frac{1}{y_i} \end{align*}\] and we readily see that the model is an exponential family with sufficient statistics \(t_1(\boldsymbol{y}) = \sum_{i=1}^n y_i\) and \(t_2(\boldsymbol{y}) = \sum_{i=1}^n y_i^{-1}.\)
We derive the score vector and information, \[\begin{align*} U(\nu, \lambda) &= \begin{pmatrix} \frac{\partial \ell(\nu, \lambda)}{\partial \nu} \\\frac{\partial \ell(\nu, \lambda)}{\partial \lambda} \end{pmatrix} \\&= \begin{pmatrix} \frac{\lambda \sum_{i=1}^n y_i}{\nu^3} - \frac{n\lambda}{\nu^2} \\ \frac{n}{2\lambda} + \frac{\sum_{i=1}^n y_i}{2\nu^2} + \frac{n}{\nu} - \frac{\sum_{i=1}^n y_i^{-1}}{2} \end{pmatrix} \end{align*}\] and \[\begin{align*} j(\nu, \lambda) &= -\begin{pmatrix} \frac{\partial^2 \ell(\nu, \lambda)}{\partial \nu^2} & \frac{\partial^2 \ell(\nu, \lambda)}{\partial \nu \partial \lambda} \\ \frac{\partial^2 \ell(\nu, \lambda)}{\partial \lambda \partial \nu } & \frac{\partial^2 \ell(\nu, \lambda)}{\partial \lambda^2} \end{pmatrix} \\&= \begin{pmatrix} \frac{3\lambda \sum_{i=1}^n y_i}{\nu^4} - \frac{2n\lambda}{\nu^3} & -\frac{\sum_{i=1}^n y_i}{\nu^3} + \frac{n}{\nu^2} \\ -\frac{\sum_{i=1}^n y_i}{\nu^3} + \frac{n}{\nu^2} &\frac{n}{2\lambda^2} \end{pmatrix} \end{align*}\] To compute the expected information, we need to consider the random counterpart of this and replace values of \(Y_i\) by their \(\mathsf{E}(Y_i)=\nu\), so \[\begin{align*} i(\nu, \lambda) = \begin{pmatrix} \frac{n\lambda}{\nu^3} & 0 \\ 0 &\frac{n}{2\lambda^2} \end{pmatrix} \end{align*}\] and the parameters are asymptotically independent.