10.2 Poisson model for contingency table
We analyze a \(4 \times 3\) contingency table containing information about tumour type.
The first factor is cancer type
, with levels
- Hutchinson’s melanotic freckle,
- Superficial spreading melanoma,
- Nodular
- for Indeterminate type
The second variable, site
, is one of
1. Head and Neck,
2. Trunk,
3. Extremities
The data are count, hence we proceed with the analysis using a Poisson likelihood. This ressembles ANOVA models with factors.
# Create dataset
site <- gl(n = 3, k = 1, length = 12)
# gl generates levels of a factor
tumor <- gl(n = 4, k = 3) #each 3
cases <- c(22, 2, 10, 16, 54, 115, 19, 33, 73, 11, 17, 28)
cancer <- data.frame(tumor, site, cases)
# Four cases - no effect, main interaction only, additive
cancer.m0 <- glm(cases ~ 1, family = poisson, data = cancer)
cancer.m1 <- glm(cases ~ tumor, family = poisson, data = cancer)
cancer.m2 <- glm(cases ~ site, family = poisson, data = cancer)
cancer.m3 <- glm(cases ~ tumor + site, family = poisson, data = cancer)
# Saturated model
cancer.m4 <- glm(cases ~ tumor * site, family = poisson, data = cancer)
# Analysis of deviance
# Same syntax as for GLM
drop1(cancer.m4)
## Single term deletions
##
## Model:
## cases ~ tumor * site
## Df Deviance AIC
## <none> 0.000 83.111
## tumor:site 6 51.795 122.906
## Analysis of Deviance Table
##
## Model 1: cases ~ tumor + site
## Model 2: cases ~ tumor * site
## Resid. Df Resid. Dev Df Deviance Pr(>Chi)
## 1 6 51.795
## 2 0 0.000 6 51.795 2.05e-09
# anova(cancer.m4) returns three tests,
# but only the comparison with additive model is justified
summary(cancer.m4)
##
## Call:
## glm(formula = cases ~ tumor * site, family = poisson, data = cancer)
##
## Deviance Residuals:
## [1] 0 0 0 0 0 0 0 0 0 0 0 0
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) 3.0910 0.2132 14.498 < 2e-16
## tumor2 -0.3185 0.3286 -0.969 0.332432
## tumor3 -0.1466 0.3132 -0.468 0.639712
## tumor4 -0.6931 0.3693 -1.877 0.060511
## site2 -2.3979 0.7385 -3.247 0.001167
## site3 -0.7885 0.3814 -2.067 0.038701
## tumor2:site2 3.6143 0.7915 4.566 4.96e-06
## tumor3:site2 2.9500 0.7927 3.721 0.000198
## tumor4:site2 2.8332 0.8338 3.398 0.000679
## tumor2:site3 2.7608 0.4655 5.931 3.00e-09
## tumor3:site3 2.1345 0.4602 4.638 3.52e-06
## tumor4:site3 1.7228 0.5216 3.303 0.000957
##
## (Dispersion parameter for poisson family taken to be 1)
##
## Null deviance: 2.9520e+02 on 11 degrees of freedom
## Residual deviance: -1.8652e-14 on 0 degrees of freedom
## AIC: 83.111
##
## Number of Fisher Scoring iterations: 3
## [1] 2.050453e-09
The likelihood ratio test to check whether the interaction is significative is soundly rejected, hence we would keep the saturated model. For such a model, the fitted values correspond to the observed counts.
This is an example where the hypothesis of equal mean and variance does not seem to hold. Handling the overdispersion is beyond the scope of this course.