Nonparametric maximum likelihood estimation for survival data
Léo Belzile
2023-11-09
Source:vignettes/nonparametric.Rmd
nonparametric.Rmd
The longevity
package includes an implementation of Turnbull’s EM algorithm for the empirical distribution function for data subject to arbitrary censoring and truncation patterns.
For example, we can consider the interval censored data considered in Lindsey and Ryan (1998). The left
and right
give respectively.
library(longevity)
left <- c(0,15,12,17,13,0,6,0,14,12,13,12,12,0,0,0,0,3,4,1,13,0,0,6,0,2,1,0,0,2,0)
right <- c(16, rep(Inf, 4), 24, Inf, 15, rep(Inf, 5), 18, 14, 17, 15,
Inf, Inf, 11, 19, 6, 11, Inf, 6, 12, 17, 14, 25, 11, 14)
test <- np_elife(time = left, # left bound for time
time2 = right, # right bound for time
type = "interval2", # data are interval censored
event = 3) # specify interval censoring, argument recycled
plot(test)
We can also extract the equivalence classes and compare them to Lindsey and Ryan (1998): these match the values returned in the paper. The summary statistics reported by the print
method include the restricted mean, which is computed by calculating the area under the survival curve.
test$xval
## left right
## [1,] 4 6
## [2,] 13 14
## [3,] 15 16
## [4,] 17 18
print(test)
## Nonparametric maximum likelihood estimator
##
## Routine converged
## Number of equivalence classes: 4
## Restricted mean at upper bound 18 : 10.47143
## Quartiles of the survival function: 15.5 14 8
References
Lindsey, Jane C., and Louise M. Ryan. 1998. “Methods for Interval-Censored Data.” Statistics in Medicine 17 (2): 219–38. https://doi.org/10.1002/(SICI)1097-0258(19980130)17:2<219::AID-SIM735>3.0.CO;2-O.