Processing math: 5%
+ - 0:00:00
Notes for current slide
Notes for next slide

Is there a limit to human longevity?

Department of Mathematics and Statistics, Dalhousie University

Léo Belzile, HEC Montréal
Based on joint work with Anthony Davison, Jutta Gampe, Holger Rootzén, Dmitrii Zholud

1 / 40

Estimating human lifespan

The study of human longevity is full of pitfalls for the unwarry...

The problem raises several statistical problems revolving around

data quality models extrapolation

2 / 40

Glossary

  • Supercentenarian: person living beyond 110th birthday
  • Semi-supercentenarian: dies between 105th and 110th birthdays.
  • Lifetime: life-length of an individual.
  • Lifespan: upper limit (if any) on distribution of lifetimes.
3 / 40

Why study longevity?

Statistical analysis needed to assess biological theories about

natural selection

mortality plateau

existence of
finite lifespan

4 / 40

Why study longevity?

Statistical analysis needed to assess biological theories about

natural selection

mortality plateau

existence of
finite lifespan

Lots of interest in the news!

4 / 40

Exponential growth of mortality

It is believed that exponential growth of mortality with age (Gompertz law) is followed by a period of deceleration, with slower rates of mortality increase at older ages.

Recent studies found that the exponential increase of the mortality risk with age (the famous Gompertz law) continues even at extreme old ages in humans, rats, and mice, thus challenging traditional views about old-age mortality deceleration, mortality leveling-off, and late-life mortality plateaus.

Gavrilova & Gavrilov (2015), Journals of Gerontology: Biological Sciences

5 / 40

The Gompertz–Makeham model is extremely popular in demography and fits well the distribution of lifetimes at lower levels, until 102-105.

Theory of senescence

Figure 3 of Pyrkov et al. (2021), Nature Communications, doi:10.1038/s41467-021-23014-1.

Figure 3 of Pyrkov et al. (2021), Nature Communications, doi:10.1038/s41467-021-23014-1.

6 / 40

In Japan, in 2016, 8167 male and 57 525 female were centenarian.

  • rate of 5.1 per 10K in 2015

In Canada, 6116 female and 835 male

  • rate of 1.9 per 10K

Between 1983 and 2009, a total of

  • 12 supercentenarians died
  • 321 semisupercentenarians died in Quebec.

  • oldest living today: Soeur André (Lucile Randon, 118 years, 236 days); – oldest ever: Jeanne Calment (122 years, 164 days).

Data quality

7 / 40

Sampling

Information limited due to availability of historical records.

  • Validation is key

    • necronyms
    • record falsification
    • mistakes in data registers
  • Most databases (e.g., Gerontology Research Group) include self-reported records.

8 / 40

Sampling

Information limited due to availability of historical records.

  • Validation is key

    • necronyms
    • record falsification
    • mistakes in data registers
  • Most databases (e.g., Gerontology Research Group) include self-reported records.

Opportunity samples

8 / 40

Shigechio Izumi was excluded from records because he beared his dead brother name, so died age 105 rather than 120.

In Japan, the first modern family registration system was established in 1872 (Jinshin-KOSEKI), amended in 1886 by the Family Registration Law, Chapter 10 of Exceptional Lifespan.

Italy: example of teenager reportedly borned in 1800s, who died age 13 but was initially recorded to have died 113.

Jeanne Calment, the controversy

9 / 40

Nikolai Zak (conspiracy?) theory in Rejuvenation Research that Yvonne Calment took the place of her mother Jeanne to avoid inheritance taxes.

International Database on Longevity

To draw reliable conclusions, we need representative samples.

  • validated supercentenarian (110+) from 13 countries
10 / 40

International Database on Longevity

To draw reliable conclusions, we need representative samples.

  • validated supercentenarian (110+) from 13 countries

  • plus (partly validated) semi-supercentenarian (105-109) for 9 countries

10 / 40

International Database on Longevity

To draw reliable conclusions, we need representative samples.

  • validated supercentenarian (110+) from 13 countries

  • plus (partly validated) semi-supercentenarian (105-109) for 9 countries

  • Age-ascertainement bias-free

10 / 40

International Database on Longevity

To draw reliable conclusions, we need representative samples.

  • validated supercentenarian (110+) from 13 countries

  • plus (partly validated) semi-supercentenarian (105-109) for 9 countries

  • Age-ascertainement bias-free

1081 validated supercentenarians

10 / 40
  • Nearly 50% of the records fail the validation check.
  • Random sample validated for semi-supercentenarians in France

Sampling mechanisms

Data are obtained by casting a net on the population of potential (semi)-supercentenarians.

11 / 40

Sampling mechanisms

Data are obtained by casting a net on the population of potential (semi)-supercentenarians.

  • for IDL, (only) supercentenarians in a country who died between dates c1 and c2.
11 / 40

Sampling mechanisms

Data are obtained by casting a net on the population of potential (semi)-supercentenarians.

  • for IDL, (only) supercentenarians in a country who died between dates c1 and c2.

  • records for the candidates are then individually validated.

11 / 40

Lexis diagram for interval truncation

Lexis diagrams showing the selection mechanism.

Lexis diagrams showing the selection mechanism.

12 / 40

More complex truncation schemes!

Semisupercentenarians (105-109) who died in window (d1,d2)(c1,c2).

Lexis diagrams for IDL data with semisupercentenarian and supercentenarians

Lexis diagrams for IDL data with semisupercentenarian and supercentenarians

13 / 40

Lexis diagram for Italian data

Lexis diagrams for Istat data with semisupercentenarian and supercentenarians

Lexis diagrams for Istat data with semisupercentenarian and supercentenarians

14 / 40

Truncation can be hidden

  • Extinct cohort method : Birth cohorts for which no death has been reported for X consecutive years.
  • counts cross-tabulated by years of birth, age and gender.
Annual Vital Statistics Report of Japan (Hanamaya & Sibuya, 2014).

Annual Vital Statistics Report of Japan (Hanamaya & Sibuya, 2014).

15 / 40

Why does it matter?

Ignoring truncation leads to underestimation of the survival probability: population increase and reduction in mortality at lower age translates into larger impact for later birth cohorts.

Impact of truncation on quantile-quantile plots (left) and maximum age by birth year (right).

Impact of truncation on quantile-quantile plots (left) and maximum age by birth year (right).

16 / 40

Incorrect conclusions

Failing to account for truncation and increase in population.

Failing to account for truncation and increase in population.

17 / 40

Models

18 / 40

Survival analysis

Denote the lifetime T, a continuous random variable with distribution F, density f, lifespan tF=sup and survivor and hazard functions

\begin{align*} S(t) &= \Pr(T>t) =1-F(t), \\h(t) &= \frac{f(t)}{S(t)}, \quad t>0. \end{align*}

19 / 40

Poisson process

  • Suppose individuals independently reach age u_0 at calendar time x at rate \nu(x), and subsequently die at age t+u_0 with density f.
  • Events in \mathcal{C} = [c_1 , c_2] \times [u_0, \infty) follow a Poisson process of rate \begin{align*} \lambda(c, t) = \nu(c − t)f(t), \qquad c \in \mathbb{R}, t > 0 \end{align*} at calendar time c and excess lifetime t.
20 / 40

Poisson process

  • The lifetime density for dying in \mathcal{C} is \begin{align*} f_{\mathcal{C}}(t) \propto f(t)w_{\mathcal{C}}(t), \quad w_{\mathcal{C}}(t) = \int_{c_1 - t}^{c_2-t} \nu(x) \mathrm{d}x, \quad t>0 \end{align*} where w_{\mathcal{C}} is decreasing, so f_{\mathcal{C}} is stochastically smaller than f.
21 / 40

Likelihood contributions

The likelihood depends on \nu, hence consider the conditional likelihood \begin{align*} \frac{f(t)}{F(b)-F(a)}, \quad a < t< b \end{align*} for interval truncated data and, for left-truncated and right-censored data, \begin{align*} \frac{h(t)^\delta S(t)}{1-F(a)}, \quad t> a, \end{align*} where [a, b] = [\max\{0, c_1 − x\}, c_2 − x].

22 / 40

Models

Many models popular in demography, many with infinite endpoint.

  • exponential: h(t) = \sigma^{-1} for \sigma>0.
  • Gompertz–Makeham (1825, 1860): h(t) = \lambda + \sigma^{-1}\exp(\beta t/\sigma), \qquad \beta, \sigma>0, \lambda \geq 0.
  • Logistic (Thatcher, 1999): h(t) = \lambda + \frac{A\exp(\beta t/ \sigma)}{1+B\exp(\beta t/ \sigma)}, \qquad A>0, B \ge 0.
23 / 40

Generalized Pareto distribution

Most records include only lifetime above u_0 (threshold exceedances)

If a scaling function a_u exists such that (X − u)/a_u has a non-degenerate distribution conditional on X > u, then (Pickands, 1975) \frac{\Pr\{(X-u)/a_u > t\}}{\Pr(X >u)} \to \begin{cases} (1+\xi t/\sigma)_{+}^{-1/\xi}, & \xi \neq 0\\ \exp(-t/\sigma), & \xi = 0. \end{cases} where c_+ = \max\{c, 0\} for a real number c.

The unique nondegenerate limiting distribution for exceedances of a threshold u is generalized Pareto.

24 / 40

Penultimate approximation

At lower levels, the behaviour of the fitted model depends on the reciprocal hazard, r(t) = 1/h(t); under mild regularity conditions,

\xi = \lim_{t \to t_{F}} r'(t) and a pre-asymptotic shape is \xi_u = r'(u).

For example, the Gompertz model has \xi_u \nearrow 0: estimates of \xi tend to be negative.

25 / 40

The speed of convergence is quite fast, so we would expect the exceedances to be well approximated by an exponential distribution.

Threshold stability

A key property of the generalized Pareto distribution is threshold stability.

  • can extrapolate behaviour of F at higher levels
  • useful for choosing u in applications
    • fit model at multiple threshold u_1 < \cdots < u_k.
    • check whether shape \xi agrees over range.
26 / 40

Lack of (threshold) stability

Threshold stability plots for France and Italy (left), and Netherlands (right).

Threshold stability plots for France and Italy (left), and Netherlands (right).

27 / 40

How good is the approximation?

Quantile-quantile plots with 95% pointwise and simultaneous bands (left) and conditional cumulative hazard (right) for Istat.

Quantile-quantile plots with 95% pointwise and simultaneous bands (left) and conditional cumulative hazard (right) for Istat.

28 / 40

Bootstrap estimates obtained by conditioning on truncation time and birth dates. New observations simulated from doubly truncated distributions

Accounting for interval truncation

The plotting position for x-axis of Q-Q plot for observation y_i is F_0^{-1}\left[F_0(a_i) +\left\{ F_0(b_i)-F_0(a_i) \right\} \frac{F_n(y_i) - F_n(a_i)}{F_n(b_i)-F_n(a_i)}\right] where

  • F_0 is the postulated (i.e., fitted) parametric distribution,
  • F_0^{-1} is the corresponding quantile function,
  • F_n is the NPMLE of the distribution function (Turnbull, 1976).

Censored observations not displayed.

29 / 40

Flexibility is key

We fit a semiparametric hazard function h(t) = \{\sigma + \xi t + g(t)\}^{-1}_{+} with g(t) \to 0 as t \to t_{F} with g(t) a cubic regression spline

  • generalizes generalized Pareto model
  • reduces to parametric model in upper tail
  • equispaced knots
30 / 40

Left: figure obtained with bshazard for left-truncated right-censored data Right: discretize data into daily bins, use cumulative hazard H(t) = sum_{z=1}^t h(z)/365 for interpretability, so survival function is exp(-H(x))

Let the tail speak for itself!

Nonparametric hazard (left) and semiparametric generalized Pareto (right).

Nonparametric hazard (left) and semiparametric generalized Pareto (right).

Semiparametric estimator suggest a wide range of plausible behaviour, including constant risk.

31 / 40

Take home messages

cannot use low thresholds
for extrapolation.

32 / 40

Take home messages

cannot use low thresholds
for extrapolation.

goodness-of-fit diagnostics suggest
generalized Pareto model fits well.

32 / 40

Take home messages

cannot use low thresholds
for extrapolation.

goodness-of-fit diagnostics suggest
generalized Pareto model fits well.

hazard doesn't stabilize
until about 108 years.

32 / 40

Take home messages

cannot use low thresholds
for extrapolation.

goodness-of-fit diagnostics suggest
generalized Pareto model fits well.

hazard doesn't stabilize
until about 108 years.

shape estimates suggest
a decrease of the risk above.

32 / 40

Is there a finite lifespan?

Mathematically speaking, is t_F=\sup\{t: F(t)<1\} = \infty?

Hard to convey to the average reader:

  • t_F=\infty does not imply immortality.
  • \Pr(T > u) < \varepsilon does not imply finite lifespan t_F < u.
33 / 40

Is there a finite lifespan?

Mathematically speaking, is t_F=\sup\{t: F(t)<1\} = \infty?

Hard to convey to the average reader:

  • t_F=\infty does not imply immortality.
  • \Pr(T > u) < \varepsilon does not imply finite lifespan t_F < u.

the answer may be in the model.

  • Gompertz–Makeham has no right endpoint and t_F =\infty.
  • exponential model has a constant hazard (plateau of mortality).
  • the generalized Pareto implies a lifespan of u-\sigma/\xi if \xi < 0.
33 / 40

Extrapolation

34 / 40

Is the lifetime distribution bounded?

Profile likelihood for endpoint for various countries and three thresholds.

Profile likelihood for endpoint for various countries and three thresholds.

35 / 40

Human lifespan

No discernible differences between

  • earlier and later birth cohorts,
  • countries,
  • men and women, except that after age 108 French men have lower survival.

Not to be confused with gender inbalance due to lower survival of men.

36 / 40

You have no power in here!

The power of a likelihood ratio test for detecting a finite endpoint (obtained by simulating records with a generalized Pareto distribution with lifespan t_F) is high: based on France/Italy/IDL data (2016 version),

  • 125 years: combined power of 97%;
  • 130 years: combined power of 83%;
  • 135 years: combined power of 66%.

Suggests that the human lifespan lies well beyond any lifetime yet observed.

37 / 40

Huge uncertainty

Japanese (unvalidated) data are interval-censored and right-truncated

Posterior credible intervals by threshold (left) and sampling distribution with(out) rounding (right).

Posterior credible intervals by threshold (left) and sampling distribution with(out) rounding (right).

38 / 40

Supercentenarians [don't] live forever...

Estimated exponential distribution above 110 years for IDL has mean 0.5 (0.46, 0.53): a coin toss.

39 / 40

Supercentenarians [don't] live forever...

Estimated exponential distribution above 110 years for IDL has mean 0.5 (0.46, 0.53): a coin toss.

Surviving until 130 years conditional on surviving until 110 years

  • is equivalent to obtaining 20 heads in a row,
  • a less than one-in-a-million chance...
39 / 40

Supercentenarians [don't] live forever...

Estimated exponential distribution above 110 years for IDL has mean 0.5 (0.46, 0.53): a coin toss.

Surviving until 130 years conditional on surviving until 110 years

  • is equivalent to obtaining 20 heads in a row,
  • a less than one-in-a-million chance...

Anticipated increase in number of supercentenarians make it possible to observe 130, but higher record is highly unlikely (Pearce & Raftery 2021).

39 / 40

References

  • Léo R. Belzile, Anthony C. Davison, Jutta Gampe, Holger Rootzén and Dmitrii Zholud (2022). Is there a cap on longevity? A statistical review., Annual Reviews of Statistics and its Applications, 9, 21–45, doi:10.1146/annurev-statistics-040120-025426.
  • Léo R. Belzile and Anthony C. Davison (2020). Improved inference for risk measures of univariate extremes (2022), Annals of Applied Statistics, 16(3): 1524–1549, doi: 10.1214/21-AOAS1555
  • Léo R. Belzile, Anthony C. Davison, Holger Rootzén and Dmitrii Zholud (2021). Human mortality at extreme age., Royal Society Open Science, 8, doi:10.1098/rsos.202097.
40 / 40

Estimating human lifespan

The study of human longevity is full of pitfalls for the unwarry...

The problem raises several statistical problems revolving around

data quality models extrapolation

2 / 40
Paused

Help

Keyboard shortcuts

, , Pg Up, k Go to previous slide
, , Pg Dn, Space, j Go to next slide
Home Go to first slide
End Go to last slide
Number + Return Go to specific slide
b / m / f Toggle blackout / mirrored / fullscreen mode
c Clone slideshow
p Toggle presenter mode
t Restart the presentation timer
?, h Toggle this help
oTile View: Overview of Slides
Esc Back to slideshow