Estimators of the extremal coefficient

These functions estimate the extremal coefficient using an approximate sample from the Frechet distribution.

Usage

extcoef(
  dat,
  coord = NULL,
  thresh = NULL,
  estimator = c("schlather", "smith", "fmado"),
  standardize = TRUE,
  method = c("nonparametric", "parametric"),
  prob = 0,
  plot = TRUE,
  ...
)

Arguments

dat: an n by D matrix of unit Frechet observations
coord: an optional d by D matrix of location coordinates
thresh: threshold parameter (default is to keep all data if prob = 0).
estimator: string indicating which model estimates to compute, one of smith, schlather or fmado.
standardize: logical; should observations be transformed to unit Frechet scale? Default is to transform
method: string indicating which method to use to transform the margins. See Details
prob: probability of not exceeding threshold thresh
plot: logical; should cloud of pairwise empirical estimates be plotted? Default to TRUE.
...: additional parameters passed to the function, currently ignored.

Value

an invisible list with vectors dist if coord is non-null or else a matrix of pairwise indices ind,

extcoef and the supplied estimator, fmado and binned. If estimator == "schlather", an additional matrix with 2 columns containing the binned distance binned with the h and the binned extremal coefficient.

Details

The Smith estimator: suppose $Z(x)$ is simple max-stable vector (i.e., with unit Frechet marginals). Then $1/Z$ is unit exponential and $1/\max(Z(s_1), Z(s_2))$ is exponential with rate $\theta = \max\{Z(s_1), Z(s_2)\}$. The extremal index for the pair can therefore be calculated using the reciprocal mean.

The Schlather and Tawn estimator: the likelihood of the naive estimator for a pair of two sites $A$ is $$ \mathrm{card}\left\{ j: \max_{i \in A} X_i^{(j)}\bar{X}_i)>z \right\} \log(\theta_A) - \theta_A \sum_{j=1}^n \left[ \max \left\{z, \max_{i \in A} (X_i^{(j)}\bar{X}_i)\right\}\right]^{-1},$$ where $\bar{X}_i = n^{-1} \sum_{j=1}^n 1/X_i^{(j)}$ is the harmonic mean and $z$ is a threshold on the unit Frechet scale. The search for the maximum likelihood estimate for every pair $A$ is restricted to the interval $[1,3]$. A binned version of the extremal coefficient cloud is also returned. The Schlather estimator is not self-consistent. The Schlather and Tawn estimator includes as special case the Smith estimator if we do not censor the data (p = 0) and do not standardize observations by their harmonic mean.

The F-madogram estimator is a non-parametric estimate based on a stationary process $Z$; the extremal coefficient satisfies $$\theta(h)=\frac{1+2\nu(h)}{1-2\nu(h)},$$ where $$\nu(h) = \frac{1}{2} \mathsf{E}[|F(Z(s+h)-F(Z(s))|]$$ The implementation only uses complete pairs to calculate the relative ranks.

All estimators are coded in plain R and computations are not optimized. The estimation time can therefore be significant for large data sets. If there are no missing observations, the routine fmadogram from the SpatialExtremes package should be prefered as it is noticeably faster.

The data will typically consist of max-stable vectors or block maxima. Both of the Smith and the Schlather--Tawn estimators require unit Frechet margins; the margins will be standardized to the unit Frechet scale, either parametrically or nonparametrically unless standardize = FALSE. If method = "parametric", a parametric GEV model is fitted to each column of dat using maximum likelihood estimation and transformed back using the probability integral transform. If method = "nonparametric", using the empirical distribution function. The latter is the default, as it is appreciably faster.

References

Schlather, M. and J. Tawn (2003). A dependence measure for multivariate and spatial extremes, Biometrika, 90(1), pp.139--156.

Cooley, D., P. Naveau and P. Poncet (2006). Variograms for spatial max-stable random fields, In: Bertail P., Soulier P., Doukhan P. (eds) Dependence in Probability and Statistics. Lecture Notes in Statistics, vol. 187. Springer, New York, NY

R. J. Erhardt, R. L. Smith (2012), Approximate Bayesian computing for spatial extremes, Computational Statistics and Data Analysis, 56, pp.1468--1481.

Examples

if (FALSE) {
coord <- 10*cbind(runif(50), runif(50))
di <- as.matrix(dist(coord))
dat <- rmev(n = 1000, d = 100, param = 3, sigma = exp(-di/2), model = 'xstud')
res <- extcoef(dat = dat, coord = coord)
# Extremal Student extremal coefficient function

XT.extcoeffun <- function(h, nu, corrfun, ...){
  if(!is.function(corrfun)){
    stop('Invalid function \"corrfun\".')
  }
  h <- unique(as.vector(h))
  rhoh <- sapply(h, corrfun, ...)
  cbind(h = h, extcoef = 2*pt(sqrt((nu+1)*(1-rhoh)/(1+rhoh)), nu+1))
}
#This time, only one graph with theoretical extremal coef
plot(res$dist, res$extcoef, ylim = c(1,2), pch = 20); abline(v = 2, col = 'gray')
extcoefxt <- XT.extcoeffun(seq(0, 10, by = 0.1), nu = 3,
                            corrfun = function(x){exp(-x/2)})
lines(extcoefxt[,'h'], extcoefxt[,'extcoef'], type = 'l', col = 'blue', lwd = 2)
# Brown--Resnick extremal coefficient function
BR.extcoeffun <- function(h, vario, ...){
  if(!is.function(vario)){
    stop('Invalid function \"vario\".')
  }
  h <- unique(as.vector(h))
  gammah <- sapply(h, vario, ...)
  cbind(h = h, extcoef = 2*pnorm(sqrt(gammah/4)))
}
extcoefbr<- BR.extcoeffun(seq(0, 20, by = 0.25), vario = function(x){2*x^0.7})
lines(extcoefbr[,'h'], extcoefbr[,'extcoef'], type = 'l', col = 'orange', lwd = 2)

coord <- 10*cbind(runif(20), runif(20))
di <- as.matrix(dist(coord))
dat <- rmev(n = 1000, d = 20, param = 3, sigma = exp(-di/2), model = 'xstud')
res <- extcoef(dat = dat, coord = coord, estimator = "smith")
}