A Survey of Statistical Methods for Climate Extremes Chris Ferro Climate Analysis Group Department of Meteorology University of Reading, UK 9th International Meeting on Statistical Climatology, Cape Town, 26 May 2004
Overview Climate extremes – Aims and issuesAims and issues – PRUDENCE projectPRUDENCE project Extreme-value theory – Fundamental ideaFundamental idea – Spatial modellingSpatial modelling – ClusteringClustering Concluding remarks
Aims and Issues Description – Statistical properties Comparison – Space, time, model, obs Prediction – Space, time, magnitude Non-stationarity – Space, time Dependence – Space, time Data – Size, inhomogeneity
PRUDENCE European climate Control 1961–1990 Scenarios 2071– high-resolution, limited domain regional GCMs 6 driving global GCMs
Fundamental Idea Data sparsity requires efficient methods Extrapolation must be justified by theory Probability theory identifies appropriate models Example: X 1 + … + X n Normal max{X 1, …, X n } GEV
Spatial Statistical Models Single-site models Conditioned independence: Y(s', t) Y(s, t) | (s) – Deterministically linked parametersDeterministically linked parameters – Stochastically linked parametersStochastically linked parameters Residual dependence: Y(s', t) Y(s, t) | (s) – Multivariate extremesMultivariate extremes – Max-stable processesMax-stable processes
Generalised Extreme Value (GEV) Block maximum M n = max{X 1, …, X n } for iid X i Pr(M n x) G(x) = exp[–{1 + (x – ) / } –1/ ] for large n
Single-site Model Annual maximum Y(s, t) at site s in year t Assume Y(s, t) | (s) = ( (s), (s), (s)) iid GEV( (s)) for all t m-year return level satisfies G(y m (s) ; (s)) = 1 – 1 / m Daily max 2m air temperature (ºC) at 35 grid points over Switzerland from control run of HIRHAM in HadAM3H
Temperature – Single-site Model y 100
Generalised Pareto (GP) Points (i / n, X i ), 1 i n, for which X i exceeds a high threshold approximately follow a Poisson process Pr(X i – u > x | X i > u) (1 + x / u ) –1/ for large u
Deterministic Links Assume Y(s, t) | (s) = ( (s), (s), (s)) iid GEV( (s)) for all t Global model (s) = h(x(s) ; 0 ) for all s e.g. (s) = 0 + 1 ALT(s) Local model (s) = h(x(s) ; 0 ) for all s N(s 0 ) Spline model (s) = h(x(s) ; 0 ) + (s) for all s
Temperature – Global Model (s)= 0 + 1 ALT(s) 0 =31.8ºC (0.2) 1 =–6.1ºC/km (0.1) p=0.03 single site (y 100 ) altitude (km) global (y 100 )
Stochastic Links Model l( (s)) = h(x(s) ; 0 ) + Z(s ; 1 ), random process Z Continuous Gaussian process, i.e. {Z(s j ) : j = 1, …, J } ~ N(0, ( 1 )), jk ( 1 ) = cov{Z(s j ), Z(s k )} Discrete Markov random field, e.g. Z(s) | {Z(s') : s' s} ~ N( (s) + (s, s'){Z(s') – (s)}, 2 ) s'N(s)s'N(s)
Stochastic Links – Example Model (s)= 0 + 1 ALT(s) + Z (s | a , b , c ) log (s)=log 0 + Z (s | a , b , c ) (s)= 0 + Z (s | a , b , c ) cov{Z * (s j ), Z * (s k )}=a * 2 exp[–{b * d(s j, s k )} c * ] Independent, diffuse priors on a *, b *, c *, 0, 1, 0 and 0 Metropolis-Hastings with random-walk updates
Temperature – Stochastic Links 00 11 latent (y 100 ) global (y 100 )
Multivariate Extremes Maxima M nj = max{X 1j, …, X nj } for iid X i = (X i1, …, X iJ ) Pr(M nj x j for j = 1, …, J ) MEV for large n e.g. logistic Pr(M n1 x 1, M n2 x 2 ) = exp{–(z 1 –1/ + z 2 –1/ ) } Model {Y(s, t) : s N(s 0 )} | { , (s) : s N(s 0 )} ~ MEV
Temperature – Multivariate Extremes Assume Y(s, t) Y(s', t) | Y(s 0, t) for all s, s' N(s 0 ) and locally constant single site (y 100 ) multivar (y 100 )
Max-stable Processes Maxima M n (s) = max{X 1 (s), …, X n (s)} for iid {X(s) : s S} Pr{M n (s) x(s) for s S} max-stable for large n Model Y*(s, t) = max{r i k(s, s i ) : i 1} where {(r i, s i ) : i 1} is a Poisson process on (0, ) S e.g. k(s, s i ) exp{ – (s – s i )' ( 1 ) – 1 (s – s i ) / 2}
Precipitation – Max-stable Process Estimate Pr{Y(s j, t) y(s j ) for j = 1, …, J } Max-stable model0.16 Spatial independence0.54 Realisation of Y*
Clustering Extremes can cluster in stationary sequences X 1, …, X n Points i / n, 1 i n, for which X i exceeds a high threshold approximately follow a compound Poisson process
Zurich Temperature (June – July) Extremal Index Threshold Percentile Pr(cluster size > 1) Threshold Percentile
Review Linkageefficiency, continuous space,description, interpretation, bias, expensecomparison Multivariatediscrete space, model choice,description dimension limitation Max-stablecontinuous space, estimation,prediction model choice
Future Directions Wider application of EV theory in climate science – combine with physical understanding – shortcomings of models, new applications Improved methods for non-identically distributed data – especially threshold methods with dependent data
Further Information Climate Analysis Group NCAR Alec Stephenson’s R software PRUDENCE ECA&D project My personal web-site