Extreme Value Analysis

Slides:



Advertisements
Similar presentations
Introduction to modelling extremes
Advertisements

Introduction to modelling extremes Marian Scott (with thanks to Clive Anderson, Trevor Hoey) NERC August 2009.
Hydrologic Statistics Reading: Chapter 11, Sections 12-1 and 12-2 of Applied Hydrology 04/04/2006.
Econ. & Mat. Enrique Navarrete Palisade Risk Conference
Introduction to Algorithmic Trading Strategies Lecture 8 Risk Management Haksun Li
Hydrologic Statistics
Sampling Distributions (§ )
Quantile Estimation for Heavy-Tailed Data 23/03/2000 J. Beirlant G. Matthys
Extremes ● An extreme value is an unusually large – or small – magnitude. ● Extreme value analysis (EVA) has as objective to quantify the stochastic behavior.
CF-3 Bank Hapoalim Jun-2001 Zvi Wiener Computational Finance.
Climate Change and Extreme Wave Heights in the North Atlantic Peter Challenor, Werenfrid Wimmer and Ian Ashton Southampton Oceanography Centre.
4. Multiple Regression Analysis: Estimation -Most econometric regressions are motivated by a question -ie: Do Canadian Heritage commercials have a positive.
Evaluating Hypotheses
WFM 5201: Data Management and Statistical Analysis
Statistics and Probability Theory Prof. Dr. Michael Havbro Faber
Inferences About Process Quality
Market Risk VaR: Historical Simulation Approach
Linear and generalised linear models Purpose of linear models Least-squares solution for linear models Analysis of diagnostics Exponential family and generalised.
The Lognormal Distribution
Principles of the Global Positioning System Lecture 10 Prof. Thomas Herring Room A;
Flood Frequency Analysis
Concepts and Notions for Econometrics Probability and Statistics.
1 Least squares procedure Inference for least squares lines Simple Linear Regression.
On Model Validation Techniques Alex Karagrigoriou University of Cyprus "Quality - Theory and Practice”, ORT Braude College of Engineering, Karmiel, May.
Bayesian Spatial Modeling of Extreme Precipitation Return Levels Daniel COOLEY, Douglas NYCHKA, and Philippe NAVEAU (2007, JASA)
Extreme Value Analysis What is extreme value analysis?  Different statistical distributions that are used to more accurately describe the extremes of.
WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful IslamDr. Akm Saiful Islam WFM 5201: Data Management and Statistical Analysis Akm Saiful.
The Triangle of Statistical Inference: Likelihoood
Traffic Modeling.
CE 3354 ENGINEERING HYDROLOGY Lecture 6: Probability Estimation Modeling.
Statistics for Data Miners: Part I (continued) S.T. Balke.
Random Sampling, Point Estimation and Maximum Likelihood.
An Empirical Likelihood Ratio Based Goodness-of-Fit Test for Two-parameter Weibull Distributions Presented by: Ms. Ratchadaporn Meksena Student ID:
1 Statistical Distribution Fitting Dr. Jason Merrick.
The Examination of Residuals. Examination of Residuals The fitting of models to data is done using an iterative approach. The first step is to fit a simple.
Ch5. Probability Densities II Dr. Deshi Ye
The Triangle of Statistical Inference: Likelihoood Data Scientific Model Probability Model Inference.
Extreme Value Techniques Paul Gates - Lane Clark & Peacock James Orr - TSUNAMI GIRO Conference, 15 October 1999.
1 A non-Parametric Measure of Expected Shortfall (ES) By Kostas Giannopoulos UAE University.
ELEC 303 – Random Signals Lecture 18 – Classical Statistical Inference, Dr. Farinaz Koushanfar ECE Dept., Rice University Nov 4, 2010.
Extreme values and risk Adam Butler Biomathematics & Statistics Scotland CCTC meeting, September 2007.
Lecture 4: Statistics Review II Date: 9/5/02  Hypothesis tests: power  Estimation: likelihood, moment estimation, least square  Statistical properties.
INTRODUCTORY STUDY : WATER INDICATORS AND STATISTICAL ANALYSIS OF THE HYDROLOGICAL DATA EAST OF GUADIANA RIVER by Nikolas Kotsovinos,P. Angelidis, V. Hrissanthou,
Extreme Value Theory: Part II Sample (N=1000) from a Normal Distribution N(0,1) and fitted curve.
For information contact H. C. Koons 30 October Preliminary Analysis of ABFM Data WSR 11 x 11-km Average Harry Koons 30 October.
INTRODUCTION TO Machine Learning 3rd Edition
Chapter 7 Point Estimation of Parameters. Learning Objectives Explain the general concepts of estimating Explain important properties of point estimators.
The final exam solutions. Part I, #1, Central limit theorem Let X1,X2, …, Xn be a sequence of i.i.d. random variables each having mean μ and variance.
Review of Probability. Important Topics 1 Random Variables and Probability Distributions 2 Expected Values, Mean, and Variance 3 Two Random Variables.
Identification of Extreme Climate by Extreme Value Theory Approach
Review of Statistical Inference Prepared by Vera Tabakova, East Carolina University ECON 4550 Econometrics Memorial University of Newfoundland.
Brief Review Probability and Statistics. Probability distributions Continuous distributions.
New approaches in extreme-value modeling A.Zempléni, A. Beke, V. Csiszár (Eötvös Loránd University, Budapest) Flood Risk Workshop,
Probability distributions
Introduction to Inference Sampling Distributions.
ES 07 These slides can be found at optimized for Windows)
Stochastic Excess-of-Loss Pricing within a Financial Framework CAS 2005 Reinsurance Seminar Doris Schirmacher Ernesto Schirmacher Neeza Thandi.
CE 3354 ENGINEERING HYDROLOGY Lecture 6: Probability Estimation Modeling.
Review of Statistical Inference Prepared by Vera Tabakova, East Carolina University.
Parameter Estimation. Statistics Probability specified inferred Steam engine pump “prediction” “estimation”
1 Ka-fu Wong University of Hong Kong A Brief Review of Probability, Statistics, and Regression for Forecasting.
A major Hungarian project for flood risk assessment A.Zempléni (Eötvös Loránd University, Budapest, visiting the TU Munich as a DAAD grantee) Technical.
Fewer permutations, more accurate P-values Theo A. Knijnenburg 1,*, Lodewyk F. A. Wessels 2, Marcel J. T. Reinders 3 and Ilya Shmulevich 1 1Institute for.
ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
Application of Extreme Value Theory (EVT) in River Morphology
Flood Frequency Analysis
Chapter 7: Sampling Distributions
Hydrologic Statistics
HYDROLOGY Lecture 12 Probability
Sampling Distributions (§ )
Presentation transcript:

Extreme Value Analysis FISH 558 Decision Analysis in Natural Resource Management 11/30/2015 Noble Hendrix QEDA Consulting LLC Affiliate Faculty UW SAFS

Lecture Overview Motivating examples of extreme events Generalized Extreme Value Statistical Development Case Study: the white cliffs of Dover Generalized Pareto Distribution Case Study: whale strikes in SE Alaska Additional resources

Why should we care about extreme events? They are rare by definition, so why spend much time thinking about them? Often the consequences of the event have significant impacts to the system – mortality, colonization, episodic recruitment We tend to focus on averages, but extremes may be more important in some situations. We may also be interested in estimating extremes beyond what has been observed

Distribution of outcomes

Distribution of outcomes

Distribution of outcomes

Distribution of outcomes

Distribution of outcomes

Motivation 100 year floodplain

Motivation Surpassing the 100 year floodplain Road and home construction based on flood frequency and intensity i.e., 100 year floodplain

Motivation Hurricanes

Financial Markets

Statistical Foundations Central Limit Theorem Consider sequence of iid random variables, X1, … Xn We know that sum Sn = X1 + … + Xn, when normalized lead to the CLT: Statistical Foundations

Generalized Extreme Value Fisher-Tippet Asymptotic Theorem Define maxima of sequence of random variables Mn = max(X1, …, Xn) For normalized maxima, there is also a non-degenerate distribution H(x), which is a GEV distribution

Generalized Extreme Value Cumulative Density Function u – location s – scale v - shape

Generalized Extreme Value Variants of the GEV Shape parameter v defines several distributions: Gumbel: v = 0 Weibull: v < 0 Fréchet: v > 0

Generalized Extreme Value Shapes of GEV Weibull Gumbel Fréchet

Generalized Extreme Value Applicability Almost all common continuous distributions converge on H(x) for some value of v Weibull – beta Gumbel – normal, lognormal, hyperbolic, gamma, chi-squared Fréchet – Pareto, inverse gamma, Student t, loggamma

Generalized Extreme Value Minima What about minima? min(X1, …, Xn) = - max(-X1, … ,-Xn) If H(x) is the limiting distribution for maxima, then 1 – H(-x) is the limiting distribution for minima, so can also be handled

Generalized Extreme Value Estimation Obtain data from an unknown distribution F Let’s assume that there is an extreme value distribution Hv for some value of v The true distribution of the n-block maximum Mn can be approximated for large enough n with a GEV distribution H(x) Fit model to repeated observations of an n-block maximum, thus m blocks of size n

Generalized Extreme Value Example - Data Annual sea level height at Dover, Britain between 1912 and 1992

Generalized Extreme Value Example - Data Annual sea level height at Dover, Britain between 1912 and 1992

Generalized Extreme Value R package evd > require(evd) > data(sealevel) > sl.no<-na.omit(sealevel[,1]) > fgev(sl.no) Call: fgev(x = sl.no) Deviance: -5.022368 Estimates loc scale shape 3.59252 0.20195 -0.02107 Standard Errors loc scale shape 0.02642 0.01874 0.07730

Generalized Extreme Value Diagnostics

Generalized Extreme Value Return Level Plot Return level – “how long to wait on average until see another event equal to or more extreme” If H is the distribution of the n-block maximum, the k return level is the 1 – 1/k quantile of H

Generalized Extreme Value Profile likelihood of parameters

Generalized Extreme Value Limitations Limitations of the GEV: Used for block maxima, e.g., annual precipitation, annual flow, Only 1 exceedance per block May ignore some important observations, Some go so far as to say it is a wasteful method! (McNeil et al. 2005 Quantitative Risk Management, Princeton)

Generalized Pareto Distribution GEV has largely been surpassed by another method for extremes over a threshold Pickands (1975) developed a model for excesses y over threshold a Pickands 1975 Annals of Stats 3:119

Generalized Pareto Distribution a – threshold b – scale v - shape

Generalized Pareto Distribution Shapes of GPD Positive shape = limitless loss

Generalized Pareto Distribution Applicability For any continuous distributions that converge on H(x) for some value of v, which was most of the continuous distributions of interest The same distributions will converge on G(x) as an excess distribution as the threshold a is raised

Generalized Pareto Distribution Estimation Obtain data from an unknown distribution F Calculate Yj = Xj – a for Na that exceed threshold a maximize log-likelihood:

Generalized Pareto Distribution Threshold Estimation Have an interesting problem: Need a value of threshold a that must be high enough to satisfy the theoretical assumptions Need enough data above the threshold a so that the parameters are well estimated Use a sample mean residual life plot to help identify a reasonable threshold value a

Generalized Pareto Distribution Sample Mean Residual Life Plot Let Y = X – a0. At threshold a0, if Y is GPD with parameters b and v then E(Y) = b/(1 – v), v < 1 This is true for all thresholds ai > a0, but the scale parameter bi must be appropriate to the threshold ai E(X-ai| X > ai) = (bi + v*ai)/(1-v), Thus E(X - a| X > a) is a linear function of a where GPD appropriate, so can plot E(x-ai) (where x are our observed data) versus ai. This is the sample mean residual life plot, and confidence intervals added by assuming E(x-a) are approximately normally distributed

Generalized Pareto Distribution Example - Data Quantifying strike rates of whales in southeast Alaska

Generalized Pareto Distribution Distances to Whales Minimum distances (i.e., D < 0) are where losses occur, so transform distance D into a positive loss metric, where value of 100 equates to D = 0

Generalized Pareto Distribution Whale Distance Metric

Generalized Pareto Distribution Threshold determination Looking for discontinuities in the mean excess, E(x-ai), at different threshold values ai Identified value of 70 as the threshold (equates to a distance of 300m between whales and ships)

Generalized Pareto Distribution Threshold determination library(POT) mrlplot(w.metric, xlim = c(50,90) ) tcplot(w.metric, u.range = c(50, 90) ) Mean residual life plot (previous slide) indicates a = 70 Discontinuity in scale and shape estimates when threshold a > 70

Generalized Pareto Distribution Estimation > fitgpd(w.metric, thresh = 70, est = "mle") Estimator: MLE Deviance: 974.4418 AIC: 978.4418 Varying Threshold: FALSE Threshold Call: 70 Number Above: 151 Proportion Above: 0.1946 Estimates scale shape 14.8380 -0.4706 Standard Error Type: observed Standard Errors scale shape 1.53542 0.07452 Asymptotic Variance Covariance scale shape scale 2.357530 -0.106864 shape -0.106864 0.005553 Optimization Information Convergence: successful

Generalized Pareto Distribution Diagnostics

Generalized Pareto Distribution Likelihood profiles

Generalized Pareto Distribution Likelihood profiles with different thresholds relative log likelihood - likelihood relative to maximum for that threshold value

Generalized Pareto Distribution Empirical and Estimated Comparison of empirical (no observed strikes) and GPD model estimates for a = 70 Since 2000, 2 confirmed strikes GPD provides better characterization of risk Empirical GPD

Generalized Pareto Distribution Return Level Return level – how many encounters where whales are less than 300m until a strike? Conditional return level of approx. 500 Absolute return level of approx. 2500 (1 in 5 encounters has an encounter < 300m)

Summary: GEV and EVT Generalized Extreme Value (GEV) distribution Used for block maxima, e.g., maximum sea-level per year Data loss due to only block maxima Generalized Pareto Distribution (GPD) Used for points over a threshold All exceedances above some limit are used Question about how to deal with selecting a threshold value

Additional Resources Books and Papers Coles, S. 2001. An Introduction to Statistical Modelling of Extreme Values. Springer Series in Statistics. London. McNeil, A. J., Frey, R., & Embrechts, P. 2005. Quantitative risk management: concepts, techniques, and tools. Princeton University Press. Embrechts, P. 1997. Modelling extremal events: for insurance and finance (Vol. 33). Springer. Bayesian GPD Modeling Coles, S. and L. Pericchi. 2003. Anticipating catastrophes through extreme value modeling. Applied Statistics 52(4): 405–416. Jagger. T. H. and J. B. Elsne 2004. Climatology models for extreme hurricane winds near the United States. Journal of Climate 19: 3220-3236.

Additional Resources Fitting models in R and BUGS A few R packages Points over Threshold (POT) Extreme Value Distributions (evd) extRemes Quantitative Risk Management (QRM) evdbayes BUGS OpenBUGS – GEV and GPD WinBUGS/JAGS – GPD with 1’s trick