forecasts of rare events

Slides:



Advertisements
Similar presentations
Robin Hogan Ewan OConnor University of Reading, UK What is the half-life of a cloud forecast?
Advertisements

Estimation of Means and Proportions
Measuring the performance of climate predictions Chris Ferro, Tom Fricker, David Stephenson Mathematics Research Institute University of Exeter, UK IMA.
What is a good ensemble forecast? Chris Ferro University of Exeter, UK With thanks to Tom Fricker, Keith Mitchell, Stefan Siegert, David Stephenson, Robin.
What is a good ensemble forecast? Chris Ferro University of Exeter, UK With thanks to Tom Fricker, Keith Mitchell, Stefan Siegert, David Stephenson, Robin.
Federal Department of Home Affairs FDHA Federal Office of Meteorology and Climatology MeteoSwiss Improving COSMO-LEPS forecasts of extreme events with.
Creating probability forecasts of binary events from ensemble predictions and prior information - A comparison of methods Cristina Primo Institute Pierre.
Gridded OCF Probabilistic Forecasting For Australia For more information please contact © Commonwealth of Australia 2011 Shaun Cooper.
1 Summarizing Performance Data Confidence Intervals Important Easy to Difficult Warning: some mathematical content.
1 Simple Linear Regression Chapter Introduction In this chapter we examine the relationship among interval variables via a mathematical equation.
Lecture 16 – Thurs, Oct. 30 Inference for Regression (Sections ): –Hypothesis Tests and Confidence Intervals for Intercept and Slope –Confidence.
Simple Linear Regression. Introduction In Chapters 17 to 19, we examine the relationship between interval variables via a mathematical equation. The motivation.
Business Statistics, A First Course (4e) © 2006 Prentice-Hall, Inc. Chap 8-1 Chapter 8 Confidence Interval Estimation Business Statistics, A First Course.
1 Summarizing Performance Data Confidence Intervals Important Easy to Difficult Warning: some mathematical content.
Verification of extreme events Barbara Casati (Environment Canada) D.B. Stephenson (University of Reading) ENVIRONMENT CANADA ENVIRONNEMENT CANADA.
Common Probability Distributions in Finance. The Normal Distribution The normal distribution is a continuous, bell-shaped distribution that is completely.
Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-Hall, Inc. Chap 7-1 Business Statistics: A Decision-Making Approach 6 th Edition Chapter.
HSRP 734: Advanced Statistical Methods July 10, 2008.
1 On the use of radar data to verify mesoscale model precipitation forecasts Martin Goeber and Sean Milton Model Diagnostics and Validation group Numerical.
Stats for Engineers Lecture 9. Summary From Last Time Confidence Intervals for the mean t-tables Q Student t-distribution.
WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful IslamDr. Akm Saiful Islam WFM 5201: Data Management and Statistical Analysis Akm Saiful.
Chap 8-1 Copyright ©2013 Pearson Education, Inc. publishing as Prentice Hall Chapter 8 Confidence Interval Estimation Business Statistics: A First Course.
Verification of ensembles Courtesy of Barbara Brown Acknowledgments: Tom Hamill, Laurence Wilson, Tressa Fowler Copyright UCAR 2012, all rights reserved.
Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc. Chap 8-1 Confidence Interval Estimation.
PARAMETRIC STATISTICAL INFERENCE
Sampling Distributions & Standard Error Lesson 7.
© Crown copyright Met Office Probabilistic turbulence forecasts from ensemble models and verification Philip Gill and Piers Buchanan NCAR Aviation Turbulence.
Chap 7-1 A Course In Business Statistics, 4th © 2006 Prentice-Hall, Inc. A Course In Business Statistics 4 th Edition Chapter 7 Estimating Population Values.
Extreme values and risk Adam Butler Biomathematics & Statistics Scotland CCTC meeting, September 2007.
Sampling distributions rule of thumb…. Some important points about sample distributions… If we obtain a sample that meets the rules of thumb, then…
Chap 7-1 A Course In Business Statistics, 4th © 2006 Prentice-Hall, Inc. A Course In Business Statistics 4 th Edition Chapter 7 Estimating Population Values.
1 Summarizing Performance Data Confidence Intervals Important Easy to Difficult Warning: some mathematical content.
Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-Hall, Inc. Chap 7-1 Business Statistics: A Decision-Making Approach 6 th Edition Chapter.
Verification of ensemble systems Chiara Marsigli ARPA-SIMC.
Statistical Inference Statistical inference is concerned with the use of sample data to make inferences about unknown population parameters. For example,
Furthermore… References Katz, R.W. and A.H. Murphy (eds), 1997: Economic Value of Weather and Climate Forecasts. Cambridge University Press, Cambridge.
Page 1© Crown copyright 2004 The use of an intensity-scale technique for assessing operational mesoscale precipitation forecasts Marion Mittermaier and.
Chris Ferro Climate Analysis Group Department of Meteorology University of Reading Extremes in a Varied Climate 1.Significance of distributional changes.
Diagnostic verification and extremes: 1 st Breakout Discussed the need for toolkit to build beyond current capabilities (e.g., NCEP) Identified (and began.
© Crown copyright Met Office Verifying modelled currents using a threshold exceedance approach Dr Ray Mahdon An exploration of the Gerrity Skill Score.
Climate change, hydrodynamical models & extreme sea levels Adam Butler Janet Heffernan Jonathan Tawn Lancaster University Department of Mathematics &
Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc. Chap 8-1 Chapter 8 Confidence Interval Estimation Business Statistics: A First Course 5 th Edition.
Chapter 8 Confidence Interval Estimation Statistics For Managers 5 th Edition.
Statistics for Business and Economics 7 th Edition Chapter 7 Estimation: Single Population Copyright © 2010 Pearson Education, Inc. Publishing as Prentice.
Estimating standard error using bootstrap
Chapter 8: Estimating with Confidence
Confidence Interval Estimation
Intensity-scale verification technique
Fuzzy verification using the Fractions Skill Score
Systematic timing errors in km-scale NWP precipitation forecasts
Verifying and interpreting ensemble products
Georgi Iskrov, MBA, MPH, PhD Department of Social Medicine
Statistical Methods For Engineers
CHAPTER 29: Multiple Regression*
Elementary Statistics
Probability & Statistics Probability Theory Mathematical Probability Models Event Relationships Distributions of Random Variables Continuous Random.
DS4 Interpreting Sets of Data
Confidence Interval Estimation
Quantitative verification of cloud fraction forecasts
What forecast users might expect: an issue of forecast performance
Comparing two Rates Farrokh Alemi Ph.D.
Chapter 8: Estimating with Confidence
Chapter 8: Estimating with Confidence
Measuring the performance of climate predictions
Section 10.2 Comparing Two Means.
Verification of probabilistic forecasts: comparing proper scoring rules Thordis L. Thorarinsdottir and Nina Schuhen
the performance of weather forecasts
Georgi Iskrov, MBA, MPH, PhD Department of Social Medicine
What is a good ensemble forecast?
Short Range Ensemble Prediction System Verification over Greece
Presentation transcript:

forecasts of rare events Verification for forecasts of rare events Chris Ferro Mathematics Research Institute University of Exeter, UK 15 mins + 5 mins questions 11th International Meeting on Statistical Climatology Edinburgh, 14 July 2010

Forecast verification Describe and understand forecast performance. We consider challenges raised by deterministic forecasts for the occurrence of large, rare values. Read more next year in the expanded second edition of ‘the book’!

Daily rainfall in mid-Wales Observations: radar measurements Forecasts: direct output from the old 12km mesoscale version of the Met Office Unified Model 1 Jan 05 – 11 Nov 06 Data courtesy of Marion Mittermaier

Threshold exceedances Observe Forecast Yes No 13 9 27 600 Hit rate = 13 / (13 + 27) = 0.325

Threshold exceedances Observe Forecast Yes No 7 3 12 627 Hit rate = 7 / (7 + 12) = 0.368

Threshold exceedances Observe Forecast Yes No 3 5 638 Hit rate = 3 / (3 + 5) = 0.375

Threshold exceedances Observe Forecast Yes No 1 2 4 642 Hit rate = 1 / (1 + 4) = 0.2

Threshold exceedances Observe Forecast Yes No 1 3 645 Hit rate = 0 / (0 + 3) = 0

Threshold exceedances Observe Forecast Yes No 1 648 Hit rate = 0 / (0 + 0) = NaN

Sampling variation increases Hit rate decreases to 0 Sampling variation increases 95% confidence intervals

Proportion correct tends to 1 Sampling variation decreases 95% confidence intervals (a+d)/n >= 1-2p since (b+c)/n <= 2p so constraint reduces absolute variation. Relative variation increases

Lessons As we move to rarer events... forecast performance degenerates, sampling variation (uncertainty) can increase, some aspects of performance are hard to maintain e.g. hit rate, other aspects of performance are easy to maintain e.g. proportion correct.

Solutions Verification measures that do not degenerate, e.g. Extreme Dependency Score measures decay rate. Reduce sampling variation by... imposing a parametric form on how the entries in the contingency table change with the threshold, estimating the parameters with moderate events, using the fitted model to extrapolate to rare events. Ferro (2007), Stephenson et al. (2008), Ferro and Stephenson (2010)

Define the base rate to be p = (a + c) / n. Table of frequencies Observe Forecast Yes No a b a + b c d c + d a + c b + d n Define the base rate to be p = (a + c) / n.

Table of relative frequencies Observe Forecast Yes No a / n b / n q c / n d / n 1 – q p 1 – p 1 There is no theory for how q changes as p → 0 so recalibrate the forecasts to force q = p and model the bias, q / p, separately.

Recalibration Use unequal thresholds to equalise the numbers of observed and forecasted events, removing bias. Observe Forecast Yes No 9 10 620

Parametric model for the table Observe Forecast Yes No a / n • p 1 – p 1 For a wide class of (regularly varying) probability distributions, a / n ~ αpβ as p → 0 where α > 0 and β ≥ 1. For random forecasts, α = 1 and β = 2. Ledford and Tawn (1996), Heffernan (2000)

Parametric model for the table Observe Forecast Yes No αpβ • p 1 – p 1 Given estimates of α and β, we can derive the behaviour of verification measures for small p, e.g. Hit rate ~ αpβ / p = αpβ–1.

Check model adequacy If the model fits then log a / n ~ log α + β log p for small p and the graph of log a / n against log p will approximate a straight line for small enough p.

Parameter estimation If the model holds for p < p0 then the observations and forecasts can be transformed into a sample from a distribution for which Pr(Z > z) ≈ α exp(–βz) for z > –log p0. Choose p0 and then estimate α and β by maximum likelihood using those transformed data that exceed –log p0.

Daily rainfall in mid-Wales Observe Forecast Yes No 0.97p1.25 • p 1 – p 1 The model is a good fit for base rates p < 0.135 (not shown) with estimates α = 0.97 (0.71, 1.60) and β = 1.25 (1.11, 1.47). 90% confidence intervals

Standard (black) and model-based (red) estimates of the hit rate with bootstrap 90% confidence intervals

Standard (black) and model-based (red) estimates of prop. correct with bootstrap 90% confidence intervals

Summary Forecast performance degenerates for rare events (some aspects are hard to maintain, others are easy to maintain) and uncertainty can increase. Probability models based on extreme-value theory can help to estimate performance more precisely and to extrapolate performance to rarer events.

Discussion A model for how the bias changes with base rate could be included to estimate the performance of uncalibrated forecasts. Similar ideas can be used for other types of forecasts and observations, e.g. probabilistic forecasts, although different models are required. Other related issues include observation errors, location/timing errors, sensitivity to outliers, etc.

References c.a.t.ferro@ex.ac.uk Ferro CAT (2007) A probability model for verifying deterministic forecasts of extreme events. Weather and Forecasting, 22, 1089-1100. Ferro CAT, Stephenson DB (2010) Improved verification measures for deterministic forecasts of rare, binary events. In preparation. Heffernan JE (2000) A directory of coefficients of tail dependence. Extremes, 3, 279-290. Ledford AW, Tawn JA (1996) Statistics for near independence in multivariate extreme values. Biometrika, 83, 169-187. Stephenson DB, Casati B, Ferro CAT, Wilson CA (2008) The extreme dependency score: a non-vanishing verification score for deterministic forecasts of rare events. Meteorological Applications, 15, 41-50. c.a.t.ferro@ex.ac.uk