Verifying and interpreting ensemble products

Slides:

Advertisements

Similar presentations

ECMWF Slide 1Met Op training course – Reading, March 2004 Forecast verification: probabilistic aspects Anna Ghelli, ECMWF.

Advertisements

What is a good ensemble forecast? Chris Ferro University of Exeter, UK With thanks to Tom Fricker, Keith Mitchell, Stefan Siegert, David Stephenson, Robin.

What is a good ensemble forecast? Chris Ferro University of Exeter, UK With thanks to Tom Fricker, Keith Mitchell, Stefan Siegert, David Stephenson, Robin.

1 Verification Continued… Holly C. Hartmann Department of Hydrology and Water Resources University of Arizona RFC Verification Workshop,

14 May 2001QPF Verification Workshop Verification of Probability Forecasts at Points WMO QPF Verification Workshop Prague, Czech Republic May 2001.

Verification of probability and ensemble forecasts

Details for Today: DATE:3 rd February 2005 BY:Mark Cresswell FOLLOWED BY:Assignment 2 briefing Evaluation of Model Performance 69EG3137 – Impacts & Models.

Daria Kluver Independent Study From Statistical Methods in the Atmospheric Sciences By Daniel Wilks.

Introduction to Probability and Probabilistic Forecasting L i n k i n g S c i e n c e t o S o c i e t y Simon Mason International Research Institute for.

Edpsy 511 Homework 1: Due 2/6.

Barbara Casati June 2009 FMI Verification of continuous predictands

Ensemble Post-Processing and it’s Potential Benefits for the Operational Forecaster Michael Erickson and Brian A. Colle School of Marine and Atmospheric.

Multi-Model Ensembling for Seasonal-to-Interannual Prediction: From Simple to Complex Lisa Goddard and Simon Mason International Research Institute for.

Introduction to Seasonal Climate Prediction Liqiang Sun International Research Institute for Climate and Society (IRI)

Verification of extreme events Barbara Casati (Environment Canada) D.B. Stephenson (University of Reading) ENVIRONMENT CANADA ENVIRONNEMENT CANADA.

Verification of ensembles Courtesy of Barbara Brown Acknowledgments: Tom Hamill, Laurence Wilson, Tressa Fowler Copyright UCAR 2012, all rights reserved.

Rank Histograms – measuring the reliability of an ensemble forecast You cannot verify an ensemble forecast with a single.

Tutorial. Other post-processing approaches … 1) Bayesian Model Averaging (BMA) – Raftery et al (1997) 2) Analogue approaches – Hopson and Webster, J.

Instrumentation (cont.) February 28 Note: Measurement Plan Due Next Week.

Measuring forecast skill: is it real skill or is it the varying climatology? Tom Hamill NOAA Earth System Research Lab, Boulder, Colorado

Model validation Simon Mason Seasonal Forecasting Using the Climate Predictability Tool Bangkok, Thailand, 12 – 16 January 2015.

Statistics PSY302 Quiz One Spring A _____ places an individual into one of several groups or categories. (p. 4) a. normal curve b. spread c.

61 st IHC, New Orleans, LA Verification of the Monte Carlo Tropical Cyclone Wind Speed Probabilities: A Joint Hurricane Testbed Project Update John A.

2nd SRNWP Workshop on “Short-range ensembles” – Bologna, 7-8 April Verification of ensemble systems Chiara Marsigli ARPA-SIM.

Probabilistic Forecasting. pdfs and Histograms Probability density functions (pdfs) are unobservable. They can only be estimated. They tell us the density,

Central Tendency & Dispersion

Quantile regression as a means of calibrating and verifying a mesoscale NWP ensemble Tom Hopson 1 Josh Hacker 1, Yubao Liu 1, Gregory Roux 1, Wanli Wu.

The Centre for Australian Weather and Climate Research A partnership between CSIRO and the Bureau of Meteorology Verification and Metrics (CAWCR)

Basic Verification Concepts

© 2009 UCAR. All rights reserved. ATEC-4DWX IPR, 21−22 April 2009 National Security Applications Program Research Applications Laboratory Ensemble-4DWX.

Edpsy 511 Exploratory Data Analysis Homework 1: Due 9/19.

Verification of ensemble systems Chiara Marsigli ARPA-SIMC.

Furthermore… References Katz, R.W. and A.H. Murphy (eds), 1997: Economic Value of Weather and Climate Forecasts. Cambridge University Press, Cambridge.

Nathalie Voisin 1, Florian Pappenberger 2, Dennis Lettenmaier 1, Roberto Buizza 2, and John Schaake 3 1 University of Washington 2 ECMWF 3 National Weather.

Common verification methods for ensemble forecasts

VERIFICATION OF A DOWNSCALING SEQUENCE APPLIED TO MEDIUM RANGE METEOROLOGICAL PREDICTIONS FOR GLOBAL FLOOD PREDICTION Nathalie Voisin, Andy W. Wood and.

Verification methods - towards a user oriented verification The verification group.

Central Bank of Egypt Basic statistics. Central Bank of Egypt 2 Index I.Measures of Central Tendency II.Measures of variability of distribution III.Covariance.

Ensemble Forecasting: Calibration, Verification, and use in Applications Tom Hopson.

Spatial Verification Intercomparison Meeting, 20 February 2007, NCAR

Addressing the environmental impact of salt use on roads

Tom Hopson, Jason Knievel, Yubao Liu, Gregory Roux, Wanli Wu

Precipitation Products Statistical Techniques

Climate Predictability Tool (CPT)

Ensemble Forecasting and its Verification

Numerical Descriptive Measures

Binary Forecasts and Observations

Nathalie Voisin, Andy W. Wood and Dennis P. Lettenmaier

Statistics PSY302 Review Quiz One Fall 2018

forecasts of rare events

Probabilistic forecasts

Validation-Based Decision Making

Application of a global probabilistic hydrologic forecast system to the Ohio River Basin Nathalie Voisin1, Florian Pappenberger2, Dennis Lettenmaier1,

N. Voisin, J.C. Schaake and D.P. Lettenmaier

Quantitative verification of cloud fraction forecasts

COSMO-LEPS Verification

Christoph Gebhardt, Zied Ben Bouallègue, Michael Buchhold

Ensemble-4DWX update: focus on calibration and verification

Statistics PSY302 Review Quiz One Spring 2017

MBA 510 Lecture 2 Spring 2013 Dr. Tonya Balan 4/20/2019.

Verification of Tropical Cyclone Forecasts

Can we distinguish wet years from dry years?

Chapter Nine: Using Statistics to Answer Questions

Roc curves By Vittoria Cozza, matr

Forecast system development activities

Verification of SPE Probability Forecasts at SEPC

the performance of weather forecasts

What is a good ensemble forecast?

Short Range Ensemble Prediction System Verification over Greece

Presentation transcript:

Verifying and interpreting ensemble products

What do we mean by “calibration” or “post-processing”? “bias” obs Forecast PDF Probability Probability Forecast PDF obs “spread” or “dispersion” calibration Temperature [K] Temperature [K] Post-processing has corrected: the “on average” bias as well as under-representation of the 2nd moment of the empirical forecast PDF (i.e. corrected its “dispersion” or “spread”)

Specific Benefits of Post-Processing Improvements in: statistical accuracy, bias, and, reliability Correcting basic forecast statistics (increasing user “trust”) discrimination and sharpness Increasing “information content”; in many cases, gains equivalent to years of NWP model development! Relatively inexpensive! Statistical post-processing can improve not only statistical (unconditional) accuracy of forecasts (as measured by reliability diagrams, rank histograms, etc), but also factors more related to “conditional” forecast behavior (as measured by, say, RPS, skill-spread relations,etc.) Last point: inexpensive statistically-derived skill improvements can equate to significant NWP developments (expensive)

(cont) Benefits of Post-Processing Essential for tailoring to local application: NWP provides spatially- and temporally-averaged gridded forecast output => Applying gridded forecasts to point locations requires location specific calibration to account for spatial- and temporal- variability ( => increasing ensemble dispersion) Statistical post-processing can improve not only statistical (unconditional) accuracy of forecasts (as measured by reliability diagrams, rank histograms, etc), but also factors more related to “conditional” forecast behavior (as measured by, say, RPS, skill-spread relations,etc.) Last point: inexpensive statistically-derived skill improvements can equate to significant NWP developments (expensive)

Raw versus Calibrated PDF’s Blue is “raw” ensemble Black is calibrated ensemble Red is the observed value Notice: significant change in both “bias” and dispersion of final PDF (also notice PDF asymmetries) obs

Verifying ensemble (probabilistic) forecasts Overview: Rank histogram Mean square error (MSE) Brier score Rank Probability Score (RPS) Reliability diagram Relative Operating Characteristic (ROC) curve Skill score

Example: January T Before Calibration After Calibration Quartiles? Note underbias of forecasts on plot on the left Plot on right, observations now generally embedded with the ensemble bundle Comparing left/right ovals, notice the significant variation in the PDF’s both in their non-gaussian structure and the change in the ensemble spread Black curve shows observations; colors are ensemble

You cannot verify an ensemble forecast with a single observation. Rank Histograms – measuring the reliability of an ensemble forecast You cannot verify an ensemble forecast with a single observation. The more data you have for verification, (as is true in general for other statistical measures) the more certain you are. Rare events (low probability) require more data to verify => as do systems with many ensemble members. From Barb Brown

From Tom Hamill

Troubled Rank Histograms 0 10 20 30 Counts 0 10 20 30 Counts 1 2 3 4 5 6 7 8 9 10 Ensemble # 1 2 3 4 5 6 7 8 9 10 Ensemble # Slide from Matt Pocernic

After quantile regression, rank histograms uniform Example: July T Before Calibration After Calibration Before callibration, U-shaped rank histogram (underdispersive) with larger under- than over-bias After calibration, observation falls within ensemble bundle (still not perfectly “uniform” rank histogram due to small sample size) -- NOTE: for our current application, improving the information content of the ensemble spread (I.e. as a representation of potential forecast error) is the most significant gain in our calibration. After quantile regression, rank histograms uniform

Continuous scores: MSE Attribute: measures accuracy Average of the squares of the errors: it measures the magnitude of the error, weighted on the squares of the errors it does not indicate the direction of the error Quadratic rule, therefore large weight on large errors:  good if you wish to penalize large error  sensitive to large values (e.g. precipitation) and outliers; sensitive to large variance (high resolution models); encourage conservative forecasts (e.g. climatology) => For ensemble forecast, use ensemble mean Slide from Barbara Casati

Scatter-plot and Contingency Table Brier Score Does the forecast detect correctly temperatures above 18 degrees ? y = forecasted event occurence o = observed occurrence (0 or 1) i = sample # of total n samples => Note similarity to MSE Slide from Barbara Casati

Rank Probability Score for multi-categorical or continuous variables

Conditional Distributions Conditional histogram and conditional box-plot Slide from Barbara Casati

Reliability (or Attribute) Diagram Slide from Matt Pocernic

Scatter-plot and Contingency Table Does the forecast detect correctly temperatures above 18 degrees ? Does the forecast detect correctly temperatures below 10 degrees ? Slide from Barbara Casati

Discrimination Plot Decision Threshold Outcome = No Outcome = Yes Hits False Alarms Slide from Matt Pocernic

Receiver Operating Characteristic (ROC) Curve Slide from Matt Pocernic

Skill Scores Single value to summarize performance. Reference forecast - best naive guess; persistence, climatology A perfect forecast implies that the object can be perfectly observed Positively oriented – Positive is good

References: Jolliffe and Stephenson (2003): Forecast Verification: a practitioner’s guide, Wiley & Sons, 240 pp. Wilks (2005): Statistical Methods in Atmospheric Science, Academic press, 467 pp. Stanski, Burrows, Wilson (1989) Survey of Common Verification Methods in Meteorology http://www.eumetcal.org.uk/eumetcal/verification/www/english/courses/msgcrs/index.htm http://www.bom.gov.au/bmrc/wefor/staff/eee/verif/verif_web_page.html