Richard (Rick)Jones Regional Training Workshop on Severe Weather Forecasting Macau, April 8 -13, 2013.

Slides:



Advertisements
Similar presentations
ECMWF Slide 1Met Op training course – Reading, March 2004 Forecast verification: probabilistic aspects Anna Ghelli, ECMWF.
Advertisements

Slide 1ECMWF forecast products users meeting – Reading, June 2005 Verification of weather parameters Anna Ghelli, ECMWF.
Robin Hogan Ewan OConnor University of Reading, UK What is the half-life of a cloud forecast?
Standardized Scales.
Verification of forecasts from the SWFDP – E Africa
Guidance of the WMO Commission for CIimatology on verification of operational seasonal forecasts Ernesto Rodríguez Camino AEMET (Thanks to S. Mason, C.
14 May 2001QPF Verification Workshop Verification of Probability Forecasts at Points WMO QPF Verification Workshop Prague, Czech Republic May 2001.
Validation of Satellite Precipitation Estimates for Weather and Hydrological Applications Beth Ebert BMRC, Melbourne, Australia 3 rd IPWG Workshop / 3.
Statistical Issues in Research Planning and Evaluation
Copyright © Allyn & Bacon (2007) Data and the Nature of Measurement Graziano and Raulin Research Methods: Chapter 4 This multimedia product and its contents.
Verification of probability and ensemble forecasts
Details for Today: DATE:3 rd February 2005 BY:Mark Cresswell FOLLOWED BY:Assignment 2 briefing Evaluation of Model Performance 69EG3137 – Impacts & Models.
Seasonal Predictability in East Asian Region Targeted Training Activity: Seasonal Predictability in Tropical Regions: Research and Applications 『 East.
Daria Kluver Independent Study From Statistical Methods in the Atmospheric Sciences By Daniel Wilks.
Introduction to Probability and Probabilistic Forecasting L i n k i n g S c i e n c e t o S o c i e t y Simon Mason International Research Institute for.
Gridded OCF Probabilistic Forecasting For Australia For more information please contact © Commonwealth of Australia 2011 Shaun Cooper.
Evaluation.
7-2 Estimating a Population Proportion
Today Concepts underlying inferential statistics
A Regression Model for Ensemble Forecasts David Unger Climate Prediction Center.
Inferential Statistics
Creating Empirical Models Constructing a Simple Correlation and Regression-based Forecast Model Christopher Oludhe, Department of Meteorology, University.
Probabilistic forecasts of (severe) thunderstorms for the purpose of issuing a weather alarm Maurice Schmeits, Kees Kok, Daan Vogelezang and Rudolf van.
Verification has been undertaken for the 3 month Summer period (30/05/12 – 06/09/12) using forecasts and observations at all 205 UK civil and defence aerodromes.
Verification of extreme events Barbara Casati (Environment Canada) D.B. Stephenson (University of Reading) ENVIRONMENT CANADA ENVIRONNEMENT CANADA.
© Crown copyright Met Office Operational OpenRoad verification Presented by Robert Coulson.
April 24, 2007 Nihat Cubukcu Utilization of Numerical Weather Forecast in Energy Sector.
Verification of ensembles Courtesy of Barbara Brown Acknowledgments: Tom Hamill, Laurence Wilson, Tressa Fowler Copyright UCAR 2012, all rights reserved.
Classroom Assessments Checklists, Rating Scales, and Rubrics
Copyright © 2012 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 17 Inferential Statistics.
4IWVM - Tutorial Session - June 2009 Verification of categorical predictands Anna Ghelli ECMWF.
Quantitative Skills 1: Graphing
Fundamentals of Data Analysis Lecture 9 Management of data sets and improving the precision of measurement.
Success depends upon the ability to measure performance. Rule #1:A process is only as good as the ability to reliably measure.
How can LAMEPS * help you to make a better forecast for extreme weather Henrik Feddersen, DMI * LAMEPS =Limited-Area Model Ensemble Prediction.
Verification methods - towards a user oriented verification WG5.
June 19, 2007 GRIDDED MOS STARTS WITH POINT (STATION) MOS STARTS WITH POINT (STATION) MOS –Essentially the same MOS that is in text bulletins –Number and.
Measuring forecast skill: is it real skill or is it the varying climatology? Tom Hamill NOAA Earth System Research Lab, Boulder, Colorado
Heidke Skill Score (for deterministic categorical forecasts) Heidke score = Example: Suppose for OND 1997, rainfall forecasts are made for 15 stations.
Model validation Simon Mason Seasonal Forecasting Using the Climate Predictability Tool Bangkok, Thailand, 12 – 16 January 2015.
Managerial Economics Demand Estimation & Forecasting.
© Crown copyright Met Office Probabilistic turbulence forecasts from ensemble models and verification Philip Gill and Piers Buchanan NCAR Aviation Turbulence.
61 st IHC, New Orleans, LA Verification of the Monte Carlo Tropical Cyclone Wind Speed Probabilities: A Joint Hurricane Testbed Project Update John A.
Probabilistic Forecasting. pdfs and Histograms Probability density functions (pdfs) are unobservable. They can only be estimated. They tell us the density,
Eidgenössisches Departement des Innern EDI Bundesamt für Meteorologie und Klimatologie MeteoSchweiz The challenge to verify operational weather warnings.
IMPORTANCE OF STATISTICS MR.CHITHRAVEL.V ASST.PROFESSOR ACN.
Probabilistic Forecasts of Extreme Precipitation Events for the U.S. Hazards Assessment Kenneth Pelman 32 nd Climate Diagnostics Workshop Tallahassee,
Reliability: Introduction. Reliability Session 1.Definitions & Basic Concepts of Reliability 2.Theoretical Approaches 3.Empirical Assessments of Reliability.
Verification of ensemble systems Chiara Marsigli ARPA-SIMC.
1 Validation for CRR (PGE05) NWC SAF PAR Workshop October 2005 Madrid, Spain A. Rodríguez.
Nathalie Voisin 1, Florian Pappenberger 2, Dennis Lettenmaier 1, Roberto Buizza 2, and John Schaake 3 1 University of Washington 2 ECMWF 3 National Weather.
Common verification methods for ensemble forecasts
Verification of ensemble precipitation forecasts using the TIGGE dataset Laurence J. Wilson Environment Canada Anna Ghelli ECMWF GIFS-TIGGE Meeting, Feb.
Gridded warning verification Harold E. Brooks NOAA/National Severe Storms Laboratory Norman, Oklahoma
Diagnostic verification and extremes: 1 st Breakout Discussed the need for toolkit to build beyond current capabilities (e.g., NCEP) Identified (and began.
DOWNSCALING GLOBAL MEDIUM RANGE METEOROLOGICAL PREDICTIONS FOR FLOOD PREDICTION Nathalie Voisin, Andy W. Wood, Dennis P. Lettenmaier University of Washington,
Verification methods - towards a user oriented verification The verification group.
Verification of C&V Forecasts Jennifer Mahoney and Barbara Brown 19 April 2001.
Data and the Nature of Measurement
BINARY LOGISTIC REGRESSION
Addressing the environmental impact of salt use on roads
Verifying and interpreting ensemble products
Hypothesis Testing: Hypotheses
Nathalie Voisin, Andy W. Wood and Dennis P. Lettenmaier
Quantitative verification of cloud fraction forecasts
What forecast users might expect: an issue of forecast performance
Deterministic (HRES) and ensemble (ENS) verification scores
Can we distinguish wet years from dry years?
Verification of SPE Probability Forecasts at SEPC
Short Range Ensemble Prediction System Verification over Greece
Presentation transcript:

Richard (Rick)Jones Regional Training Workshop on Severe Weather Forecasting Macau, April 8 -13, 2013

Verification WMO sponsored Joint Working Group on Forecast Verification Research JWGFVR ecast_Verification.html ecast_Verification.html

Resources The EUMETCAL training site on verification – computer aided learning: www/english/courses/msgcrs/index.htm www/english/courses/msgcrs/index.htm The website of the Joint Working Group on Forecast Verification Research: WMO/TD 1083 : Guidelines on performance assessment of the performance of Public Weather Systems

Why? “you can't know where you're going until you know where you've been” Proverb or George Santayana-"Those who are unaware of history are destined to repeat it.” Quality management- “Plan Do Check Act” – Deming How to verify ….“Begin with the end in mind” … Covey

Verification as a measure of Forecast quality to monitor forecast quality to improve forecast quality to compare the quality of different forecast systems

Introduction “Verification activity has value only if the information generated leads to a decision about the forecast or system being verified” – A. Murphy “User-Oriented Verification” Verification methods designed with the needs of a specific user in mind. “Users” are those who are interested in verification results, and who will take action based on verification results Forecasters, modelers are users too.

SWFDP Goals PROGRESS AGAINST SWFDP GOALS To improve the ability of NMSs to forecast severe weather events To improve the lead time of alerting these events To improve the interaction of NMSs with Disaster Management and Civil Protection authorities before, during and after severe weather events To identify gaps and areas for improvements To improve the skill of products from Global Centres through feedback from NMSs EVALUATION OF WEATHER WARNINGS Feedback from the public Feedback from the DMCPA to include comments of the timeliness and usefulness of the warnings Feedback from the media Warning verification by the NMCs

Introduction Importance of verification Relative ease by any group can put out of forecast for any where in the world How good are they? Product differentiation

Goals of Verification Scientific To identify the strengths and weaknesses of a forecast product in sufficient detail that actions can be specified that will lead to improvements in the product, ie to provide information to direct R&D. Demands more detail in verification methodology “diagnostic verification” SWFDP: Both administrative goals and scientific goals 9

Forecast “goodness” What makes a forecast good? QUALITY: How well it corresponds with the actual weather, as revealed by observations. (Verification) VALUE: The increase or decrease in economic or other value to a user, attributable to his use of the forecast. (satisfaction) Requires information from the user to assess, in addition to verification Can be assessed by methods of decision theory. (Cost-Loss etc)

Principles of (Objective) Verification Verification activity has value only if the information generated leads to a decision about the forecast or system being verified User of the information must be identified Purpose of the verification must be known in advance No single verification measure provides complete information about the quality of a forecast product. 11

The contingency Table 12 Observations Forecasts YesNo Yes

Preparation of the table Start with matched forecasts and observations Forecast event is precipitation >50 mm / 24 h Next day Threshold – medium risk Count up the number of each of hits, false alarms, misses and correct negatives over the whole sample Enter them into the corresponding 4 boxes of the table. DayFcst to occur? Observed ? 1Yes 2NoYes 3No 4YesNo 5 6Yes 7No 8 Yes 9No

Exercice Mozambique contingency table Review on Thursday 11 April

Outline Introduction: Purposes and Principles of verification Some relevant verification measures: Contingency table and scores Verification of products from the SWFDP Verification of probability forecasts Exercise results and interpretation

Forecast “goodness” Evaluation of forecast system Forecast goodness Evaluation of delivery system timeliness (are forecasts issued in time to be useful?) relevance (are forecasts delivered to intended users in a form they can understand and use?) robustness (level of errors or failures in the delivery of forecasts)

Principles of (Objective) Verification Forecast must be stated in such a way that it can be verified What about subjective verification? With care, is OK. If subjective, should not be done by anyone directly connected with the forecast. Sometimes necessary due to lack of objective information 17

Goals of Verification Administrative Justify cost of provision of weather services Justify additional or new equipment Monitor the quality of forecasts and track changes Usually means summarizing the verification into few numbers (scoring) Impact - $ and injuries 18

Verification Procedure Start with dataset of matched observations and forecasts Data preparation is the major part of the effort of verification Establish purpose Scientific vs. administrative Pose question to be answered, for specific user or set of users Stratification of dataset On basis of user requirements (seasonal, extremes etc) Take care to maintain sufficient sample size 19

Verification Procedure Nature of variable being verified Continuous: Forecasts of specific value at specified time and place Categorical: Forecast of an “event”, defined by a range of values, for a specific time period, and place or area Probabilistic: Same as categorical, but uncertainty is estimated SWFDP: Predicted variables are categorical: Extreme events, where extreme is defined by thresholds of precipitation and wind. Some probabilistic forecasts are available too. 20

What is the Event? For categorical and probabilistic forecasts, one must be clear about the “event” being forecast Location or area for which forecast is valid Time range over which it is valid Definition of category Example?

What is the Event? And now, what is defined as a correct forecast? A “hit” The event is forecast, and is observed – anywhere in the area? Over some percentage of the area? Scaling considerations Discussion:

Events for the SWFDP Best if “events” are defined for similar time period and similar-sized areas One day 24h Fixed areas; should correspond to forecast areas and have at least one reporting stn. The smaller the areas, the more useful the forecast, potentially, BUT… Predictability lower for smaller areas More likely to get missed event/false alarm pairs

Events for the SWFDP Correct negatives a problem Data density a problem Best to avoid verification where there is no data.

The contingency Table 25 Observations Forecasts YesNo Yes

Contingency tables 26 Characteristics: PoD= “Prefigurance” or “probability of detection”, “hit rate” Sensitive only to missed events, not false alarms Can always be increased by overforecasting rare events FAR= “False alarm ratio” Sensitive only to false alarms, not missed events Can always be improved by underforecasting rare events range: 0 to 1 best score = 1 range: 0 to 1 best score = 0 Forecasts Observations

Contingency tables 27 range: 0 to 1 best score = 1 Forecasts Observations best score = 1 Characteristics: PAG= “Post agreement” PAG= (1-FAR), and has the same characteristics Bias: This is frequency bias, indicates whether the forecast distribution is similar to the observed distribution of the categories (Reliability)

Contingency tables 28 Forecasts Observations range: 0 to 1 best score = 1 Characteristics: Critical success index, better known as the Threat Score Sensitive to both false alarms and missed events; a more balanced measure than either PoD or FAR

Contingency tables 29 Forecasts Observations range: negative value to 1 best score = 1 Characteristics: Heidke skill score against chance (as shown) Easy to show positive values Better to use climatology or persistence needs another table

Contingency tables 30 range: 0 to 1 best score = 1 Forecasts Observations best score = 0 Characteristics: Hit Rate (HR) is the same as the PoD and has the same characteristics False alarm RATE. This is different from the false alarm ratio. These two are used together in the Hanssen-Kuipers score, and in the ROC, and are best used in comparison.

Contingency tables 31 Forecasts Observations range: -1 to 1 best score = 1 Extreme dependency score characteristics: Score can be improved by incurring more false alarms Considered useful for extremes because does not converge to 0 as the base rate (observed frequency of events) decreases A relatively new score – not yet widely used.

32 Contingency tables Forecasts Observations range: 0 to 1 best score = 1 Characteristics: Better known as the Threat Score Sensitive to both false alarms and missed events; a more balanced measure than either PoD or FAR

33 Contingency tables range: 0 to 1 best score = 1 Forecasts Observations best score = 0 Characteristics: Hit Rate (HR) is the same as the PoD and has the same characteristics False alarm RATE. This is different from the false alarm ratio. These two are used together in the Hanssen-Kuipers score, and in the ROC, and are best used in comparison.

Extreme weather scores Extreme Dependency Score EDS Extreme Dependency Index EDI Symmetric Extremal Dependency Score SEDS Symmetric Extremal Dependency Index SEDI

EDS – EDI – SEDS - SEDI  Novelty categorical measures! Standard scores tend to zero for rare events Extremal Dependency Index - EDI Symmetric Extremal Dependency Index - SEDI Ferro & Stephenson, 2010: Improved verification measures for deterministic forecasts of rare, binary events. Wea. and Forecasting Base rate independence  Functions of H and F Verification of extreme, high-impact weather

Weather Warning Index (Canada)

Example - Madagascar LowObs yes Obs no Totals Fcst yes Fcst no Totals MedObs yesObs noTotals Fcst yes Fcst no Totals HighObs yesObs noTotals Fcst yes13417 Fcst no Totals Cases Separate tables assuming low, medium, high risk as thresholds Can plot the hit rate vs the false alarm RATE = FA/total obs no

Example (contd)

Discrimination User-perspective: Does the model or forecast tend to give higher values of precipitation when heavy precipitation occurs than when it doesn’t? (or temperature?)

How do we verify this?

Contingency Table for spatial data Possible interpretation for spatially defined threat areas: Put grid of equal area boxes over overlaid obs and fcsts Entries are just the number of boxes covered by the areas as shown. Correct negatives problematic, but could limit to total forecast domain Likely to result in overforecasting bias – different interpretation? Can be done only where spatially continuous obs and forecasts are available – hydro estimator? Forecast Observed False alarms Hits Misses

Summary – Verification of SWFDP products ProductWho should verify General method NMC severe weather warnings NMCContingency tables and scores RSMC severe weather guidance charts RSMCGraphical contingency table Global centre deterministic models Global centres Continuous scores (temperature); contingency tables (precip, wind) Global EPSGlobal centres. Scores for ensemble pdfs; scores for probability forecasts with respect to relevant categories.

Probability forecast verification – Reliability tables Reliability: The level of agreement between the forecast probability and the observed frequency of an event Usually displayed graphically Measures the bias in a probability forecast: Is there a tendency to overforecast or underforecast. Cannot be evaluated on a single forecast.

Reliability

Reliability – Summer 08- Europe 114 h

Summary – NMS products Warnings issued by NMSs Contingency tables as above, if enough data is gathered Important for a warning to determine the lead time – must archive the issue time of the warning and the occurrence time of the event. Data problems – verify the “reporting of the event”

Summary and discussion…. Summary Keep the data! Be clear about all forecasts! Know why you are verifying and for whom! Keep the verification simple but relevant! Just do it! Case studies – post-mortem

SWFDP verification Thank you