6th WMO tutorial Verification Martin GöberContinuous 1 Good afternoon! नमस्कार नमस्कार Guten Tag! Buenos dias! до́брый день! до́брыйдень Qwertzuiop asdfghjkl!

Slides:



Advertisements
Similar presentations
Slide 1ECMWF forecast products users meeting – Reading, June 2005 Verification of weather parameters Anna Ghelli, ECMWF.
Advertisements

Continued Psy 524 Ainsworth
Measuring the performance of climate predictions Chris Ferro, Tom Fricker, David Stephenson Mathematics Research Institute University of Exeter, UK IMA.
ECMWF long range forecast systems
14 May 2001QPF Verification Workshop Verification of Probability Forecasts at Points WMO QPF Verification Workshop Prague, Czech Republic May 2001.
How do model errors and localization approaches affects model parameter estimation Juan Ruiz, Takemasa Miyoshi and Masaru Kunii
Random variable Distribution. 200 trials where I flipped the coin 50 times and counted heads no_of_heads in a trial.
Details for Today: DATE:3 rd February 2005 BY:Mark Cresswell FOLLOWED BY:Assignment 2 briefing Evaluation of Model Performance 69EG3137 – Impacts & Models.
12.4 Notes Weather Analysis
A Short Introduction to Curve Fitting and Regression by Brad Morantz
Copyright 2004 David J. Lilja1 What Do All of These Means Mean? Indices of central tendency Sample mean Median Mode Other means Arithmetic Harmonic Geometric.
Regression Analysis. Unscheduled Maintenance Issue: l 36 flight squadrons l Each experiences unscheduled maintenance actions (UMAs) l UMAs costs $1000.
Methods and Measurement in Psychology. Statistics THE DESCRIPTION, ORGANIZATION AND INTERPRATATION OF DATA.
Statistical Analysis SC504/HS927 Spring Term 2008 Week 17 (25th January 2008): Analysing data.
Statistical Treatment of Data Significant Figures : number of digits know with certainty + the first in doubt. Rounding off: use the same number of significant.
Barbara Casati June 2009 FMI Verification of continuous predictands
Chapter 7 Correlational Research Gay, Mills, and Airasian
Linear Regression and Correlation Topic 18. Linear Regression  Is the link between two factors i.e. one value depends on the other.  E.g. Drivers age.
Chapter 13 – Weather Analysis and Forecasting. The National Weather Service The National Weather Service (NWS) is responsible for forecasts several times.
Naive Extrapolation1. In this part of the course, we want to begin to explicitly model changes that depend not only on changes in a sample or sampling.
Relationships Among Variables
Measures of Central Tendency
COSMO General Meeting Zurich, 2005 Institute of Meteorology and Water Management Warsaw, Poland- 1 - Verification of the LM at IMGW Katarzyna Starosta,
Mean Tests & X 2 Parametric vs Nonparametric Errors Selection of a Statistical Test SW242.
AM Recitation 2/10/11.
Psychometrics.
1 Reading Report 9 Yin Chen 29 Mar 2004 Reference: Multivariate Resource Performance Forecasting in the Network Weather Service, Martin Swany and Rich.
MA-250 Probability and Statistics Nazar Khan PUCIT Lecture 5.
1 FORECASTING Regression Analysis Aslı Sencer Graduate Program in Business Information Systems.
Chapter 7 Statistical Inference: Confidence Intervals
1 Chapter 6. Section 6-1 and 6-2. Triola, Elementary Statistics, Eighth Edition. Copyright Addison Wesley Longman M ARIO F. T RIOLA E IGHTH E DITION.
Population All members of a set which have a given characteristic. Population Data Data associated with a certain population. Population Parameter A measure.
A Statistical Comparison of Weather Stations in Carberry, Manitoba, Canada.
Statistics 11 Correlations Definitions: A correlation is measure of association between two quantitative variables with respect to a single individual.
Statistical Analysis Topic – Math skills requirements.
Correlation and Regression Used when we are interested in the relationship between two variables. NOT the differences between means or medians of different.
Verification methods - towards a user oriented verification WG5.
June 19, 2007 GRIDDED MOS STARTS WITH POINT (STATION) MOS STARTS WITH POINT (STATION) MOS –Essentially the same MOS that is in text bulletins –Number and.
Heidke Skill Score (for deterministic categorical forecasts) Heidke score = Example: Suppose for OND 1997, rainfall forecasts are made for 15 stations.
TYPES OF STATISTICAL METHODS USED IN PSYCHOLOGY Statistics.
Model validation Simon Mason Seasonal Forecasting Using the Climate Predictability Tool Bangkok, Thailand, 12 – 16 January 2015.
Model dependence and an idea for post- processing multi-model ensembles Craig H. Bishop Naval Research Laboratory, Monterey, CA, USA Gab Abramowitz Climate.
Latest results in verification over Poland Katarzyna Starosta, Joanna Linkowska Institute of Meteorology and Water Management, Warsaw 9th COSMO General.
Appraisal and Its Application to Counseling COUN 550 Saint Joseph College For Class # 3 Copyright © 2005 by R. Halstead. All rights reserved.
Probabilistic Forecasting. pdfs and Histograms Probability density functions (pdfs) are unobservable. They can only be estimated. They tell us the density,
Chapter 16 Data Analysis: Testing for Associations.
TYPES OF DATA KEEP THE ACTIVITIES ROLLING Data, Standard Deviation, Statistical Significance.
Statistical Inference for the Mean Objectives: (Chapter 9, DeCoursey) -To understand the terms: Null Hypothesis, Rejection Region, and Type I and II errors.
Central Tendency & Dispersion
Lecturer’s desk Physics- atmospheric Sciences (PAS) - Room 201 s c r e e n Row A Row B Row C Row D Row E Row F Row G Row H Row A
Basic Verification Concepts
Verification of Precipitation Areas Beth Ebert Bureau of Meteorology Research Centre Melbourne, Australia
© 2008 McGraw-Hill Higher Education The Statistical Imagination Chapter 5. Measuring Dispersion or Spread in a Distribution of Scores.
PART 2 SPSS (the Statistical Package for the Social Sciences)
Overview of WG5 activities and Conditional Verification Project Adriano Raspanti - WG5 Bucharest, September 2006.
Machine Learning CUNY Graduate Center Lecture 6: Linear Regression II.
Comparison of LM Verification against Multi Level Aircraft Measurements (MLAs) with LM Verification against Temps Ulrich Pflüger, Deutscher Wetterdienst.
Verification methods - towards a user oriented verification The verification group.
Global vs mesoscale ATOVS assimilation at the Met Office Global Large obs error (4 K) NESDIS 1B radiances NOAA-15 & 16 HIRS and AMSU thinned to 154 km.
LESSON 5 - STATISTICS & RESEARCH STATISTICS – USE OF MATH TO ORGANIZE, SUMMARIZE, AND INTERPRET DATA.
Statistical principles: the normal distribution and methods of testing Or, “Explaining the arrangement of things”
Application of the CRA Method Application of the CRA Method William A. Gallus, Jr. Iowa State University Beth Ebert Center for Australian Weather and Climate.
Deutscher Wetterdienst Long-term trends of precipitation verification results for GME, COSMO-EU and COSMO-DE Ulrich Damrath.
Central Bank of Egypt Basic statistics. Central Bank of Egypt 2 Index I.Measures of Central Tendency II.Measures of variability of distribution III.Covariance.
Outline Sampling Measurement Descriptive Statistics:
Verifying and interpreting ensemble products
Statistical Methods For Engineers
Hypothesis Testing.
Sampling Distributions
Measuring the performance of climate predictions
Presentation transcript:

6th WMO tutorial Verification Martin GöberContinuous 1 Good afternoon! नमस्कार नमस्कार Guten Tag! Buenos dias! до́брый день! до́брыйдень Qwertzuiop asdfghjkl! Bom dia ! Bonjour! Please, verify ! Good afternoon! नमस्कार नमस्कार Guten Tag! Buenos dias! до́брый день! до́брыйдень Qwertyuiop asdfghjkl! Bom dia ! Bonjour!

6th WMO tutorial Verification Martin GöberContinuous 2 Verification of continuous variables Martin Göber Deutscher Wetterdienst (DWD) Hans-Ertel-Centre for Weather Research (HErZ) Acknowledgements: Thanks to Barb Brown and Barbara Casatti!

6th WMO tutorial Verification Martin GöberContinuous 3 Types of forecasts, observations  Continuous  Temperature  Rainfall amount  500 hPa geopotential height  Categorical  Dichotomous Rain vs. no rain Thresholding of continuous variables Strong winds vs. no strong wind Often formulated as Yes/No  Multi-category Cloud amount category Precipitation type YY NY YN NN Except when it is meaningful, forecasts should not be degraded to categorical, due to the resulting loss of information.

6th WMO tutorial Verification Martin GöberContinuous 4 observation o forecast f (961 classes)*(100 stations)*(2 days)*(5 kinds of forecasts) = 1 Million numbers to analyse „curse of dimensionality“ Joint frequency distribution, road surface temperature, winter 2011 The joint probability distribution p(f,o) Boil down to a few numbers  (little ?) loss of information

6th WMO tutorial Verification Martin GöberContinuous 5 5 Continuous verification Normally distributed ERRORS

6th WMO tutorial Verification Martin GöberContinuous 6 If errors are normally distributed, then 2 parameters are enough, to answer all questions approximately If systematic error („bias“) small, then Root(MSE )= Standard error Normally distributed errors

6th WMO tutorial Verification Martin GöberContinuous 7 mean error ME, ideally=0 “systemtic error”  “on average, something goes wrong into one direction”, e.g. model physics wrongly tuned, missing processes, wrong interpretation of guidances tells us nothing about the pairwise match of forecasts and observations large in the past, rather small nowadays on average, but maybe large e.g. for certain weather types misleading for multi-modal error distributions  take Mean Absolute Error MAE Bias

6th WMO tutorial Verification Martin GöberContinuous 8 ME and MAE Q: If the ME is similar to the MAE, performing the bias correction is safe, if MAE >> ME performing the bias correction is dangerous: why ? A: if MAE >>ME it means that positive and negative errors cancel out in the bias evaluation …

6th WMO tutorial Verification Martin GöberContinuous 9 mean squared error or root mean square error RMSE accuracy measure: determines the distance between individual forecasts and observations, Ideally RMSE = 0 “It might be useful on average, but when its really important its not good ! ????” NOT necessarily, e.g: 1 five degree error is penalised like 25 one degree error 1 ten degree error is penalised like 100 one degree errors RMSE

6th WMO tutorial Verification Martin GöberContinuous 10 If errors normally distributed, then Interpretation of RMSE

6th WMO tutorial Verification Martin GöberContinuous 11 Decomposition of the MSE Consequence: smooth forecasts verify better Bias can be subtracted !

6th WMO tutorial Verification Martin GöberContinuous 12 Correlation coefficient  Measures the level of “association” between the forecasts and observations  Related to the “phase error” of the harmonic decomposition of the forecast  Is familiar and relatively easy to interpret  Has a nonparametric analog based on ranks

6th WMO tutorial Verification Martin GöberContinuous 13 Correlation coefficient

6th WMO tutorial Verification Martin GöberContinuous 14 Correlation coefficient

6th WMO tutorial Verification Martin GöberContinuous 15 What is wrong with the correlation coefficient as a measure of performance? Doesn’t take into account biases and amplitude – can inflate performance estimate More appropriate as a measure of “potential” performance Correlation coefficient

6th WMO tutorial Verification Martin GöberContinuous 16 Comparative verification  Generic skill score definition: Where M is the verification measure for the forecasts, M ref is the measure for the reference forecasts, and M perf is the measure for perfect forecasts  Measures percent improvement of the forecast over the reference  Positively oriented (larger is better)  Choice of the standard matters (a lot!)

6th WMO tutorial Verification Martin GöberContinuous 17 Comparative verification Skill scores  A skill score is a measure of relative performance Ex: How much more accurate are my temperature predictions than climatology? How much more accurate are they than the model’s temperature predictions? Provides a comparison to a standard  Standard of comparison can be Chance (easy?) Long-term climatology (more difficult) Sample climatology (difficult) Competitor model / forecast (most difficult) Persistence (hard or easy)

6th WMO tutorial Verification Martin GöberContinuous 18 Reduction of error Variance (also often called „skill score“ SS) Skill scores General skill score definition:

6th WMO tutorial Verification Martin GöberContinuous 19 Reduced variance MSE(Persistence) MSE(forecast) 24h mean wind forecast Higher skill Lower accuracy Accuracy vs skill

6th WMO tutorial Verification Martin GöberContinuous 20 “hits” = percentage of “acceptable” forecast errors (e.g. ICAO - dd:+-30°, ff:+-5kt bis 25kt, etc.) „hits“ and RMSE Forecast error in K “hits” in % “hits”

6th WMO tutorial Verification Martin GöberContinuous 21 Reduction of Error “mass“: Through reduction of large errors „hits“ and RMSE Forecast error in K “hits” in % “hits”

6th WMO tutorial Verification Martin GöberContinuous 22 Maximum temperature Potsdam Every 10 years one day better Long term trends “Hit rate” (errors +- 2k) in %

6th WMO tutorial Verification Martin GöberContinuous 23 Linear Error in Probability Space  LEPS is an MAE evaluated by using the cumulative frequencies of the observation  Errors in the tail of the distribution are penalized less than errors in the centre of the distribution q 0.75

6th WMO tutorial Verification Martin GöberContinuous 24  Verification is a high dimensional problem  can be boiled down to a lower dimensional under certain assumptions or interests  If forecast errors are normally distributed, continuous verification allows usage of only a few numbers like bias and RMSE  Accuracy and skill are different things Summary