Deutscher Wetterdienst Bootstrapping – using different methods to estimate statistical differences between model errors Ulrich Damrath COSMO GM Rome 2011.

Slides:



Advertisements
Similar presentations
DATA & STATISTICS 101 Presented by Stu Nagourney NJDEP, OQA.
Advertisements

CHAPTER 21 Inferential Statistical Analysis. Understanding probability The idea of probability is central to inferential statistics. It means the chance.
1 Detection and Analysis of Impulse Point Sequences on Correlated Disturbance Phone G. Filaretov, A. Avshalumov Moscow Power Engineering Institute, Moscow.
Chapter Seventeen HYPOTHESIS TESTING
10-1 Introduction 10-2 Inference for a Difference in Means of Two Normal Distributions, Variances Known Figure 10-1 Two independent populations.
Point estimation, interval estimation
DATA ANALYSIS I MKT525. Plan of analysis What decision must be made? What are research objectives? What do you have to know to reach those objectives?
Verification of DWD Ulrich Damrath & Ulrich Pflüger.
13-1 Designing Engineering Experiments Every experiment involves a sequence of activities: Conjecture – the original hypothesis that motivates the.
Chapter 2 Simple Comparative Experiments
1 Confidence Intervals for Means. 2 When the sample size n< 30 case1-1. the underlying distribution is normal with known variance case1-2. the underlying.
Statistical Methods for long-range forecast By Syunji Takahashi Climate Prediction Division JMA.
Copyright © Cengage Learning. All rights reserved. 13 Linear Correlation and Regression Analysis.
5-1 Introduction 5-2 Inference on the Means of Two Populations, Variances Known Assumptions.
© 2002 Thomson / South-Western Slide 8-1 Chapter 8 Estimation with Single Samples.
Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc. Chap 8-1 Confidence Interval Estimation.
Population All members of a set which have a given characteristic. Population Data Data associated with a certain population. Population Parameter A measure.
Statistics 101 Chapter 10. Section 10-1 We want to infer from the sample data some conclusion about a wider population that the sample represents. Inferential.
Chapter 7 Estimates and Sample Sizes
Random Sampling, Point Estimation and Maximum Likelihood.
Biostatistics IV An introduction to bootstrap. 2 Getting something from nothing? In Rudolph Erich Raspe's tale, Baron Munchausen had, in one of his many.
Eidgenössisches Departement des Innern EDI Bundesamt für Meteorologie und Klimatologie MeteoSchweiz Statistical Characteristics of High- Resolution COSMO.
Permutation Procedures, Bootstrap Methods and the Jackknife Bob Livezey Climate Services Division/OCWWS/NWS AMS Short Course on Significance Testing, Model.
Estimating Incremental Cost- Effectiveness Ratios from Cluster Randomized Intervention Trials M. Ashraf Chaudhary & M. Shoukri.
Verification methods - towards a user oriented verification WG5.
1 Nonparametric Methods II Henry Horng-Shing Lu Institute of Statistics National Chiao Tung University
Ch 6 Introduction to Formal Statistical Inference
Factors Affecting Student Study Allison FoyJustin Messina.
Determination of Sample Size: A Review of Statistical Theory
Normal Distribution.
Limits to Statistical Theory Bootstrap analysis ESM April 2006.
Federal Department of Home Affairs FDHA Federal Office of Meteorology and Climatology MeteoSwiss Local Probabilistic Weather Predictions for Switzerland.
CHEMISTRY ANALYTICAL CHEMISTRY Fall
Confidence Intervals for Variance and Standard Deviation.
Deutscher Wetterdienst Fuzzy and standard verification for COSMO-EU and COSMO-DE Ulrich Damrath (with contributions by Ulrich Pflüger) COSMO GM Rome 2011.
Eidgenössisches Departement des Innern EDI Bundesamt für Meteorologie und Klimatologie MeteoSchweiz Statistics of COSMO Forecast Departures in View of.
U. Damrath, COSMO GM, Athens 2007 Verification of numerical QPF in DWD using radar data - and some traditional verification results for surface weather.
Statistical Postprocessing of Surface Weather Parameters Susanne Theis Andreas Hense Ulrich Damrath Volker Renner.
Basic statistical concepts and techniques Mean and variance Probability distribution, and statistical significance Harmonic analysis and power spectrum.
Chapter 9 Inferences Based on Two Samples: Confidence Intervals and Tests of Hypothesis.
Business Statistics: Contemporary Decision Making, 3e, by Black. © 2001 South-Western/Thomson Learning 8-1 Business Statistics, 3e by Ken Black Chapter.
Diagnostic verification and extremes: 1 st Breakout Discussed the need for toolkit to build beyond current capabilities (e.g., NCEP) Identified (and began.
Hypothesis Testing. Suppose we believe the average systolic blood pressure of healthy adults is normally distributed with mean μ = 120 and variance σ.
6.3 One- and Two- Sample Inferences for Means. If σ is unknown Estimate σ by sample standard deviation s The estimated standard error of the mean will.
Verification methods - towards a user oriented verification The verification group.
Deutscher Wetterdienst FE VERSUS 2 Priority Project Meeting Langen Use of Feedback Files for Verification at DWD Ulrich Pflüger Deutscher.
Quantifying Uncertainty
Section Copyright © 2014, 2012, 2010 Pearson Education, Inc. Lecture Slides Elementary Statistics Twelfth Edition and the Triola Statistics Series.
VERIFICATION Highligths by WG5. 2 Outlook The COSMO-Index COSI at DWD Time series of the index and its DWD 2003.
McGraw-Hill/Irwin © 2003 The McGraw-Hill Companies, Inc.,All Rights Reserved. Part Four ANALYSIS AND PRESENTATION OF DATA.
Micro array Data Analysis. Differential Gene Expression Analysis The Experiment Micro-array experiment measures gene expression in Rats (>5000 genes).
Deutscher Wetterdienst Long-term trends of precipitation verification results for GME, COSMO-EU and COSMO-DE Ulrich Damrath.
Bias-Variance Analysis in Regression  True function is y = f(x) +  where  is normally distributed with zero mean and standard deviation .  Given a.
STA248 week 121 Bootstrap Test for Pairs of Means of a Non-Normal Population – small samples Suppose X 1, …, X n are iid from some distribution independent.
New results in COSMO about fuzzy verification activities and preliminary results with VERSUS Conditional Verification 31th EWGLAM &16th SRNWP meeting,
Eidgenössisches Departement des Innern EDI Bundesamt für Meteorologie und Klimatologie MeteoSchweiz Statistics of COSMO Forecast Departures in View of.
Lecture Slides Elementary Statistics Twelfth Edition
Application of the Bootstrap Estimating a Population Mean
Current verification results for COSMO-EU and COSMO-DE at DWD
Chapter 2 Simple Comparative Experiments
Quantifying uncertainty using the bootstrap
Elementary Statistics
BOOTSTRAPPING: LEARNING FROM THE SAMPLE
Chapter 6 Confidence Intervals.
Tutorial 9 Suppose that a random sample of size 10 is drawn from a normal distribution with mean 10 and variance 4. Find the following probabilities:
MECH 3550 : Simulation & Visualization
Ulrich Pflüger & Ulrich Damrath
Verification of probabilistic forecasts: comparing proper scoring rules Thordis L. Thorarinsdottir and Nina Schuhen
Techniques for the Computing-Capable Statistician
How Confident Are You?.
Presentation transcript:

Deutscher Wetterdienst Bootstrapping – using different methods to estimate statistical differences between model errors Ulrich Damrath COSMO GM Rome 2011

Ulrich Damrath: Bootstrapping – using different methods to estimate statistical differences between model errors, COSMO GM Rome September 2011 ahhdfkfflflflflflfkfkfkjdjdddnbdnnnd Some typical situations occuring during operational verification:

Ulrich Damrath: Bootstrapping – using different methods to estimate statistical differences between model errors, COSMO GM Rome September 2011 Questions:  1.Question: Are the differences of scores due to noise or are they statistical significant?  2. Question: Are there significant differences between the quality of different models? (Interests user of forecasts)  3. Question: Are there significant differences between the quality of models for different situations? (Interests developers of models)  Problem: BIASes may be normal distributed, but RMSEs?  A possible solution: Application of bootstrap techniques to get confidence intervals or quantiles of the distribution  1. Question concerning the bootstrap method: How many replications are necessary to get stable statistical results?  2. Question concerning the bootstrap method: How should the sample data be grouped in order to avoid autocorrelation effect?

Ulrich Damrath: Bootstrapping – using different methods to estimate statistical differences between model errors, COSMO GM Rome September 2011 The principle of bootstrapping for a sample with 10 elements Realisation 1: mean value using elements: Realisation 2: mean value using elements: Realisation 3: mean value using elements: Realisation 4: mean value using elements: Realisation 5: mean value using elements: Realisation 6: mean value using elements: Realisation 7: mean value using elements: Realisation 8: mean value using elements: Realisation 9: mean value using elements: Realisation 10: mean value using elements: The mean value of all realisations (replications) gives the bootstrap mean. The standard deviation of all mean values gives the bootstrap standard deviation as

Ulrich Damrath: Bootstrapping – using different methods to estimate statistical differences between model errors, COSMO GM Rome September 2011 Bootstrap properties for three analytical cases Number of sample values: 31

Ulrich Damrath: Bootstrapping – using different methods to estimate statistical differences between model errors, COSMO GM Rome September 2011 Bootstrap properties for three analytical cases Number of sample values: 310

Ulrich Damrath: Bootstrapping – using different methods to estimate statistical differences between model errors, COSMO GM Rome September 2011 Bootstrap properties for three analytical cases Number of sample values: 3100

Ulrich Damrath: Bootstrapping – using different methods to estimate statistical differences between model errors, COSMO GM Rome September 2011 Bootstrap properties for three analytical cases Number of sample values: 31000

Ulrich Damrath: Bootstrapping – using different methods to estimate statistical differences between model errors, COSMO GM Rome September 2011 Bootstrap properties for three analytical cases Number of sample values:

Ulrich Damrath: Bootstrapping – using different methods to estimate statistical differences between model errors, COSMO GM Rome September 2011 Conclusion concerning the convergence of the method: A number of ~500 replications seems to be appropriate to get a stable value for the bootstrap variance. Setting the sample characteristics: Treating each pair of observations and forecasts as a single sample member leeds to large sample sizes with relatively high autocorrelation. Therefore values are grouped by blocks of one, two and four days. Additionally, a block size was constructed using the optimal block length LOPT which can be estimated by with ‚a‘ as a function of autocorrelation and N as sample size.

Ulrich Damrath: Bootstrapping – using different methods to estimate statistical differences between model errors, COSMO GM Rome September 2011 The real world: Dependence of bootstrap standard deviation and bootstrap confidence intervals on the number of replications 2m-temperature forecasts during Summer 2010 and 10m-wind speed during Winter 2010/2011. BIASes for different periods, models and weather elements

Ulrich Damrath: Bootstrapping – using different methods to estimate statistical differences between model errors, COSMO GM Rome September 2011 The real world: Dependence of bootstrap standard deviation and bootstrap confidence intervals on the number of replications 2m-temperature forecasts during Summer 2010 and 10m-wind speed during Winter 2010/2011. RMSEs for different periods, weather elements and types of mean wind direction over Germany (700 hPa)

Ulrich Damrath: Bootstrapping – using different methods to estimate statistical differences between model errors, COSMO GM Rome September 2011 Quantiles 10% and 90% for different bootstrap types, Period – COSMO-EU (solid), COSMO-DE (dotted), Element Temperature 2m Top: Median and quantiles (green: overlapping quantiles, red: no overlapping quantiles) Bottom: another visualisation of the overlapping intervals (bluish: overlapping intervals, deep red: no overlapping intervals)

Ulrich Damrath: Bootstrapping – using different methods to estimate statistical differences between model errors, COSMO GM Rome September 2011 Quantiles 10% and 90% for different bootstrap types, Period – COSMO-EU (solid), COSMO-DE (dotted), Element Wind speed 10m Top: Median and quantiles (green: overlapping quantiles, red: no overlapping quantiles) Bottom: another visualisation of the overlapping intervals (bluish: overlapping intervals, deep red: no overlapping intervals)

Ulrich Damrath: Bootstrapping – using different methods to estimate statistical differences between model errors, COSMO GM Rome September 2011 Comparison of overlapping quantile intervals for different wind directions NW: north westerly flow, SW: south westerly flow, NO: north easterly flow, SO: south easterly flow

Ulrich Damrath: Bootstrapping – using different methods to estimate statistical differences between model errors, COSMO GM Rome September 2011 Comparison of overlapping quantile intervals for different wind directions NW: north westerly flow, SW: south westerly flow, NO: north easterly flow, SO: south easterly flow

Ulrich Damrath: Bootstrapping – using different methods to estimate statistical differences between model errors, COSMO GM Rome September 2011 Some typical situations occuring during operational verification in 2009, 2010 and 2011: Modification of turbulent mixing length May 2009:

Ulrich Damrath: Bootstrapping – using different methods to estimate statistical differences between model errors, COSMO GM Rome September 2011 Conclusions:  Different types of grouping the samples lead to different result concerning the statistical significance of the model errors.  Block methods give more or less equivalent results.  The results for the comparison of different models may users lead to a decision which model should be used.  The results for different weather types (flow directions) may developers give some hints concerning the development of model physics.

Ulrich Damrath: Bootstrapping – using different methods to estimate statistical differences between model errors, COSMO GM Rome September 2011 References: Efron, B., Tibshirani, R.J.(1993): An Introduction to the Bootstrap (Chapman & Hall/CRC Monographs on Statistics & Applied Probability) Mudelsee, M. (2010): Climate Time Series Analysis – Classical Statistical and Bootstrap Methods, Springer Dordrecht, Heidelberg, London, New York