Verification of probabilistic forecasts: comparing proper scoring rules Thordis L. Thorarinsdottir and Nina Schuhen 11.04.2018.

Slides:



Advertisements
Similar presentations
Fair scores for ensemble forecasts Chris Ferro University of Exeter 13th EMS Annual Meeting and 11th ECAM (10 September 2013, Reading, UK)
Advertisements

Point Estimation Notes of STAT 6205 by Dr. Fan.
Hypothesis testing and confidence intervals by resampling by J. Kárász.
Lecture (11,12) Parameter Estimation of PDF and Fitting a Distribution Function.
Sampling: Final and Initial Sample Size Determination
Estimation  Samples are collected to estimate characteristics of the population of particular interest. Parameter – numerical characteristic of the population.
Introduction to Summary Statistics
INTRODUCTION TO Machine Learning ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
Fall 2006 – Fundamentals of Business Statistics 1 Chapter 6 Introduction to Sampling Distributions.
Descriptive statistics Experiment  Data  Sample Statistics Experiment  Data  Sample Statistics Sample mean Sample mean Sample variance Sample variance.
8 Statistical Intervals for a Single Sample CHAPTER OUTLINE
8-1 Introduction In the previous chapter we illustrated how a parameter can be estimated from sample data. However, it is important to understand how.
7-1 Introduction The field of statistical inference consists of those methods used to make decisions or to draw conclusions about a population. These.
 Deviation is a measure of difference for interval and ratio variables between the observed value and the mean.  The sign of deviation (positive or.
1. Homework #2 2. Inferential Statistics 3. Review for Exam.
Stats for Engineers Lecture 9. Summary From Last Time Confidence Intervals for the mean t-tables Q Student t-distribution.
Population All members of a set which have a given characteristic. Population Data Data associated with a certain population. Population Parameter A measure.
Prof. Dr. S. K. Bhattacharjee Department of Statistics University of Rajshahi.
Random Sampling, Point Estimation and Maximum Likelihood.
Lecture 12 Statistical Inference (Estimation) Point and Interval estimation By Aziza Munir.
Review of Chapters 1- 5 We review some important themes from the first 5 chapters 1.Introduction Statistics- Set of methods for collecting/analyzing data.
7-1 Introduction The field of statistical inference consists of those methods used to make decisions or to draw conclusions about a population. These.
Lab 3b: Distribution of the mean
MEGN 537 – Probabilistic Biomechanics Ch.5 – Determining Distributions and Parameters from Observed Data Anthony J Petrella, PhD.
An Introduction to Statistics. Two Branches of Statistical Methods Descriptive statistics Techniques for describing data in abbreviated, symbolic fashion.
May 2004 Prof. Himayatullah 1 Basic Econometrics Chapter 5: TWO-VARIABLE REGRESSION: Interval Estimation and Hypothesis Testing.
AP Statistics Chapter 10 Notes. Confidence Interval Statistical Inference: Methods for drawing conclusions about a population based on sample data. Statistical.
Chapter 7 Point Estimation of Parameters. Learning Objectives Explain the general concepts of estimating Explain important properties of point estimators.
© Copyright McGraw-Hill 2004
Chapter 5 Sampling Distributions. The Concept of Sampling Distributions Parameter – numerical descriptive measure of a population. It is usually unknown.
Machine Learning 5. Parametric Methods.
Introduction to Inference Sampling Distributions.
MEGN 537 – Probabilistic Biomechanics Ch.5 – Determining Distributions and Parameters from Observed Data Anthony J Petrella, PhD.
Describing Data: Summary Measures. Identifying the Scale of Measurement Before you analyze the data, identify the measurement scale for each variable.
Sampling and Sampling Distributions. Sampling Distribution Basics Sample statistics (the mean and standard deviation are examples) vary from sample to.
ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
Stat 223 Introduction to the Theory of Statistics
Introduction to Inference
Active Learning Lecture Slides
Statistical Estimation
Confidence Intervals and Sample Size
ESTIMATION.
Estimation Point Estimates Industrial Engineering
Stat 223 Introduction to the Theory of Statistics
Point and interval estimations of parameters of the normally up-diffused sign. Concept of statistical evaluation.
Sampling Distributions
Target for Today Know what can go wrong with a survey and simulation
7-1 Introduction The field of statistical inference consists of those methods used to make decisions or to draw conclusions about a population. These.
Inference for the mean vector
Sampling Distributions
AP Statistics: Chapter 7
When we free ourselves of desire,
Estimation Point Estimates Industrial Engineering
Simple Probability Problem
Introduction Second report for TEGoVA ‘Assessing the Accuracy of Individual Property Values Estimated by Automated Valuation Models’ Objective.
CONCEPTS OF ESTIMATION
Improving forecasts through rapid updating of temperature trajectories and statistical post-processing Nina Schuhen, Thordis L. Thorarinsdottir and Alex.
Chapter 7 Sampling Distributions.
POINT ESTIMATOR OF PARAMETERS
Warmup To check the accuracy of a scale, a weight is weighed repeatedly. The scale readings are normally distributed with a standard deviation of
Chapter 7 Sampling Distributions.
Stat 223 Introduction to the Theory of Statistics
1. Homework #2 (not on posted slides) 2. Inferential Statistics 3
LECTURE 07: BAYESIAN ESTIMATION
Chapter 7 Sampling Distributions.
Advanced Algebra Unit 1 Vocabulary
Chapter 7 Sampling Distributions.
Rapid Adjustment of Forecast Trajectories: Improving short-term forecast skill through statistical post-processing Nina Schuhen, Thordis L. Thorarinsdottir.
Introductory Statistics
How Confident Are You?.
Presentation transcript:

Verification of probabilistic forecasts: comparing proper scoring rules Thordis L. Thorarinsdottir and Nina Schuhen 11.04.2018

Introduction Proper scoring rules: measure the accuracy of a forecast assign numerical penalty Often used to rank different models or forecasters For both deterministic and probabilistic verification Propriety: expected score is optimized for true distribution

Real-life forecast szenario Which proper scoring rule should I use? What if they give conflicting results? How should I report results? Is my data set sufficient? Which should I use for model parameter optimization? Short: How to use proper scores in practice!

Proper scoring rules Squared error: Absolute error: Ignorance score: Continuous ranked probability score:

Scores behave differently…

Simulation study: concept Draw random data from a «true» distribution with Verifying observations: 1000 data points Training data: 300 data points for each observation Estimate forecast distributions from the training data (method of moments) Make forecasts from the estimated distributions (50 members) Evaluate against observations

Forecasting distributions Expected value Variance Normal non- central t log-normal Gumbel + true distribution = Euler-Mascheroni constant

Example forecast: scores vs. obs IGN has different minimum due to skewness of the Gumbel distribution All scores are minimized at the same value => proper

Mean scores and bootstrap intervals Only IGN has large difference between Gumbel and other forecasters 1000 forecasts log-normal is best if truth is unknown 10^6 forecasts True distribution always has the lowest score, same ranking for all scores

PIT histograms (normal sample size)

PIT histograms (huge sample size)

Variation: Gumbel true distribution Estimated Gumbel has lower mean score than the true distribution 1000 forecasts 10^6 forecasts True distribution always has the lowest score, same ranking for all scores

Summary For a huge sample size, all proper scores give the same result For more realistic sample sizes, they differ widely The best model doesn’t always get the best score AE and CRPS have trouble identifying appropriate distributions Ignorance score is sensitive to shape of distributions => There is no «best» scoring rule

Summary For robust results: Use error bars! Use a combination of scores CRPS is very useful if the distribution is unknown or can not be easily specified Minimum score estimation: CRPS or Maximum Likelihood? No clear answer Depends on the forecast situation and model choice

Read more in… Statistical Postprocessing of Ensemble Forecasts Editors: Stéphane Vannitsem, Daniel S. Wilks, Jakob W. Messner Elsevier 978-0-12-812372-0 Planned publication: September 2018