Seasonal Forecasting Using the Climate Predictability Tool

Slides:



Advertisements
Similar presentations
Verification of Probabilistic Forecast J.P. Céron – Direction de la Climatologie S. Mason - IRI.
Advertisements

Guidance of the WMO Commission for CIimatology on verification of operational seasonal forecasts Ernesto Rodríguez Camino AEMET (Thanks to S. Mason, C.
Climate Predictability Tool (CPT)
Introduction to Excel 2007 Part 2: Bar Graphs and Histograms February 5, 2008.
Verification of probability and ensemble forecasts
Details for Today: DATE:3 rd February 2005 BY:Mark Cresswell FOLLOWED BY:Assignment 2 briefing Evaluation of Model Performance 69EG3137 – Impacts & Models.
Introduction to Probability and Probabilistic Forecasting L i n k i n g S c i e n c e t o S o c i e t y Simon Mason International Research Institute for.
Creating Empirical Models Constructing a Simple Correlation and Regression-based Forecast Model Christopher Oludhe, Department of Meteorology, University.
Multi-Model Ensembling for Seasonal-to-Interannual Prediction: From Simple to Complex Lisa Goddard and Simon Mason International Research Institute for.
Introduction to Seasonal Climate Prediction Liqiang Sun International Research Institute for Climate and Society (IRI)
Climate Predictability Tool (CPT) Ousmane Ndiaye and Simon J. Mason International Research Institute for Climate and Society The Earth.
VI. Evaluate Model Fit Basic questions that modelers must address are: How well does the model fit the data? Do changes to a model, such as reparameterization,
LECTURER PROF.Dr. DEMIR BAYKA AUTOMOTIVE ENGINEERING LABORATORY I.
Heidke Skill Score (for deterministic categorical forecasts) Heidke score = Example: Suppose for OND 1997, rainfall forecasts are made for 15 stations.
Statistical analysis Outline that error bars are a graphical representation of the variability of data. The knowledge that any individual measurement.
Model validation Simon Mason Seasonal Forecasting Using the Climate Predictability Tool Bangkok, Thailand, 12 – 16 January 2015.
Time series Model assessment. Tourist arrivals to NZ Period is quarterly.
Mathematics of PCR and CCA Simon Mason Seasonal Forecasting Using the Climate Predictability Tool Bangkok, Thailand, 12 – 16 January.
Forecasting in CPT Simon Mason Seasonal Forecasting Using the Climate Predictability Tool Bangkok, Thailand, 12 – 16 January 2015.
61 st IHC, New Orleans, LA Verification of the Monte Carlo Tropical Cyclone Wind Speed Probabilities: A Joint Hurricane Testbed Project Update John A.
Probabilistic Forecasting. pdfs and Histograms Probability density functions (pdfs) are unobservable. They can only be estimated. They tell us the density,
Climate Predictability Tool (CPT) Ousmane Ndiaye and Simon J. Mason International Research Institute for Climate and Society The Earth.
Can we distinguish wet years from dry years? Simon Mason Seasonal Forecasting Using the Climate Predictability Tool Bangkok, Thailand,
Linear Regression Simon Mason Seasonal Forecasting Using the Climate Predictability Tool Bangkok, Thailand, 12 – 16 January 2015.
The Centre for Australian Weather and Climate Research A partnership between CSIRO and the Bureau of Meteorology Verification and Metrics (CAWCR)
Verification of ensemble systems Chiara Marsigli ARPA-SIMC.
Hydrological Forecasting. Introduction: How to use knowledge to predict from existing data, what will happen in future?. This is a fundamental problem.
VERIFICATION OF A DOWNSCALING SEQUENCE APPLIED TO MEDIUM RANGE METEOROLOGICAL PREDICTIONS FOR GLOBAL FLOOD PREDICTION Nathalie Voisin, Andy W. Wood and.
Verification methods - towards a user oriented verification The verification group.
Statistical Inference for the Mean Objectives: (Chapter 8&9, DeCoursey) -To understand the terms variance and standard error of a sample mean, Null Hypothesis,
Review Design of experiments, histograms, average and standard deviation, normal approximation, measurement error, and probability.
Statistical Decision Making. Almost all problems in statistics can be formulated as a problem of making a decision. That is given some data observed from.
Verification of Seasonal Forecasts
Stats Methods at IC Lecture 3: Regression.
Statistical analysis.
Step 1: Specify a null hypothesis
Market-Risk Measurement
Risk and Return in Capital Markets
Decisions Under Risk and Uncertainty
DTC Quantitative Methods Bivariate Analysis: t-tests and Analysis of Variance (ANOVA) Thursday 20th February 2014  
Logistic Regression APKC – STATS AFAC (2016).
MSA / Gage Capability (GR&R)
PERFORMANCE APPRAISAL
Point and interval estimations of parameters of the normally up-diffused sign. Concept of statistical evaluation.
Statistical analysis.
Understanding the Relative Operating Characteristic (ROC)
Verifying and interpreting ensemble products
Principal Components: A Conceptual Introduction
Analysis of Covariance (ANCOVA)
Question 1 Given that the globe is warming, why does the DJF outlook favor below-average temperatures in the southeastern U. S.? Climate variability on.
Makarand A. Kulkarni Indian Institute of Technology, Delhi
QUANTITATIVE ANALYSIS
Climate Predictability Tool (CPT)
Leave-one-out cross-validation
Relative Operating Characteristics
Evaluation of measuring tools: reliability
Probabilistic forecasts
Predictability of Indian monsoon rainfall variability
Winter-Quiz- Intermezzo
How good (or bad) are seasonal forecasts?
Regression Forecasting and Model Building
Seasonal Forecasting Using the Climate Predictability Tool
Can we distinguish wet years from dry years?
Evaluating the Estimate at Completion (EAC)
Introduction to Excel 2007 Part 3: Bar Graphs and Histograms
Seasonal Forecasting Using the Climate Predictability Tool
Seasonal Forecasting Using the Climate Predictability Tool
the performance of weather forecasts
Exercise 1 (a): producing individual tables, using the cross-tabs menu
MGS 3100 Business Analysis Regression Feb 18, 2016
Presentation transcript:

Seasonal Forecasting Using the Climate Predictability Tool Verification of probabilistic forecasts Simon Mason simon@iri.columbia.edu Seasonal Forecasting Using the Climate Predictability Tool

Who is this? Park Jae-sang, aka “Psy”

Probabilistic Forecasts Why do we issue forecasts probabilistically? We cannot be certain what is going to happen The probabilities try to give an indication of how confident we are that the specified outcome will occur. Odds are an alternative to probabilities; they indicate how much more (or less) likely we think the specified outcome will occur than not occur. If the probabilities are “correct”, or, equivalently, the odds are fair, then we can calculate how frequently the specified outcome should occur ...

What makes a good forecast? Forecast: The odds of Prince William and Prince Kate’s baby being called Psy were 5000 to 1 against. Verification: Were those odds fair?

Reliability Give 90% confidence limits for the mean annual rainfall in New Delhi. 706 mm (27.8 inches). How many of these confidence intervals contain the observed annual rainfall? Differences between how many of these confidence intervals do contain the observed annual rainfall and how many should contain the observation are the subject of questions about reliability?

Forecast “goodness” What makes a “good” forecast? Consistency Quality Value Murphy AH 1993; Wea. Forecasting 8, 281

Consistent forecasts? All regions have highest probability on the normal category. Did we genuinely think that normal was the most likely category everywhere, or did we think it was the safest forecast everywhere? 70 – 80% of all the African RCOF forecasts have highest probability on normal. Are we really forecasting what we think, or are we playing safe?

Unconditional bias Are probabilities consistently too high or too low?

Forecast “goodness” What makes a “good” forecast? Consistency Quality Value Murphy AH 1993; Wea. Forecasting 8, 281

What is “skill”?

Skill Is one set of forecasts better than another? Skillful forecasts are not necessarily good; both sets of forecasts may be really bad. Unskillful forecasts are not necessarily bad: both sets of forecasts may be really good. “Skill” is poorly defined. What do we mean by “better”?

Skill Imagine a set of forecasts that indicates probabilities of rainfall (which has a climatological probability of 30%): Suppose that rainfall occurs on 40% of the green forecasts, and 20% of the brown. The forecasts correctly indicate times with increased and decreased chances of rainfall, but do so over-confidently. The Brier skill score is -7%. 01 May 60% 02 May 03 May 04 May 05 May 06 May 10% 07 May 08 May 09 May 10 May

Reliability When we say 60% chance of above-normal, we expect a forecast of above-normal to be correct 60% of the time. If we take all our forecasts for a location when we said 60% chance of above-normal, 60% of them (not more or less) should be above-normal.

Resolution Does the outcome change when the forecast changes? 01 May 60% 02 May 03 May 04 May 05 May 06 May 10% 07 May 08 May 09 May 10 May When the forecast is 10% rain occurs 20% of the time. When the forecast is 60% rain occurs 40% of the time. Rain becomes more frequent when the forecast probability increases – there is resolution.

Resolution Does the outcome change when the forecast changes? Example: does above-normal rainfall become more frequent when its probability increases? Resolution is the crucial attribute of a good forecast. If the outcome differs depending on the forecast then the forecasts have useful information. If the outcome is the same regardless of the forecast the forecaster can be ignored.

Discrimination Does the forecast change when the outcome changes? 01 May 60% 02 May 03 May 04 May 05 May 06 May 10% 07 May 08 May 09 May 10 May When rain occurs the average forecast probability is 43%. When it is dry the average forecast probability is 31%. The forecast probability for rain is higher when it does rain – there is some discrimination.

Discrimination Does the forecast differ when the outcome differs? Example: is the probability on above-normal rainfall higher when above-normal rainfall occurs compared to when rainfall is normal or below-normal? Discrimination is an alternative perspective to resolution. If the forecast differs given different outcomes then the forecasts have useful information. If the forecast is the same regardless of the outcome the forecaster can be ignored.

What makes a “good” probabilistic forecast? Reliability the event occurs as frequently as implied by the forecast Sharpness the forecasts frequently have probabilities that differ from climatology considerably Resolution the outcome differs when the forecast differs Discrimination the forecasts differ when the outcome differs

Verification in CPT In CPT, “verification” relates to the assessment of probabilistic predictions: As retroactive predictions in CCA, PCR, MLR or GCM; As inputs in PFV. CPT does not produce cross-validated probabilistic predictions because it needs an independent estimate of the prediction error variance.

Retroactive forecasting 1981 Training period (1961 – 1980) Predict 1981 Omit 1982+ 1982 Training period (1961 – 1981) Predict 1982 1983+ 1983 Training period (1961 – 1982) Predict 1983 Omit 1984+ 1984 Training period (1961 – 1983) Predict 1984 Omit 1985+ 1985 Training period (1961 – 1984) Predict 1985 Given data for 1961 – date, it is possible to calculate a retroactive set of probabilistic forecasts. CPT will use an initial training period to cross-validate a model and make predictions for the subsequent year(s), then update the training period and predict additional years, repeating until all possible years have been predicted.

Probabilistic forecast input files INDEX and STATION files cpt:ncats (the number of categories; must be 3) cpt:C (start with category 1, i.e. below-normal, then repeat for category 2, i.e. normal; complete for all 3 categories, but make sure the probabilities add to 100) Date (the period for which the forecast applies, not the date the forecast was made) cpt:clim_prob (indicate the climatological probability of each category)

Verification of probabilistic forecasts Attributes Diagrams: graphs reliability, resolution, sharpness ROC Diagrams: graphs showing discrimination Scores: a table of scores for probabilistic forecasts Skill Maps: maps of scores for probabilistic forecasts Tendency Diagram: graphs showing unconditional biases Ranked Hits Diagram: graphs showing frequencies of observed categories having the highest probability Weather Roulette: graphs showing estimates of forecast value

Attributes diagrams The histograms show the sharpness. The vertical and horizontal lines show the observed climatology and indicate the forecast bias. The diagonal lines show reliability and “skill”. The coloured line shows the reliability and resolution of the forecasts. The dashed line shows a smoothed fit.

ROC diagrams ROC areas: do we issue a higher probability when the category occurs? Graph bottom left: when the probabilities are high, does the category occur? Graph top right: when the probabilities are low, does the category not occur? The ROC area indicates the probability that the forecasts successfully discriminate an observation in the category of interest from one not in that category. For example, given two observations, one of which is above-normal and the other is not, ideally the forecast probability for above-normal should be higher for the observation that was above-normal than for the observation that was not. In the example, the forecasts probability for above-normal was higher in about 71% of all such possible comparisons. The bottom left of the graph indicates how good the forecasts are when the probability for the respective category is high – if the graph here is steep and above the diagonal then the forecasts do discriminate that category well when the forecast probabilities are high (i.e., the forecasts give a good indication that the category will occur when the forecast probabilities are high). In the top right the graph indicates how good the forecasts are when the probability for the respective category is low – if the graph here is shallow and above the diagonal then the forecasts do discriminate that category well when the forecast probabilities are low (i.e., the forecasts give a good indication that the category will not occur when the forecast probabilities are low). Retroactive forecasts of MAM 1986 – 2010 Thailand rainfall using February Pacific SSTs

Tendency diagrams Retroactive forecasts of MAM 1986 – 2010 Thailand rainfall using February Pacific SSTs. Shift towards above-normal was successfully predicted.

Ranked Hits diagrams highest probability second highest probability lowest probability Retroactive forecasts of MAM 1986 – 2010 Thailand rainfall using February Pacific SSTs. Category with highest probability is occurring most frequently.

Probabilistic scores Scores per category Brier score: mean squared error in probability (assuming that the probability should be 100% if the category occurs and 0% if it does not occur) Brier skill score: % improvement over Brier score using climatology forecasts (often pessimistic because of strict requirement for reliability) ROC area: probability of successfully discriminating the category (i.e., how frequently the forecast probability for that category is higher when it occurs than when it does not occur) Resolution slope: % increase in frequency for each 1% increase in forecast probability

Probabilistic scores Overall scores Ranked prob score: mean squared error in cumulative probabilities RPSS: % improvement over RPS using climatology forecasts (often pessimistic because of strict requirement for reliability) 2AFC score: probability of successfully discriminating the wetter or warmer category Resolution slope: % increase in frequency for each 1% increase in forecast probability Effective interest: % return given fair odds Linear prob score: average probability on the category that occurs Hit score (rank n): how often the category with the nth highest probability occurs

Forecast “goodness” What makes a “good” forecast? Consistency Quality Value Murphy AH 1993; Wea. Forecasting 8, 281

Weather roulette – profits diagram Given fair odds: profit = 1 ÷ odds Multiply the investment by the profit (or loss) to indicate how much money would be made (or lost). Average over all locations. The weather roulette results provide an estimate of the potential value of the forecasts. The underlying assumption is that an investment on the forecasts apportioned according to the probabilities receives fair odds. For example, given an initial $100 and a forecast probability for 50% on above-normal, $50 would be invested on above-normal, and if that category occurs then a pay-out of twice the investment plus the initial $50 would be made, i.e., %150 in total. A profit of 50% would therefore be made. Note that climatological probabilities would result in the investor breaking even. In the profits diagram the profits or losses are calculated assuming that the same total amount is invested at the beginning of each forecast.

Weather roulette – cumulative profits diagram Multiply the initial investment by the profit (or loss) carried over each year to indicate how much money would be made (or lost). In the cumulative profits diagram the profits or losses in the first forecast are carried over to the next (but profits made at one location are not redistributed to other locations). Strictly speaking the cumulative profits can only be calculated if the outcome of the first forecast is known before the second forecast commences (and so on for subsequent forecasts), which will not be the case if the forecasts are for overlapping seasons. However, CPT does not check for the overlap. Because the cumulative profits and losses are multiplicative rather than additive, the cumulative profits graph tends to increase exponentially if the forecasts have some skill., and so care should be taken not to assume that the rapid increase in profits (as suggested in the example above) is because the latest forecasts are better than the earliest ones. The profits graph gives a clearer indication of which forecasts have been more successful.

Weather roulette – effective interest rate diagram Multiply the initial investment by the profit (or loss) carried over each year, and calculate the effective interest rate. As for the cumulative profits, the effective interest rate can only be calculated properly if the outcome of the first forecast is known before the second forecast commences (and so on for subsequent forecasts), which will not be the case if the forecasts are for overlapping seasons.

Exercises Generate some retroactive forecasts. How do the retroactive validation results compare to the cross-validated? Do the forecasts perform as well as you might have expected given the cross-validated skill measures? Explore the various verification options.

CPT Help Desk web: iri.columbia.edu/cpt/ @climatesociety cpt@iri.columbia.edu @climatesociety …/climatesociety