Seasonal Forecasting Using the Climate Predictability Tool

Seasonal Forecasting Using the Climate Predictability Tool
Verification of probabilistic forecasts Simon Mason Seasonal Forecasting Using the Climate Predictability Tool

Who is this? Park Jae-sang, aka “Psy”

Probabilistic Forecasts
Why do we issue forecasts probabilistically? We cannot be certain what is going to happen The probabilities try to give an indication of how confident we are that the specified outcome will occur. Odds are an alternative to probabilities; they indicate how much more (or less) likely we think the specified outcome will occur than not occur. If the probabilities are “correct”, or, equivalently, the odds are fair, then we can calculate how frequently the specified outcome should occur ...

What makes a good forecast?
Forecast: The odds of Prince William and Prince Kate’s baby being called Psy were 5000 to 1 against. Verification: Were those odds fair?

Reliability Give 90% confidence limits for the mean annual rainfall in New Delhi. 706 mm (27.8 inches). How many of these confidence intervals contain the observed annual rainfall? Differences between how many of these confidence intervals do contain the observed annual rainfall and how many should contain the observation are the subject of questions about reliability?

Forecast “goodness” What makes a “good” forecast? Consistency Quality
Value Murphy AH 1993; Wea. Forecasting 8, 281

Consistent forecasts? All regions have highest probability on the normal category. Did we genuinely think that normal was the most likely category everywhere, or did we think it was the safest forecast everywhere? 70 – 80% of all the African RCOF forecasts have highest probability on normal. Are we really forecasting what we think, or are we playing safe?

Unconditional bias Are probabilities consistently too high or too low?

What is “skill”?

Skill Is one set of forecasts better than another?
Skillful forecasts are not necessarily good; both sets of forecasts may be really bad. Unskillful forecasts are not necessarily bad: both sets of forecasts may be really good. “Skill” is poorly defined. What do we mean by “better”?

Skill Imagine a set of forecasts that indicates probabilities of rainfall (which has a climatological probability of 30%): Suppose that rainfall occurs on 40% of the green forecasts, and 20% of the brown. The forecasts correctly indicate times with increased and decreased chances of rainfall, but do so over-confidently. The Brier skill score is -7%. 01 May 60% 02 May 03 May 04 May 05 May 06 May 10% 07 May 08 May 09 May 10 May

Reliability When we say 60% chance of above-normal, we expect a forecast of above-normal to be correct 60% of the time. If we take all our forecasts for a location when we said 60% chance of above-normal, 60% of them (not more or less) should be above-normal.

Resolution Does the outcome change when the forecast changes?
01 May 60% 02 May 03 May 04 May 05 May 06 May 10% 07 May 08 May 09 May 10 May When the forecast is 10% rain occurs 20% of the time. When the forecast is 60% rain occurs 40% of the time. Rain becomes more frequent when the forecast probability increases – there is resolution.

Resolution Does the outcome change when the forecast changes?
Example: does above-normal rainfall become more frequent when its probability increases? Resolution is the crucial attribute of a good forecast. If the outcome differs depending on the forecast then the forecasts have useful information. If the outcome is the same regardless of the forecast the forecaster can be ignored.

Discrimination Does the forecast change when the outcome changes?
01 May 60% 02 May 03 May 04 May 05 May 06 May 10% 07 May 08 May 09 May 10 May When rain occurs the average forecast probability is 43%. When it is dry the average forecast probability is 31%. The forecast probability for rain is higher when it does rain – there is some discrimination.

Discrimination Does the forecast differ when the outcome differs?
Example: is the probability on above-normal rainfall higher when above-normal rainfall occurs compared to when rainfall is normal or below-normal? Discrimination is an alternative perspective to resolution. If the forecast differs given different outcomes then the forecasts have useful information. If the forecast is the same regardless of the outcome the forecaster can be ignored.

What makes a “good” probabilistic forecast?
Reliability the event occurs as frequently as implied by the forecast Sharpness the forecasts frequently have probabilities that differ from climatology considerably Resolution the outcome differs when the forecast differs Discrimination the forecasts differ when the outcome differs

Verification in CPT In CPT, “verification” relates to the assessment of probabilistic predictions: As retroactive predictions in CCA, PCR, MLR or GCM; As inputs in PFV. CPT does not produce cross-validated probabilistic predictions because it needs an independent estimate of the prediction error variance.

Retroactive forecasting
1981 Training period (1961 – 1980) Predict 1981 Omit 1982+ 1982 Training period (1961 – 1981) Predict 1982 1983+ 1983 Training period (1961 – 1982) Predict 1983 Omit 1984+ 1984 Training period (1961 – 1983) Predict 1984 Omit 1985+ 1985 Training period (1961 – 1984) Predict 1985 Given data for 1961 – date, it is possible to calculate a retroactive set of probabilistic forecasts. CPT will use an initial training period to cross-validate a model and make predictions for the subsequent year(s), then update the training period and predict additional years, repeating until all possible years have been predicted.

Probabilistic forecast input files
INDEX and STATION files cpt:ncats (the number of categories; must be 3) cpt:C (start with category 1, i.e. below-normal, then repeat for category 2, i.e. normal; complete for all 3 categories, but make sure the probabilities add to 100) Date (the period for which the forecast applies, not the date the forecast was made) cpt:clim_prob (indicate the climatological probability of each category)

Verification of probabilistic forecasts
Attributes Diagrams: graphs reliability, resolution, sharpness ROC Diagrams: graphs showing discrimination Scores: a table of scores for probabilistic forecasts Skill Maps: maps of scores for probabilistic forecasts Tendency Diagram: graphs showing unconditional biases Ranked Hits Diagram: graphs showing frequencies of observed categories having the highest probability Weather Roulette: graphs showing estimates of forecast value

Attributes diagrams The histograms show the sharpness.
The vertical and horizontal lines show the observed climatology and indicate the forecast bias. The diagonal lines show reliability and “skill”. The coloured line shows the reliability and resolution of the forecasts. The dashed line shows a smoothed fit.

ROC diagrams ROC areas: do we issue a higher probability when the category occurs? Graph bottom left: when the probabilities are high, does the category occur? Graph top right: when the probabilities are low, does the category not occur? The ROC area indicates the probability that the forecasts successfully discriminate an observation in the category of interest from one not in that category. For example, given two observations, one of which is above-normal and the other is not, ideally the forecast probability for above-normal should be higher for the observation that was above-normal than for the observation that was not. In the example, the forecasts probability for above-normal was higher in about 71% of all such possible comparisons. The bottom left of the graph indicates how good the forecasts are when the probability for the respective category is high – if the graph here is steep and above the diagonal then the forecasts do discriminate that category well when the forecast probabilities are high (i.e., the forecasts give a good indication that the category will occur when the forecast probabilities are high). In the top right the graph indicates how good the forecasts are when the probability for the respective category is low – if the graph here is shallow and above the diagonal then the forecasts do discriminate that category well when the forecast probabilities are low (i.e., the forecasts give a good indication that the category will not occur when the forecast probabilities are low). Retroactive forecasts of MAM – 2010 Thailand rainfall using February Pacific SSTs

Tendency diagrams Retroactive forecasts of MAM 1986 – 2010 Thailand rainfall using February Pacific SSTs. Shift towards above-normal was successfully predicted.

Ranked Hits diagrams highest probability second highest probability lowest probability Retroactive forecasts of MAM 1986 – 2010 Thailand rainfall using February Pacific SSTs. Category with highest probability is occurring most frequently.

Probabilistic scores Scores per category Brier score: mean squared error in probability (assuming that the probability should be 100% if the category occurs and 0% if it does not occur) Brier skill score: % improvement over Brier score using climatology forecasts (often pessimistic because of strict requirement for reliability) ROC area: probability of successfully discriminating the category (i.e., how frequently the forecast probability for that category is higher when it occurs than when it does not occur) Resolution slope: % increase in frequency for each 1% increase in forecast probability

Probabilistic scores Overall scores Ranked prob score: mean squared error in cumulative probabilities RPSS: % improvement over RPS using climatology forecasts (often pessimistic because of strict requirement for reliability) 2AFC score: probability of successfully discriminating the wetter or warmer category Resolution slope: % increase in frequency for each 1% increase in forecast probability Effective interest: % return given fair odds Linear prob score: average probability on the category that occurs Hit score (rank n): how often the category with the nth highest probability occurs

Weather roulette – profits diagram
Given fair odds: profit = 1 ÷ odds Multiply the investment by the profit (or loss) to indicate how much money would be made (or lost). Average over all locations. The weather roulette results provide an estimate of the potential value of the forecasts. The underlying assumption is that an investment on the forecasts apportioned according to the probabilities receives fair odds. For example, given an initial $100 and a forecast probability for 50% on above-normal, $50 would be invested on above-normal, and if that category occurs then a pay-out of twice the investment plus the initial $50 would be made, i.e., %150 in total. A profit of 50% would therefore be made. Note that climatological probabilities would result in the investor breaking even. In the profits diagram the profits or losses are calculated assuming that the same total amount is invested at the beginning of each forecast.

Weather roulette – cumulative profits diagram
Multiply the initial investment by the profit (or loss) carried over each year to indicate how much money would be made (or lost). In the cumulative profits diagram the profits or losses in the first forecast are carried over to the next (but profits made at one location are not redistributed to other locations). Strictly speaking the cumulative profits can only be calculated if the outcome of the first forecast is known before the second forecast commences (and so on for subsequent forecasts), which will not be the case if the forecasts are for overlapping seasons. However, CPT does not check for the overlap. Because the cumulative profits and losses are multiplicative rather than additive, the cumulative profits graph tends to increase exponentially if the forecasts have some skill., and so care should be taken not to assume that the rapid increase in profits (as suggested in the example above) is because the latest forecasts are better than the earliest ones. The profits graph gives a clearer indication of which forecasts have been more successful.

Weather roulette – effective interest rate diagram
Multiply the initial investment by the profit (or loss) carried over each year, and calculate the effective interest rate. As for the cumulative profits, the effective interest rate can only be calculated properly if the outcome of the first forecast is known before the second forecast commences (and so on for subsequent forecasts), which will not be the case if the forecasts are for overlapping seasons.

Exercises Generate some retroactive forecasts.
How do the retroactive validation results compare to the cross-validated? Do the forecasts perform as well as you might have expected given the cross-validated skill measures? Explore the various verification options.

CPT Help Desk web: iri.columbia.edu/cpt/ @climatesociety
@climatesociety …/climatesociety

Seasonal Forecasting Using the Climate Predictability Tool

Similar presentations

Presentation on theme: "Seasonal Forecasting Using the Climate Predictability Tool"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Seasonal Forecasting Using the Climate Predictability Tool

Similar presentations

Presentation on theme: "Seasonal Forecasting Using the Climate Predictability Tool"— Presentation transcript:

Similar presentations

About project

Feedback