Potential predictability, ensemble forecasts and tercile probabilities Michael Tippett, IRI Tony Barnston, Andy Robertson, IRI Richard Kleeman and Youmin Tang CIMS, NYU Predictability (all potential). How much information is in the ensemble? How best to extract it for tercile probability forecasts. How much information in addition to mean? Depends on ensemble size. Focus on perfect model results. Perfect model = forecast error given by ensemble spread, nature is like one member of the ensemble. Not saying what works in PM setting, works in general. Rather if it doesn’t work in PM then will be even worse in general.
Predictability Climatology pdf of seasonal precipitation. Forecast probability distribution based on additional information: Initial conditions; Boundary conditions (SST; soil); ENSO state If the two distributions are the same: No additional information in forecast. Greater difference, more information in forecast. Two measures: relative entropy and tercile probabilities. Predictability often define in relative terms. With no information, the best forecast is climatology. Perfect forecast pdf that reflects the true change in distribution due to the additional information. Two measures of predictability. RE and tercile probabilities different from equal odds.
Measuring predictability Relative entropy measures the difference between forecast p and climatology q pdfs. Measures change in mean and higher order moments. Nice properties. Invariant under nonlinear transformations. Taking log or square-root does not change R. Useful for non-Gaussian pdfs. Relative entropy is an information measure that arises naturally when comparing two distributions. Don’t know the forecast pdf Perfect model Invariance useful computationally also Gaussian
Measure predictability by variance. Example Measure predictability by variance. Does predictability (variance) depend on mean? Mean and variance of a Gaussian are uncorrelated. Raise to a power, mean and variance are correlated. Is RE really useful? Relative entropy is uncorrelated with mean (and insignificant).
Relative entropy calculations Measure relative entropy in two GCM simulations forced by observed SST. “Perfect model” potential predictabilty Time-series in three locations JFM North America precipitation. How does relative entropy depend on Ensemble mean? Ensemble variance? Perfect model in the sense that the distributions are computed from the GCM ensemble. 3 locations where SST forcing is important. Using relative entropy to quantify the role of different measures of the pdf. Complementary to previous work that has looked at changes in pdf depending on ENSO state.
South Florida DJF Black is climatology pdf, gray is forecast pdf with largest RE associated with 1982-83 el nino, string shift to wet conditions. Pluses mark RE values using Gaussian approximation. Line is 95% significance level (ensemble size). GCMs are similar. Strong correlation between RE and square of the ensemble mean. Weak correlation between RE and variance.
Kenya OND Dashed line marks significant values of RE. One GCM has strong shifts, all years are significant. Other has fewer years where RE is significant. Strong correlation with ensemble mean. One GCM is noisier than the other.
NE Brazil MAM Similar behavior between the two GCMs in NE Brazil.
RE between varying and constant ensemble Top to bottom, Florida, GHA, NE Brazil. RE between time-varying ensemble and constant ensemble.
North America JFM precipitation JFM precipitation totals. Average RE shows information in usual place, especially SE. Modest correlation with shift. Care needed to interpret because it looks like the correlation between the ensemble mean and its square. Ask is the relation with shift due to the relation with shift2 Need to use partial correlation. Partial correlation of shift, controlling for shift^2, is not significant. Correlation between RE and spread mostly where there is no skill. Also some is due to variance being correlated with square of mean.
North America JFM precipitation There are modest correlations between RE and the sign of NINO 3.4. However, care needed with interpretation. Correlation between nino 3.4 and its square is 0.3. Semi-partial correlation suggests some of the correlation with NINO3.4 is due to the correlation with NINO3.4^2.
Summary Relative entropy measures the difference between forecast and climatology pdfs. changes in mean, variance, higher order moments. For seasonal precipitation total: RE is more closely related to changes in mean than variance Model dependence. Future questions: Differing utility of predictions during warm/cold events.
Estimating tercile probabilities Count frequencies. Fit PDF Wilks (2002) NWP, Kharin & Zwiers (2003) Reduce sampling error in tercile probabilities; 2-tier seasonal forecasts Force GCMs with predicted SST. Compute tercile probabilities from frequencies. Post-process. Parameterize PDF = parameterize predictability. Changes in probabilities related to: Ensemble mean? Ensemble variance? Both? The general topic of this work is computing tercile probabilities from fitted distributions. There are two motivations. First, seasonal forecasts use model-based tercile probabilities as a key input. Second, changes in tercile probabilities from equally likely indicates predictability. Parametric descriptions allow us to associate those changes in with changes in the pdf.
Outline Quantify sampling error Fit parametric forecast PDFs Analytical estimates Counting. Gaussian fit. Sub-sampling from large ensemble. Perfect model. ECHAM 4.5 T42 79 members DJF precipitation over North America Fit parametric forecast PDFs Gaussian Constant variance vs. Variable variance Generalized linear model Ensemble mean vs. Ensemble mean and variance. Compare to observations. Two main results. First, quantify sampling error analytically and sub-sampling from a large ensemble. Second, use two fitting methods. Is sampling error reduced? What parameters are useful to characterize changes in tercile probabilities.
Sampling error How to measure sampling error? Compare the sample (ensemble size N) tercile probability with true tercile probability: Problem: Don’t know the true probability. Compare two independent samples. Error variance between two samples is twice true error. One way of measuring sampling error is to compare the sample probability with the true probability. Another is to compare two independent sample. The variance of their difference is twice the true error. Do this in a Monte Carlo fashion. Average over many samples.
S = Signal-to-noise ratio (model dep.) N = ensemble size Error decreases like S = Signal-to-noise ratio (model dep.) N = ensemble size DJF North America precipitation. Sup-sampling from ensemble of size 79. Sampling error when you calculate tercile probabilities by counting. Error depends on sample size and signal-to-noise ratio. Signal to noise is model dependent.
Fitting with a Gaussian Two types of error: PDF not really Gaussian! Sampling error Fit only mean Fit mean and variance Compare the error of counting with fitting a Gaussian. There’s also sampling error when fitting a Gaussian. Two sources. First, the real PDF is not Gaussian. Problem dependent. Second, sampling error estimating mean and variance. Treat analytically. Conclusion, ff the PDF is really Gaussian, FIT for better tercile probabilities! Expression for no signal, additional terms when there is a signal Error(Gaussian fit N=24) = Error(Counting N=40)
Generalized Linear Model Logistic regression Regression between tercile probabilities and Ensemble mean Ensemble mean and variance Why GLM? Relation is nonlinear—probabilities bounded. Errors are not normal. Similar to Gaussian fitting, if pdf is Gaussian. An empirical approach. Prove that GLM is the same as fitting for Gaussian variables.
PDF is Gaussian.
PDF is not Gaussian. (square of Gaussian)
Estimate sampling error by sub-sampling Randomly select 24 members Compute DJF (1951-2003) precipitation tercile probabilities by Counting (theory predicts average rms error = 10.9) Fitting Gaussian Constant variance Interannually varying variance GLM Ensemble mean Ensemble mean and variance Compare with frequency probability from independent 55 member ensemble. Adding more parameters better fits the 24 but not the 55. Show results comparing counting, fitting a Gaussian and GLM. Precipitation is not Gaussian, use square-root of precipitation. This approach allows use to determine which parameters are useful.
Gaussian (square-root) Counting Below Above N=24 Gaussian (square-root) 9.25 = rms error of sampling with 40 member ensemble. Gaussian fitting gives a reduction in error that is equivalent to going from 24 to 40 members. Some problems due to the PDF not really being Gaussian. Time-varying Gaussian
Regression with mean (square-root) Counting Below Above N=24 Regression with mean (square-root) GLM results are similar, on average. Some indication that GLM is slight better in regions where the model is more confident and worse elsewhere. Lots of Monte Carlo draws, so results are robust. Regression with mean and variance
1996 N=24 Below Above Sample Regression Illustrate the character of the tercile probabilities based on fitted distributions. Both use the same 24 members, fitted is spatially smoother.
1998 N=24 Below Above Sample Regression With strong SST forcing, still have strong shifts.
Calibration Calibrated = model prob. + past performance Bayesian Models are weighted with climatology Weights chosen to maximize likelihood of observations. Gaussian Inflate variance Variance (“standard error”) determined by correlation of ensemble mean with observations. GLM (not cross-validated)
Count Fit GLM Bayesian RPSS N=8 Fit + standard error
Count Fit GLM Bayesian RPSS N=39 Fit + standard error
Summary Used a large ensemble to look at sampling error in perfect model tercile probabilities. Error well-described analytically Error depends on sample size and S/N ratio. Parametric fitting reduces tercile probability sampling error. For Gaussian fitting and GLM, most of the useful information is associated with ensemble mean. Smoothed probabilities are better input to calibration. Gaussian + standard error works.