The Expanded UW SREF System and Statistical Inference STAT 592 Presentation Eric Grimit 1. Description of the Expanded UW SREF System (How is this thing created?) 2. Spread-error Correlation Theory, Results, and Future Work 3. Forecast Verification Issues OUTLINE
Core Members of the Expanded UW SREF System M = 7 + CENT-MM5 Is this enough??? MM5 Multiple Analyses / Forecasts ICs LBCs
Generating Additional Initial Conditions POSSIBILITIES: Random Perturbations Breeding Growing Modes (BGM) Singular Vectors (SV) Perturbed Obs (PO) / EnKF / EnSRF Ensembles of Initializations Linear Combinations* May be the optimal approach (unproven) Simplistic approach (no one has tried it yet) Uses Bayesian melding (under development) Insufficient for short-range, inferior to PO, and computationally expensive (BGM & SV) } Selected Important Linear Combinations (SILC) ? Founded on the idea of “mirroring” (Tony Eckel) IC* = CENT + PF * (CENT - IC) ; PF = 1.0 Computationally inexpensive (restricts dimensionality to M=7) May be extremely cost effective Can test the method now Size of the perturbations is controlled by the spread of the core members Why Linear Combinations?
cmcg* Illustration of “mirroring” STEP 1: Calculate best guess for truth (the centroid) by averaging all analyses. STEP 2: Find error vector in model phase space between one analysis and the centroid by differencing all state variables over all grid points. STEP 3: Make a new IC by mirroring that error about the centroid. cmcg C cmcg* Sea Level Pressure (mb) ~1000 km cent 170°W 165°W 160°W 155°W 150°W 145°W 140°W 135°W eta ngps tcwb gasp avn ukmo cmcg IC* = CENT + (CENT - IC)
Two groups of “important” LCs: (x) mirrors X m * = X i – X m ; m = 1, 2, …, M (+) inflated sub-centroids X mn * = X i - (X m +X n ) ; m,n = 1, 2, …, M ; m n 2M2M i = 1 M 1+PF M PF 2 i = 1 M PF 2 = ( ) 2*(M-1) (M-2) Must restrict selection of LCs to physically/dynamically “important” ones At the same time, try for equally likely ICs Sample the “cloud” as completely as possible with a finite number (ie- fill in the holes)
Root Mean Square Error (RMSE) by Grid Point Verification RMSE of MSLP (mb) 36km Outer Domain cmcg cmcg* avn avn* eta eta* ngps ngps* ukmo ukmo* tcwb tcwb* cent 12h 24h 36h 48h 12km Inner Domain cmcg cmcg* avn avn* eta eta* ngps ngps* ukmo ukmo* tcwb tcwb* cent
Summary of Initial Findings Set of 15 ICs for UW SREF are not optimal, but may be good enough to represent important features of analysis error The centroid may be the best-bet deterministic model run, in the big picture Need further evaluation... How often does the ensemble fail to capture the truth? How reliable are the probabilities? Does the ensemble dispersion represent forecast uncertainty? 1.Evaluate the expanded UW MM5 SREF system and investigate multimodel applications 2.Develop a mesoscale forecast skill prediction system 3.Additional Work –mesoscale verification –probability forecasts –deterministic-style solutions –additional forecast products/tools (visualization) Future Work
Spread-error Correlation Theory Houtekamer 1993 (H93) Model: “This study neglects the effects of model errors. This causes an underestimation of the forecast error. This assumption probably causes a decrease in the correlation between the observed skill and the predicted spread.” agrees with... Var[Q | D] = E k [Var(Q | D,M k )] + Var k (E[Q | D,M k ]) Raftery BMA variance formula: “avg within model variance” “avg between model variance” [ () exp(- 2 ) 1 - exp(- 2 ) 22 Corr(S,|E|) = sqrt ; log S ~ N(0, 2 ), E ~ N(0,S 2 ) ]
Observed correlations greater than those predicted by the H93 model RESULTS: 10-m WDIR Jan-Jun 2000 (Phase I) Possible explanations: Artifact of the way spread and error are calculated! Accounting for some of the model error? Luck?
RESULTS: 2-m TEMP Jan-Jun 2000 (Phase I) What’s happening here? Error saturation? Differences in ICs not as important for surface temperature
Another Possible Predictor of Skill Spread of a temporal ensemble ~ forecast consistency Temporal ensemble = lagged forecasts all verifying at the same time F36F24 F12 F48 CENT-MM5CENT-MM5 CENT-MM5CENT-MM5 CENT-MM5CENT-MM5 CENT-MM5CENT-MM5 CENT-MM5CENT-MM5 00 UTC T - 48 h 12 UTC T - 36 h 00 UTC T - 24 h 12 UTC T - 12 h 00 UTC T F00* Does not have mesoscale features * “adjusted” CENT-MM5 analysis M = 4 verification BENEFITS: Yields mesoscale temporal spread Less sensitive to one synoptic-scale model’s time variability Best forecast estimate of “truth” Temporal Short-range Ensemble with the centroid runs
Are spread and skill well correlated for other parameters? ie. – wind speed & precipitation use sqrt or log to transform data to be normally distributed Do spread-error correlations improve after bias removal? What is “high” and “low” spread? need a spread climatology, i.e.- large data set What are the synoptic patterns associated with “high” and “low” spread cases? use NCEP/NCAR reanalysis data and compositing software How do the answers change for the expanded UW MM5 ensemble? Can a better single predictor of skill be formed from the two individual predictors? IC spread & temporal spread Future Investigation: Developing a Prediction System for Forecast Skill
Mesoscale Verification Issues Will verify 2 ways: At the observation locations (as before) Using a gridded mesoscale analysis SIMPLE possibilities for the gridded dataset: “adjusted” centroid analysis (run MM5 for < 1 h) Verification has the same scales as the forecasts Useful for creating verification rank histograms Bayesian combination of “adjusted” centroid with observations (e.g.- Fuentes and Raftery 2001) Accounts for scale differences (change of support problem) Can correct for MM5 biases TRUE VALUES OBSERVATIONS CENT-MM5 “adjusted” OUTPUT Bias parameters Noise Measurement error Large-scale structure Small-scale structure (after Fuentes and Raftery 2001)
Limitations of Traditional Bulk Error Scores biased toward the mean can get spurious zero errors by coincidence, not skill also can be blind to position, phase, and/or rotation errors This affects measurements of both spread & error! Need to try new methods of verification… 1.consider the gradient of a field, not just the magnitude addresses false zero errors / blindness to errors in the first derivative of a field still biased toward the mean 2.pattern recognition software would penalize the mean for absence/smoothness of features