Download presentation
Presentation is loading. Please wait.
1
22 May 2003 1:30 PM General Examination Presentation Toward Short-Range Ensemble Prediction of Mesoscale Forecast Skill Eric P. Grimit University of Washington Supported by: NWS Western Region/UCAR-COMET Student-Career Experience Program (SCEP) DoD Multi-Disciplinary University Research Initiative (MURI)
2
22 May 2003 1:30 PM General Examination Presentation Forecasting Forecast Skill Like any other scientific prediction or measurement, weather forecasts should be accompanied by error bounds, or a statement of uncertainty. Atmospheric predictability changes from day-to-day, and is dependent on: Atmospheric flow configuration Magnitude/orientation of initial state errors Sensitivity of flow to the initial state errors Numerical model deficiencies Sensitivity of flow to model errors T 2m = 3 °C ± 2 °CP(T 2m < 0 °C) = 6.7 %
3
22 May 2003 1:30 PM General Examination Presentation Operational forecasters need this crucial information to know how much to trust model forecast guidance Current uncertainty knowledge is partial, and largely subjective End users could greatly benefit from knowing the expected forecast reliability Allows sophisticated users to make optimal decisions in the face of uncertainty (economic cost-loss or utility) Common users of weather forecasts – confidence index Forecasting Forecast Skill Showers Low 46°F High 54°F FRI8 AM Showers Low 47°F High 57°F SAT5 Take protective action if: P(T 2m cost/loss
4
22 May 2003 1:30 PM General Examination Presentation Probabilistic Weather Forecasts Ensemble weather forecasting diagnoses the sensitivity of the predicted flow to initial-state and model errors—provided they are well-sampled.
5
22 May 2003 1:30 PM General Examination Presentation Probabilistic Weather Forecasts Agreement/disagreement among ensemble member forecasts provides information about forecast certainty/uncertainty. agreementdisagreement better forecastworse forecastreliability Traditional approach: use ensemble forecast variance as a predictor of forecast skill
6
22 May 2003 1:30 PM General Examination Presentation The Traditional Spread Approach Spread-skill correlation depends on the time variation of spread For constant spread day-to-day ( = 0), = 0 For large spread variability ( ), sqrt(2/ ) < 0.8 Assumes that E is the ensemble mean error, infinite ensemble 1- exp(- 2 ) 2 ( ,|E|) = ; =std(ln ) 2 1-exp(- 2 ) 22 (Houtekamer 1993) Spread-Skill Correlation Theory = ensemble standard deviation (spread) = temporal spread variability E = ensemble forecast error (skill)
7
22 May 2003 1:30 PM General Examination Presentation Observed Forecast Skill Predictability [c.f. Goerss 2000] [c.f. Hou et al. 2001] [c.f. Hamill and Colucci 1998] [c.f. Grimit and Mass 2002] Tropical Cyclone Tracks SAMEX ’98 SREFs NCEP SREF Precipitation Northwest MM5 SREF 10-m Wind Direction Highly scattered relationship, thus low correlations Unique 5-member short- range ensemble developed in 2000 showed promise Spread-skill correlations near 0.6, higher for cases with extreme spread
8
22 May 2003 1:30 PM General Examination Presentation Temporal (Lagged) Ensemble Related to dprog/dt and lagged-average forecasting (LAF) [Hoffman and Kalnay 1983; Reed et al. 1998; Palmer and Tibaldi 1988; Roebber 1990; Brundage et al. 2001; Hamill 2003] Palmer and Tibaldi (1988) and Roebber (1990) found lagged forecast spread to be moderately correlated with lagged forecast skill Roebber (1990) did not look for correlation between lagged forecast spread and current forecast skill Is temporal ensemble spread a useful second predictor of the current forecast skill?
9
22 May 2003 1:30 PM General Examination Presentation Choice must be made whether to compare forecasts and verifications in grid-box space or in observation space Representing data at a scale other than its own inherent scale introduces an error Verification schemes introduce their own error, potentially masking true forecast error Fields with large small-scale variability Low observation density (grid-based) Estimating Forecast Skill: Verification X X X X X X X X X X X X X From idealized verification experiments (Grimit et al. 200x)
10
22 May 2003 1:30 PM General Examination Presentation Estimating Forecast Skill: Verification User-dependency Scoring metric Deterministic or probabilistic? Categorized? Are timing errors important? [c.f. Mass et al. 2002]
11
22 May 2003 1:30 PM General Examination Presentation Definition of forecast skill Traditional spread approach is inherently deterministic A fully probabilistic approach requires an accurately forecast PDF In practice, the PDF is not well forecast Under-dispersive ensemble forecasts ‘u’-shaped verification rank histograms Under-sampling (distribution tails not well captured) Unaccounted for sources of uncertainty Sub-grid scale processes Need superior ensemble generation and/or statistical post- processing to accurately depict the true forecast PDF To bridge the gap, find ways to extract flow-dependent uncertainty information from current (suboptimal) ensembles Limitations to Forecast Skill Prediction Well-calibrated Under-dispersive Over-dispersive
12
22 May 2003 1:30 PM General Examination Presentation Project Goal Develop a short-range forecast skill prediction system using an imperfect mesoscale ensemble short-range = 0 – 48 h imperfect = suboptimal; cannot correctly forecast PDF Estimate the upper-bound of forecast skill predictability Assess the relationship sensitivity to different metrics Use existing MM5 SREF system – a unique resource Adequate sampling of initial state uncertainty Include spatially- and temporally-dependent bias correction Use temporal ensemble spread as a secondary predictor of forecast skill, if viable Attempt a new method of probabilistic forecast skill prediction
13
22 May 2003 1:30 PM General Examination Presentation Simple Stochastic Spread-Skill Model an extension of the Houtekamer (1993) model
14
22 May 2003 1:30 PM General Examination Presentation 1.Draw today’s “forecast uncertainty” from a log- normal distribution (Houtekamer 1993 model). ln( ) ~ N( ln( f ), 2.Create synthetic ensemble forecasts by drawing M values from the “true” distribution (perfect ensemble). F i ~ N( Z, ) ; i = 1,2,…,M 3.Draw the verifying observation from the same “true” distribution. V ~ N( Z, ) 4.Calculate ensemble spread and skill using varying metrics. Simple Stochastic Spread-Skill Model Stochastically simulated ensemble forecasts at a single grid point with 50,000 realizations (cases) Assume perfect ensemble forecasts Assumed Gaussian statistics Varied: 1)temporal spread variability ( 2)finite ensemble size (M) 3)spread and skill metrics
15
22 May 2003 1:30 PM General Examination Presentation Simple Model Results – Traditional Spread-Skill STD-AEM correlation increases with spread variability and ensemble size. STD-AEM correlations asymptote to the H93 values. STD = Standard Deviation AEM =Absolute Error of the ensemble Mean
16
22 May 2003 1:30 PM General Examination Presentation What Measure of Skill? STD is a better predictor of the average ensemble member error than of the ensemble mean error. _ AEM = | E | ___ MAE = | E | Different measures of ensemble variation in may be required to predict other measures of skill. spread STD =Standard Deviation error RMS=Root-Mean Square error MAE=Mean Absolute Error AEM=Absolute Error of the ensemble Mean AEC=Absolute Error of a Control
17
22 May 2003 1:30 PM General Examination Presentation STD-AEM correlation STD-RMS correlation Linear?
18
22 May 2003 1:30 PM General Examination Presentation Mesoscale Ensemble Forecast and Verification Data Two suboptimal mesoscale short-range ensembles designed for the U.S. Pacific Northwest
19
22 May 2003 1:30 PM General Examination Presentation The Challenges for Mesoscale SREF Lagging development of SREF systems compared to large- scale, medium-range ensemble prediction systems. Limited-area domain (necessity for boundary conditions) may constrain mesoscale ensemble spread. [Errico and Baumhefner 1987; Paegle et al. 1997; Du and Tracton 1999; Nutter 2003] Error growth due to model deficiency plays a significant role in the short-range. [Brooks and Doswell 1993; Stensrud et al. 2000; Orrell et al. 2001] Predominantly linear error growth in the short-range (< 24h). [Gilmour et al. 2001] IC selection methodologies from medium-range ensembles are not well applied to short-range ensembles Suboptimal, but highly effective, approach was adopted in 2000 use multiple analyses/forecasts from major operational weather centers
20
Grid Sources for Multi-Analysis Approach Resolution ( ~ @ 45 N ) Objective Abbreviation/Model/Source Type Computational Distributed Analysis avn, Global Forecast System (GFS), SpectralT254 / L641.0 / L14 SSI National Centers for Environmental Prediction~55km~80km cmcg, Global Environmental Multi-scale (GEM),SpectralT199 / L281.25 / L113D Var Canadian Meteorological Centre ~100km ~100km eta, Eta limited-area mesoscale model, Finite12km / L45 90km / L37SSI National Centers for Environmental Prediction Diff. gasp, Global AnalysiS and Prediction model,SpectralT239 / L291.0 / L11 3D Var Australian Bureau of Meteorology~60km~80km jma, Global Spectral Model (GSM),SpectralT106 / L211.25 / L13OI Japan Meteorological Agency~135km~100km ngps, Navy Operational Global Atmos. Pred. System,SpectralT239 / L301.0 / L14OI Fleet Numerical Meteorological & Oceanographic Cntr. ~60km~80km tcwb, Global Forecast System,SpectralT79 / L181.0 / L11 OI Taiwan Central Weather Bureau~180km~80km ukmo, Unified Model, Finite5/6 5/9 /L30same / L123D Var United Kingdom Meteorological Office Diff.~60km
21
22 May 2003 1:30 PM General Examination Presentation Multi-Analysis, Fixed Physics: ACME core Single limited-area mesoscale modeling system (MM5) 2-day (48-hr) forecasts at 0000 UTC in real-time since Jan. 2000 Initial Condition Selection: Large-scale, multi-analysis [from different operational centers] Lateral Boundary Conditions: Prescribed by the corresponding, large- scale forecasts Configurations of the MM5 short-range ensemble grid domains. (a) Outer 151 127 domain with 36-km horizontal grid spacing. (b) Inner 103 100 domain with 12-km horizontal grid spacing. a)b)
22
22 May 2003 1:30 PM General Examination Presentation Multi-Analysis, Mixed Physics: ACME core+ see Eckel (2003) for further details
23
Using Lagged-Centroid Forecasts Advantages: Run-to-run consistency of the best deterministic forecast estimate of “truth” (without any weighting) Less sensitive to a single member’s temporal variability Yields mesoscale spread [equal weighting of lagged forecasts] Temporal (Lagged) Ensemble c T M T Analysis Region 48h forecast Region
24
22 May 2003 1:30 PM General Examination Presentation Verification Data: Surface Observations Network of surface observations from many different agencies Observations are preferentially located in lower elevations and near urban centers. Focus in this study is on 10-m wind direction More extensive coverage & greater # of reporting sites than SLP. Greatly influenced by regional orography and synoptic scale changes. MM5’s systematic biases in the other near-surface variables can dominate errors originating from ICs. Will also use temperature and wind speed
25
22 May 2003 1:30 PM General Examination Presentation Key Questions Is there a significant spread-skill relationship in the MM5 ensemble predictions? Can it be used to form a forecast skill prediction system? Is the spread of a temporal ensemble a useful second predictor of forecast skill? Is there a significant difference between expected spread-skill correlations indicated by a simple stochastic model and the observed MM5 ensemble spread-skill correlations? Do the MM5 ensemble spread-skill correlations improve after a simple bias correction is applied? Are probabilistic error forecasts useful for predicting short-range mesoscale forecast skill?
26
22 May 2003 1:30 PM General Examination Presentation Preliminary Results Observation-based verification of 10-m wind direction Evaluated over one cool season (2002-2003)
27
ACME core Spread-Skill Correlations Latest spread-skill correlations are lower than in early MM5 ensemble work. Observed STD-RMS correlations are higher than STD-AEM correlations. ACME core forecast skill predictability is comparable to the expected predictability, given a perfect ensemble (with the same spread variability). Clear diurnal variation— affected by IC & MM5 biases? Ensemble Size = 8 members (AVN, CMC, ETA, GASP, JMA, NOGAPS, TCWB, UKMO) Verification Period: Oct 2002 – Mar 2003 (130 cases) Verification Strategy: Interpolate Model to Observations Variable: 10-m Wind Direction
28
ACME core+ Spread-Skill Correlations Temporal spread variability ( ) decreases! STD-RMS correlations are higher than and improve more than STD-AEM correlations. Exceedance of expected and idealized correlations may be due to: Simple model assumptions Domain-averaging Less diurnal variation, but still present—affected by unique MM5 biases? Ensemble Size = 8 members (PLUS01, PLUS02, PLUS03, PLUS04, PLUS05, PLUS06, PLUS07, PLUS08) Verification Period: Oct 2002 – Mar 2003 (130 cases) Verification Strategy: Interpolate Model to Observations Variable: 10-m Wind Direction
29
22 May 2003 1:30 PM General Examination Presentation Relatively weak correlation with current ensemble skill Initial Temporal Spread-Skill Correlations Lagged CENT-MM5 ensemble spread has moderate to strong correlation (r = 0.7 / 0.8) with the lagged CENT-MM5 ensemble skill. Weaker correlation with current mean skill, but is still a useful secondary predictor.
30
22 May 2003 1:30 PM General Examination Presentation VERY weak correlation with current ensemble skill Temporal Spread-Skill Correlations Different results for 2002-2003 season – much weaker correlations. Preliminary results – could have potential errors in the calculations. Are model improvements a factor? Difference in component members (added JMA-MM5)? Year-to-year variability?
31
22 May 2003 1:30 PM General Examination Presentation Summary Forecast skill predictability depends largely on the definition of skill itself. User-dependent needs Spread-skill correlation is sensitive to the spread and skill metrics For 10-m wind direction, ACME core spread (STD) is a good predictor (r = 0.5-0.75) of ensemble forecast skill (RMS). ACME core+ STD is slightly better (r = 0.6-0.8). Larger improvements are expected for T and WSPD. It is unclear whether the variance of a temporal ensemble (using lagged centroid forecasts from ACME) is a useful secondary forecast skill predictor.
32
22 May 2003 1:30 PM General Examination Presentation Future Work Additional cool season of evaluation (2003-2004) Grid-based verification * Forecast skill predictability with bias-corrected forecasts * Other variables (T and WSPD) Categorical approach Probabilistic forecast skill prediction *
33
22 May 2003 1:30 PM General Examination Presentation Verification Data: Mesoscale Gridded Analysis Reduced concern about impacts of representativity error on results, if observation and grid-based spread-skill relationships are qualitatively similar. Use Rapid Update Cycle 20-km (RUC20) analysis as “gridded truth” for MM5 ensemble verification and calibration. Smooth 12-km MM5 ensemble forecasts to RUC20 grid. 12-km Eta or 2.5-km ADAS (U. Utah) could be used in the future.
34
Training Period Bias-corrected Forecast Period Training Period Bias-corrected Forecast Period Training Period Bias-corrected Forecast Period N number of forecast cases f i,j,t forecast at location (i, j ) and lead time (t) o i,j verification 1) Calculate bias at every location and lead time using previous forecasts/verifications 2) Post-process current forecast using calculated bias: f i,j,t bias-corrected forecast at location (i, j ) and lead time (t) * November December January February March Simple Bias Correction Overall goal is to correct the majority of the bias in each member forecast, while using shortest possible training period Will be performed separately using both observations and the RUC20 analysis as verifications
35
22 May 2003 1:30 PM General Examination Presentation Probabilistic (2 nd -order) Forecast Skill Prediction Even for perfect ensemble forecasts, there is scatter in the spread-skill relationship; error is a multi-valued function of spread Additional information about the range of forecast errors associated with each spread value could be passed on to the user Include error bounds with the error bounds… T 2m = 3 °C ± 1.5-2.5 °C AEM STD
36
22 May 2003 1:30 PM General Examination Presentation Probabilistic (2 nd -order) Forecast Skill Prediction Ensemble forecast errors (RMS in this case) are divided into categories by spread amount. A gamma distribution is fit to the empirical forecast errors in each spread bin to form a probabilistic error forecast. The skill of the probabilistic error forecasts are evaluated using a cross- validation approach and the CRPS. Forecast skill predictability can be defined as a CRPS skill score: SS = (CRPS cli – CRPS) / CRPS cli STD RMS
37
22 May 2003 1:30 PM General Examination Presentation Contributions Development of a mesoscale forecast skill prediction system Forecast users (of the Northwest MM5 predictions) will gain useful information on forecast reliability that they do not have now. Probabilistic predictions of deterministic forecast errors Probabilistic predictions of average ensemble member errors Incorporation of a simple bias-correction procedure This has not been previously accomplished, only suggested Temporal ensemble spread approach with lagged-centroid forecasts Extension of a simple stochastic spread-skill model to include sampling effects and non-traditional measures Idealized verification experiments may provide useful guidance on how mesoscale forecast verification should be conducted
38
22 May 2003 1:30 PM General Examination Presentation “No forecast is complete without a forecast of forecast skill!” -- H. Tennekes, 1987 QUESTIONS?
39
22 May 2003 1:30 PM General Examination Presentation An Alternative Categorical Approach Ensemble mode population is the predictor (Toth et al. 2001) Largest fraction of ensemble members falling into a bin Bins are determined by climatologically equally likely classes Skill measured by success rate Success, if verification falls into ensemble mode bin Mode population and statistical entropy (ENT) are better predictors of success rate than STD (Ziehmann 2001) The key here is the classification of forecast and observed data [c.f. Toth et al. 2001 Fig. 2] 500 hPa Height NH Extratropics
40
22 May 2003 1:30 PM General Examination Presentation spread STD =Standard Deviation ENT*=Statistical Entropy MOD*=Mode Population error AEM=Absolute Error of the ensemble Mean MAE=Mean Absolute Error IGN*=Ignorance * = binned quantity
41
22 May 2003 1:30 PM General Examination Presentation
42
22 May 2003 1:30 PM General Examination Presentation spread STD =Standard Deviation ENT*=Statistical Entropy MOD*=Mode Population skill Success =0 / 1 * = binned quantity
43
22 May 2003 1:30 PM General Examination Presentation At and below minimum useful correlation. Multiple (Combined) Spread-Skill Correlations Early results suggested that temporal spread would be a useful secondary predictor. Latest results suggest otherwise.
44
22 May 2003 1:30 PM General Examination Presentation Multiple (Combined) Spread-Skill Correlations
45
22 May 2003 1:30 PM General Examination Presentation Simple Stochastic Model with Forecast Bias
46
22 May 2003 1:30 PM General Examination Presentation Spread-Skill Correlations for Temperature
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.