1 A discussion of the use of reforecasts Tom Hamill and Jeff Whitaker NOAA/ESRL, Physical Sciences Division (303)

Slides:



Advertisements
Similar presentations
Sub-seasonal to seasonal prediction David Anderson.
Advertisements

ECMWF long range forecast systems
2014 WAF/NWP Conference – January 2014 Precipitation and Temperature Forecast Performance at the Weather Prediction Center David Novak WPC Acting Deputy.
Statistical post-processing using reforecasts to improve medium- range renewable energy forecasts Tom Hamill and Jeff Whitaker NOAA Earth System Research.
Downscaling ensembles using forecast analogs Jeff Whitaker and Tom Hamill
Federal Department of Home Affairs FDHA Federal Office of Meteorology and Climatology MeteoSwiss Improving COSMO-LEPS forecasts of extreme events with.
94th American Meteorological Society Annual Meeting
Creating probability forecasts of binary events from ensemble predictions and prior information - A comparison of methods Cristina Primo Institute Pierre.
Instituting Reforecasting at NCEP/EMC Tom Hamill (ESRL) Yuejian Zhu (EMC) Tom Workoff (WPC) Kathryn Gilbert (MDL) Mike Charles (CPC) Hank Herr (OHD) Trevor.
MOS Developed by and Run at the NWS Meteorological Development Lab (MDL) Full range of products available at:
The NCEP operational Climate Forecast System : configuration, products, and plan for the future Hua-Lu Pan Environmental Modeling Center NCEP.
United States Coast Guard 1985 Evaluation of a Multi-Model Storm Surge Ensemble for the New York Metropolitan Region Brian A. Colle Tom Di Liberto Stony.
A statistical downscaling procedure for improving multi-model ensemble probabilistic precipitation forecasts Tom Hamill ESRL Physical Sciences Division.
MOS Performance MOS significantly improves on the skill of model output. National Weather Service verification statistics have shown a narrowing gap between.
Introduction to Numerical Weather Prediction and Ensemble Weather Forecasting Tom Hamill NOAA-CIRES Climate Diagnostics Center Boulder, Colorado USA.
Ensemble Post-Processing and it’s Potential Benefits for the Operational Forecaster Michael Erickson and Brian A. Colle School of Marine and Atmospheric.
A Regression Model for Ensemble Forecasts David Unger Climate Prediction Center.
CPC Extended Range Forecasts Ed O’Lenic NOAA-NWS-Climate Prediction Center Camp Springs, Maryland , ext 7528.
UMAC data callpage 1 of 16Global Ensemble Forecast System - GEFS Global Ensemble Forecast System Yuejian Zhu Ensemble Team Leader, Environmental Modeling.
The Long Journey of Medium-Range Climate Prediction Ed O’Lenic, NOAA-NWS-Climate Prediction Center.
Multi-Model Ensembling for Seasonal-to-Interannual Prediction: From Simple to Complex Lisa Goddard and Simon Mason International Research Institute for.
1 Ensemble Reforecasts Presented By: Yuejian Zhu (NWS/NCEP) Contributors:
1 Global Ensemble Strategy Presented By: Yuejian Zhu (NWS/NCEP) Contributors: EMC staff.
Notes on reforecasting and the computational capacity needed for future SREF systems Tom Hamill NOAA Earth System Research Lab presentation for 2009 National.
Verification of ensembles Courtesy of Barbara Brown Acknowledgments: Tom Hamill, Laurence Wilson, Tressa Fowler Copyright UCAR 2012, all rights reserved.
Exploring sample size issues for 6-10 day forecasts using ECMWF’s reforecast data set Model: 2005 version of ECMWF model; T255 resolution. Initial Conditions:
Intraseasonal TC prediction in the southern hemisphere Matthew Wheeler and John McBride Centre for Australia Weather and Climate Research A partnership.
EUROBRISA Workshop – Beyond seasonal forecastingBarcelona, 14 December 2010 INSTITUT CATALÀ DE CIÈNCIES DEL CLIMA Beyond seasonal forecasting F. J. Doblas-Reyes,
A Comparison of the Northern American Regional Reanalysis (NARR) to an Ensemble of Analyses Including CFSR Wesley Ebisuzaki 1, Fedor Mesinger 2, Li Zhang.
Measuring forecast skill: is it real skill or is it the varying climatology? Tom Hamill NOAA Earth System Research Lab, Boulder, Colorado
Improving Ensemble QPF in NMC Dr. Dai Kan National Meteorological Center of China (NMC) International Training Course for Weather Forecasters 11/1, 2012,
Celeste Saulo and Juan Ruiz CIMA (CONICET/UBA) – DCAO (FCEN –UBA)
Notes on reforecasting and the computational capacity needed for future SREF systems Tom Hamill NOAA Earth System Research Lab presentation for 2009 National.
1 Using reforecasts for probabilistic forecast calibration Tom Hamill & Jeff Whitaker NOAA Earth System Research Lab, Boulder, CO NOAA.
1 Climate Test Bed Seminar Series 24 June 2009 Bias Correction & Forecast Skill of NCEP GFS Ensemble Week 1 & Week 2 Precipitation & Soil Moisture Forecasts.
1 An overview of the use of reforecasts for improving probabilistic weather forecasts Tom Hamill NOAA / ESRL, Physical Sciences Div.
A recommended reforecast configuration for the NCEP GEFS Tom Hamill NOAA/ESRL Physical Sciences Division (303)
Model Post Processing. Model Output Can Usually Be Improved with Post Processing Can remove systematic bias Can produce probabilistic information from.
. Outline  Evaluation of different model-error schemes in the WRF mesoscale ensemble: stochastic, multi-physics and combinations thereof  Where is.
The Challenge of Mesoscale Probabilistic Weather Prediction and Communication Cliff Mass University of Washington.
APPLICATION OF NUMERICAL MODELS IN THE FORECAST PROCESS - FROM NATIONAL CENTERS TO THE LOCAL WFO David W. Reynolds National Weather Service WFO San Francisco.
18 September 2009: On the value of reforecasts for the TIGGE database 1/27 On the value of reforecasts for the TIGGE database Renate Hagedorn European.
Working group III report Post-processing. Members participating Arturo Quintanar, John Pace, John Gotway, Huiling Yuan, Paul Schultz, Evan Kuchera, Barb.
Statistical Post Processing - Using Reforecast to Improve GEFS Forecast Yuejian Zhu Hong Guan and Bo Cui ECM/NCEP/NWS Dec. 3 rd 2013 Acknowledgements:
Reforecasting, an integral component of probabilistic forecasting in the NWS Tom Hamill NOAA Earth System Research Lab Boulder, CO
Using Model Climatology to Develop a Confidence Metric Taylor Mandelbaum, School of Marine and Atmospheric Sciences, Stony Brook, NY Brian Colle, School.
Federal Department of Home Affairs FDHA Federal Office of Meteorology and Climatology MeteoSwiss A more reliable COSMO-LEPS F. Fundel, A. Walser, M. A.
Fly - Fight - Win 2 d Weather Group Mr. Evan Kuchera HQ AFWA 2 WXG/WEA Template: 28 Feb 06 Approved for Public Release - Distribution Unlimited AFWA Ensemble.
Nathalie Voisin 1, Florian Pappenberger 2, Dennis Lettenmaier 1, Roberto Buizza 2, and John Schaake 3 1 University of Washington 2 ECMWF 3 National Weather.
Common verification methods for ensemble forecasts
Science plan S2S sub-project on verification. Objectives Recommend verification metrics and datasets for assessing forecast quality of S2S forecasts Provide.
MOS and Evolving NWP Models Developer’s Dilemma: Frequent changes to NWP models… Make need for reliable statistical guidance more critical Helps forecasters.
Ensemble prediction review for WGNE, 2010 Tom Hamill 1 and Pedro de Silva-Dias 2 1 NOAA/ESRL 2 Laboratório Nacional de Computação Científica
Fly - Fight - Win 2 d Weather Group Mr. Evan Kuchera HQ AFWA 2 WXG/WEA Template: 28 Feb 06 Approved for Public Release - Distribution Unlimited AFWA Ensemble.
DOWNSCALING GLOBAL MEDIUM RANGE METEOROLOGICAL PREDICTIONS FOR FLOOD PREDICTION Nathalie Voisin, Andy W. Wood, Dennis P. Lettenmaier University of Washington,
VERIFICATION OF A DOWNSCALING SEQUENCE APPLIED TO MEDIUM RANGE METEOROLOGICAL PREDICTIONS FOR GLOBAL FLOOD PREDICTION Nathalie Voisin, Andy W. Wood and.
Probabilistic Forecasts Based on “Reforecasts” Tom Hamill and Jeff Whitaker and
NOAA Northeast Regional Climate Center Dr. Lee Tryhorn NOAA Climate Literacy Workshop April 2010 NOAA Northeast Regional Climate.
Figures from “The ECMWF Ensemble Prediction System”
Verifying and interpreting ensemble products
Tom Hopson, Jason Knievel, Yubao Liu, Gregory Roux, Wanli Wu
Precipitation Products Statistical Techniques
Nathalie Voisin, Andy W. Wood and Dennis P. Lettenmaier
Post Processing.
The Importance of Reforecasts at CPC
University of Washington Center for Science in the Earth System
N. Voisin, J.C. Schaake and D.P. Lettenmaier
Ensemble-4DWX update: focus on calibration and verification
Access and visualization of theTwentieth Century Reanalysis dataset
Rapid Adjustment of Forecast Trajectories: Improving short-term forecast skill through statistical post-processing Nina Schuhen, Thordis L. Thorarinsdottir.
Presentation transcript:

1 A discussion of the use of reforecasts Tom Hamill and Jeff Whitaker NOAA/ESRL, Physical Sciences Division (303) NOAA Earth System Research Laboratory prepared for 2006 NCEP Predictability Meeting, 14 Nov 2006

2 Motivation “Hydrometeorological services in the United States are an Enterprise effort. Therefore, effective incorporation of uncertainty information will require a fundamental and coordinated shift by all sectors of the Enterprise. Furthermore, it will take time and perseverance to successfully make this shift. As the Nation’s public weather service, NWS has the responsibility to take a leading role in the transition to widespread, effective incorporation of uncertainty information into hydrometeorological prediction.” – From finding 1 of NRC report “Completing the Forecast”

3 The problem with using raw ensemble forecasts Probabilistic forecasts from raw ensembles are not very reliable, due to deficiencies in forecast model, ensemble methods. Users want “sharp” and “reliable” forecasts. Statistical adjustment necessary.

4 References for reforecasting Hamill, T. M., J. S. Whitaker, and X. Wei, 2003: Ensemble re-forecasting: improving medium-range forecast skill using retrospective forecasts. Mon. Wea. Rev., 132, Hamill, T. M., J. S. Whitaker, and S. L. Mullen, 2005: Reforecasts, an important dataset for improving weather predictions. Bull. Amer. Meteor. Soc., 87, Whitaker, J. S, F. Vitart, and X. Wei, 2006: Improving week two forecasts with multi-model re-forecast ensembles. Mon. Wea. Rev., 134, Hamill, T. M., and J. S. Whitaker, 2006: Probabilistic quantitative precipitation forecasts based on reforecast analogs: theory and application. Mon. Wea. Rev., in press. Hamill, T. M., and J. Juras, 2006: Measuring forecast skill: is it real skill or is it the varying climatology? Quart. J. Royal Meteor. Soc., in press. Wilks, D. S., and T. M. Hamill, 2006: Comparison of ensemble-MOS methods using GFS reforecasts. Mon. Wea. Rev., in press. Hamill, T. M. and J. S. Whitaker, 2006: White Paper. “Producing high-skill probabilistic forecasts using reforecasts: implementing the National Research Council vision.” Available at Hamill, T. M., and J. S. Whitaker, 2006: Ensemble calibration of 500 hPa geopotential height and 850 hPa and 2- meter temperatures using reforecasts. Mon. Wea. Rev.,submitted. Available at

5 Outline Addressing some of the concerns NCEP has about reforecast calibration efficacy. Configuration of real-time reforecasts: the compromises between ideal and practical. Moving toward operational implementation. –Archiving and accessing reforecasts –What can we do to hit the ground running? –What can various organizations contribute? Will emphasize results that I don’t think we’ve already showed many times before.

6 Questions based on differences in NCEP / ESRL calibration results Is a short time series of most recent forecasts really the best for short-range forecast calibration? Is the bulk of the benefit from correction of model bias, or other deficiencies in the ensemble system?

7 Concerns/issues: Are short training data sets better for short-lead forecasts? From Wilks and Hamill (2006); intercomparison of logistic regression, nonhomogeneous Gaussian regression, and ordinary least-squares regression for 2-m temperature forecasts. Blue lines are 40-day training skill. back

8

9 T2M skill at Day 1 was so small that I repeated the experiment myself… Procedure: –Determine ensemble-mean forecast deviation from climatology, observed dev. from climatology. –Regress to predict observed deviation. –Form Gaussian pdf from regression.

10 Daily Max Temp CRPSS Consistent 1-day improvement in lead from fully using reforecasts; (didn’t trust T max data before 1987).

11 A discrepancy… Prior NCEP results: better skill for short-lead forecasts using just the most recent forecasts as training data. ESRL results: short training data set provide far less benefit than long ones at all leads. Hypothesis: NCEP results may have used different analysis system for verification than for reforecasts. Strong biases between them.

12  =.995, difference in climatology, Operational (  T62) - CDAS, January 2004 Very large differences, due to land- surface treatment and terrain differences in models with different resolutions. Both will be effects that will have to be dealt with if CDAS reanalyses are used with newer version of NCEP GFS. This may explain NCEP’s test results showing better performance using short training data set (since your short training data set used same analysis system as operational)

13 How does precipitation forecast skill change with training sample size? Here, 2-step analog approach used; colors of dots indicate which size analog ensemble provided the largest amount of skill.

14 Does the impact of calibration depend upon the variable? Decided we wanted to look systematically at the differences in calibration between Z500, T850, T2M, all with long training data sets. Was Z500 “easier” ? How much from bias correction vs. full pdf calibration? Since NCEP did their calc’s over N. Hemisphere, we’d repeat using that area.

15 Calibration of Z 500, T 850, T 2m Perturbed ensemble Z500 by 12.0m, T850 by 0.6C, T2M by 1.5C; solid lines are rank histogram after bias correction

16 Another piece of evidence why T2M is so much harder to calibrate than Z500. T2M bias is large relative to the intrinsic climatological variability.

17 Calibration techniques Uncalibrated : PDF from raw ensemble Gross Bias Correction: –(1) Calculate Mean B = F - O –(2) Corrected ens = raw ens + B Analog method –Similar to method for precipitation, but now find forecast analogs using only the current grid point’s data. 50 members. –Wilks and Hamill (2006, MWR, to appear) found that many other calibration methods (e.g. logistic regression, non- homogeneous Gaussian regression) were similar in performance.

18 Verification of Z 500, T 850, T 2m Northern Hemisphere (Z 500, T 850 ); 00Z North American surface obs with > 97% complete record from (T 2m ) Use continuous ranked probability skill score (CRPSS; 0=no skill, 1=perfect); use method of calculation in Hamill and Juras (2006, Oct. QJRMS) to avoid overestimating skill when climatology varies.

19 Z 500 CRPSS

20 T 850 CRPSS

21 T 2m CRPSS Impact of calibration largest for surface variables

22 Configuration for reforecasts: between ideal and practical Reanalyses: why redoing them is an important part of reforecast process. –Forecast skill improving partly due to better initial conditions. Reforecasts will leverage that. –Different biases in different analysis systems, especially near the surface. Reforecasts with old reanalysis initial conditions thus start with biased initial conditions relative to real-time forecasts. –Reforecasts may be the way of highlighting reanalysis importance - doing reanls improves skill Large ensemble of reforecasts necessary?

23 Forecasts based on new vs. old analysis systems - Want homogeneous characteristics of forecasts; skill the same for 1980’s forecasts as 2006 forecasts. - Part of better skill of current forecasts is the better initial condition. - Reanalysis would improve skill of old forecasts. - Reanalyses should use same or similar model as used in reforecasts.

24 Biases in reanalyses Especially near the surface, reanalyses can have substantial bias, reflecting bias of forecast model. –  Analysis systems don’t use much near-surface data, so much information from 1st guess. –  Near-surface temperatures reflect specific choice of boundary-layer, land-surface parameterizations. –  Should expect that near-surface temperature forecasts from reanalyses will differ from those from current analysis, especially for short-lead forecasts, before equilibration.

25  =.995, difference in climatology, Operational (  T62) - CDAS, January 2004 Very large differences, due to land- surface treatment and terrain differences in models with different resolutions. Both will be effects that will have to be dealt with if CDAS reanalyses are used with newer version of NCEP GFS. This may explain NCEP’s test results showing better performance using short training data set (since your short training data set used same analysis system as operational)

26 Would reanalyses be that difficult? NASA’s doing MERRA, presumably saving QC’ed obs, which is a large part of the effort. Can NCEP use MERRA QC’ed obs for quick reanalysis?

27 Our strong recommendation Do reanalyses as companion to reforecasts right from the start. Cannot assume that a reforecast system without reanalysis will give you large calibration benefits –strongly encourage your own testing (current GFS w. CDAS reanalyses) before committing to specific plan.

28 Issue: How many members are needed in reforecast? Ideally, reforecast ensemble is as large as the real-time ensemble. But can we develop calibration techniques that are effective with fewer members?

29 Analog high-resolution precipitation forecast calibration technique (actually run with 10 to 75 analogs)

30 Precipitation forecasts: how many members are needed? Analog reforecast process repeated, as in prior cartoon. But now rather than matching ensemble-mean pattern, match today’s control forecast to past control forecast. Grey area measures degradation relative to baseline using ensemble mean. Not much degradation in skill, esp. at short leads! (and you don’t even have to run an ensemble to get a probabilistic forecast).

31 Temperature forecasts: how many members? Statistical correction of control forecast using control reforecast is nearly as skillful at short leads, less skillful at longer leads. (May not have fully exploited value of control reforecast, however.)

32 Moving toward operational implementation at NCEP: issues Companion reanalysis possible? (we’ll keep harping on this…) Is NCDC ready for reforecast storage? Does NOMADS have bandwidth to receive/transmit reforecasts? Will reforecast access from ESRL, MDL be simple and convenient? Are all reforecasts, all levels guaranteed to be saved, or just selected subsets? What subsets? Would NCEP like any ESRL library routines, e.g., for analog calibration?

33 The bigger picture NOAA needs coordinated probabilistic forecast program. What is the role of reforecast-based ensemble products? –Bias correction of NAEFS ensemble products? [simple, NCEP can handle] –Suite of multiple, statistically adjusted probabilistic products pumped out to WFOs via NDGD? [complex, cross-NOAA, need coordinated plan involving NWS/OST, NCEP, MDL, ESRL, etc.]. Canada?

34 Conclusions Reforecasts will dramatically help NWS meet NRC guidelines for calibrated, skillful probabilistic forecast products. Reanalyses -- a necessary part of reforecast procedure. And end-to-end procedure for widely disseminating probabilistic products is a big endeavor, and all parts of NOAA should participate in a coordinated plan.

35 Bias correction using forecast and observed CDFs?

36 T 2m CRPSS, low and high climatological spread

hPa temperature bias for a grid point in the central U.S. Spread of yearly bias estimates from 31-day running mean F-O Note the spread is often larger than the bias, especially for long leads.

38 Effects of short / long training, T850 This was a quick-and-dirty study using T850 data over CONUS, NCEP-NCAR reanal. for verification. Not as much benefit from many years of data as seen with Tsfc.