Barbara Brown 1, Ed Tollerud 2, and Tara Jensen 1 1 NCAR/RAL, Boulder, CO and DTC 2 NOAA/GSD, Boulder, CO and DTC DET: Testing and Evaluation Plan Wally.

Slides:



Advertisements
Similar presentations
Model Evaluation Tools MET. What is MET Model Evaluation Tools ( MET )- a powerful and highly configurable verification package developed by DTC offering:
Advertisements

Mywish K. Maredia Michigan State University
Mei Xu, Jamie Wolff and Michelle Harrold National Center for Atmospheric Research (NCAR) Research Applications Laboratory (RAL) and Developmental Testbed.
14 May 2001QPF Verification Workshop Verification of Probability Forecasts at Points WMO QPF Verification Workshop Prague, Czech Republic May 2001.
Ed Tollerud In collaboration with ESRL, RFC’s, WFO’s, and CDWR HMT-DTC Task.
Designing a DTC Verification System Jennifer Mahoney NOAA/ESRL 21 Feb 2007.
Verification and evaluation of a national probabilistic prediction system Barbara Brown NCAR 23 September 2009.
DTC AOP GSI October 1, FY09 Funding Resource/Tasks AFWA (February January 2010): – GSI code management and support (1.1FTE)
NWP Verification with Shape- matching Algorithms: Hydrologic Applications and Extension to Ensembles Barbara Brown 1, Edward Tollerud 2, Tara Jensen 1,
1 WRF Development Test Center A NOAA Perspective WRF ExOB Meeting U.S. Naval Observatory, Washington, D.C. 28 April 2006 Fred Toepfer NOAA Environmental.
Update on the Regional Modeling System Cliff Mass, David Ovens, Richard Steed, Mark Albright, Phil Regulski, Jeff Baars, Eric Grimit.
1 st UNSTABLE Science Workshop April 2007 Science Question 3: Science Question 3: Numerical Weather Prediction Aspects of Forecasting Alberta Thunderstorms.
University of Southern California Center for Systems and Software Engineering ©USC-CSSE1 Ray Madachy USC Center for Systems and Software Engineering
Testbeds and Projects with Ongoing Ensemble Research:  Hydrometeorology Testbed (HMT)  Hazardous Weather Testbed (HWT)  Hurricane Forecast Improvement.
Tara Jensen for DTC Staff 1, Steve Weiss 2, Jack Kain 3, Mike Coniglio 3 1 NCAR/RAL and NOAA/GSD, Boulder Colorado, USA 2 NOAA/NWS/Storm Prediction Center,
Chapter 13 – Weather Analysis and Forecasting. The National Weather Service The National Weather Service (NWS) is responsible for forecasts several times.
RTFDDA/CFDDA/E-RTFDDA Resources. Repositories –CVS –Data Sources Documentation & Useful Information –Internal –External –Mailing lists/aliases –Misc Web.
Jamie Wolff Jeff Beck, Laurie Carson, Michelle Harrold, Tracy Hertneky 15 April 2015 Assessment of two microphysics schemes in the NOAA Environmental Modeling.
Objective Evaluation of Aviation Related Variables during 2010 Hazardous Weather Testbed (HWT) Spring Experiment Tara Jensen 1*, Steve Weiss 2, Jason J.
Model testing, community code support, and transfer between research and operations: The Tropical Cyclone Modeling Testbed (TCMT) and the Developmental.
Bill Kuo 1, Louisa Nance 1, Barb Brown 1 and Zoltan Toth 2 Developmental Testbed Center 1. National Center for Atmospheric Research 2. Earth System Research.
Ed Tollerud, Tara Jensen, Barb Brown ( also Yuejian Zhu, Zoltan Toth, Tony Eckel, Curtis Alexander, Huiling Yuan,…) Module 6 Objective: Provide a portable,
Summary of MAP D-PHASE Strategy and Requirements MAP D-PHASE / Olympics Project Meeting 6 February 2006 Prepared by: Ron McTaggart-Cowan.
Inter-comparison and Validation Task Team Breakout discussion.
Chapter 12 Evaluating Products, Processes, and Resources.
ExOB Discussions on Development Test Center WRF ExOB Meeting U.S. Naval Observatory, Washington, D.C. 28 April 2006.
Federal Department of Home Affairs FDHA Federal Office of Meteorology and Climatology MeteoSwiss MAP D-PHASE – a WWRP Forecast Demonstration Project Marco.
1 INTRODUCTION TO DTC ENSEMBLE TESTBED (DET) Zoltan Toth 1 1 GSD/ESRL/OAR/NOAA 2 NCAR/RAL Team Members: Barbara Brown 2, Tara Jensen 2, Huiling Yuan 1,
Environment Change Information Request Change Definition has subtype of Business Case based upon ConceptPopulation Gives context for Statistical Program.
Real-time Verification of Operational Precipitation Forecasts using Hourly Gauge Data Andrew Loughe Judy Henderson Jennifer MahoneyEdward Tollerud Real-time.
Arctic System Model Workshop Background and Objectives International Arctic Research Center Hosted by National Center for Atmospheric Research May 19-22,
. Outline  Evaluation of different model-error schemes in the WRF mesoscale ensemble: stochastic, multi-physics and combinations thereof  Where is.
Ligia Bernardet, S. Bao, C. Harrop, D. Stark, T. Brown, and L. Carson Technology Transfer in Tropical Cyclone Numerical Modeling – The Role of the DTC.
Peter Knippertz et al. – Uncertainties of climate projections of severe European windstorms European windstorms Knippertz, Marsham, Parker, Haywood, Forster.
1 5. Overview of DTC Status and Annual Operating Plan WRF Executive Oversight Board Meeting 2 30 July 2004.
DTC Verification for the HMT Edward Tollerud 1, Tara Jensen 2, John Halley Gotway 2, Huiling Yuan 1,3, Wally Clark 4, Ellen Sukovich 4, Paul Oldenburg.
1 11/25/2015 Developmental Testbed Center (DTC) Bob Gall June 2004.
Refinement and Evaluation of Automated High-Resolution Ensemble-Based Hazard Detection Guidance Tools for Transition to NWS Operations Kick off JNTP project.
Near Real-Time Verification At The Forecast Systems Laboratory: An Operational Perspective Michael P. Kay (CIRES/FSL/NOAA) Jennifer L. Mahoney (FSL/NOAA)
Panel Discussion for the DTC Verification System 23 Feb 2007.
Working group III report Post-processing. Members participating Arturo Quintanar, John Pace, John Gotway, Huiling Yuan, Paul Schultz, Evan Kuchera, Barb.
Usability Evaluation. Objectives for today Go over upcoming deliverables Learn about usability testing (testing with users) BTW, we haven’t had a quiz.
DET Module 5 Products and Display Tara Jensen 1 and Paula McCaslin 2 1 NCAR/RAL, Boulder, CO 2 NOAA/GSD, Boulder, CO Acknowledgements: HWT Spring Experiment.
17 November Organisation Météorologique Mondiale Pour une collaboration active dans le domaine du temps, du climat et de l’eau A perspective from.
HMT-DTC Project – 2009 Funded by USWRP Collaborators: NCAR – Tara Jensen, Tressa Fowler, John Halley-Gotway, Barb Brown, Randy Bullock ESRL – Ed Tollerud,
Spatial Forecast Methods Inter-Comparison Project -- ICP Spring 2008 Workshop NCAR Foothills Laboratory Boulder, Colorado.
Edward Tollerud 1, Tara Jensen 2, John Halley Gotway 2, Huiling Yuan 1,3, Wally Clark 4, Ellen Sukovich 4, Paul Oldenburg 2, Randy Bullock 2, Gary Wick.
Opportunities for Satellite Observations in Validation and Assessment Barbara Brown NCAR/RAL 19 November 2008.
Science plan S2S sub-project on verification. Objectives Recommend verification metrics and datasets for assessing forecast quality of S2S forecasts Provide.
MOS and Evolving NWP Models Developer’s Dilemma: Frequent changes to NWP models… Make need for reliable statistical guidance more critical Helps forecasters.
Diagnostic verification and extremes: 1 st Breakout Discussed the need for toolkit to build beyond current capabilities (e.g., NCEP) Identified (and began.
DTC Overview Bill Kuo September 25, Outlines DTC Charter DTC Management Structure DTC Budget DTC AOP 2010 Processes Proposed new tasks for 2010.
Fly - Fight - Win 2 d Weather Group Mr. Evan Kuchera HQ AFWA 2 WXG/WEA Template: 28 Feb 06 Approved for Public Release - Distribution Unlimited AFWA Ensemble.
Comparison of Convection-permitting and Convection-parameterizing Ensembles Adam J. Clark – NOAA/NSSL 18 August 2010 DTC Ensemble Testbed (DET) Workshop.
DET Module 1 Ensemble Configuration Linda Wharton 1, Paula McCaslin 1, Tara Jensen 2 1 NOAA/GSD, Boulder, CO 2 NCAR/RAL, Boulder, CO 3/8/2016.
The Developmental Testbed Center: Historical Perspective and Benefits to NOAA Steve Koch DTC Deputy Director Director, NOAA/ESRL/Global Systems Division.
Verification of C&V Forecasts Jennifer Mahoney and Barbara Brown 19 April 2001.
SOFTWARE TESTING AND QUALITY ASSURANCE. Software Testing.
HIC Meeting, 02/25/2010 NWS Hydrologic Forecast Verification Team: Status and Discussion Julie Demargne OHD/HSMB Hydrologic Ensemble Prediction (HEP) group.
Bill Kuo 1, Steve Koch 2, Barb Brown 1, Louisa Nance 1 Developmental Testbed Center 1. National Center for Atmospheric Research 2. Earth System Research.
LEPS VERIFICATION ON MAP CASES
Weather Forecast Verification Using
S2S sub-project on verification (and products)
5Developmental Testbed Center
Research to operations (R2O)
Auditing Application Controls
Relational Operators.
Verification of Tropical Cyclone Forecasts
Chapter 10 Content Analysis
Presentation transcript:

Barbara Brown 1, Ed Tollerud 2, and Tara Jensen 1 1 NCAR/RAL, Boulder, CO and DTC 2 NOAA/GSD, Boulder, CO and DTC DET: Testing and Evaluation Plan Wally Clark

DTC and DET Testing and Evaluation T&E is one of the most important activities undertaken by the DTC DTC testing has involved WRF core comparisons, boundary layer schemes, and other aspects of NWP DTC has created “Reference Configurations” (RCs) that are to be re- tested in conjunction with model changes DET infrastructure is being developed to allow Testing and evaluation and Intercomparison of ensemble systems and system components

Major categories of testing Forecasting system comparisons Compare forecasts based on one configuration with forecasts based on a different model configuration Examples Two types of model initialization Two or more methods of statistical post-processing Individual reference configuration Model “setup” is evaluated Setup is re-evaluated when model changes are implemented Reference configurations may be defined by Operational centers Users RCs may also be community-contributed Forecasts contributed by a modeling group Ex: Forecasts evaluated in HWT and HMT projects

DTC Testing and Evaluation Principles A formal test plan is developed, defining all of the important aspects of the testing and evaluation Developer may have a role in helping to create the test plan Execution of test is independent of the developer Focus of test depends on the questions that are of interest Module being used Variables of interest Many cases evaluated for statistical significance Not just a few case studies Multiple seasons, times of day, etc. Meaningful stratifications Location/region Season Other user-based criteria

Components of a test plan (example) Goals Experiment design Codes Specification of the codes will be run as part of the test Model output What kinds of output will be produced? Forecast periods Post-processing Verification Statistical methods and measures Graphics generation and display Data archival and dissemination of results Computer resources Deliverables Example from QNSE evaluation (surface T and wind)

Questions to address when developing a test plan Which aspect(s) (or modules)of the ensemble system will be evaluated? What performance aspects are we trying to compare? Or evaluate? Who are the “users”? What are the variables of interest? Answers to these questions will lead to determination of the other aspects of the plan

Considerations for ensemble T&E Number of cases will likely need to be increased (over non-ensemble evaluations) Many probabilistic and ensemble verification scores (e.g., reliability) require relatively large subsamples Subsamples must be large enough to assess statistical significance But – Sampling must be focused enough for representativeness Verification approaches and metrics are somewhat unique Computer resources may be a limitation

Other considerations Real-time vs. post-analysis DTC intensive tests generally done in post-analysis Real-time demonstrations also have many benefits (e.g., HMT, HWT) Subjective evaluations – should these be considered for DET T&E? How much rigorous end-to-end testing required vs. evaluation of individual components? Example for HMT evaluation – winter 2010