1 Malaquias Peña and Huug van den Dool Consolidation methods for SST monthly forecasts for MME Acknowledgments: Suru Saha retrieved and organized the data,

Slides:



Advertisements
Similar presentations
Multiple Linear Regression uses 2 or more predictors General form: Let us take simplest multiple regression case--two predictors: Here, the b’s are not.
Advertisements

Ridge Regression Population Characteristics and Carbon Emissions in China ( ) Q. Zhu and X. Peng (2012). “The Impacts of Population Change on Carbon.
Model Assessment, Selection and Averaging
The adjustment of the observations
Model assessment and cross-validation - overview
Details for Today: DATE:3 rd February 2005 BY:Mark Cresswell FOLLOWED BY:Assignment 2 briefing Evaluation of Model Performance 69EG3137 – Impacts & Models.
Yard. Doç. Dr. Tarkan Erdik Regression analysis - Week 12 1.
Linear Methods for Regression Dept. Computer Science & Engineering, Shanghai Jiao Tong University.
Statistics for Managers Using Microsoft® Excel 5th Edition
Lecture Notes for CMPUT 466/551 Nilanjan Ray
Simple Linear Regression
Statistics for Managers Using Microsoft® Excel 5th Edition
Data mining and statistical learning, lecture 5 Outline  Summary of regressions on correlated inputs  Ridge regression  PCR (principal components regression)
Statistical Treatment of Data Significant Figures : number of digits know with certainty + the first in doubt. Rounding off: use the same number of significant.
Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc. Chap 15-1 Chapter 15 Multiple Regression Model Building Basic Business Statistics 11 th Edition.
A Regression Model for Ensemble Forecasts David Unger Climate Prediction Center.
Lecture II-2: Probability Review
Copyright ©2011 Pearson Education 15-1 Chapter 15 Multiple Regression Model Building Statistics for Managers using Microsoft Excel 6 th Global Edition.
Principles of the Global Positioning System Lecture 11 Prof. Thomas Herring Room A;
Objectives of Multiple Regression
Multi-Model Ensembling for Seasonal-to-Interannual Prediction: From Simple to Complex Lisa Goddard and Simon Mason International Research Institute for.
Copyright ©2011 Pearson Education, Inc. publishing as Prentice Hall 15-1 Chapter 15 Multiple Regression Model Building Statistics for Managers using Microsoft.
沈致远. Test error(generalization error): the expected prediction error over an independent test sample Training error: the average loss over the training.
Oceanography 569 Oceanographic Data Analysis Laboratory Kathie Kelly Applied Physics Laboratory 515 Ben Hall IR Bldg class web site: faculty.washington.edu/kellyapl/classes/ocean569_.
Portfolio Management-Learning Objective
A Neural Network MonteCarlo approach to nucleon Form Factors parametrization Paris, ° CLAS12 Europen Workshop In collaboration with: A. Bacchetta.
Intraseasonal TC prediction in the southern hemisphere Matthew Wheeler and John McBride Centre for Australia Weather and Climate Research A partnership.
Educational Research: Competencies for Analysis and Application, 9 th edition. Gay, Mills, & Airasian © 2009 Pearson Education, Inc. All rights reserved.
EUROBRISA Workshop – Beyond seasonal forecastingBarcelona, 14 December 2010 INSTITUT CATALÀ DE CIÈNCIES DEL CLIMA Beyond seasonal forecasting F. J. Doblas-Reyes,
Heidke Skill Score (for deterministic categorical forecasts) Heidke score = Example: Suppose for OND 1997, rainfall forecasts are made for 15 stations.
Statistical analysis Outline that error bars are a graphical representation of the variability of data. The knowledge that any individual measurement.
Model validation Simon Mason Seasonal Forecasting Using the Climate Predictability Tool Bangkok, Thailand, 12 – 16 January 2015.
ECE 8443 – Pattern Recognition ECE 8423 – Adaptive Signal Processing Objectives: Definitions Random Signal Analysis (Review) Discrete Random Signals Random.
Chapter 4 Linear Regression 1. Introduction Managerial decisions are often based on the relationship between two or more variables. For example, after.
Verification of IRI Forecasts Tony Barnston and Shuhua Li.
1 Climate Test Bed Seminar Series 24 June 2009 Bias Correction & Forecast Skill of NCEP GFS Ensemble Week 1 & Week 2 Precipitation & Soil Moisture Forecasts.
ELEC 303 – Random Signals Lecture 18 – Classical Statistical Inference, Dr. Farinaz Koushanfar ECE Dept., Rice University Nov 4, 2010.
Theoretical Review of Seasonal Predictability In-Sik Kang Theoretical Review of Seasonal Predictability In-Sik Kang.
CTB Science Plan For Multi Model Ensembles (MME) Suru Saha Environmental Modeling Centre NCEP/NWS/NOAA.
Development of Precipitation Outlooks for the Global Tropics Keyed to the MJO Cycle Jon Gottschalck 1, Qin Zhang 1, Michelle L’Heureux 1, Peitao Peng 1,
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 14-1 Chapter 14 Multiple Regression Model Building Statistics for Managers.
“Comparison of model data based ENSO composites and the actual prediction by these models for winter 2015/16.” Model composites (method etc) 6 slides Comparison.
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Chap 15-1 Chapter 15 Multiple Regression Model Building Basic Business Statistics 10 th Edition.
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 14-1 Chapter 14 Multiple Regression Model Building Statistics for Managers.
1 Malaquias Peña and Huug van den Dool Consolidation of Multi Method Forecasts Application to monthly predictions of Pacific SST NCEP Climate Meeting,
Module 4 Forecasting Multiple Variables from their own Histories EC 827.
Huug van den Dool / Dave Unger Consolidation of Multi-Method Seasonal Forecasts at CPC. Part I.
Educational Research: Data analysis and interpretation – 1 Descriptive statistics EDU 8603 Educational Research Richard M. Jacobs, OSA, Ph.D.
1 How Does NCEP/CPC Make Operational Monthly and Seasonal Forecasts? Huug van den Dool (CPC) ESSIC, February, 23, 2011.
Two Consolidation Projects: Towards an International MME: CFS+EUROSIP(UKMO,ECMWF,METF) 11 slides Towards a National MME: CFS and GFDL 18 slides.
Multi Model Ensembles CTB Transition Project Team Report Suranjana Saha, EMC (chair) Huug van den Dool, CPC Arun Kumar, CPC February 2007.
Huug van den Dool and Suranjana Saha Prediction Skill and Predictability in CFS.
Huug van den Dool and Steve Lord International Multi Model Ensemble.
1 Summary of CFS ENSO Forecast September 2010 update Mingyue Chen, Wanqiu Wang and Arun Kumar Climate Prediction Center 1.Latest forecast of Nino3.4 index.
Demand Management and Forecasting Chapter 11 Portions Copyright © 2010 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin.
1 Summary of CFS ENSO Forecast August 2010 update Mingyue Chen, Wanqiu Wang and Arun Kumar Climate Prediction Center 1.Latest forecast of Nino3.4 index.
1/39 Seasonal Prediction of Asian Monsoon: Predictability Issues and Limitations Arun Kumar Climate Prediction Center
Chapter 11 – With Woodruff Modications Demand Management and Forecasting Copyright © 2010 by The McGraw-Hill Companies, Inc. All rights reserved.McGraw-Hill/Irwin.
Ch 1. Introduction Pattern Recognition and Machine Learning, C. M. Bishop, Updated by J.-H. Eom (2 nd round revision) Summarized by K.-I.
Chapter 15 Multiple Regression Model Building
Statistical analysis.
Statistical analysis.
CSE 4705 Artificial Intelligence
Alan Qi Thomas P. Minka Rosalind W. Picard Zoubin Ghahramani
With special thanks to Prof. V. Moron (U
Predictability of Indian monsoon rainfall variability
Fig. 1 The temporal correlation coefficient (TCC) skill for one-month lead DJF prediction of 2m air temperature obtained from 13 coupled models and.
GloSea4: the Met Office Seasonal Forecasting System
Seasonal Forecasting Using the Climate Predictability Tool
Forecast system development activities
Presentation transcript:

1 Malaquias Peña and Huug van den Dool Consolidation methods for SST monthly forecasts for MME Acknowledgments: Suru Saha retrieved and organized the data, Dave Unger and Peitao Peng provided discussion to the subject 3 rd CTB Science Advisory Board Meeting, August 28-29, 2007

2 Making the best single forecast out of a number of forecast inputs Necessary as large supply of forecasts available Expressed as a linear combination of participant models: Consolidation Data: Nine ensemble prediction systems (DEMETER+CFS+CA)  At least 9 ensemble members per model  Hindcast length: Twenty-one years ( )  Monthly mean forecasts; Leads 0 to 5 months  Four initial month: Feb, May, Aug, Nov K number of participating models ζ input forecast at a particular initial month and lead time Task: Finding K optimal weights, α k, corresponding to each input model

Tropical Pacific SST Region of appreciable skill Multi-Model Ensemble Average (MMA) more skillful than any single ensemble model average Can sophisticated consolidation methods be better? Model Ensemble average MMA Pattern Anomaly Correlation. Average over all leads and months. Full data Anomaly Correlation gridpointwise of MMA 12.5ºN 12.5ºS 82.5ºW140ºE

4 Issues HINDCAST should be sufficiently long to –Remove Systematic Errors –Carry out weight optimization procedures (consolidation) –Allow cross-validation assessment A CV-1 produce degeneracy; a CV-3yrs out will leave only 18 data points. OVERFITTING: Due to short training dataset compared to the number of input forecast models. –Rigorous cross-validation procedure to measure realistic skill COLLINEARITY: A large number of participating models may lead to problems when at least one of the models is not independent from the rest. –Even with plentiful data –Covariance matrix ill-conditioned: Regression coefficients not accurately computed –Regularization methods required

5 Ridging Methods RIDGING RIDGING w/ MMA constraint RIDGING w/ weighted mean constraint Climatology 1/K Optimized weights for the 9 model ensemble averages as a function of the ridging amount: λ Weights approach to skill (weighted) of corresponding model

6 Consolidation Methods Assessed AcronymMethodCharacteristics MMA Multi-method ensemble average Simple average of all the input forecasts UR Unconstrained regression Typical multiple linear regression. Assumes no collinearity among models. Weights sometimes negative and too large COR Correlation coefficient Skill weighted method. Collinearity among models is not considered FRE Frequency of the best Gives weights depending on how many times the model has been the best in the training period. RID Ridging Considers both skill and collinearity among models RI2 Double pass Ridging First pass to identify unskillful and/or redundant models. Then, set their corresponding weights to zero and perform a second pass with the reduced set of models. RIM Ridging with MMA penalty Ridging with a penalty term for weights departing from MMA RIW Ridging with COR penalty Ridging with a penalty term for weights departing from COR

7 Pattern Anomaly Correlation Average over all leads and initial months Consolidation methods

8 INCREASING EFFECTIVE SAMPLE SIZE 1. Selection-combination (double pass strategy) Objective procedure to remove or set to zero weights of bad or redundant models First pass: Ridging identifies negative weights. Set these to zero Second pass: Ridging is carried out on the models with positive weight 2. Mixing data from neighboring gridpoints Previous studies show a gain in weight stability Reduces flexibility in cases where the raking of the model skill changes from region to region If hindcasts allowed, mixing neighboring lags 3. Mixing information from individual ensemble members Most studies have used the ensemble average for each model Each member is a unique realization of model Constrain weights of ensemble members to be same within each model APPROACHES

9 INCREASING EFFECTIVE SAMPLE SIZE 1.Gridpoint by gridpoint 2.3x3 box that includes the point of analysis plus the 8 closest gridpoints 3.9x9 box with the 80 closest neighboring gridpoints 4.All the grid points in the domain Approaches USING ENSEMBLE AVERAGES ONLY USING ALL ENSEMBLE MEMBERS MMA

10 INCREASING EFFECTIVE SAMPLE Consistency: Percentage number of cases that a sophisticated consolidation method outperforms MM Consolidation method Percentage Only ensemble means and gridpoint by gridpoint All ensemble members and closest neighboring gridpoint All ensemble members and all gridpoints in the domain Sophisticated consolidation methods more frequently outperform MMA for larger effective sample

Performance Ridging AC improves considerably over western Pacific AC for MMA. All initial months includedAC: RID minus MMA (%). All initial months included

12 Probabilistic assessment PDF formation from optimized weights Method1: Weights used as factors that multiply corresponding forecast values. Width of PDF becomes too narrow when weights are small. Optimization to inflate PDF required Method2: Weights used to determine stacks given to corresponding model without changing the value of the forecast Both methods produce similar results with the latter more strightforward Relative Operating Characteristic Curve: To measure the ability of method to predict occurrence or non- occurrence that observation will fall in the “upper”, “middle” or “lower” tercile. Class limits defined by the observed SST during the training period

13 PDF formation observation 3/8111/81 67/ /81 59/ Forecasts Illustration for a particular gridpoint, lead and initial month p = fraction of ensemble members Observed climate Bars’ height depends on the weights Method 1: weights are multiplicative corrections to forecasts Method 2: weights determine stacks Equal Weights Method Weights are:

14 Area below ROC curve Regions where RID outperforms MMA are shaded Lower tercile Middle tercile Upper tercile MMARID minus MMA

15 Summary of results Eight consolidation methods assessed under rigorous cross- validation procedure, CV-3RE, to combine up to 81 monthly forecasts of tropical Pacific SST. The simple multi-methods ensemble average (MMA) shows large and consistent skill improvement over individual participant models as measured by AC. When the sampling size is small sophisticated consolidation methods are as skillful as (or, in the case of UR, worse than) MMA Increasing the effective sampling size produces more stable weights and affects positively the skill of sophisticated consolidation In the western tropical Pacific, sophisticated consolidation methods improve significantly over MMA. This is true both using a deterministic (AC) and Probabilistic (ROC) assessment.

16 Remarks Hindcast MMA requires long hindcast to remove SE Sophisticated consolidation methods need sufficiently long hindcast not only to remove SE but to optimize weights Cross-validation Rigorous CV-3RE shows artificial skill prevalent in most of the consolidation methods Probability Density Function Efficient treatment is needed to adjust posterior PDF in ridge regression methods. Gaussian Kernel, Bayesian Methods SE of the standard deviation and other higher moments of the PDF: hard to obtain given shortness of the hindcasts Application dependent Results for SST monthly predictions (high skill and large collinearity) may not apply to consolidation of other variables