Huug van den Dool and Steve Lord International Multi Model Ensemble
Two Consolidation Projects: Towards an International MME: CFS+EUROSIP(UKMO,ECMWF,METF) 11 slides Towards a National MME: CFS and GFDL, and NCAR/ CCM3.0/3.5 and NASA/GFSC 18 slides
Does the NCEP CFS add to the skill of the European DEMETER-3 to produce a viable International Multi Model Ensemble (IMME) ? Huug van den Dool Climate Prediction Center, NCEP/NWS/NOAA Suranjana Saha and Åke Johansson Environmental Modeling Center, NCEP/NWS/NOAA August 2007
DATA and DEFINITIONS USED DEMETER-3 (DEM3) = ECMWF + METFR + UKMO CFS IMME = DEM3 + CFS 1981 – Initial condition months : Feb, May, Aug and Nov Leads 1-5 Monthly means
DATA/Definitions USED (cont) Deterministic : Anomaly Correlation Probabilistic : Brier Score (BS) and Rank Probability Score (RPS) Ensemble Mean and PDF T2m and Prate Europe and United States “ NO (fancy) consolidation, equal weights, NO Cross-validation”
Number of times IMME improves upon DEM-3 : out of 20 cases (4 IC’s x 5 leads): RegionEUROPE USA VariableT2mPrateT2mPrate Anomaly Correlation 914 Brier Score RPS “The bottom line”
Frequency of being the best model in 20 cases in terms of Anomaly Correlation of the Ensemble Mean “Another bottom line” CFSECMWFMETFRUKMO T2mUSA4556 T2mEUROPE3565 PrateUSA7336 PrateEUROPE11005
Frequency of being the best model in 20 cases in terms of Ranked Probability Score (RPS) of the PDF “ Another bottom line” CFSECMWFMETFRUKMO T2mUSA9416 T2mEUROPE9343 PrateUSA19001 PrateEUROPE18001
CONCLUSIONS Overall, NCEP CFS contributes to the skill of IMME (relative to DEM3) for equal weights. This is especially so in terms of the probabilistic Brier Score and for Precipitation
CONCLUSIONS (Cont) In comparison to ECMWF, METFR and UKMO, the CFS as an individual model does: well in deterministic scoring (AC) for Prate and very well in probability scoring (BS) for Prate and T2m over both USA and EUROPEAN domains
International Multi-Model Ensemble (IMME) Status S. Lord, S. Saha, H. Vandendool
Status Goal: produce operational ensemble products from CFS and EUROSIP seasonal climate products EUROSIP –ECMWF –Met Office –Meteo France Proposal will be submitted to EUROSIP Council –Covers Licensing and product distribution Commercial interest and revenue sharing (none for US) –Consistent with EUROSIP general provisions Formal Memorandum of Understanding has been drafted –Covers IMME products Decision expected by end of calendar 2008
Status (2) Some tenets of a potential agreement –NCEP and E-partners will coordinate distribution of IMME products to their users on a regular monthly schedule –Product delivery will not compromise any organization’s operational delivery schedules and commitments –NCEP wishes to join the EUROSIP Steering Group as associate partner (non-voting member) and asks to participate in future meetings –Associated research program possible for product improvement
M. Peña Mendez and H. van den Dool, 2008: Consolidation of Multi-Method Forecasts at CPC. Accepted JCLIM 2008 Unger, D., H. van den Dool, E. O’Lenic and D. Collins, 2008: Ensemble Regression. Accepted MWR Wanqiu Wang: Pdf mapping methods Apply to soil moisture analyses We do work on methods!
Huug van den Dool, Yun Fan and Malaquias Pena The Multi-Model Ensemble Approach for Soil Moisture Analyses in the Absence of Verification Data.
Suppose we want to do MME with EIGHT MODELS 1: R1 2: R2 3: NA(RR) 4: ERA40 5: LB Climate Divisions 6: LB Global 0.5 degree 7: Noah retroactive 8: VIC retroactive Common Period Monthly mean total column soil moisture data on a 0.5 by 0.5 grid over the US. We know how to take the mean, but how about a weighted mean??
Upfront we forgive models for: Error in the mean (most models much too dry in Illinois) Wildly different standard deviations CON is applied to standardized anomalies
K CON = Σ α k w k (1) k = 1 i.e. a weighted mean over K model estimates of standardized soil moisture anomalies. One finds the K alphas, the weights, typically by minimizing the distance between CON and observed w for a number of cases. What is a consolidation (CON)???
If we had observations for soil moisture we would first do a : Classic or Unconstrained Regression (UR) The general problem of consolidation consists of finding a vector of weights, α, that minimizes the Sum of Square Errors, SSE, given by the following expression: SSE = (Wα - o) T (Wα - o) (2) Then leads to W T Wα = W T o So the weights are formally given by α = A -1 b (3) where A = W T W is the covariance matrix, b=W T o and the superscript -1 denotes the inverse operation. Equation (3) is the solution for the ordinary (Unconstrained) linear Regression (UR).
Essentially, ridging is a multiple linear regression with an additional penalty term to constrain the size of the squared weights in the minimization of SSE (2): J = (Wα - o) T (Wα - o) + λ α T α (4) Minimization of J leads to α = ( A + λ I ) -1 b (5) where I is the identity matrix, and, the regularization (or ridging) parameter, indicates the relative weight of the penalty term. Similarities between the ridging and Bayesian approaches for determining the weights have been discussed by Hsiang (1976) and Delsole (2007). In the Bayesian view, (5) represents the posterior mean probability of α, based on a normal a priori parameter distribution with mean zero and variance matrix (σ 2 /λ)I, where σ 2 I is the matrix variance of the regression residual, assumed to be normal with a mean zero.
Dilemma Outside Illinois we don’t have (sufficient) soil moisture observations to consider CON methods. (Equal weight is always possible of course).
Line of Attack In the absence of soil moisture data…we could use co- located Temperature data (two months later) to do a CON (at least in ‘warm’ half of the year). This CON serves CPC’s application. In a sense we weigh models by their ability to predict co- located future temperature (April thru September only). As an aside: We know (and hope) that soil moisture also helps in non-co-located T&P, but we cannot easily work this into a weighting scheme. The local effect on T is undisputed (e.g. dry/wet soil leads to high/low temps – thus expect negative weights!) A hydrologist could do this against runoff obs, an agronomist against crop yields, disease (obs!) over matching years
LBcdLBglR1VIC ERA40 Noah R2RR Shown above is the vector bX100 in Eq.(3), which is also the correlation between each model’s soil moisture and the temperature two months later. α = A -1 b (3) Conclusions 1)All model’s w correlates negatively with future T. Good! 2)Some models (the 2 LBs, VIC, Noah ….) correlate a little better (with future T) than others (over ) 3)A skill based weighting scheme without consideration of co-linearity would give the highest weights to these models (CPC ‘standard’) 4)Correlations (even -0.15) are modest, even if highly significant. Remember: This is an aggregate for all of the US and 6 warm months (April-Sept) combined
LBcdLBglR1VIC ERA40 Noah R2RR Conclusions 1)In the co-linear mix the Leaky Buckets carry most of the weight, followed by Noah and VIC etc. The remaining model speak for portions of the variance that, for the most part, are already accounted for by the leading models. 2) 75% ridging makes for a stable solution (all weights <=~0.) Question: 1)How much better is the weighted average than an equal weight (-1/8 th ) mean?, and how much better than the best individual model??? Shown are the weights α calculated from Eq. (3), α = A -1 b with minimal ridging Ridge= 0.75
Skill as measured by correlationX100. CON15.9 Equal Weight14.8 Best single Model14.9 Conclusions 1)Equal weight MME is NOT better than the best single model because it gives too much importance to poorly performing models. 2)Weighted MME is the best!, although the margin of gain may disappoint some of us.
UR MMA COR RIRIM RIW Climo Classic +Delsole equal weight limit +CPC skill limit