Exploring sample size issues for 6-10 day forecasts using ECMWF’s reforecast data set Model: 2005 version of ECMWF model; T255 resolution. Initial Conditions:

Exploring sample size issues for 6-10 day forecasts using ECMWF’s reforecast data set Model: 2005 version of ECMWF model; T255 resolution. Initial Conditions: 15 members, ERA-40 analysis + singular vectors Dates of reforecasts: 1982-2001, Once-weekly reforecasts from 01 Sep - 01 Dec, 14 total. So, 20*14 ensemble reforecasts = 280 samples. Data sent to NOAA / ESRL : T 2M and precipitation ensemble over most of North America, excluding Alaska. Saved on 1- degree lat / lon grid. Forecasts to 10 days lead. Tom Hamill and Jeff Whitaker, NOAA/ESRL Data courtesy of Renate Hagedorn & ECMWF

What we did Considered 6-10 day forecasts of T 2m and precipitation (longest-possible lead from this data set). Relevance to weeks 2, 3, 4 forecast? Your guess is as good as mine (I think probably some relevance, less for week 4 than week 2). Experiments: –N-member reforecast, N members real time (N=1, 3, 5, 7, 9, 11, 13, 15) –N-member reforecast, 15 members real time –Use established statistical calibration procedures

Observation locations for 2-meter temperature calibration Uses stations from NCAR’s DS472.0 database that have more than 96% of the yearly records available, and overlap with the domain that ECMWF sent us. (Note: precipitation calibration based on NARR over CONUS)

T 2m calibration procedure: “NGR” “Non-homogeneous Gaussian Regression” Reference: Gneiting et al., MWR, 133, p. 1098 Predictors: ensemble mean and ensemble spread Output: mean, spread of calibrated normal distribution Advantage: leverages possible spread/skill relationship appropriately. Large spread/skill relationship, c ≈ 0.0, d ≈1.0. Small, d ≈ 0.0 Disadvantage: iterative method, slow…no reason to bother (relative to using simple linear regression) if there’s little or no spread/skill relationship.

Also: T 2m calibration procedure: linear regression Predictors: ensemble mean of lowest sigma-layer temp Output: predicted mean and standard deviation where  is determined by and y denotes the observations and S the training sample size

Training data for Non-homogeneous Gaussian Regression (all cross validated) 01 Sep: 01 Sep, 08 Sep, 15 Sep 08 Sep: 01 Sep, 08 Sep, 15 Sep, 22 Sep 15 Sep: 01 Sep, 08 Sep, 15 Sep, 22 Sep, 29 Sep 17 Nov: 03 Nov, 10 Nov, 17 Nov, 24 Nov, 01 Dec 24 Nov: 10 Nov, 17 Nov, 24 Nov, 01 Dec 01 Dec: 17 Nov, 24 Nov, 01 Dec Use a centered training data set for weeks 3 - 12, uncentered for weeks 1, 2, 13, and 14

T 2m results, same ensemble size reforecast as real-time forecast Notes: (1)Some sensitivity to ensemble size; more members clearly better, most of benefit by 11 members. (2) Linear regression slightly better for small ensemble size, NGR slightly better for large ensemble size

T 2m results, smaller reforecast than 15-mbr real-time forecast Notes: (1)NGR line replicated from previous plot for sake of comparison. (2)Linear regression from with coefficients developed from 3-member reforecast and applied to 15 members real time provides almost all of the benefit of full 15-member reforecast. (2)

Precipitation forecast calibration: logistic regression Given predictors x 1, …, x N (such as the ensemble- mean), find regression coefficients  0,  1, …,  N for the equation This generates an S-shaped curve (here for one predictor) Probability Predictor value

Precipitation calibration training procedure Cross-validate (for example, 1983 forecasts use 1982, 1984-2001). Use all fall season data together, unlike temperature (1 Sep forecasts use 1 Sep - 1 Dec training data). [seasonal biases assumed less important than training sample size] Sole predictor: (ens. mean precip) 0.5

Increasing logistic regression sample size by compositing data from different locations Big dot: location to perform logistic regression. Small dots: grid points with similar observed climatologies, used to augment training sample at big dot’s location. Constrained so that the analog composite locations can’t be too near to each other. Sub-optimal (what if forecast climatologies differ? What if forecast/observed correlations differ? These not accounted for in choosing analog locations.)

Precipitation calibration example

Reliability diagrams 15-member reforecast / 15-member real-time calibrated 15-member, from raw ECMWF ensemble

Precipitation Brier skill scores Again, fewer members are needed in reforecast, as long as real-time forecast is larger. Most of the benefit achieved with 5-7 members in the reforecast (larger than the 3 members with temperature calibration)

Comments from Renate Hagedorn, ECMWF “The results itself are pretty much consistent with my results on the importance of the number of ensemble members in the training data set. I've also seen that 5 members are already quite sufficient and increasing the number to 15 doesn't give much benefit. In contrast to that, the number of years seems to be more important. Since increasing the reforecasts from 5 to 15 members is obviously very expensive (and doesn't seem to be justified very much) we'll go for a 5-member reforecast ensemble in our new system.” “Why you don't use ECMWF monthly forecast / reforecast data if you are interested in week 2,3,4 aspects? This could help with the problem/question of relevance of the 6-10 day forecasts.”

Reconfiguration of CFS? (intended as a starting point for discussion) Real-time: –Planned : 4x/day to 9 months (=36 months/day) –Reconfigured : 2x/day, 10 members out to 1 month, then single member to 9 months (2*(10+8) = 36 months/day) Reforecasts –Planned: 1 run/day to 9 months (9 months/day) –Reconfigured: 10-member ensemble to 1 month every 2nd day (alternating 00Z, 12Z) = 5 months/day 1 member extending for 2-9 months every other day = 4 months/day Total = 5 + 4 = 9 months/day

Conclusions Assuming 15-member real-time forecast: –3-member reforecast sufficient for calibration of 6- 10 day temperatures –5-7 member reforecast sufficient for calibration of 6-10 day precipitation Relevance to calibration of weeks 2, 3, and 4? (perhaps could explore ECMWF’s monthly data set for greater relevance). Reconfiguration/supplementation of CFS to improve sub-seasonal forecasts should be discussed.

Exploring sample size issues for 6-10 day forecasts using ECMWF’s reforecast data set Model: 2005 version of ECMWF model; T255 resolution. Initial Conditions:

Similar presentations

Presentation on theme: "Exploring sample size issues for 6-10 day forecasts using ECMWF’s reforecast data set Model: 2005 version of ECMWF model; T255 resolution. Initial Conditions:"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Exploring sample size issues for 6-10 day forecasts using ECMWF’s reforecast data set Model: 2005 version of ECMWF model; T255 resolution. Initial Conditions:

Similar presentations

Presentation on theme: "Exploring sample size issues for 6-10 day forecasts using ECMWF’s reforecast data set Model: 2005 version of ECMWF model; T255 resolution. Initial Conditions:"— Presentation transcript:

Similar presentations

About project

Feedback