Presentation is loading. Please wait.

Presentation is loading. Please wait.

It Never Rains But It Pours: Modeling Mixed Discrete- Continuous Weather Phenomena J. McLean Sloughter This work was supported by the DoD Multidisciplinary.

Similar presentations


Presentation on theme: "It Never Rains But It Pours: Modeling Mixed Discrete- Continuous Weather Phenomena J. McLean Sloughter This work was supported by the DoD Multidisciplinary."— Presentation transcript:

1 It Never Rains But It Pours: Modeling Mixed Discrete- Continuous Weather Phenomena J. McLean Sloughter This work was supported by the DoD Multidisciplinary University Research Initiative (MURI) program administered by the Office of Naval Research under Grant N00014-01-10745. Based on research being conducted under Adrian E. Raftery and Tilmann Gneiting

2 Ensemble Forecasting Single forecast model is run multiple times with different initial conditions Ensemble mean tends to outperform individual members Spread-skill relationship: spread of forecasts tends to be correlated with magnitude of error Model is underdispersive (not calibrated)

3 Spread/Skill Plot Spread = max forecast – min forecast Skill = abs(forecast mean – observed)

4 Ensemble Member Forecasts 48-hour forecasts for precipitation at 5pm Oct 20, 2003 From http://www.atmos.washington.edu/~emm5rt/ensemble.cgi

5 Bayesian Model Averaging Weighted average of multiple models Weights determined by posterior probabilities of models Posterior probabilities given by how well each member fits the training data Weights, then, give an indication of the relative usefulness of ensemble members

6 BMA for ensembles Picture taken from Raftery, Balabdaoui, Gneiting, and Polakowski (2003), “Calibrated MesoscaleShort-Range Ensemble Forecasting Using Bayesian Model Averaging.” whereis the forecast from member i, is the weight associated with member i, and is the estimated distribution function for Y given member i

7 The Trouble With Our Models Forecasts never predict zero – artifact of differential equations used to create forecasts Observed wind speed is often zero Wind speed, even ignoring zeroes, is not normally distributed

8 What Wind Speed Looks Like Wind Speed Histogram – Several exceptionally high values make it harder to see clearly Wind Speed Histogram truncated to only go up to fifty – there is a spike at zero Wind Speed Histogram without zeroes

9 What Forecasts Look Like Forecast histogram on left (all eight forecasts have similar histograms) and observed histogram on right – even after removing zeroes from the actual histogram, the shape is still not quite right – actual is more sharply skewed.

10 The Problem With Reality As we saw, the histograms for forecasts do not match the histogram for observations very well Maximal observed value is 124.000 Maximal values for each model: AVNCMCGEtaGASPJMANGPSTCWBUKMO 44.59154.56644.72251.73251.82945.38345.12549.243

11 More Trouble With Reality AVNCMCGEtaGASPJMANGPSTCWBUKMOY AVN1.0000.8260.8450.7970.7950.7830.7310.8400.417 CMCG0.8261.0000.8220.8070.8190.7970.7700.8050.402 Eta0.8450.8221.0000.7890.7930.7810.7470.8000.406 GASP0.7970.8070.7891.0000.7880.7790.7470.7860.394 JMA0.7950.8190.7930.7881.0000.7880.7560.7830.400 NGPS0.7830.7970.7810.7790.7881.0000.7570.7790.388 TCWB0.7310.7700.747 0.7560.7571.0000.7210.384 UKMO0.8400.8050.8000.7860.7830.7790.7211.0000.415 Y0.4170.4020.4060.3940.4000.3880.3840.4151.000 Pairwise correlations – Y is observed value, others are the various forecasts

12

13 Time Trends Left - average observed wind speed per day Right – same, but smoothed to average over 3-week interval

14 Time Trend Troubles Higher winds in summer, lower in winter (note that this appears to be an odd trait of the northwest) Need model to reflect seasonal patterns Would still like to just have a simple model based on forecasts

15 A Recap Of All The Things That Make Life Interesting and Miserable Distribution not normal Time Forecasts not very highly corellated with observations Zeroes

16 What to do about distributions? Model using another distribution – Gammas and Weibulls are popular models for windspeed Can apply a transformation to the data Left: Root of forecast windspeed from model 1 Right: Root of observed windspeed (excluding zeroes for easier visualization)

17 What to do about time? Rather than using all available data as training set, only train on recent data Trade-off between lower variance of estimates with more data, and better picture of current trends with less data Previous research (on temperature and pressure) indicates 40-day window of training data seems to give a good balance

18 What to do about the forecasts? We’re not making the forecasts, just using them We can apply a bias-correction by performing a linear regression of observed on forecasts (this is commonly done in forecasting already) We can see from our weight terms which models perform better and which perform worse, and report that to the folks making the forecasts We can hope that the science of meteorology continues to move forward as it has thus far

19 What to do about zeroes? We need a model that includes a point mass at zero Two main possibilities: We could model a weighted average of eight distributions, each of which is a normal plus a point mass at zero Or, we could first model probability of zero or non-zero, then, conditioned on non-zero, the weighted average of eight normals We will pursue the second option for now

20 So, let’s get to it then Probability of zero can be modeled by a logistic regression on the eight forecasts Then, the weighted average of normals can be determined by the EM algorithm Assume each normal has the bias-corrected forecast as its mean, and has a constant variance Alternate between predicting membership based on weights and variances, and weights and variances based on membership Make sure to also include probability of being non-zero when evaluating our functions

21 Let’s try out a simple test case first Generate a sample of 100,000 ordered triples (x1, x2, y), x1 uniform over 30 to 50, x2 uniform over 10 to 20 logistic regression coefficients of a=10, b1=-.2, b2=-.6 with probability determined by logistic regression, y=0 otherwise, with probability.6, y is normal around x1 with sd of 1, and with probability.4 is normal around x2 with sd of 3.14

22 How did we do? Predicted logisitic coefficients of a=10.257, b1=-0.206, b2=-0.600 weights of 0.598 and 0.402 sds of 1.000 and 3.414 Seems to be able to model pretty well under ideal artificial conditions So now let’s try the real thing

23 How do we do? RMSE from creating forecasts for 33 days, using 40 day training periods – black is without modeling zeroes, red is with modeling zeroes

24 Iterations to convergence again, black is without modeling zeroes, red is with

25 How do we feel about this? Including modeling of zeroes doesn’t appear to help our error much (p=0.4734), which is somewhat disappointing However, we get our model much faster (p=2.104e-08), which is a concern when having to do a lot of these

26 What We Would Have Done Next Had We Not Been Distracted By More Pressing Matters Consider fitting different distributions rather than using a transformation Fit a model with point masses at zero for each individual component Try additional bias corrections Compare results for different training windows Investigate importance of starting values in EM algorithm Evaluate performances of prediction intervals rather than just prediction means

27 Three Months Later… In the distraction interim, precipitation data became available Precipitation forecasts tend to be better than wind forecasts Weather people tend to be more interested in precipitation forecasts And so, we have abandoned (for now) wind speed, and are looking at precipitation instead

28 How is rain different? Distribution doesn’t look normal, even under transformations Models do predict zero, but we still see point masses at zero

29 Conditional Histograms Observed given forecast from 1.5 to 4 Observed given forecast from 55 to 80 in.01”

30 Fitting Gammas Shape Parameter Rate Parameter Coef. Of Variation

31 Something’s Not Quite Right Sample Mean Estimated Mean

32 At least something worked out nicely Proportion of Zeroes

33 And Now? First, find out what’s going funny with our gamma fitting Then, try to come up with a way to do some sort of gamma fitting in the EM algorithm Then, look at all those things we wanted to look at before


Download ppt "It Never Rains But It Pours: Modeling Mixed Discrete- Continuous Weather Phenomena J. McLean Sloughter This work was supported by the DoD Multidisciplinary."

Similar presentations


Ads by Google