Stochastic Nonparametric Techniques for Ensemble Streamflow Forecast : Applications to Truckee/Carson and Thailand Streamflows Balaji Rajagopalan, Katrina Grantz, Nkrintra Singhrattna Department of Civil and Environmental Engg. University of Colorado, Boulder, CO Edith Zagona CADSWES / Dept. Of Civil and Env. Engg. University of Colorado, Boulder, CO Martyn Clark CIRES University of Colorado GAPP / PI Meeting – Summer 2003
Hydrologic Forecasting Conditional Statistics of Future State, given Current State Current State: D t : (x t, x t- , x t-2 , …x t-d1 , y t, y t- , y t-2 , …y t-d2 ) Future State: x t+T Forecast: g(x t+T ) = f(D t ) – where g(.) is a function of the future state, e.g., mean or pdf – and f(.) is a mapping of the dynamics represented by D t to g(.) – Challenges Composition of D t Identify g(.) given D t and model structure – For nonlinear f(.), Nonparametric function estimation methods used K-nearest neighbor Local Regression Regression Splines Neural Networks
The Problem Ensemble Forecast/Stochastic Simulation /Scenarios generation – all of them are conditional probability density function problems Estimate conditional PDF and simulate (Monte Carlo, or Bootstrap)
Parametric Models Periodic Auto Regressive model (PAR) – Linear lag(1) model – Stochastic Analysis, Modeling, and Simulation (SAMS) (Salas, 1992) Data must fit a Gaussian distribution Expected to preserve – mean, standard deviation, lag(1) correlation – skew dependant on transformation – gaussian probability density function
Parametric Models - Drawbacks Model selection / parameter estimation issues Select a model (PDFs or Time series models) from candidate models Estimate parameters Limited ability to reproduce nonlinearity and non- Gaussian features. All the parametric probability distributions are ‘unimodal’ All the parametric time series models are ‘linear’ Outliers have undue influence on the fit Not Portable across sites
Nonparametric Methods Any functional (probabiliity density, regression etc.) estimator is nonparametric if: It is “local” – estimate at a point depends only on a few neighbors around it - (effect of outliers is removed) No prior assumption of the underlying functional form – data driven Kernel Estimators - (properties well studied) Splines, Multivariate Adaptive Regression Splines (MARS) K-Nearest Neighbor (K-NN) Bootstrap Estimators Locally Weighted Polynomials (K-NN Polynomials)
K-NN Philosophy Find K-nearest neighbors to the desired point x Resample the K historical neighbors (with high probability to the nearest neighbor and low probability to the farthest) Ensembles Weighted average of the neighbors Mean Forecast Fit a polynomial to the neighbors – Weighted Least Squares – Use the fit to estimate the function at the desired point x (i.e. local regression) Number of neighbors K and the order of polynomial p is obtained using GCV (Generalized Cross Validation) – K = N and p = 1 Linear modeling framework. The residuals within the neighborhood can be resampled for providing uncertainity estimates / ensembles.
Applications to date…. Monthly Streamflow Simulation Space and time disaggregation of monthly to daily streamflow Monte Carlo Sampling of Spatial Random Fields Probabilistic Sampling of Soil Stratigraphy from Cores Hurricane Track Simulation Multivariate, Daily Weather Simulation Downscaling of Climate Models Ensemble Forecasting of Hydroclimatic Time Series Biological and Economic Time Series Exploration of Properties of Dynamical Systems Extension to Nearest Neighbor Block Bootstrapping -Yao and Tong
K-NN Local Polynomial
y t * y t-1 K-NN Algorithm
y t-1 y t * e t * Residual Resampling y t = y t * + e t *
Applications Local-Polynimial + K-NN residual bootstrap Ensemble Streamflow forecasting Truckee-Carson basin, NV Ensemble forecast from categorical probabilistic forecast – Thailand Streamflows
Study Area TRUCKEE CANAL Farad Ft Churchill
Motivation USBR needs good seasonal forecasts on Truckee and Carson Rivers Forecasts determine how storage targets will be met on Lahonton Reservoir to supply Newlands Project Truckee Canal
Outline of Approach Climate Diagnostics To identify large scale features correlated to Spring flow in the Truckee and Carson Rivers Ensemble Forecast Stochastic Models conditioned on climate indicators (Parametric and Nonparametric) Application Demonstrate utility of improved forecast to water management
Data – monthly averages Streamflow at Ft. Churchill and Farad Precipitation (regional) Geopotential Height 500mb (regional) Sea Surface Temperature (regional)
Annual Cycle of Flows
Fall Climate Correlations 500 mb Geopotential Height Sea Surface Temperature Carson Spring Flow
500 mb Geopotential HeightSea Surface Temperature Carson Spring Flow Winter Climate Correlations
500 mb Geopotential HeightSea Surface Temperature Truckee Spring Flow
Sea Surface Temperature Vector Winds High-Low Flow Climate Composites
Precipitation Correlation
Geopotential Height Correlation
SST Correlation
Flow - NINO3 / Geopotential Height Relationship
The Forecasting Model Forecast Spring Runoff in Truckee and Carson Rivers using Winter Precipitation and Climate Data Indices (Geopotential height index and SST index). Modified K-NN Method: – Uses Local Polynomial for the mean forecast – Bootstraps the residuals for the ensemble
Wet Years: Overprediction w/o Climate (1995, 1996) – Might release water for flood control– stuck in spring with not enough water Underprediction w/o Climate (1998) Precipitation Precipitation and Climate
Dry Years: Overprediction w/o Climate (1998, 991) – Might not implement necessary drought precautions in sufficient time Precipitation Precipitation and Climate
Fall Prediction w/ Climate Fall Climate forecast captures whether season will be above or below average Results comparable to winter forecast w/o climate Wet YearsDry Years
Simple Water Balance S t-1 is the storage at time ‘t-1’, I t is the inflow at time ‘t’ and R t is the release at time ‘t’. Method to test the utility of the model Pass Ensemble forecasts (scenarios) for It Gives water managers a quick look at how much storage they will have available at the end of the season – to evluate decision strategies For this demonstration, Assume S t-1 =0, R t = 1 / 2 (avg. Inflow historical ) S t = S t-1 + I t - R t
Water Balance 1995 K-NN Ensemble PDF Historical PDF 1995 Storage
Future Work Stochastic Model for Timing of the Runoff Disaggregate Spring flows to monthly flows. Statistical Physical Model Couple PRMS with stochastic weather generator (conditioned on climate info.) Test the utility of these approaches to water management using the USBR operations model in RiverWare
Region / Data 6 rainfall stations - Nakhon Sawan, Suphan Buri, Lop Buri, Kanchana Buri, Bangkok, and Don Muang 3 streamflow stations (Chao Phaya basin) - Nakhon Sawan, Chai Nat, Ang-Thong 5 temperature stations - Nakhon Sawan, Lop Buri, Kanchana Buri, Bangkok, Don Muang Large Scale Climate Variables NCEP-NCAR Re-analysis data (
Composite Maps of High rainfall Pre 1980 Post 1980
Composite Maps of Low rainfall Pre 1980 Post 1980
Example Forecast for 1997 Conditional Probabilities from historical data (Categories are at Quantiles) Categorical ENSO forecast Conditional flow probabilites using Total Probability Theorem
Ensemble Forecast from Categorical Probabilistic forecasts If the categorical probabilistic forecasts are P1, P2 and P3 then –Choose a category with the above probabilities –Randomly select an historical observation from the chosen category –Repeat this a numberof times to generate ensemble forecasts
Ensemble Forecast of Thailand Streamflows – 1997
Summary Nonparametric techniques (K-NN framework in particular) provides a flexible alternative to Parametric methods for Ensemble forecasting/Downscaling Easy to implement, parsimonious extension to multivariate situations. Water managers can utilize the improved forecasts in operations and seasonal planning No prior assumption to the functional form is needed. Can capture nonlinear/non-Gaussian features readily.