Presentation is loading. Please wait.

Presentation is loading. Please wait.

A Brief Introduction to Statistical Forecasting Kevin Werner.

Similar presentations


Presentation on theme: "A Brief Introduction to Statistical Forecasting Kevin Werner."— Presentation transcript:

1 A Brief Introduction to Statistical Forecasting Kevin Werner

2 Outline Principle Component Theory Applications Z Score VIPER

3 Statistical regression Basic Forecast Methods May 1 snowpack % avg Apr-Jul streamflow % avg S Fork Rio Grande, Colo Snow pack Soil water Snow Rainfall Runoff Heat Simulation modeling Credit: Tom Pagano

4 The General Linear Regression Model where: Y = dependent variable X i = independent variables b i = regression coefficients n = number of independent variables Credit: Dave Garen

5 The Problem If X’s are intercorrelated, they contain redundant information, and the b’s cannot be meaningfully estimated. However, we don’t want to have to throw out most of the X’s but prefer to retain them for robustness. Credit: Dave Garen

6 Example Streamflow = bo + b1 * (Snotel A) + b2 * (Snotel B) -> Snotel sites are very well correlated -> An optimal b1 and b2 will be difficult to determine since the correlation is so strong

7 The Solution Possibilities: 1) Pre-combine X’s into composite index(es), e.g., Z-score method 2) Principal components regression These are similar in concept but differ in the mathematics. Credit: Dave Garen

8 Principal Components Analysis Principal components regression is just like standard regression except the independent variables are principal components rather than the original X variables. Principal components are linear combinations of the X’s. Credit: Dave Garen

9 Principal Components Analysis Each principal component is a weighted sum of all the X’s:... Credit: Dave Garen

10 Principal Components Analysis The e’s are called eigenvectors, derived from a matrix equation whose input is the correlation matrix of all the X’s with each other. Principal components are new variables that are not correlated with each other. The principal components transformation is equivalent to a rotation of axes. Credit: Dave Garen

11 Principal Components Analysis Credit: Dave Garen

12 Principal Components Analysis The eigenvectors (weights) are based solely on the intercorrelations among the X’s and have no knowledge of Y (in contrast to Z-score, for which the opposite is true). Principal components can be used for purely descriptive purposes, but we want to use them as independent variables in a regression. Credit: Dave Garen

13 Credit: Dennis Hartmann

14 Principal Components Analysis -- Example Independent Variables: X 1 – X 5 Snow water equivalent at 5 stations X 6 – X 10 Water year to date precipitation at 5 stations X 11 Antecedent streamflow X 12 Climate teleconnection index Credit: Dave Garen

15 Correlation Matrix X1X1 X2X2 X3X3 X4X4 X5X5 X6X6 X7X7 X8X8 X9X9 X 10 X 11 X 12 Y X1X1 1.0.72.67.76.81.54.31.54.38.50.18.64.65 X2X2 1.0.67.45.80.62.45.47.31.49.14.39.60 X3X3 1.0.49.72.84.76.86.68.85.48.56.80 X4X4 1.0.62.42.26.36.56.38.28.59.68 X5X5 1.0.62.49.51.44.62.32.59.73 X6X6 1.0.93.87.83.90.63.43.85 X7X7 1.0.82.85.90.67.32.76 X8X8 1.0.74.84.64.39.70 X9X9 1.0.80.70.49.84 X 10 1.0.64.46.79 X 11 1.0.36.51 X 12 1.0.64 Credit: Dave Garen

16 First Five Eigenvectors PC 1 PC 2 PC 3 PC 4 PC 5 X1X1 0.2650.4440.0040.074-0.104 X2X2 0.2490.325-0.483-0.0300.315 X3X3 0.3350.016-0.1780.149-0.314 X4X4 0.2290.3530.456-0.595-0.009 X5X5 0.2870.332-0.1480.1200.412 X6X6 0.339-0.168-0.162-0.106-0.040 X7X7 0.308-0.329-0.150-0.058-0.015 X8X8 0.317-0.197-0.1140.027-0.261 X9X9 0.304-0.2400.299-0.313-0.103 X 10 0.330-0.197 0.072-0.129 X 11 0.235-0.3490.3510.1680.692 X 12 0.2320.2620.4730.675-0.212 % var.62.715.87.83.83.2 Credit: Dave Garen

17 Principal Components Regression Procedure Try the PC’s in order Test for regression coefficient significance (t-test) Stop at first insignificant component Transform regression coefficients to be in terms of original variables Sign test – coefficient signs must be same as correlation with Y Credit: Dave Garen

18 Summary Principal components analysis is a standard multivariate statistical procedure Can be used for descriptive purposes to reduce the dimensionality of correlated variables Can be taken a step further to provide new, non- correlated independent variables for regression PC’s taken in order, subject to t-test and sign test Final model is expressed in terms of original X variables Credit: Dave Garen

19 Soil Moisture at the interannual timescale Another example demonstrating importance of land surface processes in the climate system: Werner, 1999: – GCM run with and without active land surface model in South America to explore the importance of land surface processes in the climate system variability in the Nordeste region. – Both simulations include full atmospheric model, slab ocean model (no ocean dynamics), and dynamic land surface model everywhere except tropical South America in the Data Land simulation.

20 Modeled variability – Full dynamic land surface model simulation contains variability resembling observed variability with connection between NH and SH SSTs. – Fixed land surface model shows no connected variability between NH and SH SSTs Soil Moisture at the interannual timescale

21 Resources Dave Garen VIPER slides Dennis Hartmann lecture notes (http://www.atmos.washington.edu/~dennis/)http://www.atmos.washington.edu/~dennis/

22 What does z-score regression do? 1. Combines predictors into weighted indices, emphasizing good stations, minimizing bad ones. 2. Compensates for missing data with remaining data. 3. Regresses index against target predictand Credit: Tom Pagano

23 What is a z-score? A z-score is a “normalized anomaly”: Z = value - average standard deviation Credit: Tom Pagano

24 What is a z-score? A z-score is a “normalized anomaly”: Z = value - average standard deviation Credit: Tom Pagano

25 What is a z-score? A z-score is a “normalized anomaly”: Z = value - average standard deviation 60 135 avgstdev 30 15 Credit: Tom Pagano

26 What is a z-score? A z-score is a “normalized anomaly”: Z = value - average standard deviation 60 135 avgstdev 30 15 Z = (90 – 60)/15 = +2 Credit: Tom Pagano

27 How good are the results Under conditions of serially compete data, and relatively “normal” conditions PCA and Z-Score are effectively indistinguishable* Skill and behavior is similar to the official published outlooks** However… Any tool is a weapon if you hold it right. (aka “A fool with a tool is still a tool”) *Viper technical note - 1 basin** Pagano dissertation – 29 basins Credit: Tom Pagano

28 Super Quick Primer on VIPER

29 The Viper Main Interface Layout and interpretation Credit: Tom Pagano

30 The Viper Main Interface Layout and interpretation Selecting predictors and predictands Global month changes Credit: Tom Pagano

31 The Viper Main Interface Layout and interpretation Selecting predictors and predictands Predictors quality, availability Global month changes Historical statistics Credit: Tom Pagano

32 The Viper Main Interface Layout and interpretation Selecting predictors and predictands Predictors quality, availability Forecast vs observed time series Station availability, weights Global month changes Historical statistics Credit: Tom Pagano

33 The Viper Main Interface Layout and interpretation Selecting predictors and predictands Predictors quality, availability Forecast vs observed time series Station availability, weights Fcst vs obs scatterplot Helper variable Scatterplot/ Forecast progression Global month changes Historical statistics Credit: Tom Pagano

34 The Viper Main Interface Layout and interpretation Selecting predictors and predictands Predictors quality, availability Probability bounds Forecast vs observed time series Station availability, weights Fcst vs obs scatterplot Helper variable Scatterplot/ Forecast progression Settings Global month changes Historical statistics Credit: Tom Pagano

35 The Viper Main Interface Layout and interpretation Selecting predictors and predictands Predictors quality, availability Probability bounds Forecast vs observed time series Station availability, weights Fcst vs obs scatterplot Helper variable Scatterplot/ Forecast progression Settings Global month changes Historical statistics There’s more if you scroll right: Relate any variable to another Credit: Tom Pagano


Download ppt "A Brief Introduction to Statistical Forecasting Kevin Werner."

Similar presentations


Ads by Google