Download presentation
Presentation is loading. Please wait.
Published byMargaret Elliott Modified over 9 years ago
1
A Brief Introduction to Statistical Forecasting Kevin Werner
2
Outline Principle Component Theory Applications Z Score VIPER
3
Statistical regression Basic Forecast Methods May 1 snowpack % avg Apr-Jul streamflow % avg S Fork Rio Grande, Colo Snow pack Soil water Snow Rainfall Runoff Heat Simulation modeling Credit: Tom Pagano
4
The General Linear Regression Model where: Y = dependent variable X i = independent variables b i = regression coefficients n = number of independent variables Credit: Dave Garen
5
The Problem If X’s are intercorrelated, they contain redundant information, and the b’s cannot be meaningfully estimated. However, we don’t want to have to throw out most of the X’s but prefer to retain them for robustness. Credit: Dave Garen
6
Example Streamflow = bo + b1 * (Snotel A) + b2 * (Snotel B) -> Snotel sites are very well correlated -> An optimal b1 and b2 will be difficult to determine since the correlation is so strong
7
The Solution Possibilities: 1) Pre-combine X’s into composite index(es), e.g., Z-score method 2) Principal components regression These are similar in concept but differ in the mathematics. Credit: Dave Garen
8
Principal Components Analysis Principal components regression is just like standard regression except the independent variables are principal components rather than the original X variables. Principal components are linear combinations of the X’s. Credit: Dave Garen
9
Principal Components Analysis Each principal component is a weighted sum of all the X’s:... Credit: Dave Garen
10
Principal Components Analysis The e’s are called eigenvectors, derived from a matrix equation whose input is the correlation matrix of all the X’s with each other. Principal components are new variables that are not correlated with each other. The principal components transformation is equivalent to a rotation of axes. Credit: Dave Garen
11
Principal Components Analysis Credit: Dave Garen
12
Principal Components Analysis The eigenvectors (weights) are based solely on the intercorrelations among the X’s and have no knowledge of Y (in contrast to Z-score, for which the opposite is true). Principal components can be used for purely descriptive purposes, but we want to use them as independent variables in a regression. Credit: Dave Garen
13
Credit: Dennis Hartmann
14
Principal Components Analysis -- Example Independent Variables: X 1 – X 5 Snow water equivalent at 5 stations X 6 – X 10 Water year to date precipitation at 5 stations X 11 Antecedent streamflow X 12 Climate teleconnection index Credit: Dave Garen
15
Correlation Matrix X1X1 X2X2 X3X3 X4X4 X5X5 X6X6 X7X7 X8X8 X9X9 X 10 X 11 X 12 Y X1X1 1.0.72.67.76.81.54.31.54.38.50.18.64.65 X2X2 1.0.67.45.80.62.45.47.31.49.14.39.60 X3X3 1.0.49.72.84.76.86.68.85.48.56.80 X4X4 1.0.62.42.26.36.56.38.28.59.68 X5X5 1.0.62.49.51.44.62.32.59.73 X6X6 1.0.93.87.83.90.63.43.85 X7X7 1.0.82.85.90.67.32.76 X8X8 1.0.74.84.64.39.70 X9X9 1.0.80.70.49.84 X 10 1.0.64.46.79 X 11 1.0.36.51 X 12 1.0.64 Credit: Dave Garen
16
First Five Eigenvectors PC 1 PC 2 PC 3 PC 4 PC 5 X1X1 0.2650.4440.0040.074-0.104 X2X2 0.2490.325-0.483-0.0300.315 X3X3 0.3350.016-0.1780.149-0.314 X4X4 0.2290.3530.456-0.595-0.009 X5X5 0.2870.332-0.1480.1200.412 X6X6 0.339-0.168-0.162-0.106-0.040 X7X7 0.308-0.329-0.150-0.058-0.015 X8X8 0.317-0.197-0.1140.027-0.261 X9X9 0.304-0.2400.299-0.313-0.103 X 10 0.330-0.197 0.072-0.129 X 11 0.235-0.3490.3510.1680.692 X 12 0.2320.2620.4730.675-0.212 % var.62.715.87.83.83.2 Credit: Dave Garen
17
Principal Components Regression Procedure Try the PC’s in order Test for regression coefficient significance (t-test) Stop at first insignificant component Transform regression coefficients to be in terms of original variables Sign test – coefficient signs must be same as correlation with Y Credit: Dave Garen
18
Summary Principal components analysis is a standard multivariate statistical procedure Can be used for descriptive purposes to reduce the dimensionality of correlated variables Can be taken a step further to provide new, non- correlated independent variables for regression PC’s taken in order, subject to t-test and sign test Final model is expressed in terms of original X variables Credit: Dave Garen
19
Soil Moisture at the interannual timescale Another example demonstrating importance of land surface processes in the climate system: Werner, 1999: – GCM run with and without active land surface model in South America to explore the importance of land surface processes in the climate system variability in the Nordeste region. – Both simulations include full atmospheric model, slab ocean model (no ocean dynamics), and dynamic land surface model everywhere except tropical South America in the Data Land simulation.
20
Modeled variability – Full dynamic land surface model simulation contains variability resembling observed variability with connection between NH and SH SSTs. – Fixed land surface model shows no connected variability between NH and SH SSTs Soil Moisture at the interannual timescale
21
Resources Dave Garen VIPER slides Dennis Hartmann lecture notes (http://www.atmos.washington.edu/~dennis/)http://www.atmos.washington.edu/~dennis/
22
What does z-score regression do? 1. Combines predictors into weighted indices, emphasizing good stations, minimizing bad ones. 2. Compensates for missing data with remaining data. 3. Regresses index against target predictand Credit: Tom Pagano
23
What is a z-score? A z-score is a “normalized anomaly”: Z = value - average standard deviation Credit: Tom Pagano
24
What is a z-score? A z-score is a “normalized anomaly”: Z = value - average standard deviation Credit: Tom Pagano
25
What is a z-score? A z-score is a “normalized anomaly”: Z = value - average standard deviation 60 135 avgstdev 30 15 Credit: Tom Pagano
26
What is a z-score? A z-score is a “normalized anomaly”: Z = value - average standard deviation 60 135 avgstdev 30 15 Z = (90 – 60)/15 = +2 Credit: Tom Pagano
27
How good are the results Under conditions of serially compete data, and relatively “normal” conditions PCA and Z-Score are effectively indistinguishable* Skill and behavior is similar to the official published outlooks** However… Any tool is a weapon if you hold it right. (aka “A fool with a tool is still a tool”) *Viper technical note - 1 basin** Pagano dissertation – 29 basins Credit: Tom Pagano
28
Super Quick Primer on VIPER
29
The Viper Main Interface Layout and interpretation Credit: Tom Pagano
30
The Viper Main Interface Layout and interpretation Selecting predictors and predictands Global month changes Credit: Tom Pagano
31
The Viper Main Interface Layout and interpretation Selecting predictors and predictands Predictors quality, availability Global month changes Historical statistics Credit: Tom Pagano
32
The Viper Main Interface Layout and interpretation Selecting predictors and predictands Predictors quality, availability Forecast vs observed time series Station availability, weights Global month changes Historical statistics Credit: Tom Pagano
33
The Viper Main Interface Layout and interpretation Selecting predictors and predictands Predictors quality, availability Forecast vs observed time series Station availability, weights Fcst vs obs scatterplot Helper variable Scatterplot/ Forecast progression Global month changes Historical statistics Credit: Tom Pagano
34
The Viper Main Interface Layout and interpretation Selecting predictors and predictands Predictors quality, availability Probability bounds Forecast vs observed time series Station availability, weights Fcst vs obs scatterplot Helper variable Scatterplot/ Forecast progression Settings Global month changes Historical statistics Credit: Tom Pagano
35
The Viper Main Interface Layout and interpretation Selecting predictors and predictands Predictors quality, availability Probability bounds Forecast vs observed time series Station availability, weights Fcst vs obs scatterplot Helper variable Scatterplot/ Forecast progression Settings Global month changes Historical statistics There’s more if you scroll right: Relate any variable to another Credit: Tom Pagano
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.