Helper Variables Why do we want them? How do we create them? What to avoid with them? Tom Pagano 503-414-3010.

Slides:



Advertisements
Similar presentations
Chapter 3 Examining Relationships Lindsey Van Cleave AP Statistics September 24, 2006.
Advertisements

Lesson 10: Linear Regression and Correlation
Matthew Hendrickson, and Pascal Storck
© Copyright 2001, Alan Marshall1 Regression Analysis Time Series Analysis.
1 of Introduction to Forecasts and Verification.
Aspinall Unit Seasonal Operation Goals Fill Blue Mesa Reservoir before end of runoff. Operate to the 2012 Aspinall ROD at the Whitewater Gage. Meet the.
Hydrologic Outlook for the Pacific Northwest Andy Wood and Dennis P. Lettenmaier Department of Civil and Environmental Engineering for Washington Water.
Regression Analysis. Unscheduled Maintenance Issue: l 36 flight squadrons l Each experiences unscheduled maintenance actions (UMAs) l UMAs costs $1000.
Seasonal outlooks for hydrology and water resources in the Pacific Northwest Andy Wood Alan Hamlet Dennis P. Lettenmaier Department of Civil and Environmental.
Stat 217 – Day 26 Regression, cont.. Last Time – Two quantitative variables Graphical summary  Scatterplot: direction, form (linear?), strength Numerical.
Unit #2/Slide #1 © Judith D. Singer, Harvard Graduate School of Education Revisiting Strength vs. Magnitude The correlation coefficient is a measure of.
Stat 217 – Day 25 Regression. Last Time - ANOVA When?  Comparing 2 or means (one categorical and one quantitative variable) Research question  Null.
5-3 Inference on the Means of Two Populations, Variances Unknown
Chapter 7 Correlational Research Gay, Mills, and Airasian
Time Series and Forecasting
Forecasting Revenue: An Example of Regression Model Building Setting: Possibly a large set of predictor variables used to predict future quarterly revenues.
Three Common Misinterpretations of Significance Tests and p-values 1. The p-value indicates the probability that the results are due to sampling error.
Paonia/Collbran Low Flow Presentation Water Quality Work Group Meeting June 9, 2004.
Dr Mark Cresswell Statistical Forecasting [Part 1] 69EG6517 – Impacts & Models of Climate Change.
A Brief Introduction to Statistical Forecasting Kevin Werner.
Line of Best Fit. Age (months) Height (inches) Work with your group to make.
IES 371 Engineering Management Chapter 13: Forecasting
VCE Further Maths Least Square Regression using the calculator.
Chapter 14 – Correlation and Simple Regression Math 22 Introductory Statistics.
IRP Approach to Water Supply Alternatives for Duck River Watershed: Presentation to XII TN Water Resources Symposium William W. Wade Energy and Water.
Where does VIPER get its data from? Tom Pagano
Wolf-Gerrit Früh Christina Skittides With support from SgurrEnergy Preliminary assessment of wind climate fluctuations and use of Dynamical Systems Theory.
Chapter 2 Looking at Data - Relationships. Relations Among Variables Response variable - Outcome measurement (or characteristic) of a study. Also called:
Copyright © 2014, 2011 Pearson Education, Inc. 1 Chapter 19 Linear Patterns.
Applications of Regression to Water Quality Analysis Unite 5: Module 18, Lecture 1.
Autocorrelation in Time Series KNNL – Chapter 12.
Statistical Water Supply (SWS) Mathematical relationships, in the form of regression equations, between measurements of observed climate conditions (predictor.
STATISTICS 12.0 Correlation and Linear Regression “Correlation and Linear Regression -”Causal Forecasting Method.
1 Lecture – Week 5 - Questionnaire Design & Selecting a Stats Test & Intro to G-Power. First - Some tidying Up According to my records there are a few.
 Census  Survey  Experiment  Observational study  Interviews/ questionnaires  Data logger to record data over time.
Confidence Intervals vs. Prediction Intervals Confidence Intervals – provide an interval estimate with a 100(1-  ) measure of reliability about the mean.
NRCS National Water and Climate Center Update Tom Pagano Natural Resources Conservation Service.
The Z-Score Regression Method and You Tom Pagano
Hydrologic Forecasting With Statistical Models Angus Goodbody David Garen USDA Natural Resources Conservation Service National Water and Climate Center.
Correlation. Correlation Analysis Correlations tell us to the degree that two variables are similar or associated with each other. It is a measure of.
VIPER Optimization What is optimization? How does viper’s Station and Time Period optimization work? How to interpret results? What to avoid? Tom Pagano.
: An alternative representation of level of significance. - normal distribution applies. - α level of significance (e.g. 5% in two tails) determines the.
The Viper Main Interface Layout and interpretation.
Dept of Bioenvironmental Systems Engineering National Taiwan University Lab for Remote Sensing Hydrology and Spatial Modeling STATISTICS Linear Statistical.
WISKI for Hydrologic Assessment Michael Seneka Water Policy Branch Alberta Environment and Sustainable Resource Development.
R. Ty Jones Director of Institutional Research Columbia Basin College PNAIRP Annual Conference Portland, Oregon November 7, 2012 R. Ty Jones Director of.
STATISTICS 12.0 Correlation and Linear Regression “Correlation and Linear Regression -”Causal Forecasting Method.
Long-term Trends in Water Supply Forecast Skill
Jump to first page Inferring Sample Findings to the Population and Testing for Differences.
Chapter 9 Regression Wisdom. Getting the “Bends” Linear regression only works for data with a linear association. Curved relationships may not be evident.
What is data? Wietse Dol, LEI-WUR 13 November 2012, 9.40 – 10.25, C435 Forumgebouw.
Demand Management and Forecasting Chapter 11 Portions Copyright © 2010 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin.
NRCS National Water and Climate Center Update Tom Pagano Natural Resources Conservation Service.
Chapter 11 – With Woodruff Modications Demand Management and Forecasting Copyright © 2010 by The McGraw-Hill Companies, Inc. All rights reserved.McGraw-Hill/Irwin.
Correlation and Linear Regression
Statistical Forecasting
Theme 6. Linear regression
Section 11.1 Day 3.
Effect Size 10/15.
Effect Size.
Line of Best Fit.
Line of Best Fit.
Predictability of Indian monsoon rainfall variability
Least-Squares Regression
Time Series and Forecasting
Line of Best Fit.
Least-Squares Regression
Regression Forecasting and Model Building
Gathering and Organizing Data
Seasonal Forecasting Using the Climate Predictability Tool
Presentation transcript:

Helper Variables Why do we want them? How do we create them? What to avoid with them? Tom Pagano

Why helper variables? The target (i.e. predictand) time series may have holes in important years or a short period of record. If that data is easily estimated, filling the gaps may lead to a better, or at least more honest, forecast.

Sargents is missing during a hydrologically interesting period. This is also the period of most of our predictors (i.e. SNOTEL). Gunnison could be used to fill in gaps. Why?

Sargents is missing during a hydrologically interesting period. This is also the period of most of our predictors (i.e. SNOTEL). Gunnison could be used to fill in gaps. Strength of correlation very good Why?

Another example… seasonally operated gages

Correlation between mar-sep and apr-sep = No point in throwing away years where only march is missing.

Helper variable interface Neat stuff here but don’t touch if you don’t know what you’re doing. Default is unchecked.

Main ways to use helper variables Different station, same months: (Upstream vs downstream) Estimating one gage from another Same station, different months: (May-Jul vs Apr-Jul) Estimating longer time period from shorter Same station, months, different sources: (USGS vs AWDB) Estimating natural flow from observed

Helper not used

Helper vs target scatterplot Helper used Wider range of years… More stable relationship More consistent with nearby forecasts

Dangers of helper variables Statistically, we do not include the imperfect relationship between helper and original target in the final forecast error bounds. We are increasing our chances of overconfident forecasts. Therefore, it is best to only estimate a few years and only if the relationship is very good (e.g. r 2 >0.9)

Dangers of helper variables Statistically, we do not include the imperfect relationship between helper and original target in the final forecast error bounds. We are increasing our chances of overconfident forecasts. Therefore, it is best to only estimate a few years and only if the relationship is very good (e.g. r 2 >0.9) Consider too whether the relationship between the helper and the original target is stable versus time… For example… Use observed flow as helper to estimate natural flow. Have the regulations changed over time?