Jery R. Stedinger Cornell University Research with G. Tasker, E. Martins, D. Reis, A. Gruber, V. Griffis, D.I. Jeong and Y.O. Kim SAMSI Workshop 23 January.

Slides:



Advertisements
Similar presentations
Introduction to modelling extremes
Advertisements

Hydrologic Statistics Reading: Chapter 11, Sections 12-1 and 12-2 of Applied Hydrology 04/04/2006.
1 McGill University Department of Civil Engineering and Applied Mechanics Montreal, Quebec, Canada.
Statistical Techniques I EXST7005 Start here Measures of Dispersion.
A. The Basic Principle We consider the multivariate extension of multiple linear regression – modeling the relationship between m responses Y 1,…,Y m and.
© Department of Statistics 2012 STATS 330 Lecture 32: Slide 1 Stats 330: Lecture 32.
Regional frequency analysis of hydrological droughts Henrik Madsen DHI Water & Environment.
Prediction, Correlation, and Lack of Fit in Regression (§11. 4, 11
Use of regression analysis Regression analysis: –relation between dependent variable Y and one or more independent variables Xi Use of regression model.
Hydrologic Statistics
CSC321: 2011 Introduction to Neural Networks and Machine Learning Lecture 10: The Bayesian way to fit models Geoffrey Hinton.
Objectives (BPS chapter 24)
Cox Model With Intermitten and Error-Prone Covariate Observation Yury Gubman PhD thesis in Statistics Supervisors: Prof. David Zucker, Prof. Orly Manor.
GEV Flood Quantile Estimators with Bayesian Shape-Parameter GLS Regression Dirceu Silveira Reis Jr., Jery R. Stedinger and Eduardo Savio Martins Fundação.
FREQUENCY ANALYSIS Basic Problem: To relate the magnitude of extreme events to their frequency of occurrence through the use of probability distributions.
Variance and covariance M contains the mean Sums of squares General additive models.
Statistical Methods Chichang Jou Tamkang University.
Extreme Value Analysis, August 15-19, Bayesian analysis of extremes in hydrology A powerful tool for knowledge integration and uncertainties assessment.
Simple Linear Regression and Correlation
Review for Exam 2 Some important themes from Chapters 6-9 Chap. 6. Significance Tests Chap. 7: Comparing Two Groups Chap. 8: Contingency Tables (Categorical.
Lecture II-2: Probability Review
Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. More About Regression Chapter 14.
Flood Frequency Analysis
Hydrologic Statistics
Inference for regression - Simple linear regression
PATTERN RECOGNITION AND MACHINE LEARNING
Principles of Pattern Recognition
BPS - 3rd Ed. Chapter 211 Inference for Regression.
Ms. Khatijahhusna Abd Rani School of Electrical System Engineering Sem II 2014/2015.
3/2003 Rev 1 I – slide 1 of 33 Session I Part I Review of Fundamentals Module 2Basic Physics and Mathematics Used in Radiation Protection.
Prof. Dr. S. K. Bhattacharjee Department of Statistics University of Rajshahi.
Inferences in Regression and Correlation Analysis Ayona Chatterjee Spring 2008 Math 4803/5803.
Lecture 12 Statistical Inference (Estimation) Point and Interval estimation By Aziza Munir.
Oceanography 569 Oceanographic Data Analysis Laboratory Kathie Kelly Applied Physics Laboratory 515 Ben Hall IR Bldg class web site: faculty.washington.edu/kellyapl/classes/ocean569_.
Statistical Decision Making. Almost all problems in statistics can be formulated as a problem of making a decision. That is given some data observed from.
INTRODUCTORY LINEAR REGRESSION SIMPLE LINEAR REGRESSION - Curve fitting - Inferences about estimated parameter - Adequacy of the models - Linear.
Bayesian Analysis and Applications of A Cure Rate Model.
Managerial Economics Demand Estimation & Forecasting.
Chapter 4 Linear Regression 1. Introduction Managerial decisions are often based on the relationship between two or more variables. For example, after.
MARKETING RESEARCH CHAPTER 18 :Correlation and Regression.
Copyright ©2011 Brooks/Cole, Cengage Learning Inference about Simple Regression Chapter 14 1.
STA 286 week 131 Inference for the Regression Coefficient Recall, b 0 and b 1 are the estimates of the slope β 1 and intercept β 0 of population regression.
Example: Bioassay experiment Problem statement –Observations: At each level of dose, 5 animals are tested, and number of death are observed.
Lecture 10 Chapter 23. Inference for regression. Objectives (PSLS Chapter 23) Inference for regression (NHST Regression Inference Award)[B level award]
Regression Analysis © 2007 Prentice Hall17-1. © 2007 Prentice Hall17-2 Chapter Outline 1) Correlations 2) Bivariate Regression 3) Statistics Associated.
Statistical Decision Theory Bayes’ theorem: For discrete events For probability density functions.
Chapter 7 Point Estimation of Parameters. Learning Objectives Explain the general concepts of estimating Explain important properties of point estimators.
Chapter 22: Building Multiple Regression Models Generalization of univariate linear regression models. One unit of data with a value of dependent variable.
Correlation & Regression Analysis
Review of Probability. Important Topics 1 Random Variables and Probability Distributions 2 Expected Values, Mean, and Variance 3 Two Random Variables.
Copyright © 2010 Pearson Education, Inc Chapter Seventeen Correlation and Regression.
New approaches in extreme-value modeling A.Zempléni, A. Beke, V. Csiszár (Eötvös Loránd University, Budapest) Flood Risk Workshop,
Statistical Data Analysis 2010/2011 M. de Gunst Lecture 9.
Tutorial I: Missing Value Analysis
Lecture 1: Basic Statistical Tools. A random variable (RV) = outcome (realization) not a set value, but rather drawn from some probability distribution.
Chris Ferro Climate Analysis Group Department of Meteorology University of Reading Extremes in a Varied Climate 1.Significance of distributional changes.
BPS - 5th Ed. Chapter 231 Inference for Regression.
Statistical Decision Making. Almost all problems in statistics can be formulated as a problem of making a decision. That is given some data observed from.
Ch 1. Introduction Pattern Recognition and Machine Learning, C. M. Bishop, Updated by J.-H. Eom (2 nd round revision) Summarized by K.-I.
Linear Regression Modelling
Chapter 4 Basic Estimation Techniques
Ch3: Model Building through Regression
Flood Frequency Analysis
Hydrologic Statistics
Chapter 3 Statistical Concepts.
Simple Linear Regression
Basic Practice of Statistics - 3rd Edition Inference for Regression
Product moment correlation
Model Adequacy Checking
Applied Statistics and Probability for Engineers
Presentation transcript:

Jery R. Stedinger Cornell University Research with G. Tasker, E. Martins, D. Reis, A. Gruber, V. Griffis, D.I. Jeong and Y.O. Kim SAMSI Workshop 23 January 2008 Regionalization of Statistics Describing the Distribution of Hydrologic Extremes

Extreme Value Theory & Hydrology Annual maximum flood may be daily maximum, or instantaneous maximum. Annual maximum 24-hour rainfall may be daily maximum or maximum 1440-minute values. Annual maximums are not maximum of I.I.D. series: Years have definite “wet” and “dry” seasons Daily values are correlated Because of El Niño and atmospheric patterns, some years extreme-event prone, others are not. Peaks-over-threshold (PDS) another alternative.

Outline Summarizing Data: Moments and L-moments Parameter estimation for GEV –Use of a prior on  –PDS versus AMS with GMLEs Bayesian GLS Regression for regionalization Concluding observations

Outline Summarizing Data: Moments and L-moments Parameter estimation for GEV –Use of a prior on  –PDS versus AMS with GMLEs Bayesian GLS Regression for regionalization Concluding observations

Definitions: Product-Moments Mean, measure of location µ x = E[ X ] Variance, measure of spread  x 2 = E[ (X – µ x )2] Coef. of Skewness, asymmetry  x = E[ (X – µ x )3] /  x 3

Conventional Moment Ratios Conventional descriptions of shape are Coefficient of Variation, CV:  Coefficients of skewness,  : E[(X-µ) 3 ] /  3 Coefficients of kurtosis,  : E[(X-µ) 4 ] /  4

Samples drawn from a Gumbel distribution.

L-Moments An alternative to product moments now widely used in hydrology.

L-Moments: an alternative L-moments can summarize data as do conventional moments using linear combinations of the ordered observations. Because L-moments avoid squaring and cubing the data, their ratios do not suffer from the severe bias problems encountered with product moments. Estimate using order statistics

L-Moments: an alternative Let X (i|n) be ith largest obs. in sample of size n. Measure of Scale expected difference largest and smallest observations in sample of 2: 2 = (1/2) E[ X (2|2) - X (1|2) ] Measure of Asymmetry 3 = (1/3) E[ X (3|3) - 2 X (2|3) + X (1|3) ] where 3 > 0 for positively skewed distributions

L-Moments: an alternative Measure of Kurtosis 4 = (1/4) E[ X (4|4) – 3 X (3|4) – 3 X (2|4) + X (1|4) ] For highly kurtotic distributions, 4 large. For the uniform distribution 4 = 0.

Dimensionless L-moment ratios L-moment Coefficient of variation (L-CV):         /µ L-moment coef. of skew (L-Skewness)       L-moment coef. of kurtosis (L-Kurtosis)       (Note: Hosking calls L-CV  instead of  .)

Samples drawn from a Gumbel distribution.

Generalized Extreme Value (GEV) distribution Gumbel's Type I, II & III Extreme Value distr.: F(x) = exp{ – [ 1 – (  /a)(x-  )] 1 /  } for  ≠ 0  = shape;  = scale,  = location. Mostly -0.3 <  ≤ 0 [Others use for shape .]

GEV Prob. Density Function

GEV Prob. Density Function large x

Simple GEV L-Moment Estimators Using L-moments – Hosking, Wallis & Wood (1985) c = 2/(  3 + 3) – ln(2)/ln(3);  3 = 3 / 2 then  = c c 2 ;  3  ≤ 0.5  =  2 / [  (1+  ) (1 – 2 -  ) ]  = 1 +  [  (1+  ) – 1 ] /  Quantiles: x p =  + (  ) { 1 – [ -ln(p) ]  } Method of L-moments simple and attractive.

Index Flood Methodology Research has demonstrated potential advantages of index flood procedures for combining regional and at-site data to improve the estimators at individual sites.

Hosking and Wallis (1997) Development of L-moments for regional flood frequency analysis. Research done in the period. J.R.M. Hosking and J.R. Wallis, Regional Frequency Analysis: An Approach Based on L- moments, Cambridge University Press, 1997.

Compute for region average L-CV and L-CS which yields regional y p

Index Flood Methodology Use data from hydrologically "similar" basins to estimate a dimensionless flood distribution which is scaled by at-site sample mean. "Substitutes Space for Time" by using regional information to compensate for relatively short records at each site. Most of these studies have used the GEV distribution and L-moments or equivalent.

Outline Summarizing Data: Moments and L-moments Parameter estimation for GEV –Use of a prior on  –PDS versus AMS with GMLEs Bayesian GLS Regression for regionalization Concluding observations

Trouble with MLEs for GEV X = 14.9 (true) = 6,000,000 (est.) CASE: N = 15, X ~ GEV(  = 0,  = 1,  = –0.20) MLE Solution:

Parameter Estimators for 3-parameter GEV distribution 1.Maximum Likelihood (ML) 2.Method of Moments (MOM) 3.Method of L-moments (LM) 4. Generalized Maximum Likelihood (GML) Introduces a prior distribution for  that ensures estimator within ( -0.5, +0.5), and encourages values within (-0.3, +0.1) Martins, E.S., and J.R. Stedinger, Generalized Maximum Likelihood GEV quantile estimators for hydrologic data, Water Resour. Res.. 36(3), , Or can use a penalty to enfore constraint that  > -1: Coles, S.G., and M.J.Dixon, Likelihood-Based Inference for Extreme Value Models, Extremes 2:1, 5-23, 1999.

Prior distribution on GEV 

Performance Alternative Estmators of x 0.99 for GEV distribution, n =  RMSE ML LM MOM GML

Performance Alternative Estmators of x 0.99 for GEV distribution, n = 100 RMSE ML LM MOM GML 

GEV Estimators In 1985 when Hosking, Wallis and Wood introduced L-moment (PWM) estimators for GEV, they were much better than MLEs and Quantile estimators In 1998 Madsen and Rosbjerg demonstrated MOM were not so bad, perhaps better than L-Moments. Finally in 2000 Martins & Stedinger demonstrated that adding realistic control of GEV shape parameter  yielded estimators that dominated competition. This is a distribution with modest-accuracy regional description of shape parameter.

Outline Summarizing Data: Moments and L-moments Parameter estimation for GEV –Use of a prior on  –PDS versus AMS with GMLEs Bayesian GLS Regression for regionalization Concluding observations

Partial Duration or Annual Maximum Series. by seeing more little floods, do we know more about big floods ?

Partial Duration Series (PDS) Peaks over threshold (POT)

Poisson/Pareto model for PDS = arrival rate for floods > x 0 which follow a Poisson process G(x) = Pr[ X ≤ x ] for peaks over threshold x > x 0 is a Generalized Pareto distribution = 1 – { 1 -  [ (x - x 0 )/  ] } 1/  Then annual maximums have Generalized Extreme Value distribution F(x) = exp{ – ( 1 -  [ (x -  )/  ’ ] ) 1/    = x 0 +  (1 – -  )/   ’ =   same 

Which is more precise: AMS or PDS? Consider where estimate only 2 parameter. Fix  = 0, corresponding to Poisson arrivals with exponential exceendances: Share & Lynn (1964) model for flood risk.

Poisson Arrivals with Exponential Exceedances (  = 0 )

Which is more precise: AMS=GP or PDS=GEV ? RMSE-ratio = Now estimate 3 parameters using PDS data employing XXX = MOM, L-Moments (LM) and GML with Generalized Pareto distribution and compare RMSE of PDS-XXX to RMSE of AMS-GMLE GEV estimator.

RMSE 3 PDS estimators vs AMS-GML = 5 events/year RMSE-Ratio PDS/AMS-GMLE shape parameter  

RMSE 3 PDS estimators vs AMS-GML  = – 0.30 RMSE-Ratio PDS/AMS-GMLE  events per year

Conclusions: PDS versus AMS For  < 0, with PDS data, again GML quantile estimators generally better than MOM, LM and ML. Precision of GML quantile estimators insensitive to  A year of PDS data generally worth a year of AMS data for estimating 100-year flood when employing the GMLE estimators of GP and GEV parameters: more little floods do not tell us about the distribution of large floods.

Outline Summarizing Data: Moments and L-moments Parameter estimation for GEV –Use of a prior on  –PDS versus AMS with GMLEs Bayesian GLS Regression for regionalization Concluding observations

GLS Regression for Regional Analyses GOAL– Obtain efficient estimators of the mean, standard deviation, T-yr flood, or GEV parameters as a function of physiographic basin characteristics; and provide the precision of that estimator. MODEL– log[Statistic-of-interest ] =  +  1 log(Area) +  2 log(Slope) Error

GLS Analysis: Complications With available records, only obtain sample estimates of Statistic-of-Interest, denoted y i Total error  i  is a combination of – (i)time-sampling-error  i in sample estimators y i which are often cross-correlated, and (ii)underlying model error  i (true lack of fit). Variance of those errors about prediction X  depends on statistics-of-interest at each site. ^ ^ Model error Sample error Total error Prediction ^

GLS for Regionalization Use Available record lengths n i, concurrent record lengths m ij, regional estimates of stan. deviations  i, or  2i,  3i and cross-correlations  ij of floods to estimate variance & cross-correlations of  describing errors in i. With true model error variance    determine covariance matrix  (   ) of residual errors:  (   ) =   I +   where  ( ) is covariance matrix of the estimator

GLS Analysis: Solution GLS regression model (Stedinger & Tasker, 1985, 1989) = X  +  with parameter estimator b for  { X T  (   ) -1 X } b = X T  (   ) -1 Can estimate model-error   using moments ( – X b) T  (   ) -1 ( – X b) = n - k  (   ) =   I +   n = dimension of y; k = dimension of b

Likelihood function - model error     Tibagi River, Brazil, n=17) Maximum of likelihood may be at zero, but larger values are very probable. Zero clearly not in middle of likely range of values. Method of moments has Same problem zero estimate.

Advantages of Bayesian Analysis Provides posterior distribution of parameters  model error variance   2, and predictive distribution for dependent variable Bayesian Approach is a natural solution to the problem

Bayesian GLS Model Prior distribution:  ( ,    ) -Parameter  are multivariate normal (  ) -Model error variance    Exponential dist. ( ); E[    ] = = 24 Likelihood function : Assume data is multivariate N[ X ,  ]

Quasi-Analytic Bayesian GLS  Joint posterior distribution  Marginal posterior of    where integrate analytically normal likelihood & prior to determine f in closed-form.

MM-GLS for      MLE-GLS for      Bayesian GLS for      Example of a posterior of     Model 1,  Tibagi, Brazil, n =17) Model error variance   

Quasi-Analytic Result From joint posterior distribution can compute marginal posterior of  and moments by 1- dimensional num. integrations

Bayesian GLS for Regionalization of Flood Characteristics in Korea Dae Il Jeong Post-doctoral Researcher, Cornell University Jery R. Stedinger Professor, Cornell University Young-Oh Kim Associate Professor, Seoul National University Jang Hyun Sung Graduate Student, Seoul National University

Korean River basins Land Area: 120,000 km 2 Major river basins: Han, Nakdong, Geum Total Annual Precipitation: (TAP) = 1283mm Two thirds of TAP occurs during 3-month flood season (Jul~Sep) Available sites: 31 Average length: 22 years Han River Basin Nakdong River Basin Geum River Basin

Korean Application Regional estimators of L-CV  2 and L-CS  3 for flood frequency analysis using GEV distribution 6 Explanatory Variables 2 indicators (Han-Nakdong-Geum basins) logs of drainage area logs of channel slope mean precipitation SD of annual maximum precipitation

Cross-correlation concurrent maxima

Monte Carlo results for cross-correlation L-CS estimators GEV+ when  = -0.3 and  2 = 0.3  xy - cross-correlation annual maxima  xy - cross- Corre- lation L-CS estimators

Regression Results L-CV Model Name Const.Ln(Area) Mean Ppt Model Error Var.    Avg Sampling Var. AVP GLS Pseudo R 2 (%) ERL (years) B-GLS (0.0306)(0.0033) B-GLS (0.0285)(0.0116)(0.0522)(0.0021) [0.1 %][1.3 %] Standard error in parentheses ( - ); p-value in brackets [ - ].

Performance Measures Average Variance of Prediction (AVP) How well model estimates true value of quantity of interest on average across sites Pseudo R 2 : improvement of GLS(k) versus GLS(0) Effective Record Length (ERL) Relative uncertainty of regional estimate compared to an at-site estimator

Regression Results L-CS  3 Model Name Const.Ln(Area) Model Error Var.    Avg Sampling Var. AVP Pseudo R 2 (%) ERL (years) B-GLS (0.0538)(0.0056) B-GLS (0.0489)(0.0183)(0.0044) [0.6 %] Standard error in parentheses ( - ); p-value in brackets [ - ].

Model Diagnostic Measures Pseudo ANOVA table -Variation explained by regional model -Residual variation due to model errors -Residual variation due sampling errors -Represents partition of TOTAL variation

Pseudo ANOVA Table for L-CV and L-CS Source Degrees- of-freedom Sum of squares EquationsL-CVL-CS Modelk = 1 or 2 n[   2 (0) -   2 (k)] Model error δn - k - 1 n2(k)n2(k) Sampling error ηn Total2n Pseudo R 2 45 %37 %, where w is the vector ( ) We need GLS regression analysis ERL (years) = 21 51

Conclusion: Value in Korea Regional estimator for L-Coefficient of Variation should be combined with its at-site estimator ERL(  2 ) = 21 years ≈ average record length (22 yrs) Regional estimator for L-skewness was more precise than at-site estimators ERL(  3 ) = 51 years > average record length (22 yrs) Clearly advantageous to use BOTH regional and at-site information in analysis of annual maxima.

Diagnostic Statistics Statistics for evaluating data concerns, precision of predicted values, sources of variation, and model adequacy: Leverage and Influence Measures of Prediction Precision Pseudo R 2 and ANOVA Modeling Diagnostics: EVR & MBV Bayesian Plausibility Level

Bayesian Hierarchical Model: Solve whole problem at once? Assume values for each site i for i = 1, …, K X it ~ GEV(  ), t = 1, …, n i where for parameters we have  i ~ N(µ     i ~ N(µ     where perhaps  i  i /  I or coef. of variation  i ~ N(µ     with priors on µ    ; µ    ; µ    whose values for each site I may depend on at-site physiographic characteristics of that site. Ignores cross-correlations: need multivariate model for K variates? Beware of special cases and lack of fit.

Outline Summarizing Data: Moments and L-moments Parameter estimation for GEV –Use of a prior on  –PDS versus AMS with GMLEs Bayesian GLS Regression for regionalization Concluding observations

Concluding Remarks GEV distribution used by many water agencies and countries to describe the distribution of extremes. L-moments provide simple estimators, but not efficient. Generalized Maximum Likelihood Estimators [GMLEs] (modest prior on  ) solve problems with MLEs and were the most precise. PDS (GPD-Poisson) no better than AMS (GEV) when estimating three parameters with GMLE.

Final Comments Regional regression procedures should account for precision of at-site estimators and their cross- correlations, as can be done with Generalized Least Squares regression Otherwise estimates of model accuracy and of precision of parameter estimates will be in error. When model error variance is small relative to errors in estimated hydrologic statistics, Bayesian model error variance estimator is particularly attractive.

Hosking and Wallis (1997) We can do better than simple index flood procedures that everywhere use regional average L-CV  2 and L-CS  3 values.

Conclusion: Applicability of GLS Developed Bayesian Generalized Least Squares modeling framework to analyze regional information addressing distribution parameters recognizing –Sampling error in at-site estimators as function of record length, cross-correlation of concurrent events, and concurrent record lengths, and –regional model error (true precision of regional model) Developed regression models for L-CV and L-CS for Korean annual maximum flood using B-GLS analysis

Background Reading Stedinger, J.R., Flood Frequency Analysis and Statistical Estimation of Flood Risk, Chapter 12, Inland Flood Hazards: Human, Riparian and Aquatic Communities, E.E. Wohl (ed.), Cambridge University Press, Stanford, United Kingdom, References Hosking, J. R. M., L-Moments: Analysis and Estimation of Distributions Using Linear Combinations of Order Statistics, J. of Royal Statistical Society, B, 52(2), , Hosking, J.R.M., and J.R. Wallis, Regional Frequency Analysis: An Approach Based on L- moments, Cambridge University Press, Martins, E.S., and J.R. Stedinger, Generalized Maximum Likelihood GEV quantile estimators for hydrologic data, Water Resources Research. 36(3), , Martins, E.S., and J.R. Stedinger, Generalized Maximum Likelihood Pareto-Poisson Flood Risk Analysis for Partial Duration Series, Water Resources Research.37(10), , Stedinger, J. R., and L. Lu, Appraisal of Regional and Index Flood Quantile Estimators, Stochastic Hydrology and Hydraulics, 9(1), 49-75, Flood Frequency References

GLS References Griffis, V. W., and J. R. Stedinger, The Use of GLS Regression in Regional Hydrologic Analyses, J. of Hydrology, 344(1-2), 82-95, 2007 [doi: /j.jhydrol ]. Gruber, Andrea M., Dirceu S. Reis Jr., and Jery R. Stedinger, Models of Regional Skew Based on Bayesian GLS Regression, Paper , World Environ. & Water Resour. Conf. - Restoring our Natural Habitat, K.C. Kabbes editor, Tampa, FL, May 15-18, Jeong, Dae Il, Jery R. Stedinger, Young-Oh Kim, and Jang Hyun Sung, Bayesian GLS for Regionalization of Flood Characteristics in Korea, Paper , World Environ. & Water Resour. Conf. - Restoring our Natural Habitat, Tampa, FL, May 15-18, Martins, E.S., and J.R. Stedinger, Cross-correlation among estimators of shape, Water Resources Research, 38(11), doi: /2002WR001589, 26 November Reis, D. S., Jr., J. R. Stedinger, and E. S. Martins, Bayesian generalized least squares regression with application to log Pearson type 3 regional skew estimation, Water Resour. Res., 41, W10419, doi: /2004WR003445, Stedinger, J.R., and G.D. Tasker, Regional Hydrologic Analysis, 1. Ordinary, Weighted and Generalized Least Squares Compared, Water Resour. Res., 21(9), , Tasker, G.D., and J.R. Stedinger, Estimating Generalized Skew With Weighted Least Squares Regression, J. of Water Resources Planning and Management, 112(2), , Tasker, G.D., and J.R. Stedinger, An Operational GLS Model for Hydrologic Regression, J. of Hydrology, 111(1-4), , 1989.

Pseudo R 2 for GLS Not interested in total error  that includes sampling error  which cannot explain. Traditional adjusted R 2 : How much of critical model error  can we explain, where Var [  ] =   (k) for model with k parameters? Consider the GLS model:

Pseudo ANOVA Table SourceDegrees of FreedomEstimator Modelk Model Error  n - k - 1 Sampling Error  n Total2n - 1

Modeling Diagnostics To evaluate whether OLS might be sufficient consider the Error Variance Ratio EVR. If EVR > 20%, then sampling error  in estimators of y are potentially an important fraction of the observed total error  = . Do we need WLS or GLS to correctly analyze this data?

Modeling Diagnostics EVR > 20% suggests a need for WLS or GLS. But when is cross-correlation so large that a GLS analysis is needed? Misrepresentation of Beta Variance (MBV) Describes error made by WLS in its evaluation of precision of estimator b 0 of the constant term.

OLS, WLS and GLS for L-CS Model Name Const.Ln(Area) Model Error Var. Average Sampling Var. AVP new Pseudo R 2 (%) ERL (years) OLS (0.0267)(0.0181) B-WLS (0.0261)(0.0206)(0.0047) B-GLS (0.0489)(0.0188)(0.0044) Standard error in parentheses ( - ).