Ridge Regression Population Characteristics and Carbon Emissions in China (1978-2008) Q. Zhu and X. Peng (2012). “The Impacts of Population Change on Carbon.

Slides:



Advertisements
Similar presentations
1 Regression as Moment Structure. 2 Regression Equation Y =  X + v Observable Variables Y z = X Moment matrix  YY  YX  =  YX  XX Moment structure.
Advertisements

Computational Statistics. Basic ideas  Predict values that are hard to measure irl, by using co-variables (other properties from the same measurement.
Kin 304 Regression Linear Regression Least Sum of Squares
The Multiple Regression Model.
Copyright © 2006 The McGraw-Hill Companies, Inc. Permission required for reproduction or display. 1 ~ Curve Fitting ~ Least Squares Regression Chapter.
Multiple Regression. Outline Purpose and logic : page 3 Purpose and logic : page 3 Parameters estimation : page 9 Parameters estimation : page 9 R-square.
A Short Introduction to Curve Fitting and Regression by Brad Morantz
Simple Linear Regression 1. Correlation indicates the magnitude and direction of the linear relationship between two variables. Linear Regression: variable.
LINEAR REGRESSION: Evaluating Regression Models Overview Assumptions for Linear Regression Evaluating a Regression Model.
LINEAR REGRESSION: Evaluating Regression Models. Overview Assumptions for Linear Regression Evaluating a Regression Model.
LINEAR REGRESSION: Evaluating Regression Models. Overview Standard Error of the Estimate Goodness of Fit Coefficient of Determination Regression Coefficients.
Statistics for Managers Using Microsoft® Excel 5th Edition
Lecture 24: Thurs., April 8th
Predictive Analysis in Marketing Research
Data mining and statistical learning, lecture 3 Outline  Ordinary least squares regression  Ridge regression.
Chapter 9 Multicollinearity
Quantitative Business Analysis for Decision Making Simple Linear Regression.
Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc. Chap 15-1 Chapter 15 Multiple Regression Model Building Basic Business Statistics 11 th Edition.
FIN357 Li1 The Simple Regression Model y =  0 +  1 x + u.
Classification and Prediction: Regression Analysis
Variance and covariance Sums of squares General linear models.
Forecasting Revenue: An Example of Regression Model Building Setting: Possibly a large set of predictor variables used to predict future quarterly revenues.
Regression Model Building
Objectives of Multiple Regression
Regression and Correlation Methods Judy Zhong Ph.D.
Exploring Ridge Regression Case Study: Predicting Mortality Rates
Model Building III – Remedial Measures KNNL – Chapter 11.
Forecasting Revenue: An Example of Regression Model Building Setting: Possibly a large set of predictor variables used to predict future quarterly revenues.
Correlation and Regression PS397 Testing and Measurement January 16, 2007 Thanh-Thanh Tieu.
Applied Quantitative Analysis and Practices LECTURE#22 By Dr. Osman Sadiq Paracha.
2 Multicollinearity Presented by: Shahram Arsang Isfahan University of Medical Sciences April 2014.
Statistical Methods Statistical Methods Descriptive Inferential
Multiple Regression and Model Building Chapter 15 Copyright © 2014 by The McGraw-Hill Companies, Inc. All rights reserved.McGraw-Hill/Irwin.
Regression Regression relationship = trend + scatter
Roger B. Hammer Assistant Professor Department of Sociology Oregon State University Conducting Social Research Ordinary Least Squares Regression.
Autocorrelation in Time Series KNNL – Chapter 12.
1 Regression Analysis The contents in this chapter are from Chapters of the textbook. The cntry15.sav data will be used. The data collected 15 countries’
Simple Linear Regression (SLR)
Simple Linear Regression (OLS). Types of Correlation Positive correlationNegative correlationNo correlation.
Copyright © 2011 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Model Building and Model Diagnostics Chapter 15.
Multiple Regression I 1 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 4 Multiple Regression Analysis (Part 1) Terry Dielman.
Ch14: Linear Least Squares 14.1: INTRO: Fitting a pth-order polynomial will require finding (p+1) coefficients from the data. Thus, a straight line (p=1)
I271B QUANTITATIVE METHODS Regression and Diagnostics.
Multiple Regression David A. Kenny January 12, 2014.
1 AAEC 4302 ADVANCED STATISTICAL METHODS IN AGRICULTURAL RESEARCH Part II: Theory and Estimation of Regression Models Chapter 5: Simple Regression Theory.
Chapter 11 Linear Regression and Correlation. Explanatory and Response Variables are Numeric Relationship between the mean of the response variable and.
Chapter 12 REGRESSION DIAGNOSTICS AND CANONICAL CORRELATION.
Multiple Regression.
Regression and Correlation of Data Summary
Correlation and Simple Linear Regression
Regression Diagnostics
Regression.
Correlation and Simple Linear Regression
Regression Model Building - Diagnostics
Multiple Regression.
Multicollinearity in Regression Principal Components Analysis
Regression Model Building
Correlation and Simple Linear Regression
Some issues in multivariate regression
Principal Components Analysis
OVERVIEW OF LINEAR MODELS
Regression Model Building - Diagnostics
Simple Linear Regression and Correlation
Example on the Concept of Regression . observation
Regression Forecasting and Model Building
Longitudinal Data & Mixed Effects Models
Descriptive Statistics Univariate Data
Correlation and Simple Linear Regression
Correlation and Simple Linear Regression
Presentation transcript:

Ridge Regression Population Characteristics and Carbon Emissions in China ( ) Q. Zhu and X. Peng (2012). “The Impacts of Population Change on Carbon Emissions in China During ,” Environmental Impact Assessment Review, Vol. 36, pp. 1-8

Data Description/Model Data Years: (n = 31 Years) Dependent Variable – Carbon Emissions (million-tons) Independent Variables  Population (10,000s)  Urbanization Rate (%)  Percentage of Population of Working Age (%)  Average Household Size (persons/hhold)  Per Capita Expenditures (Adjusted to Year=2000)

Correlation Transformation

Data Note that the X-variables are very highly correlated, causing problems when it is inverted and used to obtain the least squares estimate of  and is variance-covariance matrix. Eigenvalues of X * ’X * : (X’X) -1 : VIF 1 = VIF 2 = VIF 3 = VIF 4 = VIF 5 =

Ridge Regression Method of producing a biased estimator of  that has a smaller Mean Square Error than OLS Mean Square Error of Estimator = Variance + Bias 2 Ridge estimator trades of bias for large reduction of variance when the predictor variables are highly correlated Problem: Choosing the shrinkage parameter c We will work with the standardized regression model based on the correlation transformed variables, then “back transform” the regression coefficients to original scale

Ridge Estimator (Standardized X, Y) Note the unconventional notation of  as the standardized regression coefficient vector

China Carbon Emissions Data (c = 0.20) The estimated regression coefficients have changed large amounts and in signs for Population and Urbanization rate

Back-Transforming Coefficients to Original Scale

Choosing the Shrinkage Parameter, c Ridge Trace – Plot of the standardized ridge regression coefficients versus c and observe where they flatten out C c Statistic – Similar to C p used in regression model selection PRESS Statistic extended to Ridge Regression – Cross- Validation Sum of Squares for “left-out” residuals Generalized Cross-Validation – Similar to PRESS, based on prediction Plot of VIF s versus c and observe where they all fall below 10

C c - Statistic Goal: Choose c to minimize C c

“PRESS” Statistic Goal: Choose c to minimize PR Ridge

Generalized Cross Validation Goal: Choose c to minimize GCV

C c, PRESS, GCV for China Carbon Data All of these methods select very low values for c. The graphical methods tend to choose larger values for the stabilization of the regression coefficients and VIF s.

Variance Inflation Factors

Final Model – Estimated Regression Coefficients The Residual based measures C c, PRESS, and GCV suggest very small values of c The Ridge Trace suggests larger value, with coefficients stabilizing above c = 0.15 or so The VIF plot suggests values above c = 0.03 having all VIF values less than 10 The authors used c = 0.20, based on the ridge trace