1 1 Slide 統計學 Spring 2004 授課教師:統計系余清祥 日期: 2004 年 5 月 11 日 第十二週:建立迴歸模型.

Slides:



Advertisements
Similar presentations
Instrumental Variables: 2-Stage and 3-Stage Least Squares Regression of a Linear Systems of Equations 2009 LPGA Performance Statistics and Prize Winnings.
Advertisements

Chapter 12 Simple Linear Regression
Pengujian Parameter Regresi Pertemuan 26 Matakuliah: I0174 – Analisis Regresi Tahun: Ganjil 2007/2008.
Regression Analysis Module 3. Regression Regression is the attempt to explain the variation in a dependent variable using the variation in independent.
Regression Analysis Once a linear relationship is defined, the independent variable can be used to forecast the dependent variable. Y ^ = bo + bX bo is.
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 13 Nonlinear and Multiple Regression.
Objectives (BPS chapter 24)
1 1 Slide © 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole.
1 1 Slide © 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole.
Statistics for Managers Using Microsoft® Excel 5th Edition
Statistics for Managers Using Microsoft® Excel 5th Edition
Part I – MULTIVARIATE ANALYSIS C3 Multiple Linear Regression II © Angel A. Juan & Carles Serrat - UPC 2007/2008.
1 1 Slide 統計學 Spring 2004 授課教師:統計系余清祥 日期: 2004 年 5 月 4 日 第十二週:複迴歸.
Multiple Linear Regression
1 1 Slide © 2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole.
Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc. Chap 15-1 Chapter 15 Multiple Regression Model Building Basic Business Statistics 11 th Edition.
Chapter 15: Model Building
1 Chapter 17: Introduction to Regression. 2 Introduction to Linear Regression The Pearson correlation measures the degree to which a set of data points.
Regression Model Building Setting: Possibly a large set of predictor variables (including interactions). Goal: Fit a parsimonious model that explains variation.
Copyright ©2011 Pearson Education 15-1 Chapter 15 Multiple Regression Model Building Statistics for Managers using Microsoft Excel 6 th Global Edition.
1 1 Slide © 2003 South-Western/Thomson Learning™ Slides Prepared by JOHN S. LOUCKS St. Edward’s University.
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Slides by JOHN LOUCKS St. Edward’s University.
Chapter 13 Additional Topics in Regression Analysis
Marketing Research Aaker, Kumar, Day and Leone Tenth Edition
Introduction to Linear Regression and Correlation Analysis
Copyright ©2011 Pearson Education, Inc. publishing as Prentice Hall 15-1 Chapter 15 Multiple Regression Model Building Statistics for Managers using Microsoft.
© 2004 Prentice-Hall, Inc.Chap 15-1 Basic Business Statistics (9 th Edition) Chapter 15 Multiple Regression Model Building.
1 1 Slide © 2003 Thomson/South-Western Chapter 13 Multiple Regression n Multiple Regression Model n Least Squares Method n Multiple Coefficient of Determination.
Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons Business Statistics, 4e by Ken Black Chapter 15 Building Multiple Regression Models.
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Chapter 15 Multiple Regression n Multiple Regression Model n Least Squares Method n Multiple.
Quantile Regression By: Ashley Nissenbaum. About the Author Leo H. Kahane Associate Professor at Providence College Research Sport economics, international.
Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc. Chap 12-1 Correlation and Regression.
Linear Regression Least Squares Method: the Meaning of r 2.
Introduction to Probability and Statistics Thirteenth Edition Chapter 12 Linear Regression and Correlation.
An alternative approach to testing for a linear association The Analysis of Variance (ANOVA) Table.
Regression Model Building LPGA Golf Performance
Autocorrelation in Time Series KNNL – Chapter 12.
Copyright ©2011 Nelson Education Limited Linear Regression and Correlation CHAPTER 12.
Sequential sums of squares … or … extra sums of squares.
14- 1 Chapter Fourteen McGraw-Hill/Irwin © 2006 The McGraw-Hill Companies, Inc., All Rights Reserved.
©2006 Thomson/South-Western 1 Chapter 14 – Multiple Linear Regression Slides prepared by Jeff Heyl Lincoln University ©2006 Thomson/South-Western Concise.
1 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Summarizing Bivariate Data Non-linear Regression Example.
Chapter 13 Multiple Regression
Multiple regression. Example: Brain and body size predictive of intelligence? Sample of n = 38 college students Response (Y): intelligence based on the.
1 1 Slide © 2005 Thomson/South-Western AK/ECON 3480 M & N WINTER 2006 n Power Point Presentation n Professor Ying Kong School of Analytic Studies and Information.
 Relationship between education level, income, and length of time out of school  Our new regression equation: is the predicted value of the dependent.
1 Experimental Statistics - week 12 Chapter 12: Multiple Regression Chapter 13: Variable Selection Model Checking.
Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Variable Selection 1 Chapter 8 Variable Selection Terry Dielman Applied Regression Analysis:
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 14-1 Chapter 14 Multiple Regression Model Building Statistics for Managers.
Review Session Linear Regression. Correlation Pearson’s r –Measures the strength and type of a relationship between the x and y variables –Ranges from.
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.
Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons Business Statistics, 4e by Ken Black Chapter 14 Multiple Regression Analysis.
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Chap 15-1 Chapter 15 Multiple Regression Model Building Basic Business Statistics 10 th Edition.
1 Building the Regression Model –I Selection and Validation KNN Ch. 9 (pp )
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 14-1 Chapter 14 Multiple Regression Model Building Statistics for Managers.
Using SPSS Note: The use of another statistical package such as Minitab is similar to using SPSS.
Variable selection and model building Part I. Statement of situation A common situation is that there is a large set of candidate predictor variables.
Interaction regression models. What is an additive model? A regression model with p-1 predictor variables contains additive effects if the response function.
Agenda 1.Exam 2 Review 2.Regression a.Prediction b.Polynomial Regression.
Analysis of variance approach to regression analysis … an (alternative) approach to testing for a linear association.
Yandell – Econ 216 Chap 15-1 Chapter 15 Multiple Regression Model Building.
Multiple Regression.
Chapter 15 Multiple Regression Model Building
Chapter 9 Multiple Linear Regression
John Loucks St. Edward’s University . SLIDES . BY.
Business Statistics, 4e by Ken Black
Chapter 15 – Multiple Linear Regression
Multiple Regression.
Multiple Regression Chapter 14.
Business Statistics, 4e by Ken Black
Presentation transcript:

1 1 Slide 統計學 Spring 2004 授課教師:統計系余清祥 日期: 2004 年 5 月 11 日 第十二週:建立迴歸模型

2 2 Slide Chapter 16 Regression Analysis: Model Building n General Linear Model n Determining When to Add or Delete Variables n Analysis of a Larger Problem n Variable-Selection Procedures n Residual Analysis n Multiple Regression Approach to Analysis of Variance and to Analysis of Variance and Experimental Design Experimental Design

3 3 Slide General Linear Model Models in which the parameters (  0,  1,...,  p ) all have exponents of one are called linear models. n First-Order Model with One Predictor Variable n Second-Order Model with One Predictor Variable n Second-Order Model with Two Predictor Variables with Interaction with Interaction

4 4 Slide General Linear Model Often the problem of nonconstant variance can be corrected by transforming the dependent variable to a different scale. n Logarithmic Transformations Most statistical packages provide the ability to apply logarithmic transformations using either the base-10 (common log) or the base e = (natural log). n Reciprocal Transformation Use 1/ y as the dependent variable instead of y.

5 5 Slide Models in which the parameters (  0,  1,...,  p ) have exponents other than one are called nonlinear models. In some cases we can perform a transformation of variables that will enable us to use regression analysis with the general linear model. n Exponential Model The exponential model involves the regression equation: We can transform this nonlinear model to a linear model by taking the logarithm of both sides. General Linear Model

6 6 Slide Determining When to Add or Delete Variables n F Test To test whether the addition of x 2 to a model involving x 1 (or the deletion of x 2 from a model involving x 1 and x 2 ) is statistically significant

7 7 Slide Variable-Selection Procedures n Stepwise Regression At each iteration, the first consideration is to see whether the least significant variable currently in the model can be removed because its F value, FMIN, is less than the user-specified or default F value, FREMOVE. At each iteration, the first consideration is to see whether the least significant variable currently in the model can be removed because its F value, FMIN, is less than the user-specified or default F value, FREMOVE. If no variable can be removed, the procedure checks to see whether the most significant variable not in the model can be added because its F value, FMAX, is greater than the user-specified or default F value, FENTER. If no variable can be removed, the procedure checks to see whether the most significant variable not in the model can be added because its F value, FMAX, is greater than the user-specified or default F value, FENTER. If no variable can be removed and no variable can be added, the procedure stops. If no variable can be removed and no variable can be added, the procedure stops.

8 8 Slide n Forward Selection This procedure is similar to stepwise-regression, but does not permit a variable to be deleted. This procedure is similar to stepwise-regression, but does not permit a variable to be deleted. This forward-selection procedure starts with no independent variables. This forward-selection procedure starts with no independent variables. It adds variables one at a time as long as a significant reduction in the error sum of squares (SSE) can be achieved. It adds variables one at a time as long as a significant reduction in the error sum of squares (SSE) can be achieved. Variable-Selection Procedures

9 9 Slide n Backward Elimination This procedure begins with a model that includes all the independent variables the modeler wants considered. This procedure begins with a model that includes all the independent variables the modeler wants considered. It then attempts to delete one variable at a time by determining whether the least significant variable currently in the model can be removed because its F value, FMIN, is less than the user-specified or default F value, FREMOVE. It then attempts to delete one variable at a time by determining whether the least significant variable currently in the model can be removed because its F value, FMIN, is less than the user-specified or default F value, FREMOVE. Once a variable has been removed from the model it cannot reenter at a subsequent step. Once a variable has been removed from the model it cannot reenter at a subsequent step. Variable-Selection Procedures

10 Slide n Best-Subsets Regression The three preceding procedures are one-variable- at-a-time methods offering no guarantee that the best model for a given number of variables will be found. The three preceding procedures are one-variable- at-a-time methods offering no guarantee that the best model for a given number of variables will be found. Some software packages include best-subsets regression that enables the use to find, given a specified number of independent variables, the best regression model. Some software packages include best-subsets regression that enables the use to find, given a specified number of independent variables, the best regression model. Minitab output identifies the two best one- variable estimated regression equations, the two best two-variable equation, and so on. Minitab output identifies the two best one- variable estimated regression equations, the two best two-variable equation, and so on. Variable-Selection Procedures

11 Slide Example: PGA Tour Data The Professional Golfers Association keeps a variety of statistics regarding performance measures. Data include the average driving distance, percentage of drives that land in the fairway, percentage of greens hit in regulation, average number of putts, percentage of sand saves, and average score. The variable names and definitions are shown on the next slide.

12 Slide n Variable Names and Definitions Drive : average length of a drive in yards Fair : percentage of drives that land in the fairway Green : percentage of greens hit in regulation (a par-3 green is “hit in regulation” if the player’s first shot lands on the green) Putt : average number of putts for greens that have been hit in regulation Sand : percentage of sand saves (landing in a sand trap and still scoring par or better) Score : average score for an 18-hole round Example: PGA Tour Data

13 Slide n Sample Data Drive Fair Green Putt Sand Score Drive Fair Green Putt Sand Score Example: PGA Tour Data

14 Slide n Sample Data (continued) Drive Fair Green Putt Sand Score Drive Fair Green Putt Sand Score Example: PGA Tour Data

15 Slide n Sample Data (continued) Drive Fair Green Putt Sand Score Drive Fair Green Putt Sand Score Example: PGA Tour Data

16 Slide n Sample Correlation Coefficients Score Drive Fair Green Putt Score Drive Fair Green Putt Drive Fair Green Putt Sand Example: PGA Tour Data

17 Slide n Best Subsets Regression of SCORE Vars R-sq R-sq(a) C-p s D F G P S X X X XX XX XXX XXX XXXX XXXX XXXXX Example: PGA Tour Data

18 Slide n Minitab Output The regression equation Score = (Drive) (Fair) (Green) (Putt) (Green) (Putt) Predictor Coef Stdev t-ratio p Constant Drive Fair Green Putt s =.2691 R-sq = 72.4% R-sq(adj) = 66.8% Example: PGA Tour Data

19 Slide n Minitab Output Analysis of Variance SOURCE DF SS MS F P Regression Error Total Example: PGA Tour Data

20 Slide Residual Analysis: Autocorrelation n Durbin-Watson Test for Autocorrelation Statistic Statistic The statistic ranges in value from zero to four. The statistic ranges in value from zero to four. If successive values of the residuals are close together (positive autocorrelation), the statistic will be small. If successive values of the residuals are close together (positive autocorrelation), the statistic will be small. If successive values are far apart (negative auto- If successive values are far apart (negative auto- correlation), the statistic will be large. A value of two indicates no autocorrelation. A value of two indicates no autocorrelation.

21 Slide End of Chapter 16