Multivariate Models Analysis of Variance and Regression Using Dummy Variables.

Slides:



Advertisements
Similar presentations
Qualitative predictor variables
Advertisements

BA 275 Quantitative Business Methods
Simple Linear Regression and Correlation
Qualitative Variables and
Chapter 13 Multiple Regression
Chapter 10 Simple Regression.
Chapter 12 Simple Regression
Interaksi Dalam Regresi (Lanjutan) Pertemuan 25 Matakuliah: I0174 – Analisis Regresi Tahun: Ganjil 2007/2008.
Regresi dan Rancangan Faktorial Pertemuan 23 Matakuliah: I0174 – Analisis Regresi Tahun: Ganjil 2007/2008.
Chapter 12 Multiple Regression
© 2000 Prentice-Hall, Inc. Chap Multiple Regression Models.
Multiple Regression Models. The Multiple Regression Model The relationship between one dependent & two or more independent variables is a linear function.
© 2003 Prentice-Hall, Inc.Chap 14-1 Basic Business Statistics (9 th Edition) Chapter 14 Introduction to Multiple Regression.
Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc. Chap 14-1 Chapter 14 Introduction to Multiple Regression Basic Business Statistics 11 th Edition.
Linear Regression and Correlation Analysis
Predictive Analysis in Marketing Research
Chapter 11 Multiple Regression.
© 2004 Prentice-Hall, Inc.Chap 14-1 Basic Business Statistics (9 th Edition) Chapter 14 Introduction to Multiple Regression.
Ch. 14: The Multiple Regression Model building
BCOR 1020 Business Statistics
C82MCP Diploma Statistics School of Psychology University of Nottingham 1 Linear Regression and Linear Prediction Predicting the score on one variable.
Review for Final Exam Some important themes from Chapters 9-11 Final exam covers these chapters, but implicitly tests the entire course, because we use.
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 13-1 Chapter 13 Introduction to Multiple Regression Statistics for Managers.
Example of Simple and Multiple Regression
Introduction to Linear Regression and Correlation Analysis
5.1 Basic Estimation Techniques  The relationships we theoretically develop in the text can be estimated statistically using regression analysis,  Regression.
Analysis of Variance: Some Review and Some New Ideas
Multiple Regression Analysis Multivariate Analysis.
Econ 3790: Business and Economics Statistics
OPIM 303-Lecture #8 Jose M. Cruz Assistant Professor.
Chapter 14 Introduction to Multiple Regression
Managerial Economics Demand Estimation. Scatter Diagram Regression Analysis.
Correlation and Linear Regression. Evaluating Relations Between Interval Level Variables Up to now you have learned to evaluate differences between the.
Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc. Chap 12-1 Correlation and Regression.
Introduction to Linear Regression
Multivariate Analysis. One-way ANOVA Tests the difference in the means of 2 or more nominal groups Tests the difference in the means of 2 or more nominal.
Regression Models Residuals and Diagnosing the Quality of a Model.
Chap 14-1 Copyright ©2012 Pearson Education, Inc. publishing as Prentice Hall Chap 14-1 Chapter 14 Introduction to Multiple Regression Basic Business Statistics.
Statistics for Business and Economics 8 th Edition Chapter 11 Simple Regression Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall Ch.
© Copyright McGraw-Hill Correlation and Regression CHAPTER 10.
Chapter 13 Multiple Regression
Regression & Correlation. Review: Types of Variables & Steps in Analysis.
STA 286 week 131 Inference for the Regression Coefficient Recall, b 0 and b 1 are the estimates of the slope β 1 and intercept β 0 of population regression.
Analysis of Covariance Combines linear regression and ANOVA Can be used to compare g treatments, after controlling for quantitative factor believed to.
Multiple Regression. Simple Regression in detail Y i = β o + β 1 x i + ε i Where Y => Dependent variable X => Independent variable β o => Model parameter.
Environmental Modeling Basic Testing Methods - Statistics III.
Copyright ©2011 Pearson Education, Inc. publishing as Prentice Hall 14-1 Chapter 14 Introduction to Multiple Regression Statistics for Managers using Microsoft.
Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice- Hall, Inc. Chap 14-1 Business Statistics: A Decision-Making Approach 6 th Edition.
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc.. Chap 14-1 Chapter 14 Introduction to Multiple Regression Basic Business Statistics 10 th Edition.
Introduction to Multiple Regression Lecture 11. The Multiple Regression Model Idea: Examine the linear relationship between 1 dependent (Y) & 2 or more.
1 1 Slide The Simple Linear Regression Model n Simple Linear Regression Model y =  0 +  1 x +  n Simple Linear Regression Equation E( y ) =  0 + 
Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc.Chap 14-1 Statistics for Managers Using Microsoft® Excel 5th Edition Chapter.
Tutorial 5 Thursday February 14 MBP 1010 Kevin Brown.
INTRODUCTION TO MULTIPLE REGRESSION MULTIPLE REGRESSION MODEL 11.2 MULTIPLE COEFFICIENT OF DETERMINATION 11.3 MODEL ASSUMPTIONS 11.4 TEST OF SIGNIFICANCE.
Chapter 14 Introduction to Multiple Regression
Analysis of Variance and Covariance
Multiple Regression Analysis and Model Building
Chapter 13 Created by Bethany Stubbe and Stephan Kogitz.
Chapter 11 Simple Regression
Quantitative Methods Simple Regression.
Analysis of Variance and Regression Using Dummy Variables
Residuals and Diagnosing the Quality of a Model
CHAPTER 29: Multiple Regression*
Prepared by Lee Revere and John Large
Analysis of Variance: Some Review and Some New Ideas
Multiple Regression Chapter 14.
Korelasi Parsial dan Pengontrolan Parsial Pertemuan 14
Multivariate Models Regression.
Presentation transcript:

Multivariate Models Analysis of Variance and Regression Using Dummy Variables

Models A Model: A statement of the relationship between a phenomenon to be explained and the factors, or variables, which explain it. Steps in the Process of Quantitative Analysis: –Specification of the model –Estimation of the model –Evaluation of the model

Model of Housing Values and Building Size Historian A hypothesizes that there is a linear relationship among housing value, building size and the number of families in the dwelling. Building Size = Square Feet/1000 Housing Value = 1905 Property Assessment in 2002 dollars/1000 Families = Number of families in the dwelling Housing Value = a + b1(Building Size) + b2(Familes).

The Model of Determinants of Housing Value Dep Var: NEWVAL N: 467 Multiple R: Squared multiple R: Adjusted squared multiple R: Standard error of estimate: Effect Coefficient Std Error Std Coef Tolerance t P(2 Tail) CONSTANT NEWSIZE FAMILIES Analysis of Variance Source Sum-of-Squares df Mean-Square F-ratio P Regression Residual

New Questions… Historian B suggests that there will be a neighborhood effect on housing values, and suggests that the values will be different, even taking size and number of families into consideration, on the north side, south side and east side. Historian B poses the problem to Historian A.

New Possibility: Analysis of Variance Comparison of the levels of an interval level dependent variable and a categorical or nominal independent variable. Are the property values different in the three neighborhoods, East, NW and South. Take a look first at the mean differences.

Value by Neighborhood

But… Are the results statistically significant? What is the strength of the relationship? How would we integrate this information into the earlier regression model?

Concepts We partition the total variation or variance into two components: –(1) variance which is a function of the group membership, that is the differences between the groups; and –(2) variance within the groups. More formally: Total Sum of Squares = Between Groups Sum of Squares + Within Groups Sum of Squares

Equation Total Sum of Squares = Within Groups Sum of Squares + Between Groups Sum of Squares TSS= SSW + SSB

Calculations

LET SSBETWEEN = N* (MEAN )* (MEAN ) Case VAR00001$ MEAN N SD VARIANCE SSBETWEEN 3 EASTSIDE NW SOUTHSID Total

Anova Table DF between = k -1 DF within = N – k

Degrees of Freedom DF between = k -1 DF within = N – k Website for F Table: – ection3/eda3673.htm#ONE http:// ection3/eda3673.htm#ONE Eta Squared = SSBetween/Total SS =.345 (equivalent to R Square)

So, now what… We know that the neighborhood affects the value of the house. How do we integrate that knowledge into a regression model?

A Dilemma…. Regression requires interval level measurement. One cannot include categorical variables in the equation. Historian A proposes testing separate models for the three neighborhoods.

Results Regression Models for the Three Wards: Determinants of Housing Value Northwest East Side South Side Constant5.90* * Newsize 11.99* 41.49*14.88* Families * N R Squared *Statistically significant at the.05 level.

Is there another way? Can we develop one model instead of three? Answer: Yes, by remeasuring the neighborhood at the interval level. How? By conceiving of new variables identifying the presence or the absence of the neighborhood, that is a set of binary variables, called dummy variables.

Illustration of Dummy Variables NeighborhoodEast Side South Side Northwest Side East Side100 South Side010 Northwest Side001

Illustration continued… Two new binary variables provide all the information needed for the three categories. Rule: Create k -1 dummy variables for the original categorical variable. The omitted category represents the value of the equation when the other dummy variables = 0.

New variables: Northwest Side as the Omitted Category Variable: Eastside. Codes: Yes=1; No=0 Variable: South. Codes: Yes=1; No=0 By implication: –For a household on the Eastside, Eastside=1 and South=0 –For a household on the Southside, Eastside=0 and Southside=1 –For a household in the Northwest Side, Eastside = 0 and South = 0.

Results Newval = a + b1(Newsize) + b2(Families) + b3(Eastside) + b4(South) Dep Var: NEWVAL N: 467 Multiple R: 0.75 Squared multiple R: 0.56 Adjusted squared multiple R: 0.55 Standard error of estimate: Effect Coefficient Std Error Std Coef Tolerance t P(2 Tail) CONSTANT NEWSIZE FAMILIES EASTSIDE SOUTH

Implications 1. Separate regressions for each neighborhood imply that the other coefficients in the equation vary by ward. 2. Regression with dummy variables implies that the neighborhood effect is a movement of the Y intercept. There may be interactions between the slope coefficients and the dummy variables, i.e., both 1 and 2 may be the case.