1 Chapter 8 Indicator Variable Ray-Bing Chen Institute of Statistics National University of Kaohsiung.

Slides:



Advertisements
Similar presentations
Multiple Regression and Model Building
Advertisements

11-1 Empirical Models Many problems in engineering and science involve exploring the relationships between two or more variables. Regression analysis.
Managerial Economics in a Global Economy
Qualitative predictor variables
Lesson 10: Linear Regression and Correlation
Random Assignment Experiments
1 Chapter 4 Experiments with Blocking Factors The Randomized Complete Block Design Nuisance factor: a design factor that probably has an effect.
Chapter 4 Randomized Blocks, Latin Squares, and Related Designs
FTP Biostatistics II Model parameter estimations: Confronting models with measurements.
Probability & Statistical Inference Lecture 9
TYPES OF DATA. Qualitative vs. Quantitative Data A qualitative variable is one in which the “true” or naturally occurring levels or categories taken by.
12-1 Multiple Linear Regression Models Introduction Many applications of regression analysis involve situations in which there are more than.
6-1 Introduction To Empirical Models 6-1 Introduction To Empirical Models.
12 Multiple Linear Regression CHAPTER OUTLINE
11 Simple Linear Regression and Correlation CHAPTER OUTLINE
1 Chapter 2 Simple Linear Regression Ray-Bing Chen Institute of Statistics National University of Kaohsiung.
Chapter 13 Multiple Regression
1 Chapter 3 Multiple Linear Regression Ray-Bing Chen Institute of Statistics National University of Kaohsiung.
© 2000 Prentice-Hall, Inc. Chap Multiple Regression Models.
Multiple Regression Models. The Multiple Regression Model The relationship between one dependent & two or more independent variables is a linear function.
Analysis of Covariance Goals: 1)Reduce error variance. 2)Remove sources of bias from experiment. 3)Obtain adjusted estimates of population means.
Stat Today: General Linear Model Assignment 1:
1 Chapter 9 Variable Selection and Model building Ray-Bing Chen Institute of Statistics National University of Kaohsiung.
1 Chapter 5 Introduction to Factorial Designs Basic Definitions and Principles Study the effects of two or more factors. Factorial designs Crossed:
Predictive Analysis in Marketing Research
Multiple Regression and Correlation Analysis
1 1 Slide © 2003 South-Western/Thomson Learning™ Slides Prepared by JOHN S. LOUCKS St. Edward’s University.
Analysis of Covariance Goals: 1)Reduce error variance. 2)Remove sources of bias from experiment. 3)Obtain adjusted estimates of population means.
11-1 Empirical Models Many problems in engineering and science involve exploring the relationships between two or more variables. Regression analysis.
Chapter 5 Transformations and Weighting to Correct Model Inadequacies
Least Squares Regression
Simple Linear Regression Analysis
Chapter 6 (cont.) Regression Estimation. Simple Linear Regression: review of least squares procedure 2.
Relationships Among Variables
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 13-1 Chapter 13 Introduction to Multiple Regression Statistics for Managers.
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Slides by JOHN LOUCKS & Updated by SPIROS VELIANITIS.
Introduction to Linear Regression and Correlation Analysis
Correlation and Linear Regression
Graphical Analysis. Why Graph Data? Graphical methods Require very little training Easy to use Massive amounts of data can be presented more readily Can.
1 1 Slide © 2007 Thomson South-Western. All Rights Reserved OPIM 303-Lecture #9 Jose M. Cruz Assistant Professor.
1 1 Slide © 2007 Thomson South-Western. All Rights Reserved Chapter 13 Multiple Regression n Multiple Regression Model n Least Squares Method n Multiple.
1 1 Slide © 2012 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole.
1 1 Slide Multiple Regression n Multiple Regression Model n Least Squares Method n Multiple Coefficient of Determination n Model Assumptions n Testing.
CHAPTER 9 DUMMY VARIABLE REGRESSION MODELS
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Chapter 15 Multiple Regression n Multiple Regression Model n Least Squares Method n Multiple.
Multiple Regression and Model Building Chapter 15 Copyright © 2014 by The McGraw-Hill Companies, Inc. All rights reserved.McGraw-Hill/Irwin.
Sullivan – Fundamentals of Statistics – 2 nd Edition – Chapter 4 Section 2 – Slide 1 of 20 Chapter 4 Section 2 Least-Squares Regression.
Lecture 8 Simple Linear Regression (cont.). Section Objectives: Statistical model for linear regression Data for simple linear regression Estimation.
Inference for Regression Simple Linear Regression IPS Chapter 10.1 © 2009 W.H. Freeman and Company.
Chapter 4 Linear Regression 1. Introduction Managerial decisions are often based on the relationship between two or more variables. For example, after.
1 11 Simple Linear Regression and Correlation 11-1 Empirical Models 11-2 Simple Linear Regression 11-3 Properties of the Least Squares Estimators 11-4.
Linear Regression Analysis 5E Montgomery, Peck & Vining 1 Chapter 8 Indicator Variables.
Multiple Regression BPS chapter 28 © 2006 W.H. Freeman and Company.
Regression analysis and multiple regression: Here’s the beef* *Graphic kindly provided by Microsoft.
Chapter 13 Multiple Regression
Chapter Seventeen. Figure 17.1 Relationship of Hypothesis Testing Related to Differences to the Previous Chapter and the Marketing Research Process Focus.
Single-Factor Studies KNNL – Chapter 16. Single-Factor Models Independent Variable can be qualitative or quantitative If Quantitative, we typically assume.
1 Association  Variables –Response – an outcome variable whose values exhibit variability. –Explanatory – a variable that we use to try to explain the.
28. Multiple regression The Practice of Statistics in the Life Sciences Second Edition.
Least Squares Regression.   If we have two variables X and Y, we often would like to model the relation as a line  Draw a line through the scatter.
Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice- Hall, Inc. Chap 14-1 Business Statistics: A Decision-Making Approach 6 th Edition.
1 Prof. Indrajit Mukherjee, School of Management, IIT Bombay Some Basic Definitions Definition of a factor effect: The change in the mean response when.
Linear Regression and Correlation Chapter GOALS 1. Understand and interpret the terms dependent and independent variable. 2. Calculate and interpret.
Introduction to Multiple Regression Lecture 11. The Multiple Regression Model Idea: Examine the linear relationship between 1 dependent (Y) & 2 or more.
The General Linear Model. Estimation -- The General Linear Model Formula for a straight line y = b 0 + b 1 x x y.
Copyright © 2011 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Multiple Regression Chapter 14.
Chapter 14 Introduction to Multiple Regression
Multiple Regression Analysis and Model Building
Essentials of Modern Business Statistics (7e)
Prepared by Lee Revere and John Large
Presentation transcript:

1 Chapter 8 Indicator Variable Ray-Bing Chen Institute of Statistics National University of Kaohsiung

2 8.1 The General Concept of Indicator Variables The Variables in regression analysis: –Quantitative variables: well-defined scale of measurement. For example: temperature, distance, income, … –Qualitative variable (Categorical variable): for example: operators, employment status (employed or unemployed), shifts (day, evening or night), and sex (male or female). Usually no natural scale of measurement.

3 Assign a set of levels to a qualitative variable to account the effect that variable may have on the response. (indicator variable or dummy variable) For example: The effective life of a cutting tool (y) v.s. the lathe speed (x 1 ) and the type of cutting tool (x 2 ).

4

5

6

7 Example 8.1 Tool Life Data The scatter diagram is in Figure 8.2. Two different regression lines.

8

9

10

11

12 Two separate straight-line models v.s. a single model with an indicator variable: –Prefer the single-model approach (a simpler practical result) –Since assume the same slope, it makes sense to combine the data from both tool types to produce a single estimate of this common parameter. –Can give one estimate of the common error variance  2 and more residual degrees of freedom.

13 Different in intercept and slope:

14

15 Example 8.2 The Tool Life Data:

16

17

18 Example 8.3 An Indicator Variable with More Than Two Levels Total electricity consumption (y) v.s. the size of house (x 1 ) and the four types of sir condition systems. Four types of air conditions systems:

19  3 -  4 : relative efficiency of a heat pump compared to central air conditioning. Assume the variance doesn’t depend on the types.

20

21 Example 8.4 More Than One Indicator Variable Add the type of cutting oil used in Example 8.1

22

23

24

25

26

Comments on the Use of Indicator Variables Indicator Variables versus Regression on Allocated Codes Another approach to measure the levels of the variables is by an allocated code. In Example 8.3,

28

29 The allocated codes impose a particular metric on the levels of the qualitative factor. Indicator variables are more informative because they do not force any particular metric on the levels of the qualitative factor. Searle and Udell (1970): regression using indicator variables always leads to a larger R 2 than does regression on allocated codes.

Indicator Variables as a Substitute for a Quantitative Regressor Quantitative regressor can also be represented by indicator variables. In Example 8.3, for income factor: Use four indicator variables to represent the factor “income”.

31 Disadvantage: –More parameters are required to represent the information content of the quantitative factor. (a-1 v.s. 1) So it would increase the complexity of the model. –Reduce the degrees of freedom for error. Advantage: It does not require the analyst to make any prior assumptions about the functional form of the relationship between the response and the regressor variable.

Regression Approach to Analysis of Variance The Analysis of Variance is a technique frequently used to analyze data from planned ot designed experiments. Any ANOVA problem can be treated as a linear regression problem. Ordinarily we do not recommend that regression mothods be used for ANOVA because the specialized computing techniques are usually quite efficient.

33 However, there some ANOVA situation, particularly those involving unbalance designs, where the regression approach is helpful. Essentially, any ANOVA problem can be treated as a regression problem in which all of the regressors are indicator variables. n

34 Define the treatment effects in the balance case (an equal number of observations per treatment) as  1 +  2 + … +  k = n  i =  +  i is the mean of the ith treatment. Test H 0 :  1 =  2 = … =  k = 0 v.s. H 1 :  2  0 for at least one i

35

36 Example: 3 treatments Model: y ij =  +  i +  ij, i = 1, 2, 3, j = 1, 2, …, n

37

38

39

40