Presentation is loading. Please wait.

Presentation is loading. Please wait.

Linear Regression Analysis 5E Montgomery, Peck & Vining 1 Chapter 8 Indicator Variables.

Similar presentations


Presentation on theme: "Linear Regression Analysis 5E Montgomery, Peck & Vining 1 Chapter 8 Indicator Variables."— Presentation transcript:

1 Linear Regression Analysis 5E Montgomery, Peck & Vining 1 Chapter 8 Indicator Variables

2 Linear Regression Analysis 5E Montgomery, Peck & Vining 2 8.1 The General Concept of Indicator Variables Qualitative variables – also known as categorical variables. Qualitative variables do not have a scale of measurement. Indicator variables – a variable that assigns levels to the qualitative variable (also known as dummy variables).

3 Linear Regression Analysis 5E Montgomery, Peck & Vining 3 8.1 The General Concept of Indicator Variables Example Relate the effective life of a cutting tool (y) used on a lathe to the lathe speed in revolutions per minute (x 1 ) and type of cutting tool used. Tool type is qualitative and can be represented as: If a first-order model is appropriate:

4 Linear Regression Analysis 5E Montgomery, Peck & Vining 4 8.1 The General Concept of Indicator Variables Example For Tool type A this model becomes: For Tool type B this model becomes: Changing from A to B induces a change in the intercept (slope is unchanged and identical). We assume that the variance is equal for all levels of the qualitative variable.

5 Linear Regression Analysis 5E Montgomery, Peck & Vining 5 8.1 The General Concept of Indicator Variables

6 Linear Regression Analysis 5E Montgomery, Peck & Vining 6 8.1 The General Concept of Indicator Variables For qualitative variables with a levels, we would need a-1 indicator variables. For example, say there were three tool types, A, B, and C. Then two indicator variables (called x 2 and x 3 ) will be needed:

7 Linear Regression Analysis 5E Montgomery, Peck & Vining 7 Example 8.1 Tool Life Data

8 Linear Regression Analysis 5E Montgomery, Peck & Vining 8 Example 8.1 Tool Life Data

9 Linear Regression Analysis 5E Montgomery, Peck & Vining 9 Example 8.1 Tool Life Data The model to be fit is where x 2 = 0 indicates Tool type A, if x 2 = 1 then Tool type B is used. The least squares fit is

10 Linear Regression Analysis 5E Montgomery, Peck & Vining 10 Example 8.1 Tool Life Data

11 Linear Regression Analysis 5E Montgomery, Peck & Vining 11 Example 8.1 Tool Life Data – residual analysis – see plot of residuals versus fitted values in Fig. 8.3 (slide 8) and normal probability plot below:

12 Linear Regression Analysis 5E Montgomery, Peck & Vining 12 8.1 The General Concept of Indicator Variables Two separate models could have been fit to the data. However, the single-model approach is preferred because the analyst has only one final equation to work with instead of two, a much simpler practical result. Furthermore, since both straight lines are assumed to have the same slope, it makes sense to combine the data from both tool types to produce a single estimate of this common parameter.

13 Linear Regression Analysis 5E Montgomery, Peck & Vining 13 8.1 The General Concept of Indicator Variables Difference in Slope If we expect the slopes to differ, we can model this phenomenon by including an interaction term between the variables. Consider the tool life data again, and say we believe there may be different slopes for the two tools. The model we can fit to account for the change in slope is: 8.4

14 Linear Regression Analysis 5E Montgomery, Peck & Vining 14 8.1 The General Concept of Indicator Variables Difference in Slope If tool type A is used: If tool type B is used: Thus, the intercept has shifted and so has the slope.  2 – change in the intercept caused by changing from type A to type B  3 – change in the slope caused by changing from type A to type B

15 Linear Regression Analysis 5E Montgomery, Peck & Vining 15

16 Linear Regression Analysis 5E Montgomery, Peck & Vining 16 8.1 The General Concept of Indicator Variables What we really have are two regression equations; one for tool type A and one for tool type B. If we wanted to test to determine if these two equations are the same, we can use the extra sum of squares method and conduct a test of hypothesis. H 0 :  2 =  3 = 0 vs. H 1 :  2  0 and/or  3  0 The test statistic would be

17 Linear Regression Analysis 5E Montgomery, Peck & Vining 17 Example 8.2 The Tool Life Data

18 Linear Regression Analysis 5E Montgomery, Peck & Vining 18 Example 8.2 The Tool Life Data

19 Linear Regression Analysis 5E Montgomery, Peck & Vining 19 Example 8.3 An Indicator Variable with More Than Two Levels An electric utility is investigating the effect of the size of a single-family house and the type of air conditioning used in the house on the total electricity consumption during warm weather months.

20 Linear Regression Analysis 5E Montgomery, Peck & Vining 20 Example 8.3 An Indicator Variable with More Than Two Levels

21 Linear Regression Analysis 5E Montgomery, Peck & Vining 21 Example 8.3 An Indicator Variable with More Than Two Levels In this problem it would seem unrealistic to assume that the slope of the regression function relating mean electricity consumption to the size of the house does not depend on the type of air conditioning system. For example, we would expect the mean electricity consumption to increase with the size of the house, but the rate of increase should be different for a central air conditioning system than for window units because central air conditioning should be more efficient than window units for larger houses.

22 Linear Regression Analysis 5E Montgomery, Peck & Vining 22 Example 8.3 An Indicator Variable with More Than Two Levels There should be an interaction between the size of the house and the type of air conditioning system.

23 Linear Regression Analysis 5E Montgomery, Peck & Vining 23 Example 8.3 An Indicator Variable with More Than Two Levels The four regression models corresponding to the four types of air conditioning systems are as follows:

24 Linear Regression Analysis 5E Montgomery, Peck & Vining 24 Example 8.4 More than Two Indicator Variables Suppose that in Example 8.1 a second qualitative factor, the type of cutting oil used, must be considered. Assuming that this factor has two levels, we may define a second indicator variable, x 3, as follows:

25 Linear Regression Analysis 5E Montgomery, Peck & Vining 25 Example 8.4 More than Two Indicator Variables

26 Linear Regression Analysis 5E Montgomery, Peck & Vining 26 Example 8.4 More than Two Indicator Variables

27 Linear Regression Analysis 5E Montgomery, Peck & Vining 27 Example 8.4 More than Two Indicator Variables

28 Linear Regression Analysis 5E Montgomery, Peck & Vining 28 Example 8.5 Comparing Regression Models

29 Linear Regression Analysis 5E Montgomery, Peck & Vining 29 Example 8.5 Comparing Regression Models

30 Linear Regression Analysis 5E Montgomery, Peck & Vining 30 Example 8.5 Comparing Regression Models

31 Linear Regression Analysis 5E Montgomery, Peck & Vining 31 Example 8.5 Comparing Regression Models

32 Linear Regression Analysis 5E Montgomery, Peck & Vining 32 Example 8.5 Comparing Regression Models

33 Linear Regression Analysis 5E Montgomery, Peck & Vining 33 8.2 Some Comments on Indicator Variables Try to avoid using a specific metric for the levels of the qualitative variable (avoid allocating codes). That is, it is dangerous to assign values to each level such as 1, 2, 3, and 4. Why? You can substitute indicator variables for quantitative regressors. Useful if accurate data cannot be readily attained. Group the data into classes or intervals and then assign indicator variables. Drawback: loss of information by not using the actual data. Quantitative information is often more useful than qualitative.

34 Linear Regression Analysis 5E Montgomery, Peck & Vining 34 8.3 Regression Approach to Analysis of Variance The analysis of variance is a technique frequently used to analyze data from planned or designed experiments. Essentially, any analysis-of-variance problem can be treated as a regression problem in which all of the regressors are indicator variables.

35 Linear Regression Analysis 5E Montgomery, Peck & Vining 35 8.3 Regression Approach to Analysis of Variance

36 Linear Regression Analysis 5E Montgomery, Peck & Vining 36 8.3 Regression Approach to Analysis of Variance In the fixed-effects or model I case, the analysis of variance is used to test the hypothesis that all k population means are equal, or equivalently,

37 Linear Regression Analysis 5E Montgomery, Peck & Vining 37 8.3 Regression Approach to Analysis of Variance

38 Linear Regression Analysis 5E Montgomery, Peck & Vining 38 8.3 Regression Approach to Analysis of Variance

39 Linear Regression Analysis 5E Montgomery, Peck & Vining 39 8.3 Regression Approach to Analysis of Variance

40 Linear Regression Analysis 5E Montgomery, Peck & Vining 40 8.3 Regression Approach to Analysis of Variance

41 Linear Regression Analysis 5E Montgomery, Peck & Vining 41 8.3 Regression Approach to Analysis of Variance The analysis of variance is a technique frequently used to analyze data from planned or designed experiments. Essentially, any analysis-of-variance problem can be treated as a regression problem in which all of the regressors are indicator variables.


Download ppt "Linear Regression Analysis 5E Montgomery, Peck & Vining 1 Chapter 8 Indicator Variables."

Similar presentations


Ads by Google