Outline When X’s are Dummy variables –EXAMPLE 1: USED CARS –EXAMPLE 2: RESTAURANT LOCATION Modeling a quadratic relationship –Restaurant Example
Qualitative Independent Variables In many real-life situations one or more independent variables are qualitative. Including qualitative variables in a regression analysis model is done via indicator variables. An indicator variable (I) can assume one out of two values, “zero” or “one”. 1 if a first condition out of two is met 0 if a second condition out of two is met I= 1 if data were collected before if data were collected after if the temperature was below 50 o 0 if the temperature was 50 o or more 1 if a degree earned is in Finance 0 if a degree earned is not in Finance
Example 1 The dealer believes that color is a variable that affects a car’s price. Three color categories are considered: –White –Silver –Other colors Note: Color is a qualitative variable. I 1 = 1 if the color is white 0 if the color is not white I 2 = 1 if the color is silver 0 if the color is not silver And what about “Other colors”? Set I 1 = 0 and I 2 = 0
Solution –the proposed model is y = 0 + 1 (Odometer) + 2 I 1 + 3 I 2 + –The data To represent a qualitative variable that has m possible categories (levels), we must create m-1 indicator variables. White car Other color Silver color
There is insufficient evidence to infer that a white color car and a car of “Other color” sell for a different auction price. There is sufficient evidence to infer that a silver color car sells for a larger price than a car of the “Other color” category.
Price = (Odometer) (0) + 148(1) Price = (Odometer) (1) + 148(0) Price = (Odometer) (0) + 148(0) From Excel we get the regression equation PRICE = (ODOMETER)+45.2I I 2 For one additional mile the auction price decreases by 2.78 cents. Odometer Price A white car sells, on the average, for $45.2 more than a car of the “Other color” category (Odometer) (Odometer) (Odometer) A silver color car sells, on the average, for $148 more than a car of the “Other color” category The equation for a car of the “Other color” category. The equation for a car of white color The equation for a car of silver color
Example 2 Location for a new restaurant –A fast food restaurant chain tries to identify new locations that are likely to be profitable. –The primary market for such restaurants is middle-income adults and their children (between the age 5 and 12). –Which regression model should be proposed to predict the profitability of new locations?
Solution –The dependent variable will be Gross Revenue –There are quadratic relationships between Revenue and each predictor variable. Why? Members of middle-class families are more likely to visit a fast food family than members of poor or wealthy families. Income Low Middle High Revenue Families with very young or older kids will not visit the restaurant as frequent as families with mid-range ages of kids. age Revenue Low Middle High Revenue = 0 + 1 Income + 2 Age + 3 Income 2 + 4 Age 2 + 5 ( Income )( Age ) + Revenue = 0 + 1 Income + 2 Age + 3 Income 2 + 4 Age 2 + 5 ( Income )( Age ) +
Example 2 –To verify the validity of the model proposed in example 19.1, 25 areas with fast food restaurants were randomly selected. –Data collected included (see Xm19-02.xls): Previous year’s annual gross sales. Mean annual household income. Mean age of children
The model provides a good fit
The model can be used to make predictions. However, do not interpret the coefficients or test them. Multicollinearity is a problem!! In excel: Tools > Data Analysis > Correlation
Regression results of the modified model Multicolinearity is not a problem anymore