Download presentation
Presentation is loading. Please wait.
Published byStephen Tracy Wilson Modified over 8 years ago
1
HW 19 Key
2
22:37 Diamond Rings. The data table contains the listed prices and weights of the diamonds in 48 rings offered for sale in The Singapore Times. The prices are in Singapore dollars, with the weights in carats. Use price as the response and weight as the explanatory variable. These rings hold relatively small diamonds, weights less than ½ carat. The Hope Diamond weights in at 45.52 carats. Its history and fame make it impossible to assign a price, and smaller stones of its quality have gone for $600,000 per carat. Let’s say 45.52 carats *$750,000/carat=$34,140,000 and call it $35 million. For the exchange rate, assume that 1 US dollar is worth about 1.6 Singapore dollars.
3
22:37 a a. Add an imaginary ring with the weight and this price of the Hope Diamond (in Singapore dollars) to the data set as a 49 th case. How does the addition of the Hope Diamond to these other rings change the appearance of the plot? How many points can you see?
4
22:37 a a. Without the Hope Diamond, you can see each individually. With the Hope Diamond, you can only see the Hope Diamond and the rest all mold into one.
5
22:37 b b. How does the fitted equation of the SRM to this data change with the addition of this once case? Without the Hope Diamond With the Hope Diamond
6
22:37 c c. Explain how it can be that both R squared and se increase with the addition of this point. The error terms obviously grow because of the presence of an extreme outlier. The r squared explains variation between weight and price, so its natural that it explains more with the outlier. Most of the variation is the difference in ring size.
7
22:37 d d. Why does the addition of one point, making up only 2% of the data, have so much influence on the fitted model? It’s because it is such an extreme outlier. The model has to adjust rather drastically to account for the new value. Removing the outlier provides a better match to a SRM; it is more reliable. The Hope Diamond is leveraged so the regression must fit this outlier.
8
22:38 Convenience Shopping. These data describe the sales over time at a franchise outlet of a major US oil company. This particular station sells gas, and it also has a convenience store and a car wash. Each row summarizes sales for one day at this location. The column labeled Sales gives the dollar sales of the convenience store, and the column Volume gives the number of gallons of gas sold. Formulate the regression model with dollar sales as the response and number of gallons sold as the predictor.
9
22:38 a a.These data are a time series, with five or six measurements per week. (The initial data collection did not monitor sales on Saturday.) Does the sequence plot of residuals from the fitted equation indicate the presence of dependence? No, the residuals seem fine. There are a few outliers but they seem random with no pattern.
10
22:38 a a.
11
22:38 b b. Calculate the Durbin-Watson statistic D. (Ignore the fact that the data over the weekend are not adjacent.) Does the value of D indicate the presence of dependence? Does it agree with your impression in part a? It indicates dependence, which does not match my impression, but there could have easily been something hidden that I couldn’t see.
12
22:38 c c. The residual for row 14 is rather large and positive. How does this outlier affect the fit of the regression of sales on gallons? With OutlierWithout Outlier
13
22:38 c c. With OutlierWithout Outlier
14
22:38 d d. Should the outlier be removed from the fit? No, it doesn’t have that large of an impact.
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.