Presentation is loading. Please wait.

Presentation is loading. Please wait.

Day 15 Agenda: DG7 --- 15 minutes.

Similar presentations


Presentation on theme: "Day 15 Agenda: DG7 --- 15 minutes."— Presentation transcript:

1 Day 15 Agenda: DG minutes

2 AP STAT Section 3.3: Correlation and Regression Wisdom
EQ: What are influential points and lurking variables and how do they impact the association between two variables?

3 Outliers in Univariate Data Set
Recall: Outliers--- points that are well removed from the trend that the other points seem to follow. Outliers in Univariate Data Set value in a set of data that does not fit with the rest of the data more than 3 standard deviations from the mean lies outside the 1.5(IQR) fences

4 Outliers in Bivariate Data
extreme with respect to other y values in regression, a point that has an unusually large residual

5 Influential Point in Bivariate Data
when removed the regression line changes leverage on the regression coefficient (aka known as slope) normally outliers in x direction, but are not always outliers in terms of regression (i.e. residual not large)

6

7

8 The original data set is graphed at the right.
Classify the new point as a possible outlier and/or an influential point. State whether its presence increases or decreases the strength of the association of the variables. Possible Outlier and Influential Decreases strength of association Possible Influential Increases strength of association Possible Outlier Decreases strength of association

9 Go over graphs on pp 235 – 236

10

11

12

13 Possible Outliers in Bivariate Data

14 Red-Dot Included Red-Dot Not Included Presence of point does not impact predicted values or slope very much, therefore NOT INFLUENTIAL. Still an OUTLIER in terms of y-values. See S, the standard deviation of the residuals.

15 Possible Influential Point in Bivariate Data

16 Red-Dot Included Red-Dot Not Included Presence of point does not impact predicted values or slope very much, therefore NOT INFLUENTIAL. NOT OUTLIER in terms of y-values. See S, standard deviation of the residuals.

17 Possible Outlier and Influential Point in Bivariate Data

18 Presence of point does impacted predicted values and slope, therefore
Red-Dot Included Red-Dot Not Included Presence of point does impacted predicted values and slope, therefore INFLUENTIAL. Also OUTLIER in terms of y-values. See S, standard deviation of the residuals.

19 Important Notes: Influential points are almost always outliers, but not vice-versa. Outliers in y-direction may influence y-intercept, but not the slope of the regression line. Test for influential point is “what happens to regression line” when point is removed

20 No RULE for determining outliers and influential points
No RULE for determining outliers and influential points. Be able to explain what happens to correlation, slope, y-intercept, and coefficient of determination when these points are added to or removed from a scatterplot. Assignment: p. 238 – 239 #59 – 62

21

22 Create a scatterplot for this data
Create a scatterplot for this data. Analyze both your graph and the summary statistics in a few sentences.

23 Conclusion? The weight increase of a puppy in New York is CAUSING the price of snowshoes in Alaska to increase or vice-versa. OF COURSE NOT!! Be sure your relationship makes sense. Scatterplots and correlation do not demonstrate causation.

24 Causation follows from linear regression only.
Association does not imply causation! Causation follows from linear regression only.

25 Lurking Variables --- has an important effect and yet is not included among the predictor variables under consideration. Perhaps its existence is unknown or its effect unsuspected. What could be a lurking variable in these examples? Relate it back to both the explanatory and the response variable.

26 a. There is a strong positive correlation between the foot length of K-12 students and reading scores. Address the association that appears to exist : Although there appears to be a strong positive association between a student’s foot length and his reading score, Address that association does imply causation: Although there is a strong positive association between a student’s foot length and his reading score, a longer foot does not cause one to have higher reading scores. State a possible lurking variable. A possible lurking variable could be age of the student. Address how the lurking variable impacts the explanatory and/or response variable(s): An older student will be more likely to have a longer foot and higher reading scores than a younger student.

27 b. A survey shows a strong positive correlation between the percentage of a country's inhabitants that use cell phones and the life expectancy in that country. Address the association that appears to exist : Although there might be a strong positive association between cell phone usage and life expectancy in a particular country, Address that association does imply causation: Although there might be a strong positive association between cell phone usage and life expectancy in a particular country, using a cell phone does not cause one to live longer. State a possible lurking variable. A possible lurking variable could be the wealth of the inhabitants of the country.

28 b. A survey shows a strong positive correlation between the percentage of a country's inhabitants that use cell phones and the life expectancy in that country. Address how the lurking variable impacts the explanatory and/or response variable(s): A country where many people have access to cell phones will more than likely have more doctors, hospitals, and access to better health care than a country whose inhabitants don’t have access to cell phones.

29 Ex. A group of college students believes that herbal tea has remarkable powers. To test this belief, they make weekly visits to a local nursing home, where they visit with the residents and serve them herbal tea. The nursing home staff reports that after several months many of the residents are more cheerful and healthy. A skeptical sociologist commends the students for their good deeds but scoffs at the idea that herbal tea helped the residents. Do you agree with the student’s conclusion or the sociologist’s? Explain. The explanatory variable in this study is supplying the nursing home residents with herbal tea to drink. The response variable is the overall mood, outlook , etc of the residents. Although there appears to be a strong positive association between the residents drinking the herbal tea and their outlook on life, the consumption of herbal tea does not cause the residents to be cheerful and happy. A lurking variable could be the personal interaction the students give the residents during the visits. The more the students visit and spend time with the residents, the better the residents will feel.

30 KEY IDEAS TO FOCUS ON: Univariate Data Bivariate Data

31 More Stat Humor:

32 Assignment: p. 242 – 243 #63, 64, 66, 67 p. 244 – 247 #69, 70, 73


Download ppt "Day 15 Agenda: DG7 --- 15 minutes."

Similar presentations


Ads by Google