Presentation is loading. Please wait.

Presentation is loading. Please wait.

Outliers and Influential points Imagine a scatter plot of the heights and weights of adult men….would it be a positive or negative association? Should.

Similar presentations


Presentation on theme: "Outliers and Influential points Imagine a scatter plot of the heights and weights of adult men….would it be a positive or negative association? Should."— Presentation transcript:

1 Outliers and Influential points Imagine a scatter plot of the heights and weights of adult men….would it be a positive or negative association? Should there be correlation?

2 Vocabulary Review Remember – Association refers to a scatter plot’s visual pattern (not necessarily linear) – Correlation is a mathematical calculation that measures the direction (positive or negative) and strength [ -1 ≤ |r| ≤ 1] of linear association

3 Shaquille O’Neal As a general rule, the taller a person is, the greater the weight (on average). Now add Shaquille O'Neal (7'1", 325lbs) to the plot. He's an outlier in height and an outlier in weight, but not an outlier with respect to the bivariate relationship – rather, he's an example of it. His point might probably lies pretty close to the regression line determined by the others, so with or without him there the line probably looks pretty much the same. And that means he's not influential.

4 Sumo Wrestler Instead, imagine adding a sumo wrestler, average height but very high in weight. The sumo wrestler is an outlier because his point doesn't fit the pattern well. As evidence, the fact that it's not close to the line will show up in a large residual. Since this guy's point is directly above the mean point [ (x-bar, y-bar) where the line is pinned], his residual stays the same regardless of the slope of the line. That means the other points determine the slope, and the sumo wrestler isn't influential. {recall correlation formula…note x – x-bar = 0! So this wouldn’t show up in “r”}

5 Manute Bol Finally, imagine adding Manute Bol, perhaps the skinniest center in NBA history at 7'6" and only 200 pounds. Very tall, but surprisingly low weight -- so he doesn't fit the pattern, an outlier. He'd be very far from the line determined by the rest of the data (potentially a huge residual), but if we include him when doing the regression, his presence will tip the line down in order to minimize the sum of squared residuals. That leverage to make the line's slope change a lot makes Manute an influential point.


Download ppt "Outliers and Influential points Imagine a scatter plot of the heights and weights of adult men….would it be a positive or negative association? Should."

Similar presentations


Ads by Google