Presentation is loading. Please wait.

Presentation is loading. Please wait.

Sit in your permanent seat

Similar presentations


Presentation on theme: "Sit in your permanent seat"— Presentation transcript:

1 Sit in your permanent seat
QM222 Class 6 Section A1 More on why Correlation doesn’t tell us what causes what Sit in your permanent seat QM222 Fall 2017 Section A1

2 Today’s Objectives Review scatter diagrams and correlation coefficients as ways numeric way to measure the relationship between two variables. Examples of when correlation does not show causality Selection bias – One reason correlation does not show causality QM222 Fall 2017 Section A1

3 Scatterplots can tell us
The direction (sign) of relationship between two variables (is the slope positive or negative?) The form of the relationship: linear vs. curved The strength of relationship If there are outliers QM222 Fall 2017 Section A1

4 Correlation coefficients tell us (1) the direction of association: When X goes up, does Y go up or down? and (2) the strength of the association: How closely related are Y and X? When r = 1 there is perfect positive correlation; dots in a scatter diagram would all lie exactly on an upward sloping line. When r = -1 there is perfect negative correlation; dots in a scatter diagram would all lie exactly on a downward sloping line. When r = 0 there is no correlation; dots in a scatter diagram have no pattern. Most correlations are in between QM222 Fall 2017 Section A1

5 Correlation v. relation
The correlation coefficient measures the strength of linear relationship. A low value is not enough to conclude a lack of a strong link between the two variables. This picture has a near zero correlation … The two variables are very related, but it’s not a line with a single slope, but. QM222 Fall 2017 Section A1

6 Correlations QM222 Fall 2017 Section A1

7 Today’s Objectives Review scatter diagrams and correlation coefficients as ways numeric way to measure the relationship between two variables. Examples of when correlation does not show causality Selection bias – One reason correlation does not show causality QM222 Fall 2017 Section A1

8 Why correlation does not imply causation
Possible explanations for correlation between x and y: X causes Y a change in X will change Y. Y causes X a change in Y will change X X causes Y AND Y causes X this is known as simultaneity Another variable(s) cause both X and Y this is called a confounding factor QM222 Fall 2017 Section A1

9 Let’s go through the examples in the video… which is it?:
A. X causes Y B. Y causes X C. X causes Y AND Y causes X (simultaneity) D. Another variable(s) cause both X and Y (confounding factor) Ice cream (X) causes drownings (Y). Married men live longer than single men. Infants who sleep with the lights on tend to grow up short-sighted. Self esteem causes good grades. QM222 Fall 2017 Section A1

10 Examples from our book (p.41)
Inhalers (Y) is associated with shorter lengths of lives (X). Does this mean that inhalers kill people? In-class exercise a, b, c QM222 Fall 2017 Section A1

11 Why do you think this correlation occurs?
QM222 Fall 2017 Section A1

12 Today’s Objectives Review scatter diagrams and correlation coefficients as ways numeric way to measure the relationship between two variables. Examples of when correlation does not show causality Selection bias – One reason correlation does not show causality QM222 Fall 2017 Section A1

13 Selection Selection is the general term for cases where the population that you are studying is not representative of the population as a whole We’ll talk about general cases of selection and alsotwo special cases…. Self-selection – when people select into the sample Survivorship bias – where only survivors are observed QM222 Fall 2017 Section A1

14 Cases of general selection
From the video: Married men live longer From our exercise: Kids in schools with smaller classes are more likely to go to college QM222 Fall 2017 Section A1

15 Another example Suppose that I want to analyze if going to TA office hours sections improves students’ performance I compare the final grades of those that regularly attended the TA office hours vs. those that didn’t and find that those who attended have lower grades Should I fire the TA? What kind of selection is this? Self-selection, people choosing their behavior What would be a better way of testing this? QM222 Fall 2017 Section A1

16 The key in all these selection bases
Terminology: When you are measuring the effect of something in an experiment, you give this “treatment” to some of the people/rats/etc and not give this treatment to the others – “the control group” Selection will bias our measure of the effect if the treatment group and control group are likely to be different for reasons unrelated to the treatment that could be creating the outcome. Self-selection is when people choose which group they are in QM222 Fall 2017 Section A1

17 Another example of this: Is this a good way to estimate the average number of children per family in the US? Suppose that I want to estimate the average number of children in families in the US and use you (the class) as a sample. I ask each of you how many children are in your family (including you). What are the problems with this estimation? Do you think it would be a overestimate or an underestimate of the average number of children? Why? QM222 Fall 2017 Section A1

18 Another example: Do you trust reviews on Amazon/Yelp/Airbnb/ebay?
If not, why? How are they biased? What’s the selection problem? QM222 Fall 2017 Section A1

19 Another example from WWII
During World War II, some of the most important mathematicians acted as secret agents of the US armed forces (called the Applied Mathematics Panel). When a commander would stumble into a problem that might be related to statistics, he’s ask this Panel. During World War II, the chances of a member of a bomber crew making it through a tour of duty was 50%. How, the Air Force asked, could they improve the odds of a bomber making it home? The military looked at the bombers that had returned from enemy territory, recording where those planes had taken the most damage. QM222 Fall 2017 Section A1

20 Discussion: Why did statisticians say the commanders were completely wrong?
They saw the bullet holes tended to accumulate along the wings, around the tail gunner, and down the center of the body. The commanders wanted to put the thicker protection where they could clearly see the most damage, where the holes clustered But the statistician (Abraham Wald) said “No, it is the OPPOSITE. Where these bombers are unharmed is where these bombers are most vulnerable. Put protection THERE! Why did he say this? Where should they add protection? ANSWER part d! QM222 Fall 2017 Section A1

21 This special kind of selection bias is called Survivorship Bias
Survivorship bias occurs when those who survive are different from those who don’t…. But you only measure the survivors. Another example: If you look at the 10-year % return of mutual funds…. These are the ones that survived… and will be the ones who got the highest % return (even if returns were random across funds). So don’t expect this % return if you invest in mutual funds for 10 years. QM222 Fall 2017 Section A1

22 What we learned today To spot cases where the correlation can be due to more than one of these possibilities: A. X causes Y B. Y causes X C. X causes Y AND Y causes X (simultaneity) D. Another variable(s) cause both X and Y (confounding factor) To spot these cases by looking for non-representative samples due to selection QM222 Fall 2017 Section A1


Download ppt "Sit in your permanent seat"

Similar presentations


Ads by Google