Presentation is loading. Please wait.

Presentation is loading. Please wait.

Section 4.4:Contingency Tables and Association Contingency table – What and why a contingency table – Marginal distribution – Conditional distribution.

Similar presentations


Presentation on theme: "Section 4.4:Contingency Tables and Association Contingency table – What and why a contingency table – Marginal distribution – Conditional distribution."— Presentation transcript:

1 Section 4.4:Contingency Tables and Association Contingency table – What and why a contingency table – Marginal distribution – Conditional distribution Simpson’s Paradox – What is it? – What causes it?

2 Contingency tables are for summarizing bivariate (or multivariate) qualitative data. sexheightshoeeyeshairhand male709brownbrownright male7111blueblondleft male7311.5blueblondright female647brownblackright male667.5brownlightbrownright female636.5brownblackright female646.5blueredright male7210brownblondleft male668.5greenlightbrownright female678brownlightbrownright male7411.5brownbrownleft male7212bluebrownright female688.5bluelightbrownright male7812blueblondright male7012greenblondright female688blueredboth female689.5greenbrownleft female667blueblondright male6610brownbrownright :::: :: :: ::::: ::::: :::::

3 blackblondbrownlightbrownredTotal blue01932327 brown52154127 green11022015 Total631208469 Contingency table results: Rows: eyes Columns: hair

4 bluebrowngreenTotal black0516 blond1921031 brown315220 lightbrown2428 red3104 Total27 1569 Contingency table results: Rows: hair Columns: eyes Often it is arbitrary which variable gets to be the row variable.

5 blackblondbrownlightbrownredTotal blue01001213 brown3182115 green041106 Total31594334 blackblondbrownlightbrownredTotal blue0931114 brown2172012 green161109 Total316114135 Contingency table results for sex=female: Rows: eyes Columns: hair Contingency table results for sex=male Displaying three variables (sex, eye color, hair color). We will focus on two variables.

6 The 793 adult male passenger survival, by 1st class, 2nd class, and 3rd class fares: http://www.encyclopedia-titanica.org/titanic-statistics.html Status \ Class1st Class2nd Class3rd ClassTotal Saved581360131 Lost118154390662 Total176167450793

7 Status \ Class1st Class2nd Class3rd ClassTotal Saved58 (7.31%) 13 (1.64%) 60 (7.57%) 131 (16.52%) Lost118 (14.88%) 154 (19.42%) 390 (49.18%) 662 (83.48%) Total176 (22.19%) 167 (21.06%) 450 (56.75%) 793 (100%) Relative Frequency marginal distribution: (in parentheses) Margins show relative amount in each row or column Add to one.

8 Status \ Class1st Class2nd Class3rd ClassTotal Saved58 (44.27%) 13 (9.92%) 60 (45.8%) 131 (100%) Lost118 (17.82%) 154 (23.26%) 390 (58.91%) 662 (100%) Total176 (22.19%) 167 (21.06%) 450 (56.75%) 793 (100%) Conditional Distribution Either rows or columns add to one (100%). Percentages conditioned on survival status

9 Status \ Class1st Class2nd Class3rd ClassTotal Saved58 (32.95%) 13 (7.78%) 60 (13.33%) 131 (16.52%) Lost118 (67.05%) 154 (92.22%) 390 (86.67%) 662 (83.48%) Total176 (100%) 167 (100%) 450 (100%) 793 (100%) Percentages conditioned on passenger class

10 Women & children Mentotal saved369131500 lost155662817 total5247931317 1.What proportion of passengers were women & children? 2.What proportion of the passengers were lost? 3.What proportion of the women & children were lost? 4.Of the passengers who were lost, what proportion of the passengers were women and children?

11 AcceptedRejectedtotal% Accepted Male620380100062 Female480520100048 Simpson’s Paradox: Example Hypothetical graduate school acceptance data: Men do better

12 But if a third variable is accounted for the story changes… MajorAMajorB Accept Rejecte dtotal%AcceptedAccept Rejecte dtotal% Accepted Male60030090066.6667208010020 Female160402008032048080040 Women actually do better

13 Both majorsMajorAMajorB AcceptedRejectedtotal%AcceptedAcceptRejectedtotal%AcceptedAcceptRejectedtotal% Accepted Male62038010006260030090066.66667208010020 Female480520100048160402008032048080040 Why the change?

14 Simpson’s Paradox represents a situation in which an association between two variables inverts or goes away when a third variable is introduced to the analysis. See: http://users.humboldt.edu/rizzardi/Handouts.dir/SimpsonParadoxExample.xlsx


Download ppt "Section 4.4:Contingency Tables and Association Contingency table – What and why a contingency table – Marginal distribution – Conditional distribution."

Similar presentations


Ads by Google