BPSChapter 61 Two-Way Tables
BPSChapter 62 To study associations between quantitative variables correlation & regression (Ch 4 & Ch 5) To study associations between categorical variables cross-tabulate frequencies & calculate conditional percents (this Chapter) Association
BPSChapter 63 Example: Age and Education Variables Marginal distributions “Age groups” is the categorical explanatory variable “Education level” is the categorical response variable
BPSChapter 64 Example: Marginal Totals Variables Marginal totals 37,786 81,435 56,008 27,858 58,077 44,465 44,828
BPSChapter 65 Marginal Distributions Marginal distributions are used as background information only. They do not address association
BPSChapter 66 Marginal Distribution, Row Variable % not completed HS =27,859 / 175,230 × 100% = 15.9% % graduated HS =58,077 / 175,230 × 100% = 33.1% % finished 1-3 yrs col. =44,465 / 175,230 × 100% = 25.4% % finished ≥4 yrs col. =44,828 / 175,230 × 100% = 25.6%
BPSChapter 67 Marginal Distribution, Column Variable % age 25–34 =37,786 / 175,230 × 100% = 21.6% % age 35–54 =81,435 / 175,230 × 100% = 46.5% % 55 and over =56,008 / 175,230 × 100% = 32.0%
BPSChapter 68 Association To determine associations, calculate conditional distributions (conditional percents) Two types of conditional distributions: Conditioned on row variable Conditioned on column variable
BPSChapter 69 Association If explanatory variable is in rows calculate row percents analyze row conditional distributions
BPSChapter 610 Association If explanatory variable is in columns calculate column percents analyze column conditional distribution
BPSChapter 611 Example: Column Percents Is AGE associated with EDUCATION? AGE is explanatory var. use column percents
BPSChapter 612 Example: Association Percents completing college by age Age % completed college 29.3%28.4%18.9% As age goes up, % completing college goes down NEGATIVE association between age and education
BPSChapter 613 No association: conditional percents nearly equal at all levels of explanatory variable Positive association: as explanatory variable rises conditional percentages increase Negative associations: as explanatory variable rises conditional percentages go down Association
BPSChapter 614 Statement of problem: Is ACCEPTANCE into a graduate program (response variable) predicted by GENDER (explanatory variable)? Example 2: Row Percent AcceptedNot accept.Total Male Female Total Explanatory variable (gender) is in rows use row percents
BPSChapter 615 Example 2 AcceptedNot acceptTotal Male Female Total Explanatory variable in rows use row percents Therefore: positive association with “maleness” Statement of problem: Is ACCEPTANCE associated with GENDER?
BPSChapter 616 Simpson’s Paradox In example 2, consider the lurking variable "major” –Business School (240 applicants) –Art School (320 applicants) Does this lurking variable explain the association? To address this potential problem, subdivide the data according to the lurking variable Lurking variables can change or even reverse the direction of an association
BPSChapter 617 Simpson’s Paradox Illustration Business School Applicants SuccessFailureTotal Male Female Total Male proportion = 18 / 120 = 0.15 Female prop. = 24 / 120 = 0.20 Negative association All Applicants SuccessFailureTotal Male Female Total Art School Applicants SuccessFailureTotal Male Female Total Male proportion = 180 / 240 = 0.75 Female proportion = 64 / 80 = 0.80 Negative association
BPSChapter 618 Overall: higher proportion of men accepted than women Within majors higher proportion of women accepted than men Reason Men applied to easier majors the initial association was an artifact of the lurking variable “MAJOR applied to” Simpson’s Paradox Illustration