Cross-Tabulations
Cross-Tabs The level of measurement used for cross-tabulations are mostly nominal. Even when continuous variables are used (such as age and income), they are converted to categorical variables. When continuous variables are converted to categorical variables, important information (variation) is lost.
Data Types Prentice-Hall 3
Categorical Data Categorical random variables yield responses that classify Example: Gender (female, male) Measurement reflects number in category Nominal or ordinal scale Examples Did you attend a community college? Do you live on-campus or off-campus? Prentice-Hall 4
Why Concerned about Categorical Random Variables? Survey data tends to be categorical … hot/comfortable/cold, sunny/cloudy/fog/rain, yes/no… Know limitations nature of relationship causality Widely used in marketing for decision-making
Cross-Tabs The Chi-square, 2, statistic is used to test the null hypothesis. [Unfortunately, Chi-square, like many other statistics that indicate statistical significance, tells us nothing about the magnitude of the relation.] Prentice-Hall
c2 Test of Independence Shows whether a relationship exists between two categorical variables One sample is drawn Does not show nature of relationship Does not show causality Used widely in marketing Uses contingency table Prentice-Hall 39
Critical Value What is the critical c2 value if table has 2 rows and 3 columns, a =.05? If fo = fe, c2 = 0. Do not reject H0 a = .05 df = (2 - 1)(3 - 1) = 2 c2 Table (Portion) Prentice-Hall 26
c2 Test of Independence Hypotheses & Statistic H0: Variables are not dependent H1: Variables are dependent (related) Test statistic Degrees of freedom: (r - 1)(c - 1) Observed frequency Expected frequency Prentice-Hall 41
c2 Test of Independence Expected Frequencies Statistical independence means joint probability equals product of marginal probabilities P(A and B) = P(A)·P(B) Compute marginal probabilities Multiply for joint probability Expected frequency is sample size times joint probability Prentice-Hall 42
c2 Test of Independence An Example You’re a marketing research analyst. You ask a random sample of 286 consumers if they purchase Diet Pepsi or Diet Coke. At the 0.05 level of significance, is there evidence of a relationship? Prentice-Hall 44
Expected Frequencies Prentice-Hall 23
Expected Frequencies fe ³ 1 in all cells 132·116 286 132·154 286 132·170 286 154·170 286 Prentice-Hall 45
c2 Test of Independence Prentice-Hall 46
c2 Test of Independence Test Statistic: H0: Not Dependent Decision: H1: Dependent a = .05 df = (2 - 1)(2 - 1) = 1 Critical Value(s): Test Statistic: Decision: Conclusion: Reject at a = .05 a = .05 There is evidence of a relationship Prentice-Hall 47
Cross-Tabs Please provide the requested information by checking (once) in each category. What is your: age ____ < 18 ___ 18 - 26 ____ > 26 gender ____ male ____ female course load __ < 6 units __ 6 – 12 units __ > 12 units gpa __ < 2.0 __ 2.0 - 2.5 __ 2.6 - 3.0 __ 3.1 - 3.5 __ > 3.5 annual income __ < $15k __ $15k - $40k ___ > $40k
Cross-Tabs The information is coded and entered in the file student.sf by letting the first response be recorded as a 1, the second as a 2, etc.
Cross-Tabs The hypothesis test generally referred to as a test of dependence. The researcher wishes to determine whether the variables are dependent, or, exhibit a relationship.
Cross-Tabs Let’s investigate whether a relationship between a student’s gpa and units attempted exists. H0: GPA and UNITS are not dependent H1: GPA and UNITS are dependent.
Cross-Tabs Chi-Square Test ------------------------------------------ Chi-Square Df P-Value 3.67 8 0.8853
Cross-Tabs p-value = 0.8853, Retain H0 thus, GPA and UNITS are not dependent [Based on our data, there is no evidence to support the concept that a relationship exists between gpa and units attempted.]
Cross-Tabs Let’s investigate whether a relationship between a student’s age and units attempted exist. H0: AGE and UNITS are not dependent H1: AGE and UNITS are dependent.
Cross-Tabs Chi-Square Test ------------------------------------------ Chi-Square Df P-Value 9.89 4 0.0423
Cross-Tabs p-value = 0.0423, Reject H0 thus, AGE and UNITS are dependent [Based on our data, there is sufficient evidence to support the concept that a relationship exists between age and units attempted.]
Cross-Tabs Frequency Table for age by units Units <6 6-12 >12 AGE Total -------------------------------------------------------- <18 | 10 | 19 | 17 | 46 | 17.24% | 20.88% | 33.33% | 23.00% Age 18-26 | 24 | 22 | 16 | 62 | 41.38% | 24.18% | 31.37% | 31.00% >26 | 24 | 50 | 18 | 92 | 41.38% | 54.95% | 35.29% | 46.00% UNITS Total 58 91 51 200 29.00% 45.50% 25.50% 100.00%
Questions?
ANOVA