Presentation is loading. Please wait.

Presentation is loading. Please wait.

If we can reduce our desire,

Similar presentations


Presentation on theme: "If we can reduce our desire,"— Presentation transcript:

1 If we can reduce our desire,
then all worries that bother us will disappear.

2 Statistical Package Usage
Topic: Basic Categorical Data Analysis By Prof Kelly Fan, Cal. State Univ., East Bay

3 Outline Only categorical variables are discussed here.
Verify the hypothesized distribution One-sample Chi-square test Test the independence between two categorical variables Chi-square test for two-way contingency table McNemar’s test for paired data Measure the dependence (Phil and Kappa Coefficients) Odds ratios and relative risk Test the trend of a binary response Chi-square test for trend Meta-analysis

4 Example: Hair Color Distribution
Fair Red Medium Dark Black Frequency 76 19 83 65 3 % 30.89 7.72 33.74 26.42 1.22 Test % 30 12 25 From a random sample of 246 children

5 One-sample Chi-Square Test
Must be a random sample The sample size must be large enough so that expected frequencies are greater than or equal to 5 for 80% or more of the categories

6 One-sample Chi-Square Test
Test statistic: Oi = the observed frequency of i-th category ei = the expected frequency of i-th category

7 Chi-Square Test for Specified Proportions
SAS Output Chi-Square Test for Specified Proportions Chi-Square 7.7602 DF 4 Pr > ChiSq 0.1008

8 Two-way Contingency Tables
Report frequencies on two variables Such tables are also called crosstabs.

9 Contingency Tables (Crosstabs)
1991 General Social Survey Frequency Party Identification Democrat Independent Republican Race White 341 105 405 Black 103 15 11

10 Crosstabs Analysis (SAS: p.88-90; SPSS: p.369-371)
Chi-square test for testing the independence between two variables: For a fixed column, the distribution of frequencies over rows keeps the same regardless of the column For a fixed row, the distribution of frequencies over columns keeps the same regardless of the row

11 Crosstabs Analysis The phi coefficient measures the association between two categorical variables -1 < phi < 1 | phi | indicates the strength of the association If the two variables are both ordinal, then the sign of phi indicate the direction of association

12 SAS Output Statistic DF Value Prob Chi-Square 2 79.4310 <.0001
Likelihood Ratio Chi-Square <.0001 Mantel-Haenszel Chi-Square <.0001 Phi Coefficient Contingency Coefficient Cramer's V Sample Size = 980

13 Fisher’s Exact Test for Independence
The Chi-squared tests are for large samples The sample size must be large enough so that expected frequencies are greater than or equal to 5 for 80% or more of the categories

14 SAS Output Fisher's Exact Test Table Probability (P) 3.823E-22
Pr <= P E-20 Sample Size = 980

15 Matched-pair Data Comparing categorical responses for two “paired” samples When either Each sample has the same subjects (or say subjects are measured twice) Or A natural pairing exists between each subject in one sample and a subject form the other sample (eg. Twins)

16 Example: Rating for Prime Minister
Second Survey First Survey Approve Disapprove 794 150 86 570

17 Marginal Homogeneity The probabilities of “success” for both samples are identical Eg. The probability of approve at the first and 2nd surveys are identical

18 McNemar Test (for 2x2 Tables only)
See SAS textbook Section 3.L Ho: marginal homogeneity Ha: no marginal homogeneity Exact p-value Approximate p-value (When n12+n21>10)

19 SAS Output McNemar's Test Statistic (S) 17.3559 DF 1
Asymptotic Pr > S <.0001 Exact Pr >= S E-05 Simple Kappa Coefficient Kappa ASE 95% Lower Conf Limit 95% Upper Conf Limit Sample Size = 1600 Level of agreement

20 Comparing Proportions in 2x2 Tables
Difference of proportions: pi1-pi2 Relative risk: pi1/pi2 Odds Ratio: odds1/odds2 odds1=pi1/(1-pi1) odds2=pi2/(1-pi2)

21 Example: Aspirin vs. Heart Attack
Prospective sampling; Row totals were fixed Frequency Heart attack No Heart attack Placebo 189 10845 Aspirin 104 10933

22 Chi-square Test for Trend
Situation: A binary response (success, failure) + an ordinal explanatory variable Question: Is there a trend? Are the proportions (of success) in each of the levels of the explanatory variable increasing or decreasing in a linear fashion?

23 Example: Shoulder Harness Usage
Use? Large Cars Medium Cars Small Cars No 226 165 175 Yes 83 70 71 Question: Is the proportion of shoulder harness usage increasing or decreasing linearly as the car size gets larger?

24 SAS Output Statistics for Table of response by car_size
Statistic DF Value Prob Chi-Square Likelihood Ratio Chi-Square Mantel-Haenszel Chi-Square Phi Coefficient Contingency Coefficient Cramer's V

25 Meta Analysis Also known as Mantel-Haenszel test; stratified analysis
Situation: When another variable (strata) may “pollute” the effect of a categorical explanatory variable on a categorical response Goal: Study the effect of the explanatory while controlling the stratification variable

26 Example: Respiratory Improvement
Center Treatment Yes No Total 1 Test 29 16 45 Placebo 14 31 43 47 90 2 37 9 24 21 61

27 SAS Output Statistics for Table 1 of trtmnt by response
Controlling for center=1 Statistic DF Value Prob Chi-Square Likelihood Ratio Chi-Square Continuity Adj. Chi-Square Mantel-Haenszel Chi-Square Phi Coefficient Contingency Coefficient Cramer's V Estimates of the Relative Risk (Row1/Row2) Type of Study Value % Confidence Limits Case-Control (Odds Ratio) Cohort (Col1 Risk) Cohort (Col2 Risk) Sample Size = 90

28 SAS Output Summary Statistics for trtmnt by response
Controlling for center Cochran-Mantel-Haenszel Statistics (Based on Table Scores) Statistic Alternative Hypothesis DF Value Prob ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ Nonzero Correlation <.0001 Row Mean Scores Differ <.0001 General Association <.0001

29 SAS Output Estimates of the Common Relative Risk (Row1/Row2)
Type of Study Method Value % Confidence Limits ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ Case-Control Mantel-Haenszel (Odds Ratio) Logit Cohort Mantel-Haenszel (Col1 Risk) Logit Cohort Mantel-Haenszel (Col2 Risk) Logit Breslow-Day Test for Homogeneity of the Odds Ratios ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ Chi-Square DF Pr > ChiSq Total Sample Size = 180


Download ppt "If we can reduce our desire,"

Similar presentations


Ads by Google