Presentation is loading. Please wait.

Presentation is loading. Please wait.

Data analysis: cross-tabulation GAP Toolkit 5 Training in basic drug abuse data management and analysis Training session 11.

Similar presentations


Presentation on theme: "Data analysis: cross-tabulation GAP Toolkit 5 Training in basic drug abuse data management and analysis Training session 11."— Presentation transcript:

1 Data analysis: cross-tabulation GAP Toolkit 5 Training in basic drug abuse data management and analysis Training session 11

2 Objectives To introduce cross-tabulation as a method of investigating the relationship between two categorical variables To describe the SPSS facilities for cross-tabulation To discuss a range of simple statistics to describe the relationship between two categorical variables To reinforce the range of SPSS skills learnt to date

3 Bivariate analysis The relationship between two variables A two-way table: –Rows: categories of one variable –Columns: categories of the second variable

4 FrequencyPercentValid PercentCumulative Percent ValidMale125179.679.9 Female31422.020.1100.0 Total156599.6100.0 MissingSystem6.4 Total1571100.0 Gender

5 FrequencyPercentValid PercentCumulative Percent ValidSwallow79450.551.0 Smoke63440.440.791.7 Snort623.94.095.6 Inject301.9 97.6 12.002.1 97.7 15.001.1 97.8 23.0010.6 98.4 24.0011.7 99.1 25.005.3 99.4 34.004.3 99.7 234.005.3 100.0 Total155899.2100.0 MissingSystem13.8 Total1571100.0 Mode of ingestion Drug 1 Out-of-range values (note that none of the digits are > 5)

6 Cleaning Mode1 Save a copy of the original Recode the out-of-range values into a new value (for example,12, 15, 23, 24,25, 34, 234 into the value 8) Set the new value as a user-defined missing value (for example, 8 is declared a missing value and given the label “Out-of-range”).

7 FrequencyPercentValid PercentCumulative Percent ValidSwallow79450.552.2 Smoke63440.441.793.9 Snort623.94.198.0 Inject301.92.8100.0 Total152096.8100.0 MissingOut-of-range382.4 System13.8 Total513.2 Total1571100.0 Mode of ingestion Drug 1

8

9 Gender MaleFemaleTotal Swallow600194794 Smoke55377630 Snort441761 Inject201030 Total12712981515 Mode of ingestion Drug1 Row totals Joint frequencies Grand total Count Mode of ingestion Drug1 * Gender cross-tabulation Column totals

10 Percentages The difference in sample size for men and women makes comparison of raw numbers difficult Percentages facilitate comparison by standardizing the scale There are three options for the denominator of the percentage: –Grand total –Row total –Column total

11 Gender MaleFemaleTotal SwallowCount600194794 % of Total39.6%12.8%52.4% SmokeCount55377630 % of Total36.5%5.1%41.6% SnortCount441761 % of Total2.9%1.1%4.0% InjectCount201030 % of Total1.3%.7%2.0% TotalCount12712981515 % of Total80.3%19.7%100.0% Mode of ingestion Drug1 Marginal distribution Mode1 Joint distribution Mode1 & Gender Mode of ingestion Drug1 * Gender cross-tabulation Marginal distribution Gender

12 Mode of ingestion Drug1 * Gender cross-tabulation Gender MaleFemaleTotal SwallowCount600194794 % within Mode of ingestion Drug1 75.6%24.4%100.0% SmokeCount55377630 % within Mode of ingestion Drug1 87.8%12.2%100.0% SnortCount441761 % within Mode of ingestion Drug1 72.1%27.9%100.0% InjectCount201030 % within Mode of ingestion Drug1 66.7%33.3%100.0% TotalCount12712981515 % within Mode of ingestion Drug1 80.3%19.7%100.0% The distribution of Gender conditional on Mode1 Mode of ingestion Drug1

13 Mode of ingestion Drug1 * Gender cross-tabulation Gender MaleFemaleTotal SwallowCount600194794 % within Gender49.3%65.1%52.4% SmokeCount55377630 % within Gender45.4%25.8%41.6% SnortCount441761 % within Gender3.6%5.7%4.0% InjectCount201030 % within Gender1.6%3.4%2.0% TotalCount12712981515 % within Gender100.0% Mode of ingestion Drug1 The distribution of Mode1 conditional on Gender

14 Choosing percentages “Construct the proportions so that they sum to one within the categories of the explanatory variable.” Source: (C. Marsh, Exploring Data: An Introduction to Data Analysis for Social Scientists (Cambridge, Polity Press, 1988), p. 143. )

15

16

17

18

19 Dimensions Definitions of vertical and horizontal variables

20 Two-by-two tables Tables with two rows and two columns A range of simple descriptive statistics can be applied to two-by-two tables It is possible to collapse larger tables to these dimensions

21 Gender * White pipe cross-tabulation White pipe YesNoTotal MaleCount2909611251 % within Gender23.2%76.8%100.0% FemaleCount22292314 % within Gender7.0%93.0%100.0% TotalCount31212531565 % within Gender19.9%80.1%100.0% Gender

22 White pipe YesNo GenderMale0.23180.7682 Female0.07010.9299

23 Relative risk Divide the probabilities for “success”: –For example: P(Whitpipe=Yes|Gender=Male)=0.2318 P(Whitpipe=Yes|Gender=Female)=0.0701 Relative risk is 0.2318/0.0701=3.309 The proportion of males using white pipe was over three times greater than females

24 Odds The odds of “success” are the ratio of the probability of “success” to the probability of “failure” For example: - For males the odds of “success” are 0.2318/0.7682=0.302 - For females the odds of “success” are 0.0701/0.9299=0.075

25 Odds ratio Divide the odds of success for males by the odds of success for females For example: 0.302/0.075=4.005 The odds of taking white pipe as a male are four times those for a female

26

27 95% Confidence interval ValueLowerUpper Odds ratio for Gender (Male / Female) 4.0052.5476.299 For cohort white pipe = Yes3.3092.1845.012 For cohort white pipe = No.826.791.862 N of valid cases1565 Risk estimate Relative risk of “success” Relative risk of “failure” Odds ratio M/F

28 Exercise 1: cross-tabulations Create and comment on the following cross-tabulations: –Age vs Gender –Race vs Gender –Education vs Gender –Primary drugs vs Mode of ingestion Suggest other cross-tabulations that would be useful

29 Exercise 2: cross-tabulation Construct a dichotomous variable for age: Up to 24 years and Above 24 years Construct a dichotomous variable for the primary drug of use: Alcohol and Not Alcohol Create a cross-tabulation of the two new variables and interpret Generate Relative Risks and Odds Ratios and interpret

30 Summary Cross-tabulations Joint frequencies Marginal frequencies Row/Column/Total percentages Relative risk Odds Odds ratios


Download ppt "Data analysis: cross-tabulation GAP Toolkit 5 Training in basic drug abuse data management and analysis Training session 11."

Similar presentations


Ads by Google