Presentation is loading. Please wait.

Presentation is loading. Please wait.

Analysis of Discrete Variables

Similar presentations


Presentation on theme: "Analysis of Discrete Variables"— Presentation transcript:

1 Analysis of Discrete Variables
Gender (x1 = male, x2 = female) Educ. level (x1 = low, x2 = middle, x3 = high) 5-point scales (x1 = 1, x2 = 2, ..., x5 = 5) Diagnosis (x1 = Neurosis, x2 = Schizophrenia, ...)

2 Distribution of Discrete Variables: The General Case
x x x .... x 1 2 3 k p p p .... p 1 2 3 k

3 An Example of Discrete Distribution
xi: 1 2 3 pi: 0.20 0.35 0.40 0.05

4 Analysis of 1 discrete variable in 1 population

5 Distribution Fitting Hypothetical example: Which do you prefer best: Coke (x1), Pepsi (x2), or Fanta (x3)? Null hypothesis: H0: P(x1) = P(x2) = P(x3) = 1/3 Obtained frequencies (ni): Out of 150 Ss n1 = 80, n2 = 50, n3 = 20 Expected frequencies (ni): If H0 were true, one would expect for each ni.

6 Distribution Fitting with 2-test
The greater the difference between obtained (ni) and expected (ni) frequencies, the greater the likelihood that H0 is false. A possible measure of the difference: c2 = (n1 - n1)2/n1 + (n2 - n2)2/n (ng - ng)2/ng If H0 is true, the distribution of c2 is approximately chi-square, with df = g - 1.

7 Calculations ni: 80 50 20 S=150 ni: 50 50 50 S=150
= 36 > = c20.01 (df = 2) Thus we reject H0 and say: ‘The three proportions differ significantly.’

8 2-test c2 < c20.05 c2 ³ c20.05 Keep H0
Condition: ni ³ 5 H0: P(x1) = p1, P(x2) = p2 , ... , P(xg) = pg X-sample 0,6 c2 (df=1) 0,4 0,2 (df =g - 1) 0.95 0.05 c2 1 2 3 0.05 c2 < c20.05 c2 ³ c20.05 Keep H0 HA: For at least one i: P(xi) ¹ pi

9 Comparing 2 Populations by means of 1 Discrete Variable
Example: Is there a difference between males and females with respect to education level (EL)? H0: The distribution of EL is the same among males and females P(xi|Males) = P(xi|Females), (i = 1, 2, 3) x1 = Low, x2 = Middle, x3 = High

10 Two-Way Frequency Table
Low Middle High Total Male 16 32 32 n1=80 Female 18 45 27 n2=90 Total N=170

11 Two-Way Frequency Table: Row Percentages
Low Middle High Total Male 20% 40% 40% 100% Female 20% 50% 30% 100% Total % % %

12 2-test for Comparing Groups
If H0 is true then Follows c2-distribution with df=(g-1)·(h-1). Decision c2 < c20.05: Keep H0 (p > .05 n. s.). c2 ³ c20.05 : Reject H0 (p < .05 significant).

13 Comparison of Males and Females
Number of rows: g = 2 Number of columns: h = 3 Degrees of freedom: df = (2-1)×(3-1) = 2 Critical values: c20.1 = 4.605; c20.05 = 5.991; c20.01 = 9.210 Computed chi-square value: c2 = 2.155 Decision: Keep H0 (p > .10 n. s.).

14 General Case Condition: nij ³ 5 df = (g-1)×(h-1) nij= (ni×mj)/N
Samples X=x X=x X=x3 ... Total 1 2 Sample 1 n n n n 11 12 13 1 Sample 2 n n n n 21 22 23 2 nij= (ni×mj)/N ... Total m m m N 1 2 3 df = (g-1)×(h-1) Condition: nij ³ 5

15 Comparing the Distribution of 2 Variables in 1 Population
Example: Lecture about the disadvantages of smoking. Outcome: 8 of 36 students give up smoking, 3 start smoking. Any effect? H0: The proportion of smokers does not change. Indicatior of change: x1= positive change, x2 = negative change H0: P(x1) = P(x2)

16 Computation: McNemar’s test
FrequencyTable: Smoking Time 2: No Time 2: Yes Time 1: No a b = 8 Time 1: Yes c = 3 d Computation: McNemar’s test Condition: (b+c)/2 ³ 5, that is b+c ³ 10

17 More General Cases X is arbitrary, two related samples: Bowker’s test
X is dichotomous, h related samples: Cochran’s Q test

18 Relationship of 2 Discrete Variables
Girls at 15 Makes friends easily Test of independence = Comparisons

19 Table of Row Percentages
Girls at 15 Makes friends easily

20 Table of Column Percentages
Girls at 15 Makes friends easily

21 Strength of Relationship
Cramér’s contingency coefficient: Ordinally scaled variables: Kendall’s G Dichotomous variables: G= Yule’s Q


Download ppt "Analysis of Discrete Variables"

Similar presentations


Ads by Google