Presentation is loading. Please wait.

Presentation is loading. Please wait.

PY 427 Statistics 1Fall 2006 Kin Ching Kong, Ph.D Lecture 12 Chicago School of Professional Psychology.

Similar presentations


Presentation on theme: "PY 427 Statistics 1Fall 2006 Kin Ching Kong, Ph.D Lecture 12 Chicago School of Professional Psychology."— Presentation transcript:

1 PY 427 Statistics 1Fall 2006 Kin Ching Kong, Ph.D Lecture 12 Chicago School of Professional Psychology

2 Agenda Correlation & Regression (Continue) see Lecture 11 outline Interpreting Correlations The Point-Biserial Correlation Introduction to Regression Chi-Square (  2 ) Parametric & Nonparametric Tests Chi-Square test for Goodness of Fit The Chi-Square Distribution & df The Chi-Square Test for Independence Measuring Effect Size for  2 Test of Independence Assumptions & Restrictions for Chi-Square Tests

3 Parametric & Nonparametric Tests Parametric Tests: Test hypotheses about population parameters (e.g. ,  1 –  2,  ) Make assumptions about the shape of the population distribution & population parameters (e.g. ANOVA requires normal population distributions & homogeneity of variance) Require data measured on interval or ratio scales. Nonparametric Tests: Distribution-free tests, make few (if any) assumptions about the population distribution) Nominal or ordinal scale data (mean and variance cannot be calculated) The data for many nonparametric tests are frequencies. NOTE: nonparametric tests are less sensitive (i.e. less powerful) then parametric tests. Always use parametric when you can.

4 Chi-Square (  2 ) Test for Goodness of Fit Chi-Square (  2 ) Test for Goodness of Fit: Is a hypothesis-testing procedure that uses sample proportions to test hypotheses about the shape or proportions of a population distribution. Examples proportion of male and female psychologists Of two brands of O.J., which is preferred by most Americans. To what extent are ethnic minority groups represented in post-graduate education. Evaluates how well the obtained sample proportions fit the population proportions specified by the null hypothesis.

5 The Null Hypothesis for Goodness-of-Fit Test The Null Hypothesis Specifies the proportion (or percentage) of the population in each measurement category. e.g. 90% of lawyers are men, 10% are women. H 0 : MenWomen General Types of Null Hypotheses: No Preference among the categories e.g. no preference among 3 brands of soda H 0 : Brand X Brand Y Brand Z No Difference from a Known Population e.g. 60% American oppose the president’s foreign policy, 40% favor A researcher wonder if the same pattern exists among Europeans. H 0 : FavorOppose 90%10% 1/3 40%60%

6 The Data For the Goodness-of-Fit Test Data for the Goodness-of-Fit Chi-Square: Just count how many individuals in your sample in each category. This count is the observed frequency, f o e.g.observed frequencies for a sample of n = 40 classified into three personality types: Category A Category BCategory C n = 40  f o = n 15196

7 The Expected Frequencies The expected frequency, f e, for each category: is the frequency value predicted from the null hypothesis and the sample size (n). expected frequency = f e = pn e.g. H 0 : Category A Category B Category C According to this null hypothesis, how would a random sample of n = 40 distribute across the three categories? 25% of 40 =.25(40) = 10 in Category A 50% of 40 =.50(40) = 20 in Category B 25% of 40 =.25(40) = 10 in Category C 25%50%25%

8 The Chi-Square Statistic The  2 test for goodness of fit: measures how well the data (the observed frequencies) fit the distribution defined by the null hypothesis (the expected frequencies). Chi-square =  2 =  (f o – f e ) 2 f e the numerator of the  2 simply measure how much difference between the data and the null hypothesis for each category this difference is divided by the expected frequency to determine the relative size the obtained discrepancy between f o and f e

9 The Chi-Square Distribution & df The Typical Chi-Square Distribution: all  2 values are > 0 When H 0 is true, the data (f o ) should be close to the hypothesis (f e ). Chi-square should be small (close to 0). The Chi-Square Distribution is positively skewed. Figure 16.2 Degree of Freedom There is a whole family of chi-square distributions, the exact shape determined by the df df = C – 1, for the goodness-of-fit test C = number of categories Locating the Critical Region Table B.7

10 The Chi-Square Test for Goodness of Fit, Example Experiment: a psychologist presents an abstract painting with no obvious top or bottom to a sample of n = 50 participants, each participant was asked to hang the painting in whatever orientation looked the best. The Research Question: Are there any preferences among the four possible orientations. The Data Top up Bottom up Left side up Right side up 181778

11 The Chi-Square Test for Goodness of Fit, Example Step 1: State the hypotheses: H 0 : In the general population, there is no preference for any specific orientation Top up Bottom up Left side upRight side up H 1 : In the general population, one or more of the orientations is preferred over the others. Step 2: Local the critical region: df = C – 1 = 4 – 1 = 3 For df = 3 and  =.05, the critical values for  2 = 7.81 25%

12 The Chi-Square Test for Goodness of Fit, Example Step 3: Calculate the statistic: compute the expected frequencies (f e ) f e = pn = ¼(50) = 12.5 Observed frequencies Top up Bottom up Left side up Right side up Expected frequencies Top up Bottom up Left side up Right side up  2 =  (f o – f e ) 2 = (18 - 12.5) 2 + (17 – 12.5) 2 + (7 - 12.5) 2 + (8 – 12.5) 2 f e 12.5 12.5 12.5 12.5 = 8.08 Step 4: Make a decision: Since 8.08 is in the critical region, reject H 0 and conclude that the 4 orientations are not equally likely to be preferred. 12.5 181778

13 Chi-Square (  2 ) Test for Independence Chi-Square (  2 ) Test for Independence: Test whether or not there is a relationship between two categorical variables. Example: is there a relationship between personality and color preference? Each individual is measured/classified on the two variables The Data: the observed frequencies RedYellowGreenBlue Introvert50 Extrovert150 100 20 40 40 1031522 90172518

14 Chi-Square (  2 ) Test for Independence, the Null Hypothesis The General Null Hypothesis: The two variables being measured are independent (i.e. for each individual, the value obtained for one variable is not influenced by the value for the second variable. Two Different Forms of the Null Hypothesis: H 0 version 1 (similar to correlation): The data are viewed as a single sample with each individual measured on two variables. H 0 : For the general population, there is no relationship between color preference and personality. H 0 version 2 (similar to t test/ANOVA): The data are viewed as two (or more) separate samples representing two (or more) separate populations. H 0 : in the population, there is no difference between the distribution of color preferences for introverts and the distribution of color preferences for extroverts (i.e. the distributions have the same proportions). The two versions of H 0 are equivalent

15 The Observed & Expected Frequencies The observed frequencies, f o : The frequencies in the sample distribution. The expected frequencies, f e : Define an ideal hypothetical distribution in perfect agreement with the null hypothesis.

16 Calculating the Expected Frequencies, Step 1 H 0 : the frequency distribution of color preference has the same shape (same proportions) for both categories of personality. Step 1: determine the overall distribution of color preference, irrespective of personality type. RedYellowGreenBlue Introvert50 Extrovert150 100 20 40 40 Proportion prefer red: 100 out of 200 = 50% Proportion prefer yellow: 20 out of 200 = 10% Proportion prefer green:40 out of 200 = 20% Proportion prefer blue:40 out of 200 = 20%

17 Calculating the Expected Frequencies, Step 2 Step 2: apply this distribution of color preference to both categories of personality RedYellowGreenBlue Introvert50 Extrovert150 100 20 40 40 For the sample of Introvert: 50% choose red:f e = 0.50(50) = 25 10% choose yellow:f e =.10(50) = 5 20% choose green:f e =.20(50) = 10 20% choose blue:f e =.20(50) = 10 For the sample of Extrovert: 50% choose red:f e = 0.50(150) = 75 10% choose yellow:f e =.10(150) = 15 20% choose green:f e =.20(150) = 30 20% choose blue:f e =.20(150) = 30

18 Calculating the Expected Frequencies The Complete Set of Expected Frequencies Red YellowGreen Blue Introvert 50 Extrovert 150 100 20 40 40 This is the distribution predicted by the null hypothesis. A Simple Formula for f e : f e = f c f r n f c = frequency total for the column (column total) f r = frequency total for the row (row total) n = total number of individuals in the study 25510 751530

19 The Chi-Square Statistic The  2 test of independent Chi-square =  2 =  (f o – f e ) 2 f e (same formula as for test of goodness of fit) df = (R – 1)(C – 1) R = the number of rows C = the number of columns

20 Chi-Square Test for Independent, An Example A researcher is interested in the relationship between academic performance and self-esteem. Sample: n = 150 ten-year-old children The data (observed frequencies) Level of self-esteem High Medium Low Academic High 60 performance Low 90 30 75 45 173211 134334

21 The Chi-Square Test for Independence, An Example Step 1: State the hypotheses: Version 1: H 0 : In the general population, there is no relationship between academic performance and self-esteem. H 1 : There is a consistent, predictable relationship between academic performance and self-esteem. Version 2: H 0 : In the general population, the distribution of self-esteem is the same for high and low academic performers. H 1 : The distribution for self-esteem for high academic performers is different form the distribution for low academic performers. Step 2: Local the critical region: df = (R – 1)(C – 1) = (2 – 1)(3 – 1) = 2 For df = 2 and  =.05, the critical values for  2 = 5.99

22 The Chi-Square Test for Goodness of Fit, Example Step 3: Calculate the statistic: compute the expected frequencies (f e ) f e = f c f r n Level of self-esteem High Medium Low Academic High 60 performance Low 90 30 75 45 n = 150 123018 4527

23 The Chi-Square Test for Goodness of Fit, Example Step 3: Calculate the statistic: Calculate the chi-square  2 =  (f o – f e ) 2 f e = (17 - 12) 2 + (32 – 30) 2 + (11 - 18) 2 + (13 – 18) 2 12 30 18 18 + (43 - 45) 2 + (34 – 27) 2 45 27 = 8.23 Step 4: Make a decision: Since 8.23 is in the critical region, reject H 0 and conclude that there is a significant relationship between academic performance and self-esteem. Or, the data show a significant difference between the distribution of self-esteem for high academic performers versus low academic performers.

24 Effect Size for Chi-Square Test for Independence For 2 X 2 Matrix: Phi-Coefficient =  =  is a correlation and measures the strength of the relationship between the two variables. Sometime  is squared to obtain the percentage of variance accounted for, exactly the same as r 2 For Matrix Larger than 2 X 2: Cramer’s V = n = number of individuals in the study df* is the smaller of (C – 1) and (R – 1) Table 16.9: Interpreting Cramer’s V Table 16.9

25 Assumptions & Restrictions for Chi-Square Tests Independence of Observations One consequences is that each individual can be classified in only one category. The  all the cell frequencies = n Size of Expected Frequencies A chi-square test should not be performed when the expected frequency of any cell is less than 5. Very small f e can distort the chi-square statistic.


Download ppt "PY 427 Statistics 1Fall 2006 Kin Ching Kong, Ph.D Lecture 12 Chicago School of Professional Psychology."

Similar presentations


Ads by Google