Presentation is loading. Please wait.

Presentation is loading. Please wait.

Inference for Categorical Data Chi-SquareCh.11. Facts about Chi-Square ► Takes only positive values and the graph is skewed to the right ► Test Statistic.

Similar presentations


Presentation on theme: "Inference for Categorical Data Chi-SquareCh.11. Facts about Chi-Square ► Takes only positive values and the graph is skewed to the right ► Test Statistic."— Presentation transcript:

1 Inference for Categorical Data Chi-SquareCh.11

2 Facts about Chi-Square ► Takes only positive values and the graph is skewed to the right ► Test Statistic ( on AP sheet) ► Conditions: Expected cells are at least 5 and observations are based on a random sample.

3 3 types ► Goodness of Fit Test ► Test of Independence ► Homogeneity

4 Goodness of Fit Test ► Is used to determine how well a set of observed values matches a set of expected values.

5 Goodness of Fit Test ► 1 Categorical Variable ► 1 Population ► df=n-1 (n is the number of categories) ► Expected counts is equal to proportion of sample size ► Large Test statistic means more evidence against the null hypothesis

6 Chi-Square Goodness of Fit Test (Example from 5 Book pg 279) ► The following are the approximate percentages for the different blood types: A: 40 % B:11% AB: 4% O: 45% A random sample of 1000 black Americans yielded the following blood type data: A- 270, B-200, AB- 40 and O- 490. Does this sample provide evidence that the distribution of blood types among black Americans differs from that of white Americans or could the sample values simply be due to sampling variation? One categorical variable- blood type One population- black Americans

7 Example Continue ► We need to compare the observed values in the sample with the expected values we would get if the sample of black Americans really had the same distribution of blood types as white Americans. Blood Type Observe d Values Expected Values A270.40(1000) =400 B200110 AB4040 O490450

8 Another example (yellow workbook pg 145) ► A Philadelphia newspaper report claims that 24.1 % of 18-to 24-year-olds who attend a local college are from Delaware, 15.4% are from New Jersey, 50.7% are from Pennsylvania, and the remaining 9.8% are from other states in the region. Suppose that a random sample (size 150) of 18-to-24 year olds is taken at the college and the number from each state/region is recorded.

9 Continue ► Suppose that a random sample (size 150) of 18-to-24 year olds is taken at the college and the number from each state/region is recorded. The following is our observed values StateNumber of Students Delaware30 New Jersey39 Pennsylvania71 Other10

10 Continue ► Do these data provide evidence at the α=.05 level that the newspaper report is correct? ► (Answer in workbook pg 146-147)

11 Test of Independence ► 1 Population ► 2 Categorical Variables ► df=( r-1)(c-1) Use matrix ► Null hypothesis: Two variables are independent in the population (not related) ► Alternate hypothesis: They are not independent in the population ( are related)

12 Example of Test of Independence (5 Book pg 284) ► A random sample of 400 residents of large western city are polled to determine their attitudes concerning the affirmative action admissions policy of the local university. The residents are classified according to ethnicity ( white, black, Asian) and whether or not they favor the affirmative action policy. The results are presented in the following table.

13 Attitude Toward Affirmative Action Favor Do Not Favor Total White130120250 Black7535110 Asian281240 Total233167400

14 Attitude towards Affirmative Action ► We are interested in whether or not, in this population of 400 citizens, ethnicity and attitude towards affirmative action are related ( we have 1 population and two categorical variables)

15 Another Example Test of Indep. (yellow workbook pg 150) ► A Survey was taken to determine if there is a relationship between students having computers in their homes and in their school divisions (elementary, middle, secondary). A random sample of size 250 produced the following results:

16 Continue- Computer in Home DivisionYesNo Elementary1461 Middle5025 Secondary8614

17 ► Is there evidence that school division and having a home computer are independent? ► Use a.05 level of significance.

18 Test of Homogeneity of Proportion or Populations ► 1 Categorical Variable and 2 or more populations ► Degrees of freedom: (r-1)(c-1) same as independent test. ► Null Hypothesis: p1=p2…. ► Alternate Hypothesis: p1 does not equal p2…

19 Example of Homogeneity (5 Book pg 288) ► We have a random sample of 20 males from the population of males in the school and another independent, random sample of 16 females from the population of females in the school. Within each sample we classify the students as Democrat, Republican, and Independent. The results are presented in the following table.

20 Continue DemocratRepublican Indepen dent Total Male117220 Female78116 Total1815336

21 Continue ► We are asking if the proportions of Democrats, Republicans, and Independents are the same within the populations of Males and Females.

22 Another example: Test of Homogeneity (yellow wb pg 148) ► The table shows the number of Central High School students who passed the AP Calculus AB exam. Has the distribution of scores changed over the past 3 years? Give appropriate statistical evidence to support your answer.

23 ScoreYear 1Year 2Year 3 5181511 4131211 3121413

24 ► Has there been a change in the distribution of passing grades on the AB Calculus exam over these three years?


Download ppt "Inference for Categorical Data Chi-SquareCh.11. Facts about Chi-Square ► Takes only positive values and the graph is skewed to the right ► Test Statistic."

Similar presentations


Ads by Google