A random sample of 300 doctoral degree

Slides:



Advertisements
Similar presentations
Chapter 11 Inference for Distributions of Categorical Data
Advertisements

Chapter 13: Inference for Tables
+ Chapter 11: Inference for Distributions of Categorical Data Section 11.2 Inference for Relationships.
Does Background Music Influence What Customers Buy?
Chapter 13: Inference for Distributions of Categorical Data
Copyright ©2011 Brooks/Cole, Cengage Learning More about Inference for Categorical Variables Chapter 15 1.
CHAPTER 11 Inference for Distributions of Categorical Data
Chapter 26: Comparing Counts. To analyze categorical data, we construct two-way tables and examine the counts of percents of the explanatory and response.
1 Chapter 20 Two Categorical Variables: The Chi-Square Test.
Presentation 12 Chi-Square test.
+ The Practice of Statistics, 4 th edition – For AP* STARNES, YATES, MOORE Chapter 11: Inference for Distributions of Categorical Data Section 11.2 Inference.
Analysis of Count Data Chapter 26
Analysis of two-way tables - Formulas and models for two-way tables - Goodness of fit IPS chapters 9.3 and 9.4 © 2006 W.H. Freeman and Company.
11.2 Inference for Relationships. Section 11.2 Inference for Relationships COMPUTE expected counts, conditional distributions, and contributions to the.
Analysis of two-way tables - Formulas and models for two-way tables - Goodness of fit IPS chapters 9.3 and 9.4 © 2006 W.H. Freeman and Company.
Lecture 9 Chapter 22. Tests for two-way tables. Objectives The chi-square test for two-way tables (Award: NHST Test for Independence)  Two-way tables.
Chapter 11 The Chi-Square Test of Association/Independence Target Goal: I can perform a chi-square test for association/independence to determine whether.
Analysis of Two-Way tables Ch 9
+ Chi Square Test Homogeneity or Independence( Association)
BPS - 5th Ed. Chapter 221 Two Categorical Variables: The Chi-Square Test.
Analysis of two-way tables - Inference for two-way tables IPS chapter 9.1 © 2006 W.H. Freeman and Company.
Chapter 11 Chi- Square Test for Homogeneity Target Goal: I can use a chi-square test to compare 3 or more proportions. I can use a chi-square test for.
+ The Practice of Statistics, 4 th edition – For AP* STARNES, YATES, MOORE Chapter 11: Inference for Distributions of Categorical Data Section 11.2 Inference.
Lecture 9 Chapter 22. Tests for two-way tables. Objectives (PSLS Chapter 22) The chi-square test for two-way tables (Award: NHST Test for Independence)[B.
AP STATISTICS LESSON (DAY 1) INFERENCE FOR TWO – WAY TABLES.
+ Chapter 11 Inference for Distributions of Categorical Data 11.1Chi-Square Goodness-of-Fit Tests 11.2Inference for Relationships.
Lecture PowerPoint Slides Basic Practice of Statistics 7 th Edition.
Lecture PowerPoint Slides Basic Practice of Statistics 7 th Edition.
The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers CHAPTER 11 Inference for Distributions of Categorical.
Chapter 13 Section 2. Chi-Square Test 1.Null hypothesis – written in words 2.Alternative hypothesis – written in words – always “different” 3.Alpha level.
Chapter 11: Categorical Data n Chi-square goodness of fit test allows us to examine a single distribution of a categorical variable in a population. n.
Class Seven Turn In: Chapter 18: 32, 34, 36 Chapter 19: 26, 34, 44 Quiz 3 For Class Eight: Chapter 20: 18, 20, 24 Chapter 22: 34, 36 Read Chapters 23 &
Chi Square Procedures Chapter 14. Chi-Square Goodness-of-Fit Tests Section 14.1.
11/12 9. Inference for Two-Way Tables. Cocaine addiction Cocaine produces short-term feelings of physical and mental well being. To maintain the effect,
 Check the Random, Large Sample Size and Independent conditions before performing a chi-square test  Use a chi-square test for homogeneity to determine.
Chi Square Test of Homogeneity. Are the different types of M&M’s distributed the same across the different colors? PlainPeanutPeanut Butter Crispy Brown7447.
Chapter 11: Inference for Distributions of Categorical Data
Chapter 11: Inference for Distributions of Categorical Data
Chapter 11: Inference for Distributions of Categorical Data
Chapter 11: Inference for Distributions of Categorical Data
Objectives (BPS chapter 23)
CHAPTER 11 Inference for Distributions of Categorical Data
Chapter 11: Inference for Distributions of Categorical Data
AP Stats Check In Where we’ve been… Chapter 7…Chapter 8…
Chapter 11: Inference for Distributions of Categorical Data
CHAPTER 11 Inference for Distributions of Categorical Data
Inference for Relationships
CHAPTER 11 Inference for Distributions of Categorical Data
Chapter 13: Inference for Distributions of Categorical Data
Chapter 11: Inference for Distributions of Categorical Data
CHAPTER 11 Inference for Distributions of Categorical Data
Chapter 11: Inference for Distributions of Categorical Data
Chapter 11: Inference for Distributions of Categorical Data
CHAPTER 11 Inference for Distributions of Categorical Data
Chapter 11: Inference for Distributions of Categorical Data
Chapter 11: Inference for Distributions of Categorical Data
Chapter 11: Inference for Distributions of Categorical Data
Chapter 11: Inference for Distributions of Categorical Data
CHAPTER 11 Inference for Distributions of Categorical Data
Chapter 11: Inference for Distributions of Categorical Data
Inference for Two Way Tables
Chapter 11: Inference for Distributions of Categorical Data
CHAPTER 11 Inference for Distributions of Categorical Data
CHAPTER 11 Inference for Distributions of Categorical Data
11.2 Inference for Relationships
CHAPTER 11 Inference for Distributions of Categorical Data
Chapter 11: Inference for Distributions of Categorical Data
Chapter 11: Inference for Distributions of Categorical Data
CHAPTER 11 Inference for Distributions of Categorical Data
Chapter 13: Chi-Square Procedures
Presentation transcript:

A random sample of 300 doctoral degree In recent years, a national effort has been made to enable more members of minority groups to have increased educational opportunities. You want to know if the policy of “affirmative action” and similar initiatives have ad any effect in this regard. You obtain information on the ethnicity distribution of holders of the highest academic degree, the doctor of philosophy degree for 1981: A random sample of 300 doctoral degree recipients in 2004 showed the following freq. distribution: Perform a goodness of test fit to determine if the distribution of dr. degrees in ’04 is significantly different from the distribution in 1981. In which categories have the greatest changes occurred, and in what direction? Race/Ethnicity Percent White, Non-Hispanic 78.9 Black, Non-Hispanic 3.9 Hispanic 1.4 Asian/Pac. Islander 2.7 American Indian/Alaskan Native 0.4 Nonresident Alien 12.8 Race/Ethnicity Count White, Non-Hispanic 189 Black, Non-Hispanic 10 Hispanic 6 Asian/Pac. Islander 14 American Indian/Alaskan Native 1 Nonresident Alien 80

Inference for Two-Way Tables For comparing 2 groups from a two-way table = CATEGORICAL! Recall: A 2-way table describes relationships between 2 categorical variables.

2 types of Chi-Squared Tests Test of Homogeneity of populations (14.1) Ho: Distributions of response variable = in all populations Ha: Distributions are not equal 2) Independence/Association Test (14.2) Ho: There is not an association between two categorical variables (i.e., categorical variables are independent) Ha: There is an association between two categorical variables (i.e., categorical variables are not independent) (Single sample from a single population; individuals classified according to two categorical variables)

Statistical Methods for many comparisons (2 parts) An overall test to see if there is good evidence of any differences among the parameters that we want to compare. A detailed follow-up analysis to decide which of the parameters differ and to estimate how large the differences are.

Example Chronic users of cocaine need the drug to feel pleasure. A 3-year study compared an antidepressant called desipramine with lithium and a placebo. The subjects were 72 chronic users of cocaine who wanted to break their drug habit. 24 of the subjects were randomly assigned to each treatment. Above are the counts and proportions who avoided relapse. Is this data good evidence that the proportion of successes for the 3 treatments differ in the population of all cocaine users?

“Fix” the table (if it needs it!) Arrange the data in a 2-way table with complete categories to show relationship between 2 categorical variables (relapse: yes/no; drug: desipramine, lithium, placebo) Calculate marginal totals Describe/look at data prior to jumping into the hypothesis test (preliminary analysis); comment on any relationship you see R x c table: R = rows, c = columns Each of the 6 counts occupies a cell of the table.

Hypothesis (Part I: Test) Ho: There is no association between treatment method and rate of relapse (i.e., treatment method and rate of relapse are independent). Ha: There is an association between treatment method and rate of relapse (i.e., treatment method and rate of relapse are dependent).

Expected Counts The expected count in any cell of a two-way table when Ho is true is: expected count = row total X column total table total If we have n independent tries and the probability of a success on each try is p, we expect np successes. Find the expected count for the cell in row 1 (despiramine) an column 2 (relapse).

Table Observations Because 2/3 of all subjects relapsed, we expect 2/3 of the 24 subjects in each group to relapse (if there are no differences among the treatments). Desipramine has fewer relapses (10) and more successes (14) than expected Placebo has fewer successes (4) and more relapses (20).  Overall: Despiramine does the best, followed by lithium, and then placebo.

The chi-square statistic Measure of how far the observed counts in a two-way table are from the expected counts. A sum of terms, one for each cell in the table (in our example, we have 6). Large chi-squared values: evidence against Ho because they say that the observed counts are far from what we would expect if Ho were true. Always 0 or positive Only 0 when the observed counts are exactly equal to the expected counts.

Degrees of Freedom/P-Value R = rows, C = columns, df = (r-1)(c-1) P-value (using the table) Area to the right of the chi-square under the chi-square density curve. Probability that chi-square would take a value as large as 10.5 if Ho were really true.

Chi-Square with Technology (note on p-value)

TI-83 Matrix (2nd x-1)/Edit/1:A  3 x 2 Matrix Entries: 14 10 6 18 6 18 4 20 Stat/Tests/C:Chi-Squared Test (Calc. then Draw) For Expected Counts, look in Matrix B (Matrix/2 (B), Enter) Verify that this agrees with Minitab results.

Part II: Follow up Analysis: Comment on observed and expected counts The study found significant differences among the proportion of successes for three treatments for cocaine addiction. Examine Sample proportions… The major differences between the proportions is that Desipramine had a higher success rate than Lithium or Placebo Comparing observed and expected counts in Minitab:  Desipramine has more successes, fewer failures that we would expect if all 3 treatments had the same success rate in population.  Lithium and placebo had fewer successes and more failures than expected. Largest component contributing to the overall chi-square: Desipramine (4.50) had more success than expected.

The chi-square test of association/independence The cocaine study is an experiment that assigned 24 addicts to each of 3 groups Each group was a sample from a separate population corresponding to a separate treatment The study design fixed the size of each sample (24) in advance, and the data recorded which of two outcomes occurred for each subject (relapse, no relapse); Ho: “no difference” took the form of “equal proportions of successes” in the 2 populations (but we can have different settings…)

Is high blood pressure dangerous Is high blood pressure dangerous? Medical researchers classified each of a group of men as “high” or “low” blood pressure, then watched them for 5 years. (Men with systolic blood pressure 140 mm Hg or higher were “high”; the others, “low.”) The following two-way table gives the results of the study. 1. Is high blood pressure associated with a higher death rate? Calculate and compare percents to answer this question. Make an appropriate graph to display the association. 2. Write null and alternative hypotheses for a chi-square analysis of these data. 3. State and verify the conditions for performing the inference procedure. 4. The chi-square statistic is 8.86. Complete the chi-square analysis of the blood pressure data and write your conclusion(s). Died Survived Low bp 21 2655 High bp 55 3283

A study of the career plans of young women and men sent questionnaires to all 722 members of the senior class in the College of Business Administration at the University of Illinois. One question asked which major within the business program the student had chosen. Here are the data from the students who responded: This is an example of a single sample classified according to two categorical variables (gender and major). 1. Describe the differences between the distributions of majors for women and men with percents, with a bar graph, and in words. 2. Verify that the expected cell counts satisfy the requirement for use of chi-square. Test the null hypothesis that there is no relationship between the gender of students and their choice of major. Give a P-value.  3. Which two cells have the largest terms of the chi-square statistic? How do the observed and expected counts differ in these cells? (This should strengthen your conclusions in #3.)What % of the students did not respond to the questionnaire? Why does this nonresponse weaken conclusions drawn from these data?  Female Male Accounting 68 56 Administration 91 40 Economics 5 6 Finance 61 59

The Chi-Square Test of Association/Independence Use the chi-square test of association/independence to test Ho: There is no relationship between two categorical variables when you have a 2-way table from a single SRS, with each individual classified according to both of two categorical variables.

Smoking and SES In a study of heart disease in male federal employees, researchers classified 356 volunteer subjects according to their SES (socioeconomic status) and their smoking habits. There were 3 categories of SES: high, middle, and low. Individuals were asked whether they were current smokers, former smokers, or had never smoked, producing three categories for smoking habits as well. We want to see if there is an association between SES and smoking habits.

C. S. Test of Association/Indep C.S. Test of Association/Indep.! Here is a 2 way table that summarizes the data: Does not compare several populations; it has classified observations from a single population in two ways (by smoking habits and SES). Both of these variables have 3 levels, so Ho:

Comparing the distribution of smoking habits among male federal employees having a specific SES