Making Inferences for Associations Between Categorical Variables: Chi Square Chapter 12 Reading Assignment pp. 463-482; 485.

Slides:



Advertisements
Similar presentations
Introduction to Hypothesis Testing
Advertisements

Lecture (11,12) Parameter Estimation of PDF and Fitting a Distribution Function.
Hypothesis: It is an assumption of population parameter ( mean, proportion, variance) There are two types of hypothesis : 1) Simple hypothesis :A statistical.
Hypothesis Testing A hypothesis is a claim or statement about a property of a population (in our case, about the mean or a proportion of the population)
Copyright © 2014 by McGraw-Hill Higher Education. All rights reserved.
Hypothesis Testing IV Chi Square.
S519: Evaluation of Information Systems
EPIDEMIOLOGY AND BIOSTATISTICS DEPT Esimating Population Value with Hypothesis Testing.
Chapter Seventeen HYPOTHESIS TESTING
Evaluating Hypotheses Chapter 9. Descriptive vs. Inferential Statistics n Descriptive l quantitative descriptions of characteristics.
Evaluating Hypotheses Chapter 9 Homework: 1-9. Descriptive vs. Inferential Statistics n Descriptive l quantitative descriptions of characteristics ~
Aaker, Kumar, Day Seventh Edition Instructor’s Presentation Slides
Chapter 9 Hypothesis Testing.
Ch. 9 Fundamental of Hypothesis Testing
BCOR 1020 Business Statistics
Chapter 9: Introduction to the t statistic
Inferential Statistics
Inferential Statistics
Copyright © Cengage Learning. All rights reserved. 11 Applications of Chi-Square.
AM Recitation 2/10/11.
Aaker, Kumar, Day Ninth Edition Instructor’s Presentation Slides
Hypothesis Testing:.
McGraw-Hill/IrwinCopyright © 2009 by The McGraw-Hill Companies, Inc. All Rights Reserved. Chapter 9 Hypothesis Testing.
Overview of Statistical Hypothesis Testing: The z-Test
Chapter 13 – 1 Chapter 12: Testing Hypotheses Overview Research and null hypotheses One and two-tailed tests Errors Testing the difference between two.
© 2008 McGraw-Hill Higher Education The Statistical Imagination Chapter 9. Hypothesis Testing I: The Six Steps of Statistical Inference.
Chapter 8 Inferences Based on a Single Sample: Tests of Hypothesis.
Fundamentals of Hypothesis Testing: One-Sample Tests
Chapter Thirteen Part I
Adapted by Peter Au, George Brown College McGraw-Hill Ryerson Copyright © 2011 McGraw-Hill Ryerson Limited.
Copyright © 2012 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 17 Inferential Statistics.
Hypothesis Testing. Steps for Hypothesis Testing Fig Draw Marketing Research Conclusion Formulate H 0 and H 1 Select Appropriate Test Choose Level.
Week 8 Chapter 8 - Hypothesis Testing I: The One-Sample Case.
Chapter 9: Testing Hypotheses
Chapter 11: Applications of Chi-Square. Count or Frequency Data Many problems for which the data is categorized and the results shown by way of counts.
Tests of Significance June 11, 2008 Ivan Katchanovski, Ph.D. POL 242Y-Y.
Statistics for Business and Economics Chapter 7 Inferences Based on a Single Sample: Tests of Hypotheses.
IE241: Introduction to Hypothesis Testing. We said before that estimation of parameters was one of the two major areas of statistics. Now let’s turn to.
1 In this case, each element of a population is assigned to one and only one of several classes or categories. Chapter 11 – Test of Independence - Hypothesis.
EMIS 7300 SYSTEMS ANALYSIS METHODS FALL 2005 Dr. John Lipp Copyright © Dr. John Lipp.
Chapter 7 Inferences Based on a Single Sample: Tests of Hypotheses.
Statistical Inference Statistical Inference involves estimating a population parameter (mean) from a sample that is taken from the population. Inference.
McGraw-Hill/Irwin Copyright © 2007 by The McGraw-Hill Companies, Inc. All rights reserved. Chapter 8 Hypothesis Testing.
1 Chapter 8 Introduction to Hypothesis Testing. 2 Name of the game… Hypothesis testing Statistical method that uses sample data to evaluate a hypothesis.
Chap 8-1 Fundamentals of Hypothesis Testing: One-Sample Tests.
14 Statistical Testing of Differences and Relationships.
CHAPTERS HYPOTHESIS TESTING, AND DETERMINING AND INTERPRETING BETWEEN TWO VARIABLES.
Dan Piett STAT West Virginia University Lecture 12.
© Copyright McGraw-Hill 2004
Statistical Inference Statistical inference is concerned with the use of sample data to make inferences about unknown population parameters. For example,
Chapter 14 – 1 Chi-Square Chi-Square as a Statistical Test Statistical Independence Hypothesis Testing with Chi-Square The Assumptions Stating the Research.
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Chapter 12 Tests of Goodness of Fit and Independence n Goodness of Fit Test: A Multinomial.
Chapter 12 Chi-Square Tests and Nonparametric Tests.
Chapter 13 Understanding research results: statistical inference.
The Chi Square Equation Statistics in Biology. Background The chi square (χ 2 ) test is a statistical test to compare observed results with theoretical.
Hypothesis Tests for 1-Proportion Presentation 9.
Chapter 11: Categorical Data n Chi-square goodness of fit test allows us to examine a single distribution of a categorical variable in a population. n.
Class Seven Turn In: Chapter 18: 32, 34, 36 Chapter 19: 26, 34, 44 Quiz 3 For Class Eight: Chapter 20: 18, 20, 24 Chapter 22: 34, 36 Read Chapters 23 &
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. 1 FINAL EXAMINATION STUDY MATERIAL III A ADDITIONAL READING MATERIAL – INTRO STATS 3 RD EDITION.
Section 10.2 Objectives Use a contingency table to find expected frequencies Use a chi-square distribution to test whether two variables are independent.
4-1 Statistical Inference Statistical inference is to make decisions or draw conclusions about a population using the information contained in a sample.
Hypothesis Testing. Steps for Hypothesis Testing Fig Draw Marketing Research Conclusion Formulate H 0 and H 1 Select Appropriate Test Choose Level.
15 Inferential Statistics.
Statistics for Business and Economics
Chapter 8: Inferences Based on a Single Sample: Tests of Hypotheses
Chapter 11 – Test of Independence - Hypothesis Test for Proportions of a Multinomial Population In this case, each element of a population is assigned.
Chapter 9 Hypothesis Testing.
FINAL EXAMINATION STUDY MATERIAL III
Chapter 9 Hypothesis Testing.
Chapter 10 Analyzing the Association Between Categorical Variables
Presentation transcript:

Making Inferences for Associations Between Categorical Variables: Chi Square Chapter 12 Reading Assignment pp ; 485

Elements of a test of hypotheses 3 Hypothesis testing: Process for finding out whether we can generalize about an association from a sample to a population Null hypothesis : (H_0) Represents the status quo to the party performing the sampling experiment, i.e., will be accepted unless the data provides convincing evidence it is false. Research hypothesis: (H_1) (aka alternative hypothesis) Will be accepted only if the data provides convincing evidence of its truth Homework: Skills 1, p. 464

Process of Hypothesis Testing 5 Step 1: Specify a research hypothesis and a null hypothesis Step 2: Compute the value of a test statistic for the relationship Step 3: Calculate the degrees of freedom for the variables involved Step 4: Look up the distribution for the test statistic to find its critical value at a specified level of probability (to determine the likelihood that a test stat. of a particular value could have occurred by chance alone) Step 5: Decide whether to reject the null hypothesis

Null Hypothesis 3 Null Hypothesis(H_0): speculates there is no association between the two variables. Examples: – H_0: men are no different from women in there political affiliations – H_0: There is no relationship between a respondent’s educational level and his or her parents – H_0: Older people are no more likely to be happy than younger people This is the only hypothesis that can actually be tested- we either reject or fail to reject the null hypothesis EX: H_0: There is no association between age and happiness among American adults; hw/ read p. 466

2 Statistical Independence Statistical Independence: Two variables are statistically independent when changes in one variable (age of respondents) have nothing to do with changes in a second (happiness), ie, they vary independently of one another Conversely, when two variables are statistically dependent on one another, changes in one variable are associated with changes in a second variable.,ie, changes in age(older respondents) are associated with changes in levels of happiness (more happiness)

2 Statistical Independence and hypothesis testing Ex/ Null Hypothesis: Age is statistically independent of happiness, ie, differences among respondents to the variable age are unrelated to any differences in their levels of reported happiness – Hyp. Testing: can assess the likelihood that the degree of statistical indep found in the sample is due to chance – If we find that the degree of statistical indep found in the sample is not likely to be due to chance, null hyp is rejected – If it is likely due to chance, null hyp is accepted

3 Type I and Type II Errors “Mistakes “ arising from whether a given sample may or may not be representative of a population If a Null Hypothesis assumes there is no association between two variables, and we reject it even though there is no association is a Type I error, i.e, we call someone a liar when he is telling the truth If a Null Hypothesis assumes there is no association between two variables, and we accept it even though there is an association is a Type II error, i.e., we say someone is truthful when he is lying

3 Type I and Type II Errors ConclusionH_0 trueH_1 true H_0 trueCorrect decisionType II error H_1 trueType I errorCorrect decision

3 Elements of a Test of Hypothesis Null Hypothesis (H_0): a theory about one of the population parameters. The theory generally represents the status quo, which must be proven false Research Hypothesis (H_1): a theory that contradicts the null hypothesis. The theory generally represents the truth that will be accepted only if there is evidence Test statistic: Sample statistic used to decide whether to reject the null hypothesis

3 Elements of a Test of Hypothesis (cont) Rejection region: The numerical values of the test statistic for which the null hypothesis will be rejected. The rejection region is chosen so that the probability is  that it will contain the test statistic when the null hypothesis is true, thereby leading to a Type I error. The value of  chosen is usually small (e.g., 0.01,0.05, or 0.1), and is referred to as the level of significance of the test. A 0.05 (or 5%) level of significance indicates that there is a 5% chance that we would reject the hypothesis when we should not, or we have 95% confidence that we have made the right decision Assumptions: Clear statement(s) of any assumptions made about the population(s) being sampled

Experiment and calculation of test statistic Conclusion: – If the numerical value of the test statistic falls in the rejection region, we reject the null hypothesis and conclude that the research hypothesis is true. We know that hypothesis testing will led to this conclusion incorrectly (Type I Error) 100  % of the time when H_0 is true. – If the test statistic does not fall in the rejection region, we do not reject H_0. Thus we reserve judgment about which hypothesis is true. We do not conclude that the null hypothesis is true because we do not, in general, know the probability that our test procedure will lead to an incorrect failure to reject H_0 (Type II Error) Elements of a Test of Hypothesis (cont)

5 Chi-Square Formula 12.1 Observed vs. Expected: Roll a die 6 times, get three 3’s—observed; expected: one 3 Pp skills: Filling in the table of expected values Skills 3,4: Excel Generally, the greater the value of chi-square, the more statistical dependence between two variables

3 Chi-Square/degrees of freedom We are using observations from a sample as well as certain population parameters. If these parameters are unknown,they must be estimated from the sample. Degrees of Freedom ( ): the number N of independent observations in the sample (ie, sample size) minus the number k of population parameters which must be estimatede from sample observations = N – k When working with a contingency table, df=(r-1)(c-1), where r and c are the number of rows and columns (resp) in the contingency table

Chi squared example—generate random digits 250 times digit obs freq Exp freq 25

Chi squared example—generate random digits 250 times Question: Does the observed frequency differ from the expected distribution in a significant way? digit obs freq Exp freq 25

3 Chi-Squarerandom digit example  ^2 = (17-25)^2/25 + (31-25)^2/25 + (29-25)^2/25 + … + (36-25)^2/25= [excel] 23.3 Degrees of freedom: 10-1=9 Table, p. 545  ^2 at.99 is 21.7; 23.3> 21.7, so the observed frequency differs from the expected frequency at the 0.01 level of significance, so the table of “random” numbers is somewhat doubtful

3 Chi-Squarequestion 200 tosses of a fair coin, 115 heads, 85 tails. Test the hypothesis that the coin is fair using (a) 0.05, (b) 0.01 levels of significance Ans: Df=2-1=1 (2 for H,T) O1=115, O2=85; E1=E2=100  2=( )^2/100 + (85-100)^2/100 = 4.5 (a)  2 table for.95 is 3.84; 4.5>3.84, so reject hyp that coin is fair at the 0.05 level of significance (b)  2 table for.99 is 6.63; 4.5<6.63, so cannot reject hyp that coin is fair at the 0.01 level of significance

Interpreting Chi Square 4 When hypothesizing about an association between two variables, chi-square tells the likelihood that the degree of statistical dependence observed is simply the luck of the draw A p value of 0.05 tells that there are no more than 5 chances in 100 that the statistical dependence is due to chance. Thus, there are 95 chances in 100 that the statistical dependence found is not due to chance, so the null hypothesis, ie., no association between variables, is rejected The higher the value of p, the less likely we are to make a Type I error bility

Interpreting Chi Square 4 When hypothesizing about an association between two variables, chi-square tells the likelihood that the degree of statistical dependence observed is simply the luck of the draw A p value of 0.05 tells that there are no more than 5 chances in 100 that the statistical dependence is due to chance. Thus, there are 95 chances in 100 that the statistical dependence found is not due to chance, so the null hypothesis, ie., no association between variables, is rejected The higher the value of p, the less likely we are to make a Type I error bility

Interpreting Chi Square 4 P : Table 12.4 (p. 472) has  ^2 = , =6 The higher the  ^2 value, the less likely it is that the value obtained is due to chance. (read table 12.9, p. 481) Rule of thumb: reject null hypothesis when  ^2 reaches 0.05—only 5 chances in 100 that the dependence is due to chance Skills7, p. 481 Skills 8, p. 485 (following their example, p. 484)

4 Homework/ p. 492/ 1,3 P 494/ spss 1,2