4 normal probability plots at once par(mfrow=c(2,2)) for(i in 1:4) { qqnorm(dataframe[,1] [dataframe[,2]==i],ylab=“Data quantiles”) title(paste(“yourchoice”,i,sep=“”))}

Slides:



Advertisements
Similar presentations
Chi-Square Test Chi-square is a statistical test commonly used to compare observed data with data we would expect to obtain according to a specific hypothesis.
Advertisements

CHAPTER 23: Two Categorical Variables: The Chi-Square Test
CHAPTER 23: Two Categorical Variables The Chi-Square Test ESSENTIAL STATISTICS Second Edition David S. Moore, William I. Notz, and Michael A. Fligner Lecture.
Statistical Inference for Frequency Data Chapter 16.
Chapter 13: Inference for Distributions of Categorical Data
Analysis of frequency counts with Chi square
Copyright ©2011 Brooks/Cole, Cengage Learning More about Inference for Categorical Variables Chapter 15 1.
© 2010 Pearson Prentice Hall. All rights reserved The Chi-Square Test of Independence.
Chapter 26: Comparing Counts
CHAPTER 11 Inference for Distributions of Categorical Data
PSY 340 Statistics for the Social Sciences Chi-Squared Test of Independence Statistics for the Social Sciences Psychology 340 Spring 2010.
Stat 512 – Lecture 13 Chi-Square Analysis (Ch. 8).
Chapter 26: Comparing Counts. To analyze categorical data, we construct two-way tables and examine the counts of percents of the explanatory and response.
1 Chapter 20 Two Categorical Variables: The Chi-Square Test.
Presentation 12 Chi-Square test.
Chapter 13 Chi-Square Tests. The chi-square test for Goodness of Fit allows us to determine whether a specified population distribution seems valid. The.
Goodness-of-Fit Tests and Categorical Data Analysis
Copyright © 2013, 2010 and 2007 Pearson Education, Inc. Chapter Inference on Categorical Data 12.
Analysis of two-way tables - Formulas and models for two-way tables - Goodness of fit IPS chapters 9.3 and 9.4 © 2006 W.H. Freeman and Company.
1 Desipramine is an antidepressant affecting the brain chemicals that may become unbalanced and cause depression. It was tested for recovery from cocaine.
Chapter 11 Chi-Square Procedures 11.3 Chi-Square Test for Independence; Homogeneity of Proportions.
Chi-square (χ 2 ) Fenster Chi-Square Chi-Square χ 2 Chi-Square χ 2 Tests of Statistical Significance for Nominal Level Data (Note: can also be used for.
Chapter 26 Chi-Square Testing
Chi-Square X 2. Parking lot exercise Graph the distribution of car values for each parking lot Fill in the frequency and percentage tables.
Chi-Square Procedures Chi-Square Test for Goodness of Fit, Independence of Variables, and Homogeneity of Proportions.
CHAPTER 11 SECTION 2 Inference for Relationships.
+ Chi Square Test Homogeneity or Independence( Association)
BPS - 5th Ed. Chapter 221 Two Categorical Variables: The Chi-Square Test.
Data Analysis for Two-Way Tables. The Basics Two-way table of counts Organizes data about 2 categorical variables Row variables run across the table Column.
Chapter 11 Chi- Square Test for Homogeneity Target Goal: I can use a chi-square test to compare 3 or more proportions. I can use a chi-square test for.
Statistical Significance for a two-way table Inference for a two-way table We often gather data and arrange them in a two-way table to see if two categorical.
Copyright © 2010 Pearson Education, Inc. Slide
Section 10.2 Independence. Section 10.2 Objectives Use a chi-square distribution to test whether two variables are independent Use a contingency table.
Reasoning in Psychology Using Statistics Psychology
Nonparametric Tests of Significance Statistics for Political Science Levin and Fox Chapter Nine Part One.
12/23/2015Slide 1 The chi-square test of independence is one of the most frequently used hypothesis tests in the social sciences because it can be used.
Section 12.2: Tests for Homogeneity and Independence in a Two-Way Table.
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 11 Analyzing the Association Between Categorical Variables Section 11.2 Testing Categorical.
ContentFurther guidance  Hypothesis testing involves making a conjecture (assumption) about some facet of our world, collecting data from a sample,
Lecture PowerPoint Slides Basic Practice of Statistics 7 th Edition.
Chapter 13- Inference For Tables: Chi-square Procedures Section Test for goodness of fit Section Inference for Two-Way tables Presented By:
Lecture PowerPoint Slides Basic Practice of Statistics 7 th Edition.
BPS - 5th Ed. Chapter 221 Two Categorical Variables: The Chi-Square Test.
The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers CHAPTER 11 Inference for Distributions of Categorical.
Chapter 11: Categorical Data n Chi-square goodness of fit test allows us to examine a single distribution of a categorical variable in a population. n.
Class Seven Turn In: Chapter 18: 32, 34, 36 Chapter 19: 26, 34, 44 Quiz 3 For Class Eight: Chapter 20: 18, 20, 24 Chapter 22: 34, 36 Read Chapters 23 &
Section 10.2 Objectives Use a contingency table to find expected frequencies Use a chi-square distribution to test whether two variables are independent.
Chi Square Procedures Chapter 14. Chi-Square Goodness-of-Fit Tests Section 14.1.
AP Stats Check In Where we’ve been… Chapter 7…Chapter 8… Where we are going… Significance Tests!! –Ch 9 Tests about a population proportion –Ch 9Tests.
11/12 9. Inference for Two-Way Tables. Cocaine addiction Cocaine produces short-term feelings of physical and mental well being. To maintain the effect,
CHAPTER 11 Inference for Distributions of Categorical Data
Data Analysis for Two-Way Tables
AP Stats Check In Where we’ve been… Chapter 7…Chapter 8…
Reasoning in Psychology Using Statistics
Hypothesis Testing and Comparing Two Proportions
Chapter 11: Inference for Distributions of Categorical Data
CHAPTER 11 Inference for Distributions of Categorical Data
Chapter 10 Analyzing the Association Between Categorical Variables
CHAPTER 11 Inference for Distributions of Categorical Data
Analyzing the Association Between Categorical Variables
Reasoning in Psychology Using Statistics
CHAPTER 11 Inference for Distributions of Categorical Data
Reasoning in Psychology Using Statistics
CHAPTER 11 Inference for Distributions of Categorical Data
CHAPTER 11 Inference for Distributions of Categorical Data
Inference for Two Way Tables
CHAPTER 11 Inference for Distributions of Categorical Data
CHAPTER 11 Inference for Distributions of Categorical Data
CHAPTER 11 Inference for Distributions of Categorical Data
CHAPTER 11 Inference for Distributions of Categorical Data
Presentation transcript:

4 normal probability plots at once par(mfrow=c(2,2)) for(i in 1:4) { qqnorm(dataframe[,1] [dataframe[,2]==i],ylab=“Data quantiles”) title(paste(“yourchoice”,i,sep=“”))} These plots can be produced by going to “file” and “new” and “script file”. Paste the commands into the script file window, press “F10” and the four plots are produced automatically. 4 histograms all at once Same as above, but instead of qqnorm, use hist, and you only need one column rather than dataframe 1 and 2. Also, don’t forget to change your label.

Lab: Chi-Squared Test (X 2 ) Lack of Fit November 10, 2000

History n Invented in 1900 n Oldest inference procedure still used in its original form n English statistician Karl Pearson

The X 2 Test n When you have data values for two categorical variables n Also called a two-way table n For example: men/women and NSOE track; regenerated seaweed (yes/no) and access level (limpet only/limpet and fish/etc).

Example: Why do Men and Women Participate in Sports? n Desire to win or do better than others –called social comparison n Desire to improve one’s skills or to do one’s best –called mastery

Data n Collected from 67 male and 67 female undergraduate students at a large university n Survey given asking about students’ sports goals. n Students were all categorized either high or low with regard to both of the questions: –high or low social comparison –high or low mastery Duda, Joan L., Leisures Sciences, 10(1988), pp

Groups n This leads to four groups: –High social comparison, high mastery. –High social comparison, low mastery. –Low social comparison, high mastery –Low social comparison, low mastery n We want to compare this for men and women.

1. Add Totals Column: In this case, what population the observation comes from.. Row: Categorical response variable Grand total

A Cell A table with r rows and c columns contains r x c cells

X 2 is really an analysis of 5 things in this table: n Frequency (actual count) n Percent of overall total n Percent of row n Percent of column n Expected count

Frequency: Just the cell count

Overall Percent: Cell count divided by grand total 14/134= That is, 10.5% of all those studied were HSC- HM and female.

Row Percent: Cell count divided by row total 14/45= That is, of all those students reporting HSC- HM,31% were female.

Column Percent: Cell count divided by column total 14/67= That is, of all female student participant s, 21% were HSC- HM..

Expected Count n Coming later to a slide near you...

These percents are useful in graphical analysis. n Overall, row, and column percent can be calculated for each cell n Then questions of interest can be asked n We are interested in the effect of sex on sports goals. n In this case, we would examine the column percents

Column percents for sports goals

Surprise, surprise - we want to ask whether these apparently obvious differences are significant. n Can these differences be attributed to chance? n Calculate the chi-square and compare to a chi-square distribution n Determine the p-value n A low p-value means we reject our null hypothesis (sound familiar?)

The hypotheses: Null n No association exists between our row and our column variables –No association exists between sex and sports goals –The distributions of sports in the male and female populations are the same.

The hypotheses: Alternative n Alternative: An association exists between the row and column variables –No particular direction (not one- or two- sided) –The distributions of sports goals in the male and female populations are not all the same. –Includes many kinds of possible associations –“Men rate social comparison higher as a goal than do women”

OK: Now back to the Expected Count n If the null hypothesis were true, what would the count in each cell be? n For women in the HSC-HM cell, it would work like this: –33.6% of all respondents are HSC-HM –We have 67 women –So, if no sex difference exists (our null), we would expect that 33.6% of our 67 women would be HSC-HM --> 22.5 women.

Expected Count 1. 45/134=33.6 % of all respondent s are HSC- HM % of 67 women is 22.5.

Finally: The Chi-Squared Statistic Itself n Compare the entire set of observed counts with the set of expected counts. n Take the difference in each cell between observed and expected n Square each difference n Normalize these (divide by the expected count) n Sum over all cells.

The Formula: n Large values of X 2 provide evidence against the null hypothesis n A chi-square distribution is used to obtain the p-value n Degrees of freedom are (r-1)(c-1)

In this case... n Chi-squared = on 3 df. n The p-value is less than n The chance of obtaining a chi-squared value greater than or equal to this due to chance alone is very small n Clear evidence against the null hypothesis n Strong evidence that female and male students have different distributions of sports goals.

Is that all you can say? n No, you can and should combine the test with a description that shows the relationship. –Percents in our earlier table and our graph –Summary comments: the percent fo males in each of the HSC goal classes is more than twice the percent of females. –The HSC-HM group contains 46% of the males, but only 21% of the females –The HSC-LM group contains 27% of the males and only 10% of the females –We conclude that males are more likely to be motivated by social comparison goals and females are more likely to be motivated by mastery goals.

Important to remember: n The approximation of the population chi- square by our estimate becomes more accurate as the cell counts increase. n For 2 x 2 tables, the expected count in each of the 4 cells must be five or higher. n For tables larger than 2 x 2, the average of the expected counts must be 5 or higher, and the smallest expected count must be 1 or more.

Important to remember: n This is sometimes called the chi-squared test for homogeneity or the chi-squared test of independence. n Although this is is one of the most widely used of statistical tools, it is also one of the least informative. –The only thing you produce is a p-value and there is no associated parameter to describe the degree of dependence –the alternative hypothesis is very general (that row and columns are not independent)