Chi-square test
Chi-square test A chi-squared test, also written as χ2 test, is any statistical hypothesis test wherein the sampling distribution of the test statistic is a chi-squared distribution when the null hypothesis is true. Chi-squared tests are often constructed from a sum of squared errors, or through the sample variance. A chi-squared test can be used to attempt rejection of the null hypothesis that the data are independent. Like other tests, the purpose of the test is to evaluate how likely it is between the observations and the null hypothesis.
Chi-square test The chi-squared distribution is used in the common chi-squared tests for: Goodness of fit of an observed distribution to a theoretical one. The independence of two criteria of classification of qualitative data. In confidence interval estimation for a population standard deviation of a normal distribution from a sample standard deviation.
Chi-square test Chi- Square test of Independence Our question of interest is, are the two variables independent?. This question is set up using the following hypothesis statements: Null Hypothesis: two categorical values are independent. Alternative Hypothesis: two categorical values are dependent.
Chi-square test We can summarize two categorical variables within a two-way table, also called a rxc contingency table, where r= number of rows, c=number of columns. The chi-square test statistic is used by using the formula: Where O represents the observed frequency. E is the expected frecuency under the null hypothesis and computed by:
Chi-square test We will compare the value of the test statistic to the critical value of χ α 2 with the degree of freedom = (r-1)(c-1) and reject the null hypothesis if χ 2 > χ α 2
Chi-square test Example 1. Is gender independent of education level? A random sample of 395 people were surveyed and each person was asked to report the highest education level they obtained. The data that resulted from the survey is summarized in the following table:
Chi-square test Are gender and education level dependent at 5% level of significance? In other words, given the data collected above, is there a relationship between the gender of an individual and the level of education that they have obtained? Here's the table of expected counts: So, working this out, χ2=(60−50.886)2/50.886+⋯+(57−48.132)2/48.132=8.006 The critical value of χ2 with 3 degree of freedom is 7.815. Since 8.006 > 7.815, therefore we reject the null hypothesis and conclude that the education level depends on gender at a 5% level of significance
Solution in R library(MASS) # load the MASS package data<-matrix(c(60,54,46,41,40,44,53,57),ncol=4,byrow=TRUE) rownames(data)<-c("female","male") colnames(data)<c("HighSchool","Bachelors","Masters","Phd") data<- as.table(data) data chisq.test(data)
Chi-square test
Chi Square Interpretation
References References: http://www.r-tutor.com http://www.cyclismo.org/tutorial/R/types.html https://www.ijsr.net/archive/v3i8/MDIwMTU0ODU=.pdf Chun and Griffith (2013). Spatial Statistics and Geostatistics (Book) www.sphweb.bumc.bu.edu (Dr. Sullivan Notes) Master of Photogrammetry and Geoinformatics (Dr. Rawirl Notes)