Material Taken From: Mathematics for the international student Mathematical Studies SL Mal Coad, Glen Whiffen, John Owen, Robert Haese, Sandra Haese and Mark Bruce Haese and Haese Publications, 2004 AND Mathematical Studies Standard Level Peter Blythe, Jim Fensom, Jane Forrest and Paula Waldman de Tokman Oxford University Press, 2012
Consider this table: How many people are in the sample? – How many males? – How many females? This is called a 2 x 2 contingency table. The Chi-Squared (Χ 2 ) Test
Gender vs. Regular Exercise Does a person’s gender influence their exercise?
Gender vs. Regular Exercise The variables may be dependent: – Females may be more likely to exercise regularly than males. The variables may be independent: – Gender has no effect on whether they exercise regularly. A chi-squared test is used to determine whether two variables from the same sample are independent. Notice: we are not testing if they are related only their independence
Example: Suppose you collect data on the favorite t-shirt color for men and women. You want to find out if shirt color and gender are independent or not. Perform a X 2 test. In order to perform this test there are four main steps.
Example: H o states that the data sets are independent. H 1 states that the data sets are not independent. Step 1: Write the null (H o ) and alternative (H 1 ) hypotheses. H o : _______ is independent of _______. H 1 : _______ is not independent of _______. Write them using this form: Thus: H o : Color of t-shirts is independent of gender. H 1 : Color of t-shirts is not independent of gender.
Example: First: put the data into a contingency table that shows the frequency of the two variables. – the elements in the table are the observed data. – it is good to include an extra row and column for the ‘totals.’ Step 2: Calculate the chi-squared (X 2 ) statistic. BlackWhiteRedBlueTotals Male Female Totals This is a 2 x 4 matrix formed only from the main entries (not the totals)
Example: Second: from the observed data, you need to calculate the expected frequencies and place these into an expected values table. Step 2: Calculate the chi-squared (X 2 ) statistic. Column 1Column 2Totals Row 1Sum Row 1 Row 2Sum Row 2 Sum Column 1Sum Column 2Total BlackWhiteRedBlueTotals Male Female Totals Note: The expected values can never be less than 1. The expected values must be 5 or higher. If the entries are between 1 and 5, combine table rows or columns.
Example: Third: To calculate out the X 2 value use the following formula: Step 2: Calculate the chi-squared (X 2 ) statistic … = 33.8 You will not need to do this calculation out by hand on exams - GDC!
Example: The critical value can be determined from the level of significance and degrees of freedom. The most common levels of significance are 1%, 5% and 10%. The Degrees of Freedom = (#Rows – 1) x (#Columns – 1) Step 3: Calculate the critical value On exams the critical value will be given!
Example: If the X 2 calc < Critical Value then do not reject the null hypothesis. If the X 2 calc > Critical Value then reject the null hypothesis. Step 4: Compare the X 2 calc with the critical value In our example the critical value at the 5% level is Since 33.8 > we reject the null hypothesis that t-shirt color and gender are independent of each other. Therefore, with 95% certainty gender and t-shirt color are dependent on each other.
Example: Set up significance level as a decimal. 1% = 0.01, 5% = 0.05, 10% = 0.10, etc. If p-value < decimal value then reject null hypothesis. Step 4: OR compare the p value with the significance level
How to do it: 1)Write the null hypothesis (H 0 ) and the alternate hypothesis (H 1 ). 2)Calculate X 2 calc : a)Using your GDC (in examinations) b)Using the formula (in project work) 3)Determine : a)The p-value by using your GDC b)The critical value (given in examinations) 4)Compare : a)The p-value against the significance level b) X 2 calc against the critical value
Practice: a)State the null hypothesis and the alternative hypothesis. b)Show that the expected frequency for female and strawberry flavor is approximately c)Write down the number of degrees of freedom. d)Write down the X 2 calc value for this data. The critical value is e)Using the critical value or the p-value, comment on your results. One hundred people were interviewed to find out which flavor ice cream the preferred. The results are given in the table, classified by gender. StwbryCoffeeOrangeVanillaTotals Male Female Totals Perform a chi-squared test at 5% significance level to determine whether the flavor of ice cream is independent of gender.
Practice: a)State the null hypothesis and the alternative hypothesis. b)Show that the expected frequency for female and billiards is approximately c)Write down the number of degrees of freedom. d)Write down the X 2 calc value for this data. The critical value is e)Using the critical value or the p-value, comment on your results. Members of a club are required to register for one of three games: billiards, snooker or darts. The number of each club members of each gender choosing each game is recorded. Perform a chi-squared test at 10% significance level to determine if the game chosen is independent of gender. BilliardsSnookerDarts Male39168 Female211417