Is a persons’ size related to if they were bullied You gathered data from 209 children at Springfield Elementary School. Assessed: Height (short vs. not short) Bullied (yes vs. no)
Results Ever Bullied
Results Ever Bullied
Results Ever Bullied
Results Ever Bullied
Results Ever Bullied
Results Ever Bullied
Is this difference in proportion due to chance? To test this you use a Chi-Square (2) Notice you are using nominal data
Hypothesis H1: There is a relationship between the two variables i.e., a persons size is related to if they were bullied H0:The two variables are independent of each other i.e., there is no relationship between a persons size and if they were bullied
Logic 1) Calculate an observed Chi-square 2) Find a critical value 3) See if the the observed Chi-square falls in the critical area
Chi-Square O = observed frequency E = expected frequency
Results Ever Bullied
Observed Frequencies Ever Bullied
Expected frequencies Are how many observations you would expect in each cell if the null hypothesis was true i.e., there there was no relationship between a persons size and if they were bullied
Expected frequencies To calculate a cells expected frequency: For each cell you do this formula
Expected Frequencies Ever Bullied
Expected Frequencies Ever Bullied
Expected Frequencies Row total = 92 Ever Bullied
Expected Frequencies Row total = 92 Column total = 72 Ever Bullied
Expected Frequencies Ever Bullied Row total = 92 N = 209 Column total = 72 Ever Bullied
Expected Frequencies E = (92 * 72) /209 = 31.69 Ever Bullied
Expected Frequencies Ever Bullied
Expected Frequencies Ever Bullied
Expected Frequencies E = (92 * 137) /209 = 60.30 Ever Bullied
Expected Frequencies Ever Bullied E = (117 * 72) / 209 = 40.30
Expected Frequencies Ever Bullied The expected frequencies are what you would expect if there was no relationship between the two variables! Ever Bullied
How do the expected frequencies work? Looking only at: Ever Bullied
How do the expected frequencies work? If you randomly selected a person from these 209 people what is the probability you would select a person who is short? Ever Bullied
How do the expected frequencies work? If you randomly selected a person from these 209 people what is the probability you would select a person who is short? 92 / 209 = .44 Ever Bullied
How do the expected frequencies work? If you randomly selected a person from these 209 people what is the probability you would select a person who was bullied? Ever Bullied
How do the expected frequencies work? If you randomly selected a person from these 209 people what is the probability you would select a person who was bullied? 72 / 209 = .34 Ever Bullied
How do the expected frequencies work? If you randomly selected a person from these 209 people what is the probability you would select a person who was bullied and is short? Ever Bullied
How do the expected frequencies work? If you randomly selected a person from these 209 people what is the probability you would select a person who was bullied and is short? (.44) (.34) = .15 Ever Bullied
How do the expected frequencies work? How many people do you expect to have been bullied and short? Ever Bullied
How do the expected frequencies work? How many people would you expect to have been bullied and short? (.15 * 209) = 31.35 (difference due to rounding) Ever Bullied
Back to Chi-Square O = observed frequency E = expected frequency
2
2
2
2
2
2
2
Significance Is a 2 of 9.13 significant at the .05 level? To find out you need to know df
Degrees of Freedom To determine the degrees of freedom you use the number of rows (R) and the number of columns (C) DF = (R - 1)(C - 1)
Degrees of Freedom Rows = 2 Ever Bullied
Degrees of Freedom Rows = 2 Columns = 2 Ever Bullied
Degrees of Freedom To determine the degrees of freedom you use the number of rows (R) and the number of columns (C) df = (R - 1)(C - 1) df = (2 - 1)(2 - 1) = 1
Significance Look on page 691 df = 1 = .05 2critical = 3.84
Decision Thus, if 2 > than 2critical Reject H0, and accept H1 If 2 < or = to 2critical Fail to reject H0
Current Example 2 = 9.13 2critical = 3.84 Thus, reject H0, and accept H1
Current Example H1: There is a relationship between the the two variables A persons size is significantly (alpha = .05) related to if they were bullied
Seven Steps for Doing 2 1) State the hypothesis 2) Create data table 3) Find 2 critical 4) Calculate the expected frequencies 5) Calculate 2 6) Decision 7) Put answer into words
Example With whom do you find it easiest to make friends? Subjects were either male and female. Possible responses were: “opposite sex”, “same sex”, or “no difference” Is there a significant (.05) relationship between the gender of the subject and their response?
Results
Step 1: State the Hypothesis H1: There is a relationship between gender and with whom a person finds it easiest to make friends H0:Gender and with whom a person finds it easiest to make friends are independent of each other
Step 2: Create the Data Table
Step 2: Create the Data Table Add “total” columns and rows
Step 3: Find 2 critical df = (R - 1)(C - 1)
Step 3: Find 2 critical df = (R - 1)(C - 1) df = (2 - 1)(3 - 1) = 2 = .05 2 critical = 5.99
Step 4: Calculate the Expected Frequencies Two steps: 4.1) Calculate values 4.2) Put values on your data table
Step 4: Calculate the Expected Frequencies
Step 4: Calculate the Expected Frequencies
Step 4: Calculate the Expected Frequencies
Step 4: Calculate the Expected Frequencies
Step 5: Calculate 2 O = observed frequency E = expected frequency
2
2
2
2
2 8.5
Step 6: Decision Thus, if 2 > than 2critical Reject H0, and accept H1 If 2 < or = to 2critical Fail to reject H0
Step 6: Decision Thus, if 2 > than 2critical 2 = 8.5 2 crit = 5.99 Thus, if 2 > than 2critical Reject H0, and accept H1 If 2 < or = to 2critical Fail to reject H0
Step 7: Put it answer into words H1: There is a relationship between gender and with whom a person finds it easiest to make friends A persons gender is significantly (.05) related with whom it is easiest to make friends.
Effect Size Chi-Square tests are null hypothesis tests Tells you nothing about the “size” of the effect Phi (Ø) Can be interpreted as a correlation coefficient.
Phi Use with 2x2 tables N = sample size
Bullied Example Ever Bullied
2
Phi Use with 2x2 tables
SPSS
2 as a test for goodness of fit But what if: You have a theory or hypothesis that the frequencies should occur in a particular manner?
Example M&Ms claim that of their candies: 30% are brown 20% are red 20% are yellow 10% are blue 10% are orange 10% are green
Example Based on genetic theory you hypothesize that in the population: 45% have brown eyes 35% have blue eyes 20% have another eye color
To solve you use the same basic steps as before (slightly different order) 1) State the hypothesis 2) Find 2 critical 3) Create data table 4) Calculate the expected frequencies 5) Calculate 2 6) Decision 7) Put answer into words
Example M&Ms claim that of their candies: 30% are brown 20% are red 20% are yellow 10% are blue 10% are orange 10% are green
Example Four 1-pound bags of plain M&Ms are purchased Each M&Ms is counted and categorized according to its color Question: Is M&Ms “theory” about the colors of M&Ms correct?
Step 1: State the Hypothesis H0: The data do fit the model i.e., the observed data does agree with M&M’s theory H1: The data do not fit the model i.e., the observed data does not agree with M&M’s theory NOTE: These are backwards from what you have done before
Step 2: Find 2 critical df = number of categories - 1
Step 2: Find 2 critical df = number of categories - 1 df = 6 - 1 = 5 = .05 2 critical = 11.07
Step 3: Create the data table
Step 3: Create the data table Add the expected proportion of each category
Step 4: Calculate the Expected Frequencies
Step 4: Calculate the Expected Frequencies Expected Frequency = (proportion)(N)
Step 4: Calculate the Expected Frequencies Expected Frequency = (.30)(2081) = 624.30
Step 4: Calculate the Expected Frequencies Expected Frequency = (.20)(2081) = 416.20
Step 4: Calculate the Expected Frequencies Expected Frequency = (.20)(2081) = 416.20
Step 4: Calculate the Expected Frequencies Expected Frequency = (.10)(2081) = 208.10
Step 5: Calculate 2 O = observed frequency E = expected frequency
2
2
2
2
2
2 15.52
Step 6: Decision Thus, if 2 > than 2critical Reject H0, and accept H1 If 2 < or = to 2critical Fail to reject H0
Step 6: Decision Thus, if 2 > than 2critical 2 = 15.52 2 crit = 11.07 Thus, if 2 > than 2critical Reject H0, and accept H1 If 2 < or = to 2critical Fail to reject H0
Step 7: Put it answer into words H1: The data do not fit the model M&M’s color “theory” did not significantly (.05) fit the data
Practice Among women in the general population under the age of 40: 60% are married 23% are single 4% are separated 12% are divorced 1% are widowed
Practice You sample 200 female executives under the age of 40 Question: Is marital status distributed the same way in the population of female executives as in the general population ( = .05)?
Step 1: State the Hypothesis H0: The data do fit the model i.e., marital status is distributed the same way in the population of female executives as in the general population H1: The data do not fit the model i.e., marital status is not distributed the same way in the population of female executives as in the general population
Step 2: Find 2 critical df = number of categories - 1
Step 2: Find 2 critical df = number of categories - 1 df = 5 - 1 = 4 = .05 2 critical = 9.49
Step 3: Create the data table
Step 4: Calculate the Expected Frequencies
Step 5: Calculate 2 O = observed frequency E = expected frequency
2 19.42
Step 6: Decision Thus, if 2 > than 2critical Reject H0, and accept H1 If 2 < or = to 2critical Fail to reject H0
Step 6: Decision Thus, if 2 > than 2critical 2 = 19.42 2 crit = 9.49 Thus, if 2 > than 2critical Reject H0, and accept H1 If 2 < or = to 2critical Fail to reject H0
Step 7: Put it answer into words H1: The data do not fit the model Marital status is not distributed the same way in the population of female executives as in the general population ( = .05)
Practice In the past you have had a 20% success rate at getting someone to accept a date from you. What is the probability that at least 2 of the next 10 people you ask out will accept?
Practice p zero will accept = .11 p one will accept = .27 p zero OR one will accept = .38 p two or more will accept = 1 - .38 = .62
http://www.ds.unifi.it/VL/VL_EN/bernoulli/bernoulli2.html
Practice In 1693, Samuel Pepys asked Isaac Newton whether it is more likely to get at least one ace in 6 rolls of a die or at least two aces in 12 rolls of a die. This problems is known a Pepys' problem.
Binomial Distribution p = .67 p Aces
Binomial Distribution p = .62 p Aces
Practice In 1693, Samuel Pepys asked Isaac Newton whether it is more likely to get at least one ace in 6 rolls of a die or at least two aces in 12 rolls of a die. This problems is known a Pepys' problem. It is more likely to get at least one ace in 6 rolls of a die!
Practice Which is more likely: at least one ace with 4 throws of a fair die or at least one double ace in 24 throws of two fair dice? This is known as DeMere's problem, named after Chevalier De Mere. Blaise Pascal later solved this problem.