Download presentation
Presentation is loading. Please wait.
Published byGodwin Oliver Modified over 9 years ago
1
1 The 2 test Sections 19.1 and 19.2 of Howell This section actually includes 2 totally separate tests goodness-of-fit test contingency table analysis Each has its own point, and requires different things Only thing in common - same formula Keep them separate in your mind!
2
2 Return to hypothesis testing We can test statistical significance, no prob need p and alpha (and a computer) Sometimes, no computer available can use tables to test statistical significance Little more work, but works just as well This method uses the same logic as the p value method
3
3 Testing Ho without a PC The strategy (new stuff is underlined) Step1: Set up Ho, Ha and decide on alpha Step 2: Calculate the statistic and df Step 3: Get the critical value from the table Step 4: Compare critical value to statistic
4
4 Step 1 Set up Ho and alpha - already know Ha - the alternative hypothesis If Ho is false, what do we believe then? (Ha) Ha represents the opposite of Ho eg. if Ho: r = 0 then Ha: r 0 If we reject Ho (because its false), then we must accept Ha as being true.
5
5 Step 2 Nothing different use appropriate formulas for stat and df!
6
6 Step 3 Get the critical value from the table (back of Howell) Use alpha and df to look it up Critical value: the value of your statistic at which p = alpha (the edge of the rejection region)
7
7 Step 4 Compare your stat to the crit value: Ignore any minuses (look only at value) If your calculated stat is more than the crit value, then p < alpha (ie. significance!) The test is significant if calculated value is greater than the crit value Reject the Ho, and accept the Ha. Pretty easy!
8
8 Example Lets use an r value: We get r = 0.61 with df = 10, alpha = 0.05 Is this significant? Critical value: use df and alpha on table D2 in Howell (significant values of the correlation coefficient) for alpha = 0.05 and df = 10, crit value = 0.576
9
9 Example Now we have the calculated value and crit value Calculated = 0.61 Critical = 0.576 Check: if calculated > critical, reject Ho 0.61 > 0.576, so we reject Ho The result is statistically significant!
10
10 Return to 2 Note: 2 only works with discrete data What is the point of 2 ? Goodness-of-fit: Used to see if data matches a hypothetical distribution Are there the same number of men as women? Are about 25% of South Africans unemployed? Contingency table analysis (independence test): used as a correlation for discrete data (are the variables related?)
11
11 Goodness-of-fit 2 Used to test a model distribution of data Have an idea of how data should be distributed eg. There should be 60% brunettes, 40% blondes Collect data, check to see if our idea (model) is supported by the data Does the data fit the model? Before starting a goodness-of-fit test, always be sure of what the model is
12
12 Creating a model We put our expectations as percentages on a table One cell of the table for each possible value of the variable Each cell has the percentage of observations we expect
13
13 Example model We expect 40% brunettes, 60% blondes, so BlondesBrunettes 60% 40%
14
14 Observed scores and Expected scores Strategy: Want to see if our observation matches our model We collect some data (Observed scores) We work out what the data would look like if our model were correct (Expected scores) Compare the two: do the observed scores show the same pattern as the expected scores?
15
15 Converting the model to expected scores We have our model as percentages We must now convert % to actual values (frequencies) - use n (number of observations) If we collected 134 observations, then BlondesBrunettes 60%40% BlondesBrunettes (60/100) x 134 = 80.4 (40/100) x 134 = 53.6
16
16 Converting % to frequency To do this: (percentage / 100) x n Keep the decimals! You cannot work with % for 2 - you must have frequencies (number of observations)
17
17 Beginning the 2 analysis To begin, need Ho For, 2 it is always “observed data = expected data” Need to state the model (in %) Collect the data Create an expected freq table (using your model and n) Calculate 2 to see if the observed = expected
18
18 2 Formula O = observed score E = expected score
19
19 2 formula, step by step Step 1: for each subject, that subject’s O minus that subject’s E Step 2: for each subject, square the step 1s. Step 3: for each subject, take their step2, and divide it by that subject’s E Step 4: sum all the step 3’s
20
20 Table method for 2 Use the following columns: O EO-E(O-E) 2 (O-E) 2 E Add up here
21
21 Degrees of freedom (df) The df for goodness-of-fit tests is easy to calculate: df = k-1 k is the number of possible values for your variable (categories) using males and females k = 2 using coke, pepsi, sprite k = 3 using easy, moderate, hard, awesome k =4
22
22 Worked example 1 We suspect that there is a 50%/50% gender distribution at UCT. We observed 147 people, 68 male, 79 female. Do we really have a 50%/50% distribution? Set up (step 1) Ho: Distribution is 50%/50% Ha: Distribution is not 50%/50% alpha = 0.05
23
23 Example: work out expected scores (What would we have seen if Ho were true?) Model: Males 50% Females 50% Convert to scores n = 147 Males expected: (50/100) x 147 = 73.5 Females expected: (50 / 100) x 147 = 73.5
24
24 Example: O and E values Now we have our values OEO-E(O-E)2 Value E Male Female 68 79 73.5
25
25 Example OEO-E(O-E)2 Value E Male Female 68 79 73.5 -5.5 5.5 30.25 0.411 - Work out the columns
26
26 Example OEO-E(O-E)2 Value E Male Female 68 79 73.5 -5.5 5.5 30.25 0.411 0.823 - Add up the values in the last column
27
27 Example - df Now we have our 2 value: 0.823 Is it statistically significant? (does the model explain the population?) Need the critical value for this! Degrees of freedom: k-1 2 categories (male, female) so df = 1
28
28 Example: critical value What is the critical value for our male/female example? Df: k = 2 (male and female), so df = 1 For df = 1 and alpha = 0.05, the table says: crit = 3.84 To be significant, our value must be more that 3.84
29
29 Example: conclusions Calculated < critical (0.823 < 3.84), so the Ho is true (this means: it is true that “distribution is 50%/50%) Conclusion: it seems that at UCT there are as many males as there are females.
30
30 Interpreting 2 findings 2 findings are interpreted a little differently False Ho (significance) means we cannot accept the model (the model is wrong for this population) True Ho (non-significance) means we must assume that the model applies to this population This is the case for goodness-of-fit tests
31
31 Contingency table analysis with 2 Pearson’s product moment allowed us to establish a relationship between 2 continuous variables doesn’t work for discrete data (categories) Eg. “is there are relationship between gender and owning a dog or cat?” (2 discrete variables) Contingency table analysis is used for this can work with nominal variables
32
32 Something old, something new Quite similar to goodness-of-fit tests Work out the expected values Use the chi square formula Work out df get a critical value from the table Differences: Slightly different O table New way of working out expected values New way of working out df
33
33 Observed values For each person, we ask 2 questions (2 vars) “are you male/female” and “do you have a dog or a cat”(let’s assume we sample only pet owners) We end up with: SubjectGenderPet 1MD 2MC 3FDetc.
34
34 O table We need to convert those data into a frequency table that looks like: MaleFemale Dog Cat GENDER PET
35
35 Filling in the O table Each cell has only one number in it number of people fitting that condition MaleFemale Dog Cat GENDER PET 12 3 4 In cell 1: number of people who are Male AND have a dog In cell 2: number of people who are Female AND have a dog In cell 3: number of people who are Male AND have a cat etc
36
36 The finished O table An o table usually looks like: MaleFemale Dog Cat GENDER PET 3634 7 32 We had 7 males with cats We had 34 females with dogs This table is a 2x2 table - 2 rows (pet) and 2 columns (gender)
37
37 Notes about O tables The numbers inside the cells are frequencies (just like goodness-of-fit) You can have as many levels of a variable as you like eg. dog, cat, parakeet, moose, hamster, other (6 levels) BUT you can only have 2 variables eg. not gender, pet AND car type
38
38 E values Expected values are a bit more tricky We want to finish with an E table, of the same form as the O table MaleFemale Dog Cat Expected Need to calculate a value for each cell we will use the O values to do this
39
39 Step 1: work out the grand total from the O table (N) Step 2: work out the marginal totals from the O table Step 3: use a formula (R i C j /N) to get a value for each cell of the E table E values, step by step
40
40 Step 1: Grand total (N) How many people did we use? Same idea as the usual n called capital N (for some reason) To calculate: Add up all the numbers in each of the cells So in the gender/pet example: N is 36+34+7+32 = 109 N = 109
41
41 Step 2: Marginal totals We can work out the total of the margins of the O table MaleFemale Dog Cat 3634 7 32 O 70 39 43 66 The marginal totals are written on the edges of the o table
42
42 Step 2: Calculating marginals For each marginal, add up the numbers in that line, so: MaleFemale Dog Cat 3634 7 32 O 36+34 = 70 7 + 32 = 39 36+7 = 43 34+32 = 66 Do the rows AND the columns!
43
43 Step 3: Work out E table Write your marginals around your blank E table - in the right places! MaleFemale Dog Cat E 70 39 43 66 We will now use the marginals to compute one E value for each cell The formula for E: E = R i x C j N
44
44 Step 3: Work out a single cell For each cell, look at the cell’s row and column marginal (R i and C j ) MaleFemale Dog Cat E 70 39 43 66 R = 70 C = 43 For Male/Dog Ri = 70 Cj = 43 The formula for E: E = 70 x 43 109 = 27.614 Do the same for each cell
45
45 Ready to calculate 2 Now we have O and E, ready to calculate 2 (using the same formula as before) MaleFemale Dog Cat 3634 7 32 OE 27.614 42.385 15.38523.614
46
46 Calculate 2 This is almost the same as for goodness-of- fit, but be careful in building your table (the O and the E columns) OEO-E(O-E)2 E 36 27.614 3442.385 7 15.385 32 23.614
47
47 Matching up the O and E columns Be careful!! Each type of response has an O and an E - match up the correct ones! Male/Cat has O = 7 and E = 15.385 Female/Dog has O=34 and E = 42.385 If you get the wrong E for an O, all your results are wrong!! Do it slowly.
48
48 Working out the table Step 1: O-E (go row by row, slowly) OEO-E(O-E)2 E 36 27.614 3442.385 7 15.385 32 23.614 8.385 -8.385 8.385
49
49 Working out the table Step 2: square the differences OEO-E(O-E)2 E 36 27.614 3442.385 7 15.385 32 23.614 8.385 -8.385 8.385 70.3136
50
50 Working out the table Step 3: divide the squares by E OEO-E(O-E)2 E 36 27.614 3442.385 7 15.385 32 23.614 8.385 -8.385 8.385 70.3136 2.546 1.658 4.57 2.977
51
51 Working out the table Step 4: sum the divisions to get chi squared OEO-E(O-E)2 E 36 27.614 3442.385 7 15.385 32 23.614 8.385 -8.385 8.385 70.3136 2.546 1.658 4.57 2.977 11.7528
52
52 Df for contingency tables Need to check the statistical significance of out chi value! Use df and alpha (exactly as in goodness-of- fit) df = (R-1)(C-1) number of rows-1 x number of columns-1 In our example, R = 2 (2 rows) and C = 2 (2 columns) (2-1)(2-1) = (1)(1) = 1 df = 1
53
53 Testing significance Look up the critical value in the chi square table using alpha and df if your calculated chi square is more than thew critical, then there is a relationship between the variables remember that Ho is “no relationship” if it is significant, then Ho is false - the si a relationship
54
54 Example: conclusion We calculated 2 to be 11.75 df = 1, alpha set to 0.05 Crit value = 3.84 Calc > crit, so reject Ho There is a relationship between gender and pet ownership!
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.