Presentation is loading. Please wait.

Presentation is loading. Please wait.

Statistics Chi-Square

Similar presentations


Presentation on theme: "Statistics Chi-Square"— Presentation transcript:

1 Statistics Chi-Square

2 Hypothesis Tests So far we’ve tested hypotheses about means: μ = value use a z-test μ1= μ2 use a t-test μ1= μ2 = μ3 use an F-test (ANOVA) ? ? ?

3 WHAT ABOUT OTHER TYPES OF DATA AND OTHER HYPOTHESES ???
Hypothesis Tests WHAT ABOUT OTHER TYPES OF DATA AND OTHER HYPOTHESES ???

4 Tests for Count Data Another type of data would be counts (frequencies) in categories: Sibling Study Brothers Sisters Total Observed # 31 21 52

5 Tests for Count Data The hypothesis you would be testing would not be about mean values… Sibling Study Brothers Sisters Total Observed # 31 21 52

6 Tests for Count Data An alternate hypothesis might be: Are brothers and sisters equally likely? Ha: #bro ≠ #sis Sibling Study Brothers Sisters Total Observed # 31 21 52

7 Tests for Count Data The hypothesis can be based on science or history or some other “educated guess” Sibling Study Brothers Sisters Total Observed # 31 21 52

8 Tests for Count Data A null hypothesis you might have about this data would be: H0: #bro = #sis Sibling Study Brothers Sisters Total Observed # 31 21 52

9 … We’ll use a test called “Chi-Square” “X 2”
Tests for Count Data … We’ll use a test called “Chi-Square” “X 2” Sibling Study Brothers Sisters Total Observed # 31 21 52

10 Tests for Count Data A Chi-Square is shaped like an F distribution (both are squared)

11 Tests for Count Data A Chi-Square needs the original data and some “hypothesized” data Sibling Study Brothers Sisters Total Observed # 31 21 52 Hypothesized # 26

12 Tests for Count Data The “hypothesized” data are called “expected” values Sibling Study Brothers Sisters Total Observed # 31 21 52 Expected # 26

13 Tests for Count Data The hypothesized values must add up to the original total count Sibling Study Brothers Sisters Total Observed # 31 21 52 Expected # 26

14 Tests for Count Data They will come from the null (want-to-disprove) hypothesis H0: #bro = #sis Sibling Study Brothers Sisters Total Observed # 31 21 52 Expected # 26

15 Tests for Count Data To calculate the ChiSq, we use the formula: (𝑶−𝑬) 𝟐 𝑬

16 (𝑶−𝑬) 𝟐 𝑬 Calculate O-E Sibling Study Brothers Sisters Total
Chi-Square PROJECT QUESTION (𝑶−𝑬) 𝟐 𝑬 Calculate O-E Sibling Study Brothers Sisters Total Observed # 31 21 52 Expected # 26 O-E 5 -5

17 (𝑶−𝑬) 𝟐 𝑬 Square the O-E values
Chi-Square PROJECT QUESTION (𝑶−𝑬) 𝟐 𝑬 Square the O-E values Sibling Study Brothers Sisters Total Observed # 31 21 52 Expected # 26 (O-E)2 25

18 (𝑶−𝑬) 𝟐 𝑬 Divide them by E Sibling Study Brothers Sisters Total
Chi-Square PROJECT QUESTION (𝑶−𝑬) 𝟐 𝑬 Divide them by E Sibling Study Brothers Sisters Total Observed # 31 21 52 Expected # 26 (O-E)2/E 25/26

19 (𝑶−𝑬) 𝟐 𝑬 Add them up: 25/26 + 25/26 = 1.923076923
Chi-Square PROJECT QUESTION (𝑶−𝑬) 𝟐 𝑬 Add them up: 25/ /26 = Sibling Study Brothers Sisters Total Observed # 31 21 52 Expected # 26 (O-E)2/E 25/26

20 Ok, so our CHiSq statistic is 1.923076923 We need a probability!
Chi-Square PROJECT QUESTION Ok, so our CHiSq statistic is We need a probability! Sibling Study Brothers Sisters Total Observed # 31 21 52 Expected # 26 (O-E)2/E 25/26

21 Tests for Count Data Open the spreadsheet

22 In the row called “P Chi-Sq” Move your curser to the cell next to it
Tests for Count Data PROJECT QUESTION In the row called “P Chi-Sq” Move your curser to the cell next to it

23 Go to: “Formulas” “More Functions” “Statistical” “CHISQ.TEST”
Tests for Count Data PROJECT QUESTION Go to: “Formulas” “More Functions” “Statistical” “CHISQ.TEST”

24 Excel calls observed data “actual” (nobody else does…)
Tests for Count Data PROJECT QUESTION Excel calls observed data “actual” (nobody else does…)

25 Don’t include the total column
Tests for Count Data PROJECT QUESTION Don’t include the total column

26 There’s your probability value! Do you reject H0?
Tests for Count Data PROJECT QUESTION There’s your probability value! Do you reject H0?

27 Do you reject H0? Nope, the values are not different enough
Tests for Count Data PROJECT QUESTION Do you reject H0? Nope, the values are not different enough

28 Do you reject H0? We “fail to reject H0”
Tests for Count Data PROJECT QUESTION Do you reject H0? We “fail to reject H0”

29 Tests for Count Data PROJECT QUESTION What could we do?

30 What could we do? Increase n
Tests for Count Data PROJECT QUESTION What could we do? Increase n

31 Tests for Count Data The t-test and ANOVA F-test were designed to be powerful (reject H0 a lot) even with small sample sizes

32 Tests for Count Data A Chi-Square test is not very powerful It only rejects the hypothesis when the data are very VERY different

33 Tests for Count Data This means it is a very conservative test – nobody is going to think you cheated if you use a Chi-square!

34 Tests for Count Data It also means we don’t have to set a level of practical significance…

35 Tests for Count Data How different do the data have to be to be “statistically significant” (allow you to reject the hypothesized data) Add 1 to the “Brothers” and subtract 1 from the “Sisters”

36 Tests for Count Data PROJECT QUESTION 9.6% - not yet! Try again!

37 Tests for Count Data PROJECT QUESTION 5.2% - so close!!!! Try again!

38 Tests for Count Data PROJECT QUESTION 2.7% - finally !!!!

39 Tests for Count Data The data have to be: Before they are significantly different! Sibling Study Brothers Sisters Total Observed # 34 18 52 Expected # 26

40 Tests for Count Data What could you do to make it more likely that you would find a significant difference?

41 Tests for Count Data What could you do to make it more likely that you would find a significant difference? Use a bigger sample size “n”

42 Questions?

43 Tests for Count Data Most Chi-Squared tests don’t have specific hypothesized values

44 Tests for Count Data Most Chi-Squared tests don’t have specific hypothesized values The expected values come from the table of observed data

45 Tests for Count Data Ha: p1 ≠ p2 H0: p1 = p2

46 Tests for Count Data To compare two ways to removing plaque clogging arteries, Dr. Eric J. Topol and colleagues conducted a study

47 Tests for Count Data They randomly assigned 1,012 heart patients to have either directional coronary atherectomy or balloon angioplasty

48 Tests for Count Data Is there evidence of a significant difference in the two approaches in the proportion of deaths or heart attacks within 6 months of treatment?  

49 What would be Dr. Topol’s alpha-level?
Chi-Square PROJECT QUESTION What would be Dr. Topol’s alpha-level?

50 What would be his alternate hypothesis?
Chi-Square PROJECT QUESTION What would be his alternate hypothesis?

51 Chi-Square PROJECT QUESTION What would be his alternate hypothesis? Ha: p death or heart attack for directional atherectomy ≠ p death or heart attack for balloon angioplasty

52 What would be his null hypothesis?
Chi-Square PROJECT QUESTION What would be his null hypothesis?

53 Chi-Square PROJECT QUESTION What would be his null hypothesis? H0: p death or heart attack for directional atherectomy = p death or heart attack for balloon angioplasty

54 Died or suffered a heart attack Did not die or suffer a heart attack
Chi-Square PROJECT QUESTION Here are Dr. Topol’s results: Died or suffered a heart attack Did not die or suffer a heart attack Directional Atherectomy 44 468 Balloon Angioplasty 23 477

55 Died or suffered a heart attack Did not die or suffer a heart attack
Chi-Square PROJECT QUESTION Do you think there is a practically significant difference? Died or suffered a heart attack Did not die or suffer a heart attack Directional Atherectomy 44 468 Balloon Angioplasty 23 477

56 Died or suffered a heart attack Did not die or suffer a heart attack
Chi-Square PROJECT QUESTION How would you calculate the expected values??? Died or suffered a heart attack Did not die or suffer a heart attack Directional Atherectomy 44 468 Balloon Angioplasty 23 477

57 First calculate row and column totals and the grand total:
Tests for Count Data PROJECT QUESTION First calculate row and column totals and the grand total:

58 First calculate row and column totals and the grand total:
Tests for Count Data PROJECT QUESTION First calculate row and column totals and the grand total:

59 Calculate expected values using: Exp = (RowTot)(ColTot)/GrandTot
Tests for Count Data PROJECT QUESTION Calculate expected values using: Exp = (RowTot)(ColTot)/GrandTot

60 Exp(DirAth/Died) = (RowTot)(ColTot)/GrandTot = (512)(67)/1012
Tests for Count Data PROJECT QUESTION Exp(DirAth/Died) = (RowTot)(ColTot)/GrandTot = (512)(67)/1012

61 Fill in the other cells:
Tests for Count Data PROJECT QUESTION Fill in the other cells:

62 The expected values are:
Tests for Count Data PROJECT QUESTION The expected values are:

63 Now we need a Chi- Square:
Tests for Count Data PROJECT QUESTION Now we need a Chi- Square:

64 Now we need a Chi-Square:
Tests for Count Data PROJECT QUESTION Now we need a Chi-Square:

65 Now we need a Chi-Square:
Tests for Count Data PROJECT QUESTION Now we need a Chi-Square:

66 Can Dr. Topol reject the null hypothesis?
Tests for Count Data PROJECT QUESTION Can Dr. Topol reject the null hypothesis?

67 Chi-Square PROJECT QUESTION What could Dr. Topol do to make it more likely that he would find a significant difference?

68 Questions?

69 Tests for Count Data How would we tell which are different?

70 Tests for Count Data The hi-lo-close graph!

71 Tests for Count Data Or a 3D column chart

72 Tests for Count Data Note: Excel won’t handle an expected value of “0” – you must leave these out of your analysis

73 How Excel Should Be (but isn’t…)
Antarctic Cyclones Enter the observed data in blue cells in the table: 40-49S 50-59S 60-79S Fall 370 526 980 Winter 452 624 1200 Spring 273 513 995 Summer 422 1059 1751

74 Tests for Count Data PROJECT QUESTION Are there significant differences in cyclone count for different latitude and season categories?

75 Tests for Count Data PROJECT QUESTION

76 Questions?


Download ppt "Statistics Chi-Square"

Similar presentations


Ads by Google