2014.3.25 1 Medical Statistics Medical Statistics Tao Yuchun Tao Yuchun 9

Slides:



Advertisements
Similar presentations
Estimation of Means and Proportions
Advertisements

Lecture (11,12) Parameter Estimation of PDF and Fitting a Distribution Function.
1 1 Slide © 2003 South-Western/Thomson Learning™ Slides Prepared by JOHN S. LOUCKS St. Edward’s University.
1 1 Slide STATISTICS FOR BUSINESS AND ECONOMICS Seventh Edition AndersonSweeneyWilliams Slides Prepared by John Loucks © 1999 ITP/South-Western College.
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Chapter 9 Hypothesis Testing Developing Null and Alternative Hypotheses Developing Null and.
10-1 Introduction 10-2 Inference for a Difference in Means of Two Normal Distributions, Variances Known Figure 10-1 Two independent populations.
12.The Chi-square Test and the Analysis of the Contingency Tables 12.1Contingency Table 12.2A Words of Caution about Chi-Square Test.
Probability & Statistics for Engineers & Scientists, by Walpole, Myers, Myers & Ye ~ Chapter 10 Notes Class notes for ISE 201 San Jose State University.
Chapter 3 Hypothesis Testing. Curriculum Object Specified the problem based the form of hypothesis Student can arrange for hypothesis step Analyze a problem.
Ch 15 - Chi-square Nonparametric Methods: Chi-Square Applications
Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Statistics for Business and Economics 7 th Edition Chapter 9 Hypothesis Testing: Single.
Inferences About Process Quality
Chapter 14 Tests of Hypotheses Based on Count Data
Chapter 9 Hypothesis Testing.
5-3 Inference on the Means of Two Populations, Variances Unknown
1 Confidence Interval for Population Mean The case when the population standard deviation is unknown (the more common case).
INFERENTIAL STATISTICS – Samples are only estimates of the population – Sample statistics will be slightly off from the true values of its population’s.
Presentation 12 Chi-Square test.
Chapter 13 Chi-Square Tests. The chi-square test for Goodness of Fit allows us to determine whether a specified population distribution seems valid. The.
The Chi-square Statistic. Goodness of fit 0 This test is used to decide whether there is any difference between the observed (experimental) value and.
Medical Statistics (full English class) Ji-Qian Fang School of Public Health Sun Yat-Sen University.
1 Chi-Square Test(one) Chapter 8. 2 Content test of fourfold data test of R×C table Multiple comparison of sample rates test of paired fourfold data Fisher.
AM Recitation 2/10/11.
Hypothesis Testing:.
Confidence Intervals and Hypothesis Testing - II
Fundamentals of Hypothesis Testing: One-Sample Tests
More About Significance Tests
McGraw-Hill/Irwin Copyright © 2007 by The McGraw-Hill Companies, Inc. All rights reserved. Statistical Inferences Based on Two Samples Chapter 9.
Chap 8-1 Copyright ©2013 Pearson Education, Inc. publishing as Prentice Hall Chapter 8 Confidence Interval Estimation Business Statistics: A First Course.
10-1 Introduction 10-2 Inference for a Difference in Means of Two Normal Distributions, Variances Known Figure 10-1 Two independent populations.
CHAPTER 18: Inference about a Population Mean
Statistical Decision Making. Almost all problems in statistics can be formulated as a problem of making a decision. That is given some data observed from.
Chapter 9: Non-parametric Tests n Parametric vs Non-parametric n Chi-Square –1 way –2 way.
Copyright © Cengage Learning. All rights reserved. 10 Inferences Involving Two Populations.
Maximum Likelihood Estimator of Proportion Let {s 1,s 2,…,s n } be a set of independent outcomes from a Bernoulli experiment with unknown probability.
Ch9. Inferences Concerning Proportions. Outline Estimation of Proportions Hypothesis concerning one Proportion Hypothesis concerning several proportions.
Chapter 9 Fundamentals of Hypothesis Testing: One-Sample Tests.
EMIS 7300 SYSTEMS ANALYSIS METHODS FALL 2005 Dr. John Lipp Copyright © Dr. John Lipp.
Chi- square test x 2. Chi Square test Symbolized by Greek x 2 pronounced “Ki square” A Test of STATISTICAL SIGNIFICANCE for TABLE data.
Copyright © 2010 Pearson Education, Inc. Slide
Medical Statistics Medical Statistics Tao Yuchun Tao Yuchun Practice 2.
Chapter 8 Parameter Estimates and Hypothesis Testing.
Chap 8-1 Fundamentals of Hypothesis Testing: One-Sample Tests.
Statistics 300: Elementary Statistics Sections 7-2, 7-3, 7-4, 7-5.
July, 2000Guang Jin Statistics in Applied Science and Technology Chapter 12. The Chi-Square Test.
© Copyright McGraw-Hill 2004
Medical Statistics Medical Statistics Tao Yuchun Tao Yuchun 5
Medical Statistics Medical Statistics Tao Yuchun Tao Yuchun Practice 3
Statistical Inference Statistical inference is concerned with the use of sample data to make inferences about unknown population parameters. For example,
1 Chi-square Test Dr. T. T. Kachwala. Using the Chi-Square Test 2 The following are the two Applications: 1. Chi square as a test of Independence 2.Chi.
Medical Statistics Medical Statistics Tao Yuchun Tao Yuchun 7
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Chapter 12 Tests of Goodness of Fit and Independence n Goodness of Fit Test: A Multinomial.
+ Section 11.1 Chi-Square Goodness-of-Fit Tests. + Introduction In the previous chapter, we discussed inference procedures for comparing the proportion.
Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall Statistics for Business and Economics 8 th Edition Chapter 9 Hypothesis Testing: Single.
Statistical Inference for the Mean Objectives: (Chapter 8&9, DeCoursey) -To understand the terms variance and standard error of a sample mean, Null Hypothesis,
Hypothesis Tests for 1-Proportion Presentation 9.
Chapter 11: Categorical Data n Chi-square goodness of fit test allows us to examine a single distribution of a categorical variable in a population. n.
Class Seven Turn In: Chapter 18: 32, 34, 36 Chapter 19: 26, 34, 44 Quiz 3 For Class Eight: Chapter 20: 18, 20, 24 Chapter 22: 34, 36 Read Chapters 23 &
Section 10.2 Objectives Use a contingency table to find expected frequencies Use a chi-square distribution to test whether two variables are independent.
Chapter 9 -Hypothesis Testing
Slides by JOHN LOUCKS St. Edward’s University.
Inference for the Mean of a Population
Chapter 9 Hypothesis Testing.
Lecture8 Test forcomparison of proportion
Chapter 4. Inference about Process Quality
Association between two categorical variables
Hypothesis Tests for 1-Sample Proportion
Chapter 9 Hypothesis Testing.
CONCEPTS OF ESTIMATION
CHAPTER 18: Inference about a Population Mean
Presentation transcript:

Medical Statistics Medical Statistics Tao Yuchun Tao Yuchun 9

Statistical Analysis of Enumeration Data Statistical Analysis of Enumeration Data 2. Statistical Inference for enumeration data

Sampling error of frequency Example Suppose the death rate is 0.2, if the rats are fed with a kind of poison. What will happen when we do the experiment on n=1, 2, 3 or 4 rat(s)?

5 In general In general, Supposed the population proportion is , sample size =n. The frequency is a random variable. When  is unknown and n is big enough, is approximately equal to

Example Example 9-1 HBV Surface antigen. 200 people were tested, 7 positive.

In theory If the sample size n is big enough, and observed frequency is p, then we have approximately

Confidence Interval of Probability If the sample size n is big enough, and observed frequency is p, then  95% Confidence interval:  99% Confidence interval:

Example Example 9-2 HBV Surface antigen. 200 people were tested, 7 positive. Calculate confidence interval for the π.

μ Distinguish between μ and  for sampling error and confidence interval

The hypothesis testing of proportion (Z test) (1) Comparison of sample proportion and population proportion ( One- sample Z test) Example Example 9-3 Cerebral infarction Cases Cure rate New Method 98 50% Routine 30%. 50% is sample proportion, p=50%. 30% is population proportion, π 0 =30%. 

Hypotheses and α : α= 0.05 Statistic Z : Decision rule : If |Z| ≥ Z α, then reject H 0 ; Otherwise, no reason to reject H 0 (accept H 0 ).

Z α is : Two sides: One side: Since |Z|=4.32 > Z 0.05 =1.96, reject H 0. New method is better than routine. (2) Comparison of two sample proportions ( Two-samples Z test) Example Example 9-4 Carrier rate of Hepatitis in B City: 522 people were tested, 24 carriers, p 1 = 4.06% (population carrier rate:  1 ); in Countryside: 478 people were tested, 33 carriers, p 2 = 6.90% (population carrier rate:  2 ).

α= 0.05

here p c is pooled estimation of two sample proportions, S p1-p2 is standard error of p 1 -p 2. Statistic Z : Decision rule : If |Z| ≥ Z α, then reject H 0 ; Otherwise, no reason to reject H 0 (accept H 0 ). Since |Z|=1.565 < Z 0.05 =1.96, not reject H 0. B City is same as Countryside for population carrier rate (  1 =  2 ).

Summary The parameter estimation and hypothesis testing of proportion are based on the normal approximation (when sample size is big enough). How big is enough? By experience, n  > 5 and n(1-  ) >5. np > 5 and n(1-p) >5 For sample: np > 5 and n(1-p) >5. If the sample size is not big, Z test can’t be used and there is no t-test for proportion. (see more detailed text book)

 9.4 Chi-square test The Z test can only be used for comparing  with a given  0 (one sample) or comparing  1 with  2 (two samples). If we need to compare more than two samples, Chi-square test is widely used.

(1) Basic idea of χ 2 test Given a set of actual frequency distribution A 1, A 2, A 3 … to test whether the data follow certain theory. If the theory is true, then we will have a set of theoretical frequency distribution: T 1, T 2, T 3 … Comparing A 1, A 2, A 3 … and T 1, T 2, T 3 …, If they are quite different, then the theory might not be true; Otherwise, the theory is acceptable.

(2) Chi-square test for 2×2 table Example Example 9-5 Acute lower respiratory infection TreatmentEffectNon-effectTotalEffect rate Drug A68(64.82) a6(9.18) b74 (a+b)91.89 % Drug B52(55.18) c11(7.82) d63(c+d)82.54 % Total120 (a+c)17 (b+d) % H:  1 =  2 H 0 :  1 =  2 H:  1 ≠  2 H 1 :  1 ≠  2 =0.05 α=0.05  1  2 here  1 is population effect rate for drug A,  2 is population effect rate for drug B.

To calculate the theoretical frequencies; If H  1 =  2  120/137 If H 0 is true,  1 =  2  120/137 T 11 =74  120/137 =64.82, T 21 =63  120/137=55.18 T 11 =74  120/137 =64.82, T 21 =63  120/137=55.18 T 12 =74  17/137 =9.18, T 22 =63  17/137=7.82 T 12 =74  17/137 =9.18, T 22 =63  17/137=7.82 To compare A and T by a statistic  2 ;

Chi-square test was invented Karl Pearson by Karl Pearson. Chi-square test is also called Pearson’s chi-square test. Karl Pearson chi-square distribution  If H 0 is true,  2 follows a chi-square distribution. = (row-1)(column-1) If the  2 value is big enough, we doubt about H 0, then reject H 0 !

ExampleFor Example 9-5 : = (row-1)(column-1)=(2-1)(2-1)=1,  2 α(ν) =  (1) =3.84, Now,  2 =2.734<3.84, then P > 0.05, H 0 is not rejected. We have no reason to say the effects of two treatments are different. Question: What is ?Question: What is  2 α(ν) ? Why, then ? Why  ?

χ2χ2 ν=3ν=3 ν=5ν=5 ν = 10 ν = 30 Chi-square distribution is a distribution for continuous variable. Chi-square distribution has a parameter-- (degree of freedom), it determines shape of  2 curve. The area under  2 curve is distribution of  2 probability. The  2 curves for different

The Table for  2 distribution.  2 critical value denotes  2 α(ν), α is probability, ν is degree of freedom. The area under the  2 curve means [ for  (1) ]:

 2 table For 2  2 table, there is a specific formula of chi- square calculation: ExampleFor Example 9-5 : 

Chi-square test required large sample. Pearson’s chi-square test statistic follows chi-square distribution approximately. (1)andevery (1) If n≥40, and every T i ≥ 5,  2 test is applicable; (2)or (2) If n < 40 or T i < 1,  2 test is not applicable, you Fisher’s Exact Test should use Fisher’s Exact Test; (3)andonly one (3) If n≥40, and only one 1≤T i < 5,  2 test needs adjustment. 2  2 tableFor 2  2 table :

 2 table The correction formula of  2 test for 2  2 table : 

Example Example 9-6 Hematosepsis TreatmentEffectiveNo effectTotalEffective rate (%) Drug A28 (26.09)2 (3.91) Drug B12 (13.91)4 (2.09) Total Here n=46>40, but T 12 =30  6/46=3.91< 5; T 22 =16  6/46=2.09< 5. You should use the correction formula of  2 test 2  2 table for 2  2 table :

(3) Chi-square test for R×C table Example Example 9-7 Leukaemia H: H 0 : The distributions of blood types in two populations are all same H: H 1 : The distributions are not all same

R×C table : The formula of  2 test statistic for R×C table : ExampleFor Example 9-7 : ν=(R - 1)(C - 1)=(2-1)(4-1)=3, Checked χ (3) =7.81, now χ 2 =1.84 < 7.81, then P > 0.05, H 0 is not rejected. The distributions of blood types in two populations are same.

Question: Why, thenQuestion: Why  2 =1.84 <  (3) =7.81, then ? P > 0.05 ? The answer is in this figure !The answer is in this figure !

(4) Caution for Chi-square test (1)2  2 tableR  C table contingency table2  2 table R  C table (1) Either 2  2 table or R  C table are all called contingency table. 2  2 table is a special case of R  C table. (2) (2) When R >2, “H 0 is rejected”only means there is difference among some groups. Does not necessary mean that all the groups are different. (3) (3) The  2 test requires large sample : By experience, The theoretical frequencies should be greater than 5 in more than 4/5 cells  The theoretical frequencies should be greater than 5 in more than 4/5 cells ;

The theoretical frequency in any cell should be greater than 1  The theoretical frequency in any cell should be greater than 1. Otherwise, we can not use chi-square test directly. If the above requirements are violated, what should we do? If the above requirements are violated, what should we do? (1) Increase the sample size. (2) Re-organize the categories, Pool some categories, or Cancel some categories. categories, or Cancel some categories.

C  You should know  You should know: Chi-square test Chi-square test is a very important method of Statistical inference for enumeration data !