Bios 101 Lecture 6: Test of Significance Shankar Viswanathan, DrPH Division of Biostatistics, DEPH December 6, 2011.

Bios 101 Lecture 6: Test of Significance Shankar Viswanathan, DrPH Division of Biostatistics, DEPH December 6, 2011

In service Exam – Design related questions

Is particular medicine more effective than another?... Researcher would be interested in studies involving comparison of groups say Treatment Vs Control, Treatment A, versus Treatment B etc. Chance Variation Effect Variation

Significance (  ) How likely it is that an observed difference is due to chance when true difference is zero? The error of rejecting Null hypothesis when it is true is know as type I error or  error, usually referred as level of significance.

Power(1-  ) How likely we are to detect an effect for a given sample size, effect size and level of significance. When the null hypothesis is accepted when infact it is wrong is type II error or  error.

Various Probabilities of Hypothesis Testing Null hypothesis:The null hypothesis is the statement being tested; it represents what the experimenter doubts to be true.

Null hypothesis The hypothesis of ‘no difference’ or ‘no effect’ in the population is called null hypothesis. e.g. We will develop a procedure to test a particular type of diet has no effect on the mean cardiac output of people living in a small town. We call this hypothesis of no effect. Statistical Significance if the data are not consistent with the NH, the difference is said to be statistically significant..

Test of Significance A significance test enables us to measure the strength of evidence which the data supply concerning some proposition of interest. We are comparing the relative magnitude of the differences in the sample means with the amount of variability that would be expected from looking within the samples Comparison of two independent means t-test is used for measured variables in comparing two means. The student unpaired t-test compares two independent samples. Comparison of paired means Paired t-test compares two paired observation on the same individual or on matched individuals

t- distribution similar to normal distribution with wide tails assumes normality assumption and samples should have equal variance Principles of significance test 1. Set up null hypothesis and alternative hypothesis 2. find value of test statistic 3. refer the test statistic to a known distribution if the NH is true 4. find the P value of test statistic arising which is as or more extreme than that observed, if NH were true. 5. Conclude data are consistent or inconsistent with the NH

Comparison of 15-day mean comb weights of two lots of male chicks,one receiving sex harmone A (testosterone), the other C (dehydroandrosterone).

Test statistic for an experiment comparing two sample of equal size Har<-c(57,120,101, 137,119, 117, 104,73, 53, 68, 118, 106,89, 30,82,50,39,22,57, 32,96,31,88, 61) grp<-c(rep(1,12), rep(2,12)) t.test(Har~grp, data=Hardata) or HA<-c(57,120,101, 137,119, 117, 104,73, 53, 68, 118, 106) HC<-c(89, 30,82,50,39,22,57, 32,96,31,88, 61) t.test(HA,HC) Welch Two Sample t-test data: HA and HC t = 3.7176, df = 21.95, p-value = 0.001201 alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval: 18.27253 64.39414 sample estimates: mean of x mean of y 97.75000 56.41667 wilcox.test(HA,HC) Wilcoxon rank sum test with continuity correction data: HA and HC W = 124.5, p-value = 0.002674 alternative hypothesis: true location shift is not equal to 0

Gains in weights of two lots of female rats under two diets

Test statistic for an experiment comparing two sample of unequal size HP<-c(134,146,104,119,124,161,107,83, 113,129,97,123) LP<-c(70,118,101,85,107,132,94) t.test(HP,LP) Welch Two Sample t-test data: HP and LP t = 1.9107, df = 13.082, p-value = 0.07821 alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval: -2.469073 40.469073 sample estimates: mean of x mean of y 120 101 wilcox.test(HP,LP) Wilcoxon rank sum test with continuity correction data: HP and LP W = 62.5, p-value = 0.09083 alternative hypothesis: true location shift is not equal to 0

Test statistic for an experiment comparing two sample of unequal variance

Comparison of Paired Data (Correlated data) Twelve pre-school children were given a supplement of multipurpose food for a period four months. their skin fold thickness (in mm) were measured before the program and after the end of program. The question is whether there is any difference in the skin fold thickness between pre and post measurements.

Comparison of Paired Data (Correlated data)

Test statistic for an experiment comparing two related samples pre<-c(6,8, 8,6,5,9,6,7,6,6,4,8) post<- c(8,8,10,7,6,10,9,8,5,7,4,6) t.test(pre, post, paired=T) Paired t-test data: pre and post t = -1.9149, df = 11, p-value = 0.08186 alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval: -1.6120703 0.1120703 sample estimates: mean of the differences -0.75 wilcox.test(pre,post, paired=T) Wilcoxon signed rank test with continuity correction data: pre and post V = 11.5, p-value = 0.1049 alternative hypothesis: true location shift is not equal to 0

Two sided significance The null hypothesis specifies no direction for the difference nor does the alternative hypothesis One sided significance The alternative hypothesis specify a direction. E.g. active treatment is better than the placebo

Misuses of t-test t-test to non-normal data t-test to groups having unequal variances Unpaired t-test for paired data Multiple t-test t-test for repeated measures data

t-test to non-normal data: Table : In the study of comparisons of GSH hormone levels in acutely ill patients and controls, the investigator applied unpaired t-test for the following data. GroupNumberGSH unitsRange (n)Mean ± SD Patients154.9 ± 7.21.3 - 30.0 NS, t=1.1 Controls102.8 ± 1.71.3 - 6.6 Heterogeneous data - SD (7.2) > mean (4.9).

Appropriate statistical procedures: Nonparametric tests: T-test -> Mann-Whitney U-test (Wilcoxon rank-sum test) with the median and range values. Paired T-test->Wilcoxon sign-rank test Convert data ‘normal’ by suitable transformation (logarithmic, square root and inverse, etc.) and then apply t-test.

t-test to groups having unequal variances Table : In the comparison of hypothyroid and normal patients the investigator compared heart rate (part of the study) with t-test for the following data. GroupNumberGSH units (n)Mean ± SD Hypothyroid1661.80 ± 2.48, t=2.07, p<0.05, Normal2066.55 ± 9.69 t-test=2.07 Correct method: Modified t-test Modified t-test=2.11 since 2.07 < 2.11, the difference was NS.

Unpaired t-test for paired data The following table shows the study in which 11 women recorded their dietary intake for 60 consecutive days. Table : Mean daily intake over 11 pre-menstrual and 11 post-menstrual days. Subject Dietary Intake (KJ) Difference Pre-menstrual Post-menstrual 1526039101350 2547042201250 3564038851755 4618051601020 563905645745 6651546801835 7680552651540 8751559751540 975156790725 10823069001330 11877073351435 Mean6753.65433.21320.5 (SD)1142.11216.8366.7

For the above data set t un-paired =2.6(p < 0.05) t paired =11.94(p < 0.000001) Message: Unpaired t-test is not correct for the related data as it requires the assumption of independence between the two groups to be valid.

Multiple t-test Table : Comparison of blood glucose levels (mean ± SD) in 4 different groups Group A B C D n=984.67 ± 5.29105.78 ± 9.7793.11 ± 3.6288.44 ± 8.05 ComparisonCalculatedSignificanceModified LSD with Between t value by t testmultiple correction A-B 5.71P < 0.001P < 0.001 B-C 3.65P < 0.01P < 0.01 C-D 1.59NSNS A-C 3.94P < 0.01NS A-D 1.17NSNS B-D 4.11P < 0.001P < 0.001 The effective p-value for 6 comparison is 6  0.05 = 0.3 Appropriate approach: ANOVA, Modified LSD or Bonferroni Correction, Multivariate method

t-tests to repeated measurement data

Additional misuses: 1. t-test applied to more than two groups (without correction) 2. Application of several t-tests to many variables in a single study instead of multivariate test 3. Errors in the computation of t-test 4. Number of t tests to repeated measurement studies 5. Errors in the interpretation of results 6. One-tailed t-test to get significant result 7.Errors in the design of experiment How large is a large sample ? Reasonably safe with inferences about mean if sample is >100 for single sample or if both samples are > 50 for two samples

Bios 101 Lecture 6: Test of Significance Shankar Viswanathan, DrPH Division of Biostatistics, DEPH December 6, 2011.

Similar presentations

Presentation on theme: "Bios 101 Lecture 6: Test of Significance Shankar Viswanathan, DrPH Division of Biostatistics, DEPH December 6, 2011."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Bios 101 Lecture 6: Test of Significance Shankar Viswanathan, DrPH Division of Biostatistics, DEPH December 6, 2011.

Similar presentations

Presentation on theme: "Bios 101 Lecture 6: Test of Significance Shankar Viswanathan, DrPH Division of Biostatistics, DEPH December 6, 2011."— Presentation transcript:

Similar presentations

About project

Feedback