Lab 5 Hypothesis testing and Confidence Interval
Outline One sample t-test Two sample t-test Paired t-test
Lab 5 One-sample t-test
One sample t-test The hypotheses : One sided Two sided
One sample t-test Test statistics
One sample t-test Conclusion Compare the test statistics with the critical value … Compare the p-value with the level of significance α (e.g. 0.05, 0.1) Reject H 0 if p-value < α (enough evidence) Cannot reject H 0 if p-value > α (not enough evidence)
Example Download the biotest.txt data file Read into R using function read.table() Extract the 1 st column and store as ‘X1’ Store the 2 nd column as ‘X2’
Example > X1 = read.table(“biotest.txt”) [,1] > X2 = read.table(“biotest.txt”) [,2]
Example Take ‘X1’ as the sample in this case, Test H 0 : μ = 115 against H 1 : μ ≠ 115 at significant level α = 0.05
[R] command t.test() Syntax: t.test(x=“data”, alternative = “less / greater / two.sided”, mu=“μ 0 ” )
Example 1 > t.test(X1, alternative = “two.sided”, mu=115) One Sample t-test data: X1 t = , df = 9, p-value = alternative hypothesis: true mean is not equal to percent confidence interval: sample estimates: mean of x 115.6
Example 1 > t.test(X1, alternative = “two.sided”, mu=115) One Sample t-test data: X1 t = , df = 9, p-value = alternative hypothesis: true mean is not equal to percent confidence interval: sample estimates: mean of x 115.6
Example 1 > t.test(X1, alternative = “two.sided”, mu=115) One Sample t-test data: X1 t = , df = 9, p-value = alternative hypothesis: true mean is not equal to percent confidence interval: sample estimates: mean of x larger than 0.05 Cannot reject H 0 at 0.05 level of significance
Example 1 > t.test(X1, alternative = “two.sided”, mu=115) One Sample t-test data: X1 t = , df = 9, p-value = alternative hypothesis: true mean is not equal to percent confidence interval: sample estimates: mean of x μ 0 inside the 95% CI
Example 2 Test H 0 : μ ≤ 108 against H 1 : μ > 108 at significant level α = 0.05
Example 2 > t.test(X1, alternative = “greater”, mu=108) One Sample t-test data: X1 t = , df = 9, p-value = alternative hypothesis: true mean is greater than percent confidence interval: Inf sample estimates: mean of x 115.6
Example 2 > t.test(X1, alternative = “greater”, mu=108) One Sample t-test data: X1 t = , df = 9, p-value = alternative hypothesis: true mean is greater than percent confidence interval: Inf sample estimates: mean of x smaller than 0.05 Reject H 0 at 0.05 level of significance
Example 2 Conclude that the population mean is significantly greater than 108
Example 2 > t.test(X1, alternative = “greater”, mu=108) One Sample t-test data: X1 t = , df = 9, p-value = alternative hypothesis: true mean is greater than percent confidence interval: Inf sample estimates: mean of x Statistical significance vs. Practical significance
Confidence Interval By default, the function t.test() includes a 95% confidence interval Question: Can we change the confidence level?
Confidence Interval e.g. want a 99% confidence interval > t.test(x1, alternative=“greater”, mu=108, conf.level = 0.99)
Lab 5 Two-sample t-test
Testing the population mean of two independent samples
Two-sample t-test Two-sided One-sided
Example 3 Consider the two sample X1 and X2 Want to test if there is there is a significant difference between the mean of X1 and mean of X2.
Example 3 Two sided test H 0 : μ 1 = μ 2 against H 1 : μ 1 ≠ μ 2 at 0.05 level of significance Assuming equal variance
Example 3 > t.test(X1, X2, alternative = “two.sided”, var.equal = TRUE) Two Sample t-test data: X1 and X2 t = , df = 18, p-value = alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval: sample estimates: mean of x mean of y
Example 3 > t.test(X1, X2, alternative = “two.sided”, var.equal = TRUE) Two Sample t-test data: X1 and X2 t = , df = 18, p-value = alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval: sample estimates: mean of x mean of y
Example 3 Not assuming equal variance? > t.test(X1, X2, alternative = “two.sided”, var.equal = FALSE)
Lab 5 Paired t-test
Two samples problem But they are no longer independent Example: Measurement taken twice at different time point from the same group of subjects Blood pressure before and after some treatment Want to test the difference of the means
Paired t-test If we take the difference of the measurements of each subject. Reduce to a one sample problem The rest is the same as a one sample t-test X1 X2 X3 X4 y1 y2 y3 y4 -= d1 d2 d3 d4
Example 4 Consider again the dataset X1 and X2, and assume they are pairwise observations Test the equality of the means i.e. test if difference in mean = 0 H 0 : μ 1 = μ 2 against H 1 : μ 1 ≠ μ 2 at 0.05 level of significance
Example 4 > t.test(X1, X2, alternative = “two.sided”, paired = TRUE) Paired t-test data: X1 and X2 t = , df = 9, p-value = alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval: sample estimates: mean of the differences -4.8
Example 4 > t.test(X1, X2, alternative = “two.sided”, paired = TRUE) Paired t-test data: X1 and X2 t = , df = 9, p-value = alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval: sample estimates: mean of the differences -4.8
Alternatively… > t.test(X1-X2, alternative = “two.sided”) One Sample t-test data: X1 - X2 t = , df = 9, p-value = alternative hypothesis: true mean is not equal to 0 95 percent confidence interval: sample estimates: mean of x -4.8
Alternatively… > t.test(X1-X2, alternative = “two.sided”) One Sample t-test data: X1 - X2 t = , df = 9, p-value = alternative hypothesis: true mean is not equal to 0 95 percent confidence interval: sample estimates: mean of x -4.8 EXACTLY THE SAME RESULT!!
Final Remarks Notice that the conclusion from the two sample t-test and the paired t-test are different even if we are looking at the same data set. Should check if the two sample are independent or not
Final Remarks Using the wrong test either lead to loss of sensitivity or invalid analysis.