University of Ottawa - Bio 4118 – Applied Biostatistics © Antoine Morin and Scott Findlay 21/09/2015 7:46 PM 1 Two-sample comparisons Underlying principles Comparing population parameters: means, variances and medians Paired-sample tests Power analysis in two-sample comparisons
University of Ottawa - Bio 4158 – Applied Biostatistics © Antoine Morin and Scott Findlay 21/09/2015 7:46 PM 2 Concepts map
University of Ottawa - Bio 4158 – Applied Biostatistics © Antoine Morin and Scott Findlay 21/09/2015 7:46 PM 3 Sturgeon of Saskatchewan River
University of Ottawa - Bio 4158 – Applied Biostatistics © Antoine Morin and Scott Findlay 21/09/2015 7:46 PM 4 Two-sample comparisons Appropriate when there are two groups to compare (e.g. control and treatment). In principle, we can compare any sample statistic, e.g. group means, medians, variances, etc. Control Treatment Frequency s2Cs2C s2Ts2T
University of Ottawa - Bio 4158 – Applied Biostatistics © Antoine Morin and Scott Findlay 21/09/2015 7:46 PM 5 An example Two samples (1, 2) with mean values that differ by some amount . What is the probability p of observing this difference under H 0 that the two means are in fact equal? Frequency Sample 2 Sample 1
University of Ottawa - Bio 4158 – Applied Biostatistics © Antoine Morin and Scott Findlay 21/09/2015 7:46 PM 6 An example (cont’d) If H 0 is true, the expected distribution of the test statistic t is: Probability (p) t Frequency Sample 2 Sample 1
University of Ottawa - Bio 4158 – Applied Biostatistics © Antoine Morin and Scott Findlay 21/09/2015 7:46 PM 7 An example (cont’d) For the two populations, suppose t = 2.01 What is the probability of getting a value at least this large under H 0 that the two means are in fact equal? Since p is small, it is unlikely that H 0 is true. Therefore, reject H Probability t = 2.01 Frequency Sample 2 Sample 1
University of Ottawa - Bio 4158 – Applied Biostatistics © Antoine Morin and Scott Findlay 21/09/2015 7:46 PM 8 Two-sample comparisons: independent samples For independent- sample tests, there is no experimental correlation or “matching” between objects (observations) in the two groups. E.g. Weight at 6 months of a random sample of different piglets raised on two different diets. Diet
University of Ottawa - Bio 4158 – Applied Biostatistics © Antoine Morin and Scott Findlay 21/09/2015 7:46 PM 9 Two-sample comparisons: matched (paired) samples For paired sample tests, objects (observations) in one group are matched with objects in the other group. E.g. Weight at 6 months of 2 piglets, each from the same sow, raised on different diets. Diet Sows
University of Ottawa - Bio 4158 – Applied Biostatistics © Antoine Morin and Scott Findlay 21/09/2015 7:46 PM 10 Two-sample comparisons: control versus experiment Two plots of corn, one (control) with no treatment, the other (treatment) with nitrogen added Biological prediction: nitrogen increases crop yield H 0 : T C (one- tailed) Control Treatment Frequency Yield
University of Ottawa - Bio 4158 – Applied Biostatistics © Antoine Morin and Scott Findlay 21/09/2015 7:46 PM 11 Comparing means: the t-test Calculate difference between two means. H 0 (one-tailed): Calculate t and associated p with appropriate degrees of freedom Control Treatment Frequency Yield
University of Ottawa - Bio 4158 – Applied Biostatistics © Antoine Morin and Scott Findlay 21/09/2015 7:46 PM 12 What are degrees of freedom? The degrees of freedom for any estimated quantity (e.g. the mean, variance, etc.) is the total sample minus one… … because if you know the quantity (e.g. the mean) and the values of all n-1 observations, you know the value of the nth observation. l The degrees of freedom for any statistical model is the total sample size minus the number of parameters
University of Ottawa - Bio 4158 – Applied Biostatistics © Antoine Morin and Scott Findlay 21/09/2015 7:46 PM 13 Why do degrees of freedom matter? The distribution of the test statistic is different for different degrees of freedom. Therefore, depending on the degrees of freedom, the same difference in sample means will give different p values t Probability 8 df 1 df
University of Ottawa - Bio 4158 – Applied Biostatistics © Antoine Morin and Scott Findlay 21/09/2015 7:46 PM 14 Comparing means: the Mann-Whitney U test Want to compare yields of control and treatment, where each group has 4 replicate plots. Calculate rank sum (R C, R E ) for each group. H 0 : R C = R E Calculate U and associated p.
University of Ottawa - Bio 4158 – Applied Biostatistics © Antoine Morin and Scott Findlay 21/09/2015 7:46 PM 15 Comparing means: parametric (P) versus non-parametric (NP) tests *if assumptions are met
University of Ottawa - Bio 4158 – Applied Biostatistics © Antoine Morin and Scott Findlay 21/09/2015 7:46 PM 16 Independence of observations Lack of independence usually arises because observations are correlated in time or space E.g. measures of phosphorous concentrations upstream and downstream of a point source along a river. Upstream site Downstream site
University of Ottawa - Bio 4158 – Applied Biostatistics © Antoine Morin and Scott Findlay 21/09/2015 7:46 PM 17 Why do observations need to be independent? If observations are not independent, then the true degrees of freedom is less (sometimes much less) than the calculated degrees of freedom … … the distribution used to calculate p will be wrong … … and p will be smaller than it ought to be Probability t calcuated df true df Calculated t
University of Ottawa - Bio 4158 – Applied Biostatistics © Antoine Morin and Scott Findlay 21/09/2015 7:46 PM 18 General procedure: N for each sample reasonably large (> 20) Evaluate independence assumption Test normality assumption. Test homoscedasticity assumption. If both samples are normal and have equal variances, use t-test (“pooled variances”). If both samples are normal, but have different variances, use Welch’s approximation (“separate variances”). If one or both samples are non-normal, try some transformation or use Mann-Whitney U.
University of Ottawa - Bio 4158 – Applied Biostatistics © Antoine Morin and Scott Findlay 21/09/2015 7:46 PM 19 N < 10 for each group Use Mann-Whitney U N < 10 for each group Use Mann-Whitney U General procedures N > 10 and N < 20 for each group Use both pooled or separate variances t- test and Mann-Whitney U… …and hope the results lead to the same inference (i.e. to reject or accept the null)! N < 10 for each group
University of Ottawa - Bio 4158 – Applied Biostatistics © Antoine Morin and Scott Findlay 21/09/2015 7:46 PM 20 Comparison of average size of sturgeon at The Pas and Cumberland House S-PLUS output from t- test: Standard Two-Sample t-Test data: x: FKLNGTH with LOCATION = Cumberland, and y: FKLNGTH with LOCATION = The_Pas t = , df = 183, p-value = 0.04 alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval: sample estimates: mean of x mean of y
University of Ottawa - Bio 4158 – Applied Biostatistics © Antoine Morin and Scott Findlay 21/09/2015 7:46 PM 21 Comparison of average size of sturgeon at The Pas and Cumberland House Output from Mann-Whitney (S-PLUS = Wilcoxon rank sum test) Wilcoxon rank-sum test data: x: FKLNGTH with LOCATION = Cumberland, and y: FKLNGTH with LOCATION = The_Pas rank-sum normal statistic with correction Z = , p-value = alternative hypothesis: true mu is not equal to 0
University of Ottawa - Bio 4158 – Applied Biostatistics © Antoine Morin and Scott Findlay 21/09/2015 7:46 PM 22 Assessing normality Do normal probability plot. If plot appears linear to the eye, in general there is no need to go further. If you’re still concerned, run Kolmogorov- Smirnov test (with Lilliefors correction, implicit in S-PLUS K-S test)
University of Ottawa - Bio 4158 – Applied Biostatistics © Antoine Morin and Scott Findlay 21/09/2015 7:46 PM 23 The normal cumulative distribution Areas under the normal probability density function and the cumulative normal distribution function 2.28% 50.00% 68.27% p Normal probability density function Cumulative normal density function
University of Ottawa - Bio 4158 – Applied Biostatistics © Antoine Morin and Scott Findlay 21/09/2015 7:46 PM 24 Normal equivalent deviates Normal equivalent deviates Transformation of cumulative percentages into normal equivalent deviates (Z-scores) Normal equivalent deviates Cumulative percent
University of Ottawa - Bio 4158 – Applied Biostatistics © Antoine Morin and Scott Findlay 21/09/2015 7:46 PM 25 Normal probability plots Examples of frequency distributions with their cumulative distributions plotted as normal probability plots. A: Normal; B: Equal Mixture of two distributions; C: Skewed to left; D: Skewed to right; E: Platykurtic; F: Leptokurtic. NED A B CD EF
University of Ottawa - Bio 4158 – Applied Biostatistics © Antoine Morin and Scott Findlay 21/09/2015 7:46 PM 26 Example (Lab 2): sturgeon size at The Pas and Cumberland House The normal probability plots for fklngth at The Pas and Cumberland House are:
University of Ottawa - Bio 4158 – Applied Biostatistics © Antoine Morin and Scott Findlay 21/09/2015 7:46 PM 27 Example: Sturgeon size (cont’d) Output from S-PLUS Kolmogorov-Smirnov test: normality of fklngth at The Pas One sample Kolmogorov-Smirnov Test of Composite Normality data: FKLNGTH in SturgPas ks = , p-value = 0.5 alternative hypothesis: True cdf is not the normal distn. with estimated parameters sample estimates: mean of x standard deviation of x
University of Ottawa - Bio 4158 – Applied Biostatistics © Antoine Morin and Scott Findlay 21/09/2015 7:46 PM 28 Equality of variances (homoscedasticity) using the F-ratio test If variances are equal, then s 2 C = s 2 T. H 0 (F-ratio): This test is quite sensitive to non- normality. Control Treatment Frequency s2Cs2C s2Ts2T
University of Ottawa - Bio 4158 – Applied Biostatistics © Antoine Morin and Scott Findlay 21/09/2015 7:46 PM 29 Equality of variances (homoscedasticity) using Levene’s test If variances are equal, then s 2 C = s 2 T. H 0 (Levene’s): This test is less sensitive to non-normality. Control Treatment Frequency s2Cs2C s2Ts2T
University of Ottawa - Bio 4158 – Applied Biostatistics © Antoine Morin and Scott Findlay 21/09/2015 7:46 PM 30 Comparing medians: the median test Calculate median M for both samples combined. Classify each observation as being above or below M to create 2 X 2 table. Do 2 or G test of independence. Yield Frequency Control Experimental M
University of Ottawa - Bio 4158 – Applied Biostatistics © Antoine Morin and Scott Findlay 21/09/2015 7:46 PM 31 Paired-sample tests used when same object is measured under different treatments (e.g. change in rat weights before and after treatment with a drug)… … or when there is a correlation between observations in the two samples. Use paired t-statistic.
University of Ottawa - Bio 4158 – Applied Biostatistics © Antoine Morin and Scott Findlay 21/09/2015 7:46 PM 32 Paired- versus independent-sample t-tests When correlation present, paired t-test is much more powerful because standard deviation of average difference between pairs is usually much smaller than the standard error of the difference between the two means. If no correlation present, paired test weaker because N is number of pairs, not number of observations. s 2 b = 8.67, s 2 a = 21.58, s 2 W = 2.81
University of Ottawa - Bio 4158 – Applied Biostatistics © Antoine Morin and Scott Findlay 21/09/2015 7:46 PM 33 Paired versus independent-sample tests: changes in face width Standard Two-Sample t-Test data: x: WIDTH with AGE = 5, and y: WIDTH with AGE = 6 t = , df = 28, p-value = alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval: sample estimates: mean of x mean of y Paired t-Test data: x: WIDTH5 in Skulldat, and y: WIDTH6 in Skulldat t = , df = 14, p-value = 0 alternative hypothesis: true mean of differences is not equal to 0 95 percent confidence interval: sample estimates: mean of x - y
University of Ottawa - Bio 4158 – Applied Biostatistics © Antoine Morin and Scott Findlay 21/09/2015 7:46 PM 34 Two-sample tests of the mean: minimum sample size Suppose we want to detect a difference between two sample means of at least . To test at the significance level with 1 - power, we can calculate the minimum sample size n min required to detect , given a pooled sample variance s p 2. Frequency Sample 1 Sample 2
University of Ottawa - Bio 4158 – Applied Biostatistics © Antoine Morin and Scott Findlay 21/09/2015 7:46 PM 35 Index of effect size Calculation t-Test on Means d Power analysis with G*Power
University of Ottawa - Bio 4158 – Applied Biostatistics © Antoine Morin and Scott Findlay 21/09/2015 7:46 PM 36 Comparison of average size of sturgeon at The Pas and Cumberland House S-PLUS output from Summary statistics: LOCATION:Cumberland FKLNGTH Min: st Qu.: Mean: Median: rd Qu.: Max: Total N: NA's : Std Dev.: LOCATION:The_Pas FKLNGTH Min: st Qu.: Mean: Median: rd Qu.: Max: Total N: NA's : Std Dev.:
University of Ottawa - Bio 4158 – Applied Biostatistics © Antoine Morin and Scott Findlay 21/09/2015 7:46 PM 37 Index of effect size Calculation t-Test on Means d Power analysis with G*Power
University of Ottawa - Bio 4158 – Applied Biostatistics © Antoine Morin and Scott Findlay 21/09/2015 7:46 PM 38 Minimum sample size
University of Ottawa - Bio 4158 – Applied Biostatistics © Antoine Morin and Scott Findlay 21/09/2015 7:46 PM 39 Two-sample tests of the mean: minimal detectable difference What is the minimal detectable difference min between two sample means that can be detected at the significance level with 1 - power, given an estimated pooled variance s p 2 ? Frequency Sample 1 Sample 2
University of Ottawa - Bio 4158 – Applied Biostatistics © Antoine Morin and Scott Findlay 21/09/2015 7:46 PM 40 Minimal detectable difference
University of Ottawa - Bio 4158 – Applied Biostatistics © Antoine Morin and Scott Findlay 21/09/2015 7:46 PM 41
University of Ottawa - Bio 4158 – Applied Biostatistics © Antoine Morin and Scott Findlay 21/09/2015 7:46 PM 42 An example: power of a two-sample test of the mean What is probability of detecting a true difference of 1.01 if (2) =.01?
University of Ottawa - Bio 4158 – Applied Biostatistics © Antoine Morin and Scott Findlay 21/09/2015 7:46 PM 43 Calculating power
University of Ottawa - Bio 4158 – Applied Biostatistics © Antoine Morin and Scott Findlay 21/09/2015 7:46 PM 44 Calculating power