Introduction to Biostatistics, Harvard Extension School © Scott Evans, Ph.D. and Lynne Peeples, M.S. 1 T-tests and their Nonparametric Analogs.

Slides:



Advertisements
Similar presentations
Irwin/McGraw-Hill © Andrew F. Siegel, 1997 and l Chapter 16 l Nonparametrics: Testing with Ordinal Data or Nonnormal Distributions.
Advertisements

Economics 105: Statistics Go over GH 11 & 12 GH 13 & 14 due Thursday.
Confidence Interval and Hypothesis Testing for:
Copyright ©2011 Brooks/Cole, Cengage Learning Testing Hypotheses about Means Chapter 13.
Testing means, part III The two-sample t-test. Sample Null hypothesis The population mean is equal to  o One-sample t-test Test statistic Null distribution.
MARE 250 Dr. Jason Turner Hypothesis Testing II To ASSUME is to make an… Four assumptions for t-test hypothesis testing: 1. Random Samples 2. Independent.
MARE 250 Dr. Jason Turner Hypothesis Testing II. To ASSUME is to make an… Four assumptions for t-test hypothesis testing:
Introduction to Biostatistics, Harvard Extension School © Scott Evans, Ph.D. and Lynne Peeples, M.S. 1 T-tests and their Nonparametric Analogs.
BCOR 1020 Business Statistics
Test statistic: Group Comparison Jobayer Hossain Larry Holmes, Jr Research Statistics, Lecture 5 October 30,2008.
Final Review Session.
Analysis of Differential Expression T-test ANOVA Non-parametric methods Correlation Regression.
Lecture 9 Today: –Log transformation: interpretation for population inference (3.5) –Rank sum test (4.2) –Wilcoxon signed-rank test (4.4.2) Thursday: –Welch’s.
Inferences About Process Quality
Student’s t statistic Use Test for equality of two means
5-3 Inference on the Means of Two Populations, Variances Unknown
Non-parametric statistics
Sample Size Determination Ziad Taib March 7, 2014.
7.1 Lecture 10/29.
Nonparametrics and goodness of fit Petter Mostad
Chapter 15 Nonparametric Statistics
Looking at differences: parametric and non-parametric tests
Non-Parametric Methods Professor of Epidemiology and Biostatistics
AP Statistics Section 13.1 A. Which of two popular drugs, Lipitor or Pravachol, helps lower bad cholesterol more? 4000 people with heart disease were.
Analysis of Variance. ANOVA Probably the most popular analysis in psychology Why? Ease of implementation Allows for analysis of several groups at once.
Experimental Statistics - week 2
Chapter 14: Nonparametric Statistics
Hypothesis testing – mean differences between populations
Nonparametric Inference
Ch 10 Comparing Two Proportions Target Goal: I can determine the significance of a two sample proportion. 10.1b h.w: pg 623: 15, 17, 21, 23.
Statistical Analysis Statistical Analysis
STAT 5372: Experimental Statistics Wayne Woodward Office: Office: 143 Heroy Phone: Phone: (214) URL: URL: faculty.smu.edu/waynew.
Education 793 Class Notes T-tests 29 October 2003.
AP STATISTICS LESSON 11 – 2 (DAY 1) Comparing Two Means.
The paired sample experiment The paired t test. Frequently one is interested in comparing the effects of two treatments (drugs, etc…) on a response variable.
More About Significance Tests
NONPARAMETRIC STATISTICS
Comparing Two Population Means
1 CSI5388: Functional Elements of Statistics for Machine Learning Part I.
Chapter 9 Hypothesis Testing and Estimation for Two Population Parameters.
Non-parametric Tests. With histograms like these, there really isn’t a need to perform the Shapiro-Wilk tests!
Chapter 10 Comparing Two Means Target Goal: I can use two-sample t procedures to compare two means. 10.2a h.w: pg. 626: 29 – 32, pg. 652: 35, 37, 57.
Week 111 Power of the t-test - Example In a metropolitan area, the concentration of cadmium (Cd) in leaf lettuce was measured in 7 representative gardens.
Biostat 200 Lecture 7 1. Hypothesis tests so far T-test of one mean: Null hypothesis µ=µ 0 Test of one proportion: Null hypothesis p=p 0 Paired t-test:
Previous Lecture: Categorical Data Methods. Nonparametric Methods This Lecture Judy Zhong Ph.D.
Biostatistics, statistical software VII. Non-parametric tests: Wilcoxon’s signed rank test, Mann-Whitney U-test, Kruskal- Wallis test, Spearman’ rank correlation.
Fall 2002Biostat Nonparametric Tests Nonparametric tests are useful when normality or the CLT can not be used. Nonparametric tests base inference.
Ordinally Scale Variables
AP Statistics Section 13.1 A. Which of two popular drugs, Lipitor or Pravachol, helps lower bad cholesterol more? 4000 people with heart disease were.
Copyright © Cengage Learning. All rights reserved. 14 Elements of Nonparametric Statistics.
Statistics - methodology for collecting, analyzing, interpreting and drawing conclusions from collected data Anastasia Kadina GM presentation 6/15/2015.
1 Nonparametric Statistical Techniques Chapter 17.
Nonparametric Statistics
Lesson Comparing Two Means. Knowledge Objectives Describe the three conditions necessary for doing inference involving two population means. Clarify.
AP Statistics Chapter 24 Comparing Means.
Ch11: Comparing 2 Samples 11.1: INTRO: This chapter deals with analyzing continuous measurements. Later, some experimental design ideas will be introduced.
Week111 The t distribution Suppose that a SRS of size n is drawn from a N(μ, σ) population. Then the one sample t statistic has a t distribution with n.
Nonparametric Statistical Methods. Definition When the data is generated from process (model) that is known except for finite number of unknown parameters.
Non – Parametric Test Dr.L.Jeyaseelan Dept. of Biostatistics Christian Medical College Vellore, India.
Comparison of 2 Population Means Goal: To compare 2 populations/treatments wrt a numeric outcome Sampling Design: Independent Samples (Parallel Groups)
NON-PARAMETRIC STATISTICS
Statistical Analysis II Lan Kong Associate Professor Division of Biostatistics and Bioinformatics Department of Public Health Sciences December 15, 2015.
Biostatistics Nonparametric Statistics Class 8 March 14, 2000.
Nonparametric Statistical Methods. Definition When the data is generated from process (model) that is known except for finite number of unknown parameters.
ENGR 610 Applied Statistics Fall Week 7 Marshall University CITE Jack Smith.
HYPOTHESIS TESTING FOR DIFFERENCES BETWEEN MEANS AND BETWEEN PROPORTIONS.
13 Nonparametric Methods Introduction So far the underlying probability distribution functions (pdf) are assumed to be known, such as SND, t-distribution,
Objectives (BPS chapter 12) General rules of probability 1. Independence : Two events A and B are independent if the probability that one event occurs.
Two-Sample-Means-1 Two Independent Populations (Chapter 6) Develop a confidence interval for the difference in means between two independent normal populations.
1 Nonparametric Statistical Techniques Chapter 18.
Presentation transcript:

Introduction to Biostatistics, Harvard Extension School © Scott Evans, Ph.D. and Lynne Peeples, M.S. 1 T-tests and their Nonparametric Analogs

Introduction to Biostatistics, Harvard Extension School © Scott Evans, Ph.D. and Lynne Peeples, M.S. 2 Decoding Terminology  “T-test”  Type of hypothesis test  Inference using Central Limit Theorem (CLT)  One-sample case last week  Two-sample case tonight…  “Nonparametric Analogs”  Both one and two-sample hypothesis tests using “ranks”  Useful for non-normal data

Introduction to Biostatistics, Harvard Extension School © Scott Evans, Ph.D. and Lynne Peeples, M.S. 3 The Big Picture Populations and Samples Sample / Statistics x, s, s 2 Population Parameters μ, σ, σ 2

Introduction to Biostatistics, Harvard Extension School © Scott Evans, Ph.D. and Lynne Peeples, M.S. 4 What is a Parameter?  Parameter = a characteristic of the population in which we have a particular interest  Examples: μ, σ, σ 2, ρ  Statistical tests that assume an underlying distribution (“parametric”)  For Normal data, knowing just two parameters (µ and σ) allows you to fully describe a unique distribution – and therefore, calculate probabilities  Statistical tests not relying on an underlying distribution (“nonparametric”)

Introduction to Biostatistics, Harvard Extension School © Scott Evans, Ph.D. and Lynne Peeples, M.S. 5 Other Considerations:  One or Two-sided Hypothesis? Hypothesis Testing Tree

Introduction to Biostatistics, Harvard Extension School © Scott Evans, Ph.D. and Lynne Peeples, M.S. 6 Hypothesis Testing Steps 1.State H 0 (i.e., what you are trying to disprove) 2.State H A 3.Determine α (at your discretion) 4.Determine the test statistic and associated p-value 5.Determine whether to reject H 0 or fail to reject H 0

Introduction to Biostatistics, Harvard Extension School © Scott Evans, Ph.D. and Lynne Peeples, M.S. 7 One-Sample T-test Review 1.H 0 : μ = μ 0 2.H A : μ ≠ μ 0 or μ > μ 0 or μ < μ 0 3.Typically set α at Test Statistic: P-value 5.Conclusion – Reject or Fail to Reject H 0 ? Assumptions:  Random (valid) sampling  Data comes from a population that is normally distributed

Introduction to Biostatistics, Harvard Extension School © Scott Evans, Ph.D. and Lynne Peeples, M.S. 8 Comparing Two Samples Populations μ 1 μ 2 Sampling Distributions of x n1n1 n2n2 x1x1 (x 1 -x 2 ) x2x2 (μ1-μ2)(μ1-μ2) Sampling Distribution of x 1 -x 2

Introduction to Biostatistics, Harvard Extension School © Scott Evans, Ph.D. and Lynne Peeples, M.S. 9 Comparing Two Samples  Are the two samples independent?  Are standard deviations, σ 1 and σ 2, significantly different? Group 1Group 2 PopulationMeanμ1μ1 μ2μ2 Std. deviationσ1σ1 σ2σ2 SampleMean Std. deviation x1s1x1s1 x2s2x2s2 Sample sizen1n1 n2n2

Introduction to Biostatistics, Harvard Extension School © Scott Evans, Ph.D. and Lynne Peeples, M.S. 10 Comparing Two Independent Samples

Introduction to Biostatistics, Harvard Extension School © Scott Evans, Ph.D. and Lynne Peeples, M.S. 11 Comparing Two Independent Samples In the case of two independent samples consider the following issues: 1.The two sets of measurements are independent because each comes from a different group (e.g., healthy children, children suffering from cystic fibrosis) 2. In contrast to the one-sample case, we are simultaneously estimating two population means instead of one  There are now two sources of variability instead of one (one from each sample)  As a result, the standard deviation is going to be large compared to the one-sample case

Introduction to Biostatistics, Harvard Extension School © Scott Evans, Ph.D. and Lynne Peeples, M.S. 12 Comparing Two Independent Samples The following assumptions must hold: a.The two samples must be independent from each other b.The individual measurements must be roughly normally distributed c.The variances in the two populations must be roughly equal If a-c are satisfied, inference will be based on the statistic, distributed according to a t distribution with n 1 +n 2 -2 degrees of freedom

Introduction to Biostatistics, Harvard Extension School © Scott Evans, Ph.D. and Lynne Peeples, M.S. 13  Determine by plotting histograms for each sample  Not exact – eyeball  Formal hypothesis test of Normality  e.g., Shapiro-Wilk and D’Agostino Tests  H 0 : Normal vs. H a : Non- Normal Assessment of Normality Approximately Normal?

Introduction to Biostatistics, Harvard Extension School © Scott Evans, Ph.D. and Lynne Peeples, M.S. 14  If either not determined to be Normal… 1.Transform the data so that it is normal and perform the analyses on the transformed data (don’t forget to transform the interpretation back to the original scale) 2.Use nonparametric methods What if… Samples are NOT Normal?

Introduction to Biostatistics, Harvard Extension School © Scott Evans, Ph.D. and Lynne Peeples, M.S. 15 Assessment of Variability  Determine by plotting histograms for each sample  Not exact – eyeball  Formal Hypothesis Test:  Hypothesis  H 0 : variances are equal  H A : variance are not equal  Tests  Levene's test  Brown & Forsythe's test  Bartlett's test Approximately Equal Spread?

Introduction to Biostatistics, Harvard Extension School © Scott Evans, Ph.D. and Lynne Peeples, M.S. 16 Comparing Two Independent Samples STEP 1. Based on two random samples of size n 1 and n 2 observations compute the sample means x 1 and x 2, and the std. deviations and STEP 2. Compute the pooled estimate of the population variance STEP 3. The estimate of the standard deviation is

Introduction to Biostatistics, Harvard Extension School © Scott Evans, Ph.D. and Lynne Peeples, M.S. 17 Comparing Two Independent Samples  Back to our example of Serum iron levels and cystic fibrosis…

Introduction to Biostatistics, Harvard Extension School © Scott Evans, Ph.D. and Lynne Peeples, M.S. 18 Comparing Two Independent Samples

Introduction to Biostatistics, Harvard Extension School © Scott Evans, Ph.D. and Lynne Peeples, M.S. 19 Comparing Two Independent Samples

Introduction to Biostatistics, Harvard Extension School © Scott Evans, Ph.D. and Lynne Peeples, M.S. 20 Comparing Two Independent Samples

Introduction to Biostatistics, Harvard Extension School © Scott Evans, Ph.D. and Lynne Peeples, M.S. 21 Comparing Two Independent Samples  What if the two samples had significantly different variances?  First, calculate t:  Approximate distribution with v:  Now, compare value of statistic to t distribution with v degrees of freedom

Introduction to Biostatistics, Harvard Extension School © Scott Evans, Ph.D. and Lynne Peeples, M.S. 22 Two-sample Confidence Intervals  We may be interested in an estimate of the difference between 2 population means: (μ 1 – μ 2 )  E.g., The difference in cholesterol levels between treatment (μ 1 ) and control groups (μ 2 )

Introduction to Biostatistics, Harvard Extension School © Scott Evans, Ph.D. and Lynne Peeples, M.S. 23 Two-sample Confidence Intervals  Hypothesis Testing // Interval Estimation  Values between the CI limits are values for which the null hypothesis would not be rejected Note that these critical values change depending on t-distribution (# of df’s)

Introduction to Biostatistics, Harvard Extension School © Scott Evans, Ph.D. and Lynne Peeples, M.S. 24 Two-sample Confidence Intervals  Two-sided confidence interval for the difference in two means: (μ 1 - μ 2 ) 2.5%

Introduction to Biostatistics, Harvard Extension School © Scott Evans, Ph.D. and Lynne Peeples, M.S. 25 Two-sample Confidence Intervals  Back to our example…

Introduction to Biostatistics, Harvard Extension School © Scott Evans, Ph.D. and Lynne Peeples, M.S. 26 Two-sample Confidence Intervals

Introduction to Biostatistics, Harvard Extension School © Scott Evans, Ph.D. and Lynne Peeples, M.S. 27 Two-sample Confidence Intervals  In our example…

Introduction to Biostatistics, Harvard Extension School © Scott Evans, Ph.D. and Lynne Peeples, M.S. 28 Two-sample Confidence Intervals  Thus, a two-sided 95% confidence interval for the difference in serum iron levels between healthy chidren and children who are suffering from cystic fibrosis is (1.4, 12.6) as we saw before.

Introduction to Biostatistics, Harvard Extension School © Scott Evans, Ph.D. and Lynne Peeples, M.S. 29 Two-sample Confidence Intervals  If a 95% one-sided confidence interval were required (corresponding to a one-sided hypothesis test), the computer solution would be as follows:  The 95% lower one-sided confidence interval for the difference of the mean serum iron level is then (2.4, +  )

Introduction to Biostatistics, Harvard Extension School © Scott Evans, Ph.D. and Lynne Peeples, M.S. 30 Two-sample Confidence Intervals % 5%  Two-sided confidence interval: (1.4, 12.6)  One-sided confidence interval: (2.4, + infinity)

Introduction to Biostatistics, Harvard Extension School © Scott Evans, Ph.D. and Lynne Peeples, M.S. 31  If our data had significantly different variances, the CI equation would change:  Now the variances of both samples taken into account  Again, based on a t-distribution with v d.f. Two-sample Confidence Intervals

Introduction to Biostatistics, Harvard Extension School © Scott Evans, Ph.D. and Lynne Peeples, M.S. 32 CHD Example  Effects of Hormone Replacement Therapy on Serum Lipids in Elderly Women Annals of Internal Medicine (134: , 2001)  Coronary heart disease (CHD) is the leading cause of death among older women  Low levels of HDL are a risk factor for death from CHD  Researchers claim that hormone replacement therapy can increase HDL, thus hopefully preventing CHD

Introduction to Biostatistics, Harvard Extension School © Scott Evans, Ph.D. and Lynne Peeples, M.S. 33 CHD Example  63 women were randomized in a double-blinded fashion to one of two groups (experimental or placebo arm)  Four women were excluded from our analysis because lipid- lowering medication dosage was not stable during the study period (2 each in the placebo and HRT groups)  We analyzed data on 59 participants: 20 in the placebo group and 39 in the HRT group Population (Elderly Women) HRT (n=39) Placebo (n=20)

Introduction to Biostatistics, Harvard Extension School © Scott Evans, Ph.D. and Lynne Peeples, M.S. 34 CHD Example 2-sample (independent) t-test: Hypotheses: H 0 : μ Experimental ≤ μ Placebo H A : μ Experimental > μ Placebo Critical Value: t 0.01 = (using 57 df)

Introduction to Biostatistics, Harvard Extension School © Scott Evans, Ph.D. and Lynne Peeples, M.S. 35 CHD Example Remember: & Test Statistic: Conclusion:  Reject H 0  There is sufficient evidence to conclude that the mean increase in HDL levels of women in the treatment group is greater than the mean increase in HDL levels of women in the control group

Introduction to Biostatistics, Harvard Extension School © Scott Evans, Ph.D. and Lynne Peeples, M.S. 36 Independent vs. Dependent  Independent: One set of data tells us nothing about values in another set of data  Dependent (Paired): For each observation in group 1, there is a corresponding observation in group 2  Before/After measures, sets of twins, etc.  Key is that variables, other than what we are interested in, are “matched” between samples (i.e. age, sex,…)

Introduction to Biostatistics, Harvard Extension School © Scott Evans, Ph.D. and Lynne Peeples, M.S. 37 Comparing Paired Samples The percent difference in the time to the onset of angina on the first series of tests (when breathing regular air) and the percent difference of time to onset of angina during the second series of tests (when breathing air mixed with CO) were compared.

Introduction to Biostatistics, Harvard Extension School © Scott Evans, Ph.D. and Lynne Peeples, M.S. 38 Comparing Paired Samples

Introduction to Biostatistics, Harvard Extension School © Scott Evans, Ph.D. and Lynne Peeples, M.S. 39 Comparing Paired Samples  Hypothesis testing for paired samples:

Introduction to Biostatistics, Harvard Extension School © Scott Evans, Ph.D. and Lynne Peeples, M.S. 40  Note that we can think of the paired t-test as a case of a one-sample t-test based on the differences, d Comparing Paired Samples

Introduction to Biostatistics, Harvard Extension School © Scott Evans, Ph.D. and Lynne Peeples, M.S. 41 Comparing Paired Samples

Introduction to Biostatistics, Harvard Extension School © Scott Evans, Ph.D. and Lynne Peeples, M.S. 42 Comparing Paired Samples

Introduction to Biostatistics, Harvard Extension School © Scott Evans, Ph.D. and Lynne Peeples, M.S. 43  To carry out the hypothesis by STATA we use the one-sample t-test command as before, noting that our data are now comprised by differences of the paired observations and the mean under the null hypothesis is zero:  Since P<t=0.0059<0.05, we reject the null hypothesis  Patients experience angina faster (by about 6.63%) when breathing air mixed with CO then when breathing clean air Comparing Paired Samples

Introduction to Biostatistics, Harvard Extension School © Scott Evans, Ph.D. and Lynne Peeples, M.S. 44  Again, this is equivalent to the case of a one- sample t-test, but now based on the mean differences, d, rather than x Confidence Intervals - Paired

Introduction to Biostatistics, Harvard Extension School © Scott Evans, Ph.D. and Lynne Peeples, M.S. 45 Covered so far…

Introduction to Biostatistics, Harvard Extension School © Scott Evans, Ph.D. and Lynne Peeples, M.S. 46  What is “nonparametrics”?  Approach to data analysis  Particularly useful when data are not Normal  Based on Ranks  When is it best to use this technique?  Skewed data  Bimodal or ordinal data  Examples  Pain/severity scales  Performance ratings The World is Not Always Normal! Scott

Introduction to Biostatistics, Harvard Extension School © Scott Evans, Ph.D. and Lynne Peeples, M.S. 47 Need more Branches… Analysis Options For… ParametricNonparametric One Sample Sign Test Wilcoxon Signed-Rank Test Two Sample Independent Wilcoxon Rank-Sum Test Dependent (Paired) Sign Test Wilcoxon Signed-Rank Test

Introduction to Biostatistics, Harvard Extension School © Scott Evans, Ph.D. and Lynne Peeples, M.S. 48 Nonparametric Analogs SituationParametric TestNonparametric Test(s) 1-sample1-sample t-test  Sign test  Wilcoxon signed-rank test 2-sample (independent) 2-sample t-test (independent)  Wilcoxon rank-sum test (aka Mann- Whitney) 2-sample (dependent) Paired t-test  Sign test  Wilcoxon signed-rank test  Note: Nonparametric tests do not require normality (or any distributional assumption)  They are often called “distribution-free”  The two-sample tests require equal variances

Introduction to Biostatistics, Harvard Extension School © Scott Evans, Ph.D. and Lynne Peeples, M.S. 49 Nonparametric Tests  Advantages  Don’t need to assume the data come from a specified distribution (i.e. Normal)  Can be quick and easy!  Disadvantages  We may not do as well as the parametric approach if the data do come from a normal population (less “power”)  Difficult to make quantitative statements about differences  If large sample size (even if not Normal data), can apply CLT and use t-tests, etc.

Introduction to Biostatistics, Harvard Extension School © Scott Evans, Ph.D. and Lynne Peeples, M.S. 50 Nonparametric Tests  Assign, and use, ranks (or signs) instead of raw data in tests  e.g. Test scores: 67, 58, 89, 91, 91, 96 Ranked: 1, 2, 3, 4.5, 4.5, 6  Sign Test only accounts for whether a data point is less than, equal to, or greater than a specified value  + or – assigned  Wilcoxon Tests incorporate relative magnitude of differences in ranks

Introduction to Biostatistics, Harvard Extension School © Scott Evans, Ph.D. and Lynne Peeples, M.S. 51 Sign Test  Categorize each value as above or below hypothesized median  Test relative number +/-  Can also be used for paired observations (differences in two samples)

Introduction to Biostatistics, Harvard Extension School © Scott Evans, Ph.D. and Lynne Peeples, M.S. 52 Sunblock Experiment  Spread new test lotion on one arm (randomly chosen) and spread the placebo cream on other arm  Ensuring proper rotation with the sun for even baking…  After two hours, check arms and note which one is more red  If cream has no effect, we expect even redness on arms  Hypothesis Test:  H 0 : Median case = Median control vs. H A : Median case ≠ Median control

Introduction to Biostatistics, Harvard Extension School © Scott Evans, Ph.D. and Lynne Peeples, M.S. 53 Sunblock Experiment LotionPlacebo Triin-+ Audrey+- Tzu-Min-+ Yu-+ Katie Jeanne Scott00 Note: + indicates redder arm

Introduction to Biostatistics, Harvard Extension School © Scott Evans, Ph.D. and Lynne Peeples, M.S. 54 Sunblock Experiment. signtest lotion=placebo Sign test sign | observed expected positive | 2 3 negative | 4 3 zero | all | 7 7 Two-sided test: Ho: median of case - control = 0 vs. Ha: median of case - control != 0 Pr(#positive >= 4 or #negative >= 4) = min(1, 2*Binomial(n = 6, x >= 4, p = 0.5)) =  Note that we have a small sample size (n=6)  Zero not counted  We may not be able to make normality assumptions for Z +, so exact methods are used  Binomial distribution (covered later in the course)

Introduction to Biostatistics, Harvard Extension School © Scott Evans, Ph.D. and Lynne Peeples, M.S. 55 Wilcoxon Signed-Rank Test  Sign Test can be very wasteful of information  Wilcoxon tests account for difference in relative magnitude between groups, as well as signs  In our example, burns graded on a 0-5 scale (0=no burn, 5=severe burn)  Rank data, and add signs

Introduction to Biostatistics, Harvard Extension School © Scott Evans, Ph.D. and Lynne Peeples, M.S. 56 Sunblock Experiment LotionPlaceboAbs(diff), Rank & Sign Triin154, 4.5 (-) Audrey321, 1.5 (+) Tzu-Min055, 6 (-) Yu242, 3 (-) Katie Jeanne , 4.5 (-) 1, 1.5 (+) Scott330 (0)

Introduction to Biostatistics, Harvard Extension School © Scott Evans, Ph.D. and Lynne Peeples, M.S. 57 Sunblock Experiment  Hypothesis Test:  H 0 : Median difference in sunburn between lotion arm and placebo arm = 0 (T + =T - ) vs. H A : Median difference in sunburn between lotion arm and placebo arm ≠ 0 (T + ≠T - )  Sum up ranks corresponding to “+” and “-” signs (ignoring zeros):  T + = 3  T - = 18  Note that STATA calculates this a bit different, by including the zero value in the rankings and later adjusting for it  If no difference, would expect equal number (i.e. T + = T - )

Introduction to Biostatistics, Harvard Extension School © Scott Evans, Ph.D. and Lynne Peeples, M.S. 58 Sunblock Experiment. signrank lotion=placebo Wilcoxon signed-rank test sign | obs sum ranks expected positive | negative | zero | all | unadjusted variance adjustment for ties adjustment for zeros adjusted variance Ho: case = control z = Prob > |z| = Again, note small n Table A.6 provides exact p-values Looking here… Sample Size=6 and T 0 =3, so p=2(0.0782)= STATA ranks the zero, but later adjusts for it

Introduction to Biostatistics, Harvard Extension School © Scott Evans, Ph.D. and Lynne Peeples, M.S. 59 Wilcoxon Rank-Sum Test  Testing for difference in two independent samples (aka Mann-Whitney)  H 0 : The two populations have equal medians vs. H A : The two populations have different medians  Rank all data together, regardless of group  Sum ranks in one group – extreme?

Introduction to Biostatistics, Harvard Extension School © Scott Evans, Ph.D. and Lynne Peeples, M.S. 60 Caffeine and Memory  Two independent groups of ten individuals  10 given a grande cup of Starbucks coffee  10 given a grande cup of decaf Starbucks coffee  How many of 20 objects in memory test recollected?  H 0 : Median recollected equal in each group vs. H A : Median recollected different in each group

Introduction to Biostatistics, Harvard Extension School © Scott Evans, Ph.D. and Lynne Peeples, M.S. 61 Caffeine and Memory Caffeinated Decaf  Rank all 20 scores:    Sum Caffeinated Ranks:  = 125

Introduction to Biostatistics, Harvard Extension School © Scott Evans, Ph.D. and Lynne Peeples, M.S. 62 Caffeine and Memory. ranksum score, by(caf) Two-sample Wilcoxon rank-sum (Mann-Whitney) test caf | obs rank sum expected | | combined | unadjusted variance adjustment for ties adjusted variance Ho: score(caf==0) = score(caf==1) z = Prob > |z| = Since p>0.05, we do not reject the null hypothesis Based on our study, there is no significant evidence that caffeine improves short-term memory

Introduction to Biostatistics, Harvard Extension School © Scott Evans, Ph.D. and Lynne Peeples, M.S. 63 Hypothesis Tests Review Analysis Options For… Parametric One Sample Z-Test (σ known) One Sample T-Test (σ unknown) Two Samples Independent 2-Sample T-test w/ Equal Variances w/ Unequal Variances Dependent Paired T-test Nonparametric One Sample Sign Test Wilcoxon Signed- Rank Test Two Samples Independent Wilcoxon Rank- Sum Test Dependent (Paired) Sign Test Wilcoxon Signed- Rank Test