Sihua Peng, PhD Shanghai Ocean University

Sihua Peng, PhD Shanghai Ocean University 2016.10
Biostatistics 6. Simple hypothesis testing Sihua Peng, PhD Shanghai Ocean University

Contents Introduction to R Data sets
Introductory Statistical Principles Sampling and experimental design with R Graphical data presentation Simple hypothesis testing Introduction to Linear models Correlation and simple linear regression Single factor classification (ANOVA) Nested ANOVA Factorial ANOVA Simple Frequency Analysis

6. Simple hypothesis testing
6.2 One- and two-tailed tests 6.3 t-tests 6.4 Assumptions 6.8 Key for simple hypothesis testing 6.9 Worked examples of real biological data sets

6.1 Hypothesis testing A biological or research hypothesis is a concise statement about the predicted or theorized nature of a population or populations and usually proposes that there is an effect of a treatment (e.g. the means of two populations are different). Logically however, theories (and thus hypothesis) cannot be proved, only disproved (falsification) and thus a null hypothesis (H0) is formulated to represent all possibilities except the hypothesized prediction.

6.2 One- and two-tailed tests
Two-tailed tests are any test used to test a null hypotheses that can be rejected by large deviations from expected in either direction. By contrast one-tailed tests are those tests that are used to test more specific null hypotheses that restrict null hypothesis rejection to only outcomes in one direction.

H0: HA: Reject: Accept：

6.3 t-tests Single population t-tests
Single population t-tests are used to test null hypotheses that a population parameter is equal to a specific value (H0 : μ = θ, where θ is typically 0), and are thus useful for testing coefficients of regression and correlation or for testing whether measured differences are equal to zero.

Two population t-tests
Two population t-tests are used to test null hypotheses that two independent populations are equal with respect to some parameter (typically the mean, e.g. H0 : μ1 = μ2). The separate variances t-test (Welch’s test), represents an improvement of the t-test in that more appropriately accomodates samples with modestly unequal variances.

Paired samples t-tests
When observations are collected from a population in pairs such that two variables are measured from each sampling unit, a paired t-test can be used to test the null hypothesis that the population mean difference between paired observations is equal to zero (H0 : μd = 0). Note that this is equivalent to a single population t-test testing a null hypotheses that the population parameter is equal to the specific value of zero.

6.4 Assumptions samples collected from theoretical populations that are : 1) normally distributed (see section 3.1.1); 2) equally varied; 3) that each of the observations are independent .

6.5 Statistical decision and power
Type I error When rejecting a null hypothesis at the 5% level, we are therefore accepting that there is a 5% change that we are making an error (a Type I error). Type II error Conversely, when a null hypothesis is not rejected (probability of 5% or greater) even though there really is a trend or effect in the population, a Type II error has been committed. Hence, a Type II error is when you fail to detect an effect that really occurs.

6.6 Robust tests There are a number of more robust (yet less powerful) alternatives to independent samples t-tests and paired t-tests. The Mann-Whitney-Wilcoxon test is a non-parametric (rank-based) equivalent of the independent samples t-test that uses the ranks of the observations to calculate test statistics rather than the actual observations and tests the null hypothesis that the two sampled populations have equal distributions. Similarly, the non-parametric Wilcoxon signed-rank test uses the sums of positive and negative signed ranked differences between paired observations to test the null hypothesis that the two sets of observations come from the one population. Randomization tests in which the factor levels are repeatedly shuffled so as to yield a probability distribution for the relevant statistic (such as the t-statistic) specific to the sample data do not have any distributional assumptions.

6.8 Key for simple hypothesis testing
one-sample t-test Mean of single sample compared to a specific fixed value (such as a predicted population mean) (one-sample t-test) > t.test(DV, dataset)

Independent samples t-test
one-tailed (H0 : μA > μB) >t.test(DV ~ FACTOR, dataset, alternative = "greater") two-tailed (H0 : μA = μB) >t.test(DV ~ FACTOR, dataset) for pooled variances t-tests, include the var.equal=T argument

Independent samples t-test: Example 6A
Ward and Quinn investigated differences in the fecundity (as measured by egg production) of a predatory intertidal gastropod (Lepsiella vinosa) in two different intertidal zones (mussel zone and the higher littorinid zone). >ward <- read.table("ward.csv", header = T, sep = ",")

Example 6A Assess assumptions of normality and homogeneity of variance for the null hypothesis that the population mean egg production is the same for both littorinid and mussel zone Lepsiella. > boxplot(EGGS ~ ZONE, ward)

Example 6A >with(ward, rbind(MEAN = tapply(EGGS, ZONE, mean), VAR = tapply(EGGS, ZONE, var))) Conclusions - There was no evidence of non-normality (boxplots not grossly asymmetrical) or unequal variance (boxplots very similar size and variances very similar). Hence, the simple, studentized (pooled variances) t-test is likely to be reliable.

Example 6A > t.test(EGGS ~ ZONE, ward, var.equal = T)
Conclusions - Reject the null hypothesis. Egg production by predatory gastropods (Lepsiella vinosa was significantly greater (t77 = −5.39, P < 0.001) in mussel zones than littorinid zones on rocky intertidal shores.

Independent samples t-test: Example 6B
Separate variances, Welch’s t-test Furness and Bryant (1996) measured the metabolic rates of eight male and six female breeding northern fulmars and were interesting in testing the null hypothesis that there was no difference in metabolic rate between the sexes (Box 3.2 of Quinn and Keough (2002)). >furness <- read.table("furness.csv", header = T, sep = ",") >boxplot(METRATE ~ SEX, furness)

>with(furness, rbind(MEAN = tapply(METRATE, SEX, mean), VAR = tapply(METRATE, SEX, var))) Conclusions - Whilst there is no evidence of non-normality (boxplots not grossly asymmetrical), variances are a little unequal (although perhaps not grossly unequal - one of the boxplots is not more than three times smaller than the other). Hence, a separate variances t-test is more appropriate than a pooled variances t-test.

> t.test(METRATE ~ SEX, furness, var.equal = F) Conclusions - Do not reject the null hypothesis. Metabolic rate of male breeding northern fulmars was not found to differ significantly (t = −0.773, df = , P = 0.457) from that of females.

paired t-test Two samples specifically paired (each of the sampling units measured under both conditions) to reduce within-group variation (paired t-test)

Paired t-test one-tailed (H0 : μA > μB)
> t.test(DV1, DV2, dataset, alternative = "greater") > t.test(DV ~ FACTOR, dataset, alternative = "greater", + paired = T) two-tailed (H0 : μA = μB) > t.test(DV1, DV2, dataset) > t.test(DV ~ FACTOR, dataset, paired = T) for pooled variances t-tests, include the var.equal=T argument.

Paired t-test: Example 6C
To investigate the effects of lighting conditions on the orb-spinning spider webs Elgar et al. measured the horizontal (width) and vertical (height) dimensions of the webs made by 17 spiders under light and dim conditions. Accepting that the webs of individual spiders vary considerably, Elgar et al. employed a paired design in which each individual spider effectively acts as its own control. A paired t-test performs a one sample t-test on the differences between dimensions under light and dim conditions.

Example 6C >elgar <- read.table("elgar.csv", header = T, sep = ",") Assess whether the differences in web width (and height) in light and dim light conditions are normally distributed. > with(elgar, boxplot(VERTLIGH - + VERTDIM)) > with(elgar, boxplot(HORIZLIG - + HORIZDIM))

Example 6C Conclusions - There is no evidence of non-normality for either the difference in widths or heights of webs under light and dim ambient conditions. Therefore paired t-tests are likely to be reliable tests of the hypotheses that the mean web dimensional differences are equal to zero. > with(elgar, t.test(HORIZLIG, HORIZDIM, paired = T)) No effect of lighting on web width

Example 6C No effect of lighting on web height
> with(elgar, t.test(VERTLIGH, VERTDIM, paired = T)) No effect of lighting on web height Conclusions - Orb-spinning spider webs were found to be significantly wider (t = 2.148, df = 16, P = 0.047) under dim lighting conditions than light conditions, yet were not found to differ (t = 0.965, df = 16, P = 0.349) in height.

Non-parametric Mann-Whitney-Wilcoxon signed rank test
Sokal and Rohlf presented a dataset comprising the lengths of cheliceral bases (in μm) from two samples of chigger (Trombicula lipovskyi) nymphs. These data were used to illustrate two equivalent tests (Mann-Whitney U-test and Wilcoxon two-sample test) of location.

Non-parametric Mann-Whitney-Wilcoxon signed rank test
one-tailed (H0 : μA > μB) > wilcox.test(DV ~ FACTOR, dataset, alternative = "greater") two-tailed (H0 : μA = μB) > wilcox.test(DV ~ FACTOR, dataset)

Example 6D >nymphs <- read.table("nymphs.csv", header = T, sep = ",") > boxplot(LENGTH ~ SAMPLE, nymphs) > with(nymphs, rbind(MEAN = tapply(LENGTH, SAMPLE, mean), VAR = tapply(LENGTH, SAMPLE, var)))

Example 6D Conclusions - Whilst there is no evidence of unequal variance, there is some (possible) evidence of non-normality (boxplots slightly asymmetrical). These data will therefore be analysed using a non-parametric Mann-Whitney-Wilcoxon signed rank test. > wilcox.test(LENGTH ~ SAMPLE, nymphs) Conclusions - Reject the null hypothesis. The length of the cheliceral base is significantly longer in nymphs from sample 1 (W = 123.5, df = 24, P = 0.023) than those from sample 2.

Paired Wilcoxon (signed rank) test
one-tailed (H0 : μA > μB) > wilcox.test(DV1,DV2, dataset, alternative="greater") > #OR for long format > wilcox.test(DV~FACTOR, dataset, alternative="greater", + paired=T) two-tailed (H0 : μA = μB) > wilcox.test(DV1, DV2, dataset) > wilcox.test(DV ~ FACTOR, dataset, paired = T)

One-sample Wilcoxon (rank sum) test
> wilcox.test(DV, dataset)

Independent-sample Mann-Whitney Wilcoxon test
one-tailed (H0 : μA > μB) > wilcox.test(DV ~ FACTOR, dataset, alternative = "greater") two-tailed (H0 : μA = μB) > wilcox.test(DV ~ FACTOR, dataset)

Paired Wilcoxon (signed rank) test
one-tailed (H0 : μA > μB) > wilcox.test(DV1,DV2, dataset, alternative="greater") > #OR for long format > wilcox.test(DV~FACTOR, dataset, alternative="greater", + paired=T) two-tailed (H0 : μA = μB) > wilcox.test(DV1, DV2, dataset) > wilcox.test(DV ~ FACTOR, dataset, paired = T)

Sihua Peng, PhD Shanghai Ocean University

Similar presentations

Presentation on theme: "Sihua Peng, PhD Shanghai Ocean University"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Sihua Peng, PhD Shanghai Ocean University

Similar presentations

Presentation on theme: "Sihua Peng, PhD Shanghai Ocean University"— Presentation transcript:

Similar presentations

About project

Feedback