Size of a hypothesis test Statistical Data Analysis - Lecture09 25/03/03 Size of a hypothesis test The size of a hypothesis test is its Type I error rate, i.e. the proportion of time we would reject the null hypothesis when it was actually true E.g. if our testing procedure has a Type I error rate of 5% or a size of 0.05 then on average we would reject the null hypothesis 1 time in 20 (1/20=0.05) even if it were true Statistical Data Analysis - Lecture09 25/03/03
Power of a hypothesis test A complimentary concept is the power of a hypothesis test. Power is a function of Type II error, , i.e. the proportion of times we would accept the null even though the alternative was true. Power is 1 - E.g. If a hypothesis test has power .9 (or 90%) then on average 10% of the time we would fail to find a significant difference Statistical Data Analysis - Lecture09 25/03/03
Statistical Data Analysis - Lecture09 25/03/03 Size and power Size and and power have a “seesaw” type relationship, i.e. better size means worse power and vice versa. This is exactly the case with the pooled and Welch tests The Welch test uniformly out performs the pooled t-test when the variances are not equal with respect to size, but it has very poor power. The reason for this is relatively straight forward – the effect of the Welch degrees of freedom Statistical Data Analysis - Lecture09 25/03/03
Statistical Data Analysis - Lecture09 25/03/03 Welch’s modification The Welch modification effectively lowers the degrees of freedom to account for the variance inequality By lowering the df, it means we’re less likely to reject the null hypothesis But similarly, we’re less likely to find a difference if there is one. This problem is magnified even more if the “approximate” degrees of freedom is used for the Welch test Statistical Data Analysis - Lecture09 25/03/03
Statistical Data Analysis - Lecture09 25/03/03 A common theme? We’ve seen three different hypothesis tests, and all three have the same form for their test statistic, namely “The estimate minus the hypothesised value divided by the standard error of the estimate has a Student t distribution” Statistical Data Analysis - Lecture09 25/03/03
Statistical Data Analysis - Lecture09 25/03/03 A common theme Let’s go back and look In a one sample t-test we are interested in a hypothesis about the population mean, so , the population mean , a hypothesised value of the population mean , the sample mean estimates population mean therefore our t-statistic is Statistical Data Analysis - Lecture09 25/03/03
Statistical Data Analysis - Lecture09 25/03/03 Two sample t-test Usually in a two sample t-test we are interested in the difference between two means, so , the difference in population means , a hypothesised difference between the population means , the difference between sample mean estimates the difference between population mean Statistical Data Analysis - Lecture09 25/03/03
Statistical Data Analysis - Lecture09 25/03/03 therefore our t-statistic is We can easily extend this result to the difference between two proportions. If we have sample proportions px and py then let Statistical Data Analysis - Lecture09 25/03/03
Statistical Data Analysis - Lecture09 25/03/03 A quiz My significance test has size of 20%. What does this mean? Is this a good test or a bad test with respect to size? Why? My test has 90% power. I carry out an experiment with a control group and a treatment group. The summary statistics are: Write down the null hypothesis of no difference due to the treatment Write down the alternative two-tailed hypothesis Construct a test statistic to test this hypothesis Without using or calculating a P-value say whether there is evidence of an effect or not. Statistical Data Analysis - Lecture09 25/03/03
Statistical Data Analysis - Lecture09 25/03/03 k independent samples When k is two we’ve seen the two sampled t-test. What about when k is not two? Recall our book data We have sentence length data Six books from Stephen King and two from Michael Crichton – eight groups. One way analysis of variance Statistical Data Analysis - Lecture09 25/03/03
Statistical Data Analysis - Lecture09 25/03/03 One way ANOVA We will try and develop ANOVA concepts from a data analysis perspective We have k groups of data. How do we compare these groups We could compare them on the basis of some measure of location Means or medians Statistical Data Analysis - Lecture09 25/03/03
Data value = mean book sentence length + error If we use medians, then we enter the field of non-parametric statistics which is beyond the scope of this course How about means. Let’s assume we’re going to compare the groups on the basis of their means We can think of each observation as consisting of two parts: some average sentence length plus a random perturbation, i.e. Data value = mean book sentence length + error where the error represents the deviation of a single sentence count from the book mean. This boils down to our general formula: DATA = SIGNAL + NOISE Statistical Data Analysis - Lecture09 25/03/03
Statistical Data Analysis - Lecture09 25/03/03 We assume that once we have extracted the signal, the noise for each group behaves in the same way. More explicity we assume that the errors have the same mean of zero and the same std. deviation of and come from the same distribution (usually a normal distribution). This means our model can only “describe the data up to the noise” – i.e. we can only generate the signal We need some notation to go further Let represent the “population” book means We want to see whether the book means differ. If they do, then we want to know how they differ. Statistical Data Analysis - Lecture09 25/03/03
Statistical Data Analysis - Lecture09 25/03/03 To do this, we proceed as usual and take random samples of pages from each book. To keep the discussion general suppose we have k books and we sample ni pages from each book (in our experiment we sampled ni = 50 pages for each of k pages) Let The total number of pages sampled Let yij be the jth observation from the ith group. E.g y12 is the second word count from the book “Eye of the Dragon”. Then, a more mathematical way of describing our data would be The “errors”, ij, are supposed to be normally distributed with mean zero and standard deviation . The means i represent the book averages, and the standard deviation measures the fluctation around the mean, i.e. the within group variation Statistical Data Analysis - Lecture09 25/03/03
Statistical Data Analysis - Lecture09 25/03/03 When we write down such a model along with the assumptions about the distribution of the errors, we call this a probability model, and write We can also describe the grand mean , and effects i by and This allows us to rewrite our probability model as Statistical Data Analysis - Lecture09 25/03/03