Presentation is loading. Please wait.

Presentation is loading. Please wait.

Statistical Methods in Computer Science Hypothesis Testing II: Single-Factor Experiments Ido Dagan.

Similar presentations


Presentation on theme: "Statistical Methods in Computer Science Hypothesis Testing II: Single-Factor Experiments Ido Dagan."— Presentation transcript:

1 Statistical Methods in Computer Science Hypothesis Testing II: Single-Factor Experiments Ido Dagan

2 Empirical Methods in Computer Science © 2006-now Gal Kaminka 2 Single-Factor Experiments A generalization of treatment experiments Determine effect of independent variable values (nominal) Effect: On the dependent variable treatment 1 Ind 1 & Ex 1 & Ex 2 &.... & Ex n ==> Dep 1 treatment 2 Ind 2 & Ex 1 & Ex 2 &.... & Ex n ==> Dep 2 control Ex 1 & Ex 2 &.... & Ex n ==> Dep 3 Compare performance of algorithm A to B to C.... Control condition: Optional (e.g., to establish baseline)

3 Empirical Methods in Computer Science © 2006-now Gal Kaminka 3 Single-Factor Experiments An generalization of treatment experiments Determine effect of independent variable values (nominal) Effect: On the dependent variable treatment 1 Ind 1 & Ex 1 & Ex 2 &.... & Ex n ==> Dep 1 treatment 2 Ind 2 & Ex 1 & Ex 2 &.... & Ex n ==> Dep 2 control Ex 1 & Ex 2 &.... & Ex n ==> Dep 3 Compare performance of algorithm A to B to C.... Control condition: Optional (e.g., to establish baseline) Values of independent variable Values of dependent variable

4 Empirical Methods in Computer Science © 2006-now Gal Kaminka 4 Single-Factor Experiments: Definitions The independent variable is called the factor Its values (being tested) are called levels Our goal: Determine whether there is an effect of levels Null hypothesis: There is no effect Alternative hypothesis: At least one level causes an effect Tool: One-way ANOVA A simple special case of general Analysis of Variance

5 Empirical Methods in Computer Science © 2006-now Gal Kaminka 5 The case for Single-factor ANOVA (one-way ANOVA) We have k samples (k levels of the factor) Each with its own sample mean, sample std. deviation for the dependent variable value We want to determine whether one (at least) is different treatment 1 Ind 1 & Ex 1 & Ex 2 &.... & Ex n ==> Dep 1 … treatment 2 Ind k & Ex 1 & Ex 2 &.... & Ex n ==> Dep k control Ex 1 & Ex 2 &.... & Ex n ==> Dep 3 Values of independent variable = levels of the factor Values of dependent variable Cannot use the tests we learned: Why?

6 Empirical Methods in Computer Science © 2006-now Gal Kaminka 6 The case for Single-factor ANOVA (one-way ANOVA) We have k samples (k levels of the factor) Each with its own sample mean, sample std. deviation We want to determine whether one (at least) is different H 0 : M 1 =M 2 =M 3 =M 4 H 1 : There exist i,j such that M i <> M j

7 Empirical Methods in Computer Science © 2006-now Gal Kaminka 7 The case for Single-factor ANOVA (one-way ANOVA) We have k samples (k levels of the factor) Each with its own sample mean, sample std. deviation We want to determine whether one (at least) is different H 0 : M 1 =M 2 =M 3 =M 4 H 1 : There exist i,j such that M i <> M j Why not use t-test to compare every M i, M j ?

8 Empirical Methods in Computer Science © 2006-now Gal Kaminka 8 Multiple paired comparisons Let a c be the probability of an error in a single comparison alpha = the probability of incorrectly rejecting null hypothesis 1-a c : probability of making no error in a single comparison (1-a c ) m : probability of no error in m comparisons (experiment) a e = 1-(1-a c ) m : probability of an error in the experiment Under assumption of independent comparisons a e quickly becomes large as m increases

9 Empirical Methods in Computer Science © 2006-now Gal Kaminka 9 Example Suppose we want to contrast 15 levels of the factor 15 groups, k=15 Total number of pairwise comparisons (m) : 105 15 X (15-1) / 2 = 105 Suppose a c = 0.05 Then a e = 1-(1-a c ) m = 1-(1-0.05) 105 = 0.9954 We are very likely to make a type I error!

10 Empirical Methods in Computer Science © 2006-now Gal Kaminka 10 Possible solutions? Reduce a c until overall a e level is 0.05 (or as needed) Risk: comparison alpha target may become unobtainable Ignore experiment null hypothesis, focus on comparisons Carry out m comparisons # of errors in m experiments: m X a c e.g., m=105, a c =0.05, # of errors = 5.25. But which?

11 Empirical Methods in Computer Science © 2006-now Gal Kaminka 11 One-way ANOVA A method for testing the experiment null hypothesis H 0 : all levels' sample means are equal to each other Key idea: Estimate a variance B under the assumption H 0 is true Estimate a “real” variance W (regardless of H 0 ) Use F-test to test hypothesis that B=W Assumes variance of all groups is the same

12 Empirical Methods in Computer Science © 2006-now Gal Kaminka 12 Some preliminaries Let x i,j be the j th element in sample i Let M i be the sample mean of sample i Let V i be the sample variance of sample i For example: x 1,2 x 3,4

13 Empirical Methods in Computer Science © 2006-now Gal Kaminka 13 Some preliminaries Let x i,j be the j th element in sample i Let M i be the sample mean of sample i Let V i be the sample variance of sample i Let M be the grand sample mean (all elements, all samples) Let V be the grand sample variance

14 Empirical Methods in Computer Science © 2006-now Gal Kaminka 14 The variance contributing to a value Every element x i,j can be re-written as: x i,j = M + e i,j where e i,j is some error component We can focus on the error component e i,j = x i,j – M which we will rewrite as: e i,j = (x i,j - M i ) + (M i - M)

15 Empirical Methods in Computer Science © 2006-now Gal Kaminka 15 Within-group and between-group The re-written form of the error component has two parts e i,j = (x i,j - M i ) + (M i - M) Within-group component: variance w.r.t group mean Between-group component: variance w.r.t grand mean For example, in the table: x 1,1 = 14.9, M 1 = 14.86, M = 10.8 e 1,1 = (14.9-14.86) + (14.86 – 10.8) = 0.04 + 4.06 = 4.1

16 Empirical Methods in Computer Science © 2006-now Gal Kaminka 16 Within-group and between-group The re-written form of the error component has two parts e i,j = (x i,j - M i ) + (M i - M) Within-group component: variance w.r.t group mean Between-group component: variance w.r.t grand mean For example, in the table: x 1,1 = 14.9, M 1 = 14.86, M = 10.8 e 1,1 = (14.9-14.86) + (14.86 – 10.8) = 0.04 + 4.06 = 4.1 Note within-group and between-group components: Most of the error (variance) is due to the between group! Can we use this in more general fashion?

17 Empirical Methods in Computer Science © 2006-now Gal Kaminka 17 No within-group variance No variance within group, in any element

18 Empirical Methods in Computer Science © 2006-now Gal Kaminka 18 No between-group variance No variance between groups, in any group

19 Empirical Methods in Computer Science © 2006-now Gal Kaminka 19 Comparing within-group and between-groups components The error component of a single element is: e i,j =(x i,j - M) = (x i,j - M i ) + (M i - M) Let us relate this to the sample and grand sums-of-squares It can be shown that: Let us rewrite this as

20 Empirical Methods in Computer Science © 2006-now Gal Kaminka 20 From Sums of Squares (SS) to variances We know... and convert to Mean Squares (as variance estimates):

21 Empirical Methods in Computer Science © 2006-now Gal Kaminka 21 From Sums of Squares (SS) to variances We know... and convert to variances: Degrees of freedom

22 Empirical Methods in Computer Science © 2006-now Gal Kaminka 22 From Sums of Squares (SS) to variances We know... and convert to variances: # of levels (samples)

23 Empirical Methods in Computer Science © 2006-now Gal Kaminka 23 From Sums of Squares (SS) to variances We know... and convert to variances:

24 Empirical Methods in Computer Science © 2006-now Gal Kaminka 24 Determining final alpha level MS within is an estimate of the (inherent) population variance Which does not depend on the null hypothesis (M 1 =M 2 =... M I ) Intuition: It’s an “average” of variances in the individual groups MS between estimates the population variance + the treatment effect It does depend on the null hypothesis Intuition: It’s similar to an estimate for the variance of the samples means, where each component is multiplied by N i Recall: sample mean variance = N · population variance If the null hypothesis is true – the two values estimate the inherent variance, and should be equal within the sampling variation So now we have two variance estimates for testing Use F-test F = Ms between / MS within Compare to F-distribution with df between, df within Determine alpha level (significance)

25 Empirical Methods in Computer Science © 2006-now Gal Kaminka 25 Example

26 Empirical Methods in Computer Science © 2006-now Gal Kaminka 26 Example

27 Empirical Methods in Computer Science © 2006-now Gal Kaminka 27 Example Check F distribution(2,12): Significant!

28 Empirical Methods in Computer Science © 2006-now Gal Kaminka 28 Reading the results from statistics software You can use a statistics software to run one-way ANOVA It will give out something like this: SourcedfSSMSFp between2173.386.732.97p<0.001 within1431.52.6 total16204.9 You should have no problem reading this, now.

29 Empirical Methods in Computer Science © 2006-now Gal Kaminka 29 Summary Treatment and single-factor experiments Independent variable: categorical Dependent variable: “numerical” (ratio/interval) Multiple comparisons: A problem for experiment hypotheses Run one-way ANOVA instead Assumes samples are normal, have equal variances If significant, run additional tests for details: Tukey's procedure (T method) LSD Scheffe...


Download ppt "Statistical Methods in Computer Science Hypothesis Testing II: Single-Factor Experiments Ido Dagan."

Similar presentations


Ads by Google