Statistical Methods in Computer Science Hypothesis Testing II: Single-Factor Experiments Ido Dagan.

Statistical Methods in Computer Science Hypothesis Testing II: Single-Factor Experiments Ido Dagan

Empirical Methods in Computer Science © 2006-now Gal Kaminka 2 Single-Factor Experiments A generalization of treatment experiments Determine effect of independent variable values (nominal) Effect: On the dependent variable treatment 1 Ind 1 & Ex 1 & Ex 2 &.... & Ex n ==> Dep 1 treatment 2 Ind 2 & Ex 1 & Ex 2 &.... & Ex n ==> Dep 2 control Ex 1 & Ex 2 &.... & Ex n ==> Dep 3 Compare performance of algorithm A to B to C.... Control condition: Optional (e.g., to establish baseline)

Empirical Methods in Computer Science © 2006-now Gal Kaminka 3 Single-Factor Experiments An generalization of treatment experiments Determine effect of independent variable values (nominal) Effect: On the dependent variable treatment 1 Ind 1 & Ex 1 & Ex 2 &.... & Ex n ==> Dep 1 treatment 2 Ind 2 & Ex 1 & Ex 2 &.... & Ex n ==> Dep 2 control Ex 1 & Ex 2 &.... & Ex n ==> Dep 3 Compare performance of algorithm A to B to C.... Control condition: Optional (e.g., to establish baseline) Values of independent variable Values of dependent variable

Empirical Methods in Computer Science © 2006-now Gal Kaminka 4 Single-Factor Experiments: Definitions The independent variable is called the factor Its values (being tested) are called levels Our goal: Determine whether there is an effect of levels Null hypothesis: There is no effect Alternative hypothesis: At least one level causes an effect Tool: One-way ANOVA A simple special case of general Analysis of Variance

Empirical Methods in Computer Science © 2006-now Gal Kaminka 5 The case for Single-factor ANOVA (one-way ANOVA) We have k samples (k levels of the factor) Each with its own sample mean, sample std. deviation for the dependent variable value We want to determine whether one (at least) is different treatment 1 Ind 1 & Ex 1 & Ex 2 &.... & Ex n ==> Dep 1 … treatment 2 Ind k & Ex 1 & Ex 2 &.... & Ex n ==> Dep k control Ex 1 & Ex 2 &.... & Ex n ==> Dep 3 Values of independent variable = levels of the factor Values of dependent variable Cannot use the tests we learned: Why?

Empirical Methods in Computer Science © 2006-now Gal Kaminka 6 The case for Single-factor ANOVA (one-way ANOVA) We have k samples (k levels of the factor) Each with its own sample mean, sample std. deviation We want to determine whether one (at least) is different H 0 : M 1 =M 2 =M 3 =M 4 H 1 : There exist i,j such that M i <> M j

Empirical Methods in Computer Science © 2006-now Gal Kaminka 7 The case for Single-factor ANOVA (one-way ANOVA) We have k samples (k levels of the factor) Each with its own sample mean, sample std. deviation We want to determine whether one (at least) is different H 0 : M 1 =M 2 =M 3 =M 4 H 1 : There exist i,j such that M i <> M j Why not use t-test to compare every M i, M j ?

Empirical Methods in Computer Science © 2006-now Gal Kaminka 8 Multiple paired comparisons Let a c be the probability of an error in a single comparison alpha = the probability of incorrectly rejecting null hypothesis 1-a c : probability of making no error in a single comparison (1-a c ) m : probability of no error in m comparisons (experiment) a e = 1-(1-a c ) m : probability of an error in the experiment Under assumption of independent comparisons a e quickly becomes large as m increases

Empirical Methods in Computer Science © 2006-now Gal Kaminka 9 Example Suppose we want to contrast 15 levels of the factor 15 groups, k=15 Total number of pairwise comparisons (m) : 105 15 X (15-1) / 2 = 105 Suppose a c = 0.05 Then a e = 1-(1-a c ) m = 1-(1-0.05) 105 = 0.9954 We are very likely to make a type I error!

Empirical Methods in Computer Science © 2006-now Gal Kaminka 10 Possible solutions? Reduce a c until overall a e level is 0.05 (or as needed) Risk: comparison alpha target may become unobtainable Ignore experiment null hypothesis, focus on comparisons Carry out m comparisons # of errors in m experiments: m X a c e.g., m=105, a c =0.05, # of errors = 5.25. But which?

Empirical Methods in Computer Science © 2006-now Gal Kaminka 11 One-way ANOVA A method for testing the experiment null hypothesis H 0 : all levels' sample means are equal to each other Key idea: Estimate a variance B under the assumption H 0 is true Estimate a “real” variance W (regardless of H 0 ) Use F-test to test hypothesis that B=W Assumes variance of all groups is the same

Empirical Methods in Computer Science © 2006-now Gal Kaminka 12 Some preliminaries Let x i,j be the j th element in sample i Let M i be the sample mean of sample i Let V i be the sample variance of sample i For example: x 1,2 x 3,4

Empirical Methods in Computer Science © 2006-now Gal Kaminka 13 Some preliminaries Let x i,j be the j th element in sample i Let M i be the sample mean of sample i Let V i be the sample variance of sample i Let M be the grand sample mean (all elements, all samples) Let V be the grand sample variance

Empirical Methods in Computer Science © 2006-now Gal Kaminka 14 The variance contributing to a value Every element x i,j can be re-written as: x i,j = M + e i,j where e i,j is some error component We can focus on the error component e i,j = x i,j – M which we will rewrite as: e i,j = (x i,j - M i ) + (M i - M)

Empirical Methods in Computer Science © 2006-now Gal Kaminka 15 Within-group and between-group The re-written form of the error component has two parts e i,j = (x i,j - M i ) + (M i - M) Within-group component: variance w.r.t group mean Between-group component: variance w.r.t grand mean For example, in the table: x 1,1 = 14.9, M 1 = 14.86, M = 10.8 e 1,1 = (14.9-14.86) + (14.86 – 10.8) = 0.04 + 4.06 = 4.1

Empirical Methods in Computer Science © 2006-now Gal Kaminka 16 Within-group and between-group The re-written form of the error component has two parts e i,j = (x i,j - M i ) + (M i - M) Within-group component: variance w.r.t group mean Between-group component: variance w.r.t grand mean For example, in the table: x 1,1 = 14.9, M 1 = 14.86, M = 10.8 e 1,1 = (14.9-14.86) + (14.86 – 10.8) = 0.04 + 4.06 = 4.1 Note within-group and between-group components: Most of the error (variance) is due to the between group! Can we use this in more general fashion?

Empirical Methods in Computer Science © 2006-now Gal Kaminka 19 Comparing within-group and between-groups components The error component of a single element is: e i,j =(x i,j - M) = (x i,j - M i ) + (M i - M) Let us relate this to the sample and grand sums-of-squares It can be shown that: Let us rewrite this as

Empirical Methods in Computer Science © 2006-now Gal Kaminka 24 Determining final alpha level MS within is an estimate of the (inherent) population variance Which does not depend on the null hypothesis (M 1 =M 2 =... M I ) Intuition: It’s an “average” of variances in the individual groups MS between estimates the population variance + the treatment effect It does depend on the null hypothesis Intuition: It’s similar to an estimate for the variance of the samples means, where each component is multiplied by N i Recall: sample mean variance = N · population variance If the null hypothesis is true – the two values estimate the inherent variance, and should be equal within the sampling variation So now we have two variance estimates for testing Use F-test F = Ms between / MS within Compare to F-distribution with df between, df within Determine alpha level (significance)

Empirical Methods in Computer Science © 2006-now Gal Kaminka 28 Reading the results from statistics software You can use a statistics software to run one-way ANOVA It will give out something like this: SourcedfSSMSFp between2173.386.732.97p<0.001 within1431.52.6 total16204.9 You should have no problem reading this, now.

Empirical Methods in Computer Science © 2006-now Gal Kaminka 29 Summary Treatment and single-factor experiments Independent variable: categorical Dependent variable: “numerical” (ratio/interval) Multiple comparisons: A problem for experiment hypotheses Run one-way ANOVA instead Assumes samples are normal, have equal variances If significant, run additional tests for details: Tukey's procedure (T method) LSD Scheffe...

Statistical Methods in Computer Science Hypothesis Testing II: Single-Factor Experiments Ido Dagan.

Similar presentations

Presentation on theme: "Statistical Methods in Computer Science Hypothesis Testing II: Single-Factor Experiments Ido Dagan."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Statistical Methods in Computer Science Hypothesis Testing II: Single-Factor Experiments Ido Dagan.

Similar presentations

Presentation on theme: "Statistical Methods in Computer Science Hypothesis Testing II: Single-Factor Experiments Ido Dagan."— Presentation transcript:

Similar presentations

About project

Feedback