Empirical Research Methods in Computer Science Lecture 4 November 2, 2005 Noah Smith.

Slides:

Advertisements

Similar presentations

Review bootstrap and permutation

Advertisements

T-tests continued.

Irwin/McGraw-Hill © Andrew F. Siegel, 1997 and l Chapter 16 l Nonparametrics: Testing with Ordinal Data or Nonnormal Distributions.

Statistical Methods II

McGraw-Hill/Irwin Copyright © 2007 by The McGraw-Hill Companies, Inc. All rights reserved. Nonparametric Methods Chapter 15.

statistics NONPARAMETRIC TEST

Empirical Research Methods in Computer Science Lecture 1, Part 1 October 12, 2005 Noah Smith

Lecture 10 Non Parametric Testing STAT 3120 Statistical Methods I.

Nonparametric tests and ANOVAs: What you need to know.

Business 205. Review Sampling Continuous Random Variables Central Limit Theorem Z-test.

Final Review Session.

Intro to Statistics for the Behavioral Sciences PSYC 1900 Lecture 17: Nonparametric Tests & Course Summary.

EXPERIMENTAL DESIGN Random assignment Who gets assigned to what? How does it work What are limits to its efficacy?

Independent Sample T-test Often used with experimental designs N subjects are randomly assigned to two groups (Control * Treatment). After treatment, the.

15-1 Introduction Most of the hypothesis-testing and confidence interval procedures discussed in previous chapters are based on the assumption that.

Marshall University School of Medicine Department of Biochemistry and Microbiology BMS 617 Lecture 14: Non-parametric tests Marshall University Genomics.

Nonparametric and Resampling Statistics. Wilcoxon Rank-Sum Test To compare two independent samples Null is that the two populations are identical The.

Bootstrap spatobotp ttaoospbr Hesterberger & Moore, chapter 16 1.

Statistical Analysis KSE966/986 Seminar Uichin Lee Oct. 19, 2012.

Chapter 15 Nonparametric Statistics

Statistical Methods II

Chapter 14: Nonparametric Statistics

The paired sample experiment The paired t test. Frequently one is interested in comparing the effects of two treatments (drugs, etc…) on a response variable.

NONPARAMETRIC STATISTICS

Empirical Research Methods in Computer Science Lecture 2, Part 1 October 19, 2005 Noah Smith.

Section Copyright © 2014, 2012, 2010 Pearson Education, Inc. Lecture Slides Elementary Statistics Twelfth Edition and the Triola Statistics Series.

Where are we?. What we have covered: - How to write a primary research paper.

CHAPTER 14: Nonparametric Methods

Chapter 14 Nonparametric Statistics. 2 Introduction: Distribution-Free Tests Distribution-free tests – statistical tests that don’t rely on assumptions.

9 Mar 2007 EMBnet Course – Introduction to Statistics for Biologists Nonparametric tests, Bootstrapping

Previous Lecture: Categorical Data Methods. Nonparametric Methods This Lecture Judy Zhong Ph.D.

Nonparametric Statistics aka, distribution-free statistics makes no assumption about the underlying distribution, other than that it is continuous the.

© Copyright McGraw-Hill CHAPTER 13 Nonparametric Statistics.

Nonparametric Hypothesis tests The approach to explore the small-sized sample and the unspecified population.

Lecture 9 TWO GROUP MEANS TESTS EPSY 640 Texas A&M University.

Lecture 8 Simple Linear Regression (cont.). Section Objectives: Statistical model for linear regression Data for simple linear regression Estimation.

Nonparametric Statistics. In previous testing, we assumed that our samples were drawn from normally distributed populations. This chapter introduces some.

Nonparametric Tests IPS Chapter 15 © 2009 W.H. Freeman and Company.

Lesson 15 - R Chapter 15 Review. Objectives Summarize the chapter Define the vocabulary used Complete all objectives Successfully answer any of the review.

STATISTICAL ANALYSIS FOR THE MATHEMATICALLY-CHALLENGED Associate Professor Phua Kai Lit School of Medicine & Health Sciences Monash University (Sunway.

Chapter 15 – Analysis of Variance Math 22 Introductory Statistics.

Experimental Psychology PSY 433 Appendix B Statistics.

Statistics: Unlocking the Power of Data Lock 5 Exam 2 Review STAT 101 Dr. Kari Lock Morgan 11/13/12 Review of Chapters 5-9.

Nonparamentric Stats –Distribution free tests –e.g., rank tests Sign test –H 0 : Median = 100 H a : Median > 100 if median = 100, then half above, half.

Kruskal-Wallis H TestThe Kruskal-Wallis H Test is a nonparametric procedure that can be used to compare more than two populations in a completely randomized.

Statistics in Applied Science and Technology Chapter14. Nonparametric Methods.

CD-ROM Chap 16-1 A Course In Business Statistics, 4th © 2006 Prentice-Hall, Inc. A Course In Business Statistics 4 th Edition CD-ROM Chapter 16 Introduction.

Nonparametric Statistics

Biostatistics Nonparametric Statistics Class 8 March 14, 2000.

Statistical Data Analysis 2011/2012 M. de Gunst Lecture 7.

Section Copyright © 2014, 2012, 2010 Pearson Education, Inc. Lecture Slides Elementary Statistics Twelfth Edition and the Triola Statistics Series.

Education 793 Class Notes Inference and Hypothesis Testing Using the Normal Distribution 8 October 2003.

From Wikipedia: “Parametric statistics is a branch of statistics that assumes (that) data come from a type of probability distribution and makes inferences.

Statistical Data Analysis 2011/2012 M. de Gunst Lecture 6.

Introduction to Statistics for the Social Sciences SBS200, COMM200, GEOG200, PA200, POL200, or SOC200 Lecture Section 001, Fall 2015 Room 150 Harvill.

ENGR 610 Applied Statistics Fall Week 7 Marshall University CITE Jack Smith.

Copyright © 2010, 2007, 2004 Pearson Education, Inc Lecture Slides Elementary Statistics Eleventh Edition and the Triola Statistics Series by.

Bootstrapping and Randomization Techniques Q560: Experimental Methods in Cognitive Science Lecture 15.

Nonparametric Statistics Overview. Objectives Understand Difference between Parametric and Nonparametric Statistical Procedures Nonparametric methods.

Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. CHAPTER 14: Nonparametric Methods to accompany Introduction to Business Statistics fifth.

Nonparametric Statistics

Statistics for the Social Sciences

Environmental Modeling Basic Testing Methods - Statistics

Nonparametric Statistics Overview

Bootstrap - Example Suppose we have an estimator of a parameter and we want to express its accuracy by its standard error but its sampling distribution.

Reasoning in Psychology Using Statistics

Statistics for the Social Sciences

Psych 231: Research Methods in Psychology

Nonparametric Statistics

Introductory Statistics

Presentation transcript:

Empirical Research Methods in Computer Science Lecture 4 November 2, 2005 Noah Smith

Today Review bootstrap estimate of se (from homework). Review sign and permutation tests for paired samples. Lots of examples of hypothesis tests.

Recall... There is a true value of the statistic. But we don’t know it. We can compute the sample statistic. We know sample means are normally distrubuted (as n gets big):

But we don’t know anything about the distribution of other sample statistics (medians, correlations, etc.)!

Bootstrap world unknown distribution F observed random sample X statistic of interest empirical distribution bootstrap random sample X* bootstrap replication statistics about the estimate (e.g., standard error)

Bootstrap estimate of se Run B bootstrap replicates, and compute the statistic each time: θ*[1], θ*[2], θ*[3],..., θ*[B] (mean of θ* across replications) (sample standard deviation of θ* across replications)

Paired-Sample Design pairs (x i, y i ) x ~ distribution F y ~ distribution G How do F and G differ?

Sign Test H 0 : F and G have the same median median(F) – median(G) = 0 Pr(x > y) = 0.5 sign(x – y) ~ binomial distribution compute bin(N +, 0.5)

Sign Test nonparametric (no assumptions about the data) closed form (no random sampling)

Example: gzip speed build gzip with –O2 or with –O0 on about 650 files out of 1000, gzip-O2 was faster binomial distribution, p = 0.5, n = 1000 p < 3 x

Permutation Test H 0 : F = G Suppose difference in sample means is d. How likely is this difference (or a greater one) under H 0 ? For i = 1 to P  Randomly permute each (x i, y i )  Compute difference in sample means

Permutation Test nonparametric (no assumptions about the data) randomized test

Example: gzip speed 1000 permutations: difference of sample means under H 0 is centered on is very extreme; p ≈ 0

Comparing speed is tricky! It is very difficult to control for everything that could affect runtime. Solution 1: do the best you can. Solution 2: many runs, and then do ANOVA tests (or their nonparametric equivalents). “Is there more variance between conditions than within conditions?”

Sampling method 1 for r = 1 to 10  for each file f for each program p  time p on f

Result (gzip first) student 2’s program faster than gzip!

Result (student first) student 2’s program is slower than gzip!

Sampling method 1 for r = 1 to 10  for each file f for each program p  time p on f

Order effects Well-known in psychology. What the subject does at time t will affect what she does at time t+1.

Sampling method 2 for r = 1 to 10  for each program p for each file f  time p on f

Result gzip wins

Sign and Permutation Tests median(F)  median(G) all distribution pairs (F, G) F  G

Sign and Permutation Tests median(F)  median(G) all distribution pairs (F, G) F  G sign test rejects H 0 

Sign and Permutation Tests median(F)  median(G) all distribution pairs (F, G) F  G  permutation test rejects H 0

Sign and Permutation Tests median(F)  median(G) all distribution pairs (F, G) F  G  permutation test rejects H 0 sign test rejects H 0 

There are other tests! We have chosen two that are  nonparametric  easy to implement Others include:  Wilcoxon Signed Rank Test  Kruskal-Wallis (nonparametric “ANOVA”)

Pre-increment? Conventional wisdom: “Better to use ++x than to use x++.” Really, with a modern compiler?

Two (toy) programs for(i = 0; i < (1 << 30); ++i) j = ++k; for(i = 0; i < (1 << 30); i++) j = k++; ran each 200 times (interleaved) mean runtimes were and significant well below.05

What? leal -8(%ebp), %eax incl (%eax) movl -8(%ebp), %eax leal -8(%ebp), %edx incl (%edx) %edx is not used anywhere else

Conclusion Compile with –O and the assembly code is identical!

Why was this a dumb experiment?

Pre-increment, take 2 Take gzip source code. Replace all post-increments with pre-increments, in places where semantics won’t change. Run on 1000 files, 10 times each. Compare average runtime by file.

Sign test p = 8.5 x 10 -8

Permutation test

Conclusion Preincrementing is faster!... but what about –O?  sign test: p =  permutation test: p = Preincrement matters without an optimizing compiler.

Joke.

Your programs... 8 students had a working program both weeks. 6 people changed their code. 1 person changed nothing. 1 person changed to –O3. 3 people lossy in week 1. Everyone lossy in week 2!

Your programs! Was there an improvement on compression between the two versions? H 0 : No. Find sampling distribution of difference in means, using permutations.

Student 1 (lossless week 1)

Compression < 1?

Student 2: worse compression

Compression < 1?

Student 3

Student 4 (lossless week 1)

Student 5 (lossless week 1)

Student 6

Student 7

Student 8

Homework Assignment 2 6 experiments: 1. Does your program compress text or images better? 2. What about variance of compression? 3. What about gzip’s compression? 4. Variance of gzip’s compression? 5. Was there a change in the compression of your program from week 1 to week 2? 6. In the runtime?

Remainder of the course 11/9: EDA 11/16: Regression and learning 11/23: Happy Thanksgiving! 11/30: Statistical debugging 12/7: Review, Q&A Saturday 12/17, 2-5pm: Exam