Introduction to resampling in MATLAB Feb 5 2009 Austin Hilliard.

Slides:



Advertisements
Similar presentations
October 1999 Statistical Methods for Computer Science Marie desJardins CMSC 601 April 9, 2012 Material adapted.
Advertisements

Is cross-fertilization good or bad?: An analysis of Darwin’s Zea Mays Data By Jamie Chatman and Charlotte Hsieh.
Inferential Statistics
Sampling Distributions and Sample Proportions
Review of the Basic Logic of NHST Significance tests are used to accept or reject the null hypothesis. This is done by studying the sampling distribution.
Sampling Distributions (§ )
STAT 135 LAB 14 TA: Dongmei Li. Hypothesis Testing Are the results of experimental data due to just random chance? Significance tests try to discover.
AP Statistics – Chapter 9 Test Review
Evaluation (practice). 2 Predicting performance  Assume the estimated error rate is 25%. How close is this to the true error rate?  Depends on the amount.
Evaluation.
Stat 512 – Lecture 12 Two sample comparisons (Ch. 7) Experiments revisited.
Lecture 9: One Way ANOVA Between Subjects
Experimental Evaluation
© 1999 Prentice-Hall, Inc. Chap Chapter Topics Hypothesis Testing Methodology Z Test for the Mean (  Known) p-Value Approach to Hypothesis Testing.
Marshall University School of Medicine Department of Biochemistry and Microbiology BMS 617 Lecture 14: Non-parametric tests Marshall University Genomics.
Nonparametric and Resampling Statistics. Wilcoxon Rank-Sum Test To compare two independent samples Null is that the two populations are identical The.
Statistical Analysis. Purpose of Statistical Analysis Determines whether the results found in an experiment are meaningful. Answers the question: –Does.
INFERENTIAL STATISTICS – Samples are only estimates of the population – Sample statistics will be slightly off from the true values of its population’s.
EVALUATION David Kauchak CS 451 – Fall Admin Assignment 3 - change constructor to take zero parameters - instead, in the train method, call getFeatureIndices()
Experimental Statistics - week 2
Statistics Primer ORC Staff: Xin Xin (Cindy) Ryan Glaman Brett Kellerstedt 1.
Significance Tests in practice Chapter Tests about a population mean  When we don’t know the population standard deviation σ, we perform a one.
Topics: Statistics & Experimental Design The Human Visual System Color Science Light Sources: Radiometry/Photometry Geometric Optics Tone-transfer Function.
Non-parametric Tests. With histograms like these, there really isn’t a need to perform the Shapiro-Wilk tests!
Section 8.1 Estimating  When  is Known In this section, we develop techniques for estimating the population mean μ using sample data. We assume that.
Learning Objectives In this chapter you will learn about the t-test and its distribution t-test for related samples t-test for independent samples hypothesis.
Chi-Square as a Statistical Test Chi-square test: an inferential statistics technique designed to test for significant relationships between two variables.
9 Mar 2007 EMBnet Course – Introduction to Statistics for Biologists Nonparametric tests, Bootstrapping
Day 3: Sampling Distributions. CCSS.Math.Content.HSS-IC.A.1 Understand statistics as a process for making inferences about population parameters based.
Parametric tests (independent t- test and paired t-test & ANOVA) Dr. Omar Al Jadaan.
Lab 3b: Distribution of the mean
Confidence intervals and hypothesis testing Petter Mostad
Lecture 2 Review Probabilities Probability Distributions Normal probability distributions Sampling distributions and estimation.
Nonparametric Statistics
Introducing Inference with Bootstrapping and Randomization Kari Lock Morgan Department of Statistical Science, Duke University with.
DIRECTIONAL HYPOTHESIS The 1-tailed test: –Instead of dividing alpha by 2, you are looking for unlikely outcomes on only 1 side of the distribution –No.
The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers CHAPTER 10 Comparing Two Populations or Groups 10.1.
The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers CHAPTER 10 Comparing Two Populations or Groups 10.1.
Experimental Psychology PSY 433 Appendix B Statistics.
Statistics: Unlocking the Power of Data Lock 5 Exam 2 Review STAT 101 Dr. Kari Lock Morgan 11/13/12 Review of Chapters 5-9.
Stats Lunch: Day 3 The Basis of Hypothesis Testing w/ Parametric Statistics.
Data Mining Practical Machine Learning Tools and Techniques By I. H. Witten, E. Frank and M. A. Hall Chapter 5: Credibility: Evaluating What’s Been Learned.
Welcome to MM570 Psychological Statistics
Introduction to resampling in MATLAB. So you've done an experiment... Two independent datasets: control experimental Have n numbers in each dataset, representing.
1 URBDP 591 A Lecture 12: Statistical Inference Objectives Sampling Distribution Principles of Hypothesis Testing Statistical Significance.
Summarizing Risk Analysis Results To quantify the risk of an output variable, 3 properties must be estimated: A measure of central tendency (e.g. µ ) A.
From Wikipedia: “Parametric statistics is a branch of statistics that assumes (that) data come from a type of probability distribution and makes inferences.
Chapter 1 Introduction to Statistics. Section 1.1 Fundamental Statistical Concepts.
Synthesis and Review 2/20/12 Hypothesis Tests: the big picture Randomization distributions Connecting intervals and tests Review of major topics Open Q+A.
Learning Objectives After this section, you should be able to: The Practice of Statistics, 5 th Edition1 DESCRIBE the shape, center, and spread of the.
Area Test for Observations Indexed by Time L. B. Green Middle Tennessee State University E. M. Boczko Vanderbilt University.
Rodney Nielsen Many of these slides were adapted from: I. H. Witten, E. Frank and M. A. Hall Data Science Credibility: Evaluating What’s Been Learned Predicting.
Bootstrapping and Randomization Techniques Q560: Experimental Methods in Cognitive Science Lecture 15.
Statistics 22 Comparing Two Proportions. Comparisons between two percentages are much more common than questions about isolated percentages. And they.
Statistical Inference
Inference for Regression (Chapter 14) A.P. Stats Review Topic #3
Resampling: power analysis
CHAPTER 10 Comparing Two Populations or Groups
This Week Review of estimation and hypothesis testing
Simulation-Based Approach for Comparing Two Means
CHAPTER 10 Comparing Two Populations or Groups
Lesson Inferences about the Differences between Two Medians: Dependent Samples.
CHAPTER 10 Comparing Two Populations or Groups
CHAPTER 10 Comparing Two Populations or Groups
CHAPTER 10 Comparing Two Populations or Groups
CHAPTER 10 Comparing Two Populations or Groups
CHAPTER 10 Comparing Two Populations or Groups
CHAPTER 10 Comparing Two Populations or Groups
Producing good data through sampling and experimentation
CHAPTER 10 Comparing Two Populations or Groups
Presentation transcript:

Introduction to resampling in MATLAB Feb Austin Hilliard

So you've done an experiment... Two independent datasets: control experimental Have n numbers in each dataset, representing independent measurements Question: Did the experimental manipulation have an effect, i.e. did it make a difference?

The question restated Is there a significant (i.e. “real”) numerical difference between the control and experimental datasets? What is the probability that the two datasets are subsets of two different/independent distributions? What is the probability that the two datasets are NOT subsets of the same underlying distribution?

Answering the question What do we need? A number(s) to represent/summarize each dataset → data descriptors A test for comparing these numbers → test statistic A way to assess the significance of the test statistic → distribution of test statistics generated from datasets that ARE from the same underlying distribution

Descriptors and tests Know your data! Visualize it Does it have a central tendency? (i.e. is histogram vaguely mound shaped?)  - If yes, use the mean as descriptor  - If no, median may be better Typical test: compute difference (subtraction) between descriptors

We'll discuss how to assess the significance of your test statistic soon, that's the resampling part... But first let's visualize some data

Simulate and inspect data ctrl = 10.*(5.75+randn(25,1)); exp = 10.*(5+randn(25,1)); boxplot([ctrl exp])groups = {'control' 'experimental'} boxplot([ctrl exp],'labels',groups)

Central tendency? hist(ctrl) hist(exp) title('Control group') title('Experimental group')

Compute descriptors and test stat Our data looks pretty mound shaped, so let's use the mean as data descriptor Take the absolute value of the difference as our test stat ctrl_d = mean(ctrl); exp_d = mean(exp); test_stat = abs(ctrl_d - exp_d);

Assessing the test statistic What we would really like to test: The probability that the two datasets are subsets of two different/independent distributions Problem: We don't know what those hypothetical independent distributions are, if we did we wouldn't have to go through all of this!

Assessing the test statistic Solution: Compute the probability that the two datasets are NOT subsets of the same underlying distribution How to do this? Start by assuming datasets are subsets of same distribution → null hypothesis See what test statistic looks like when null hypothesis is true

Assessing the test statistic Need to generate a distribution of test statistic generated when datasets really ARE from the same underlying distribution → distribution of test stat under null hypothesis Then we can quantify how (a)typical the value of our test stat is under null hypothesis if typical, our datasets are likely from same distribution → no effect of experiment if atypical, there is a good chance datasets are from different distributions → experiment had an effect

Generate the null distribution using bootstrap Basic procedure: Combine datasets Randomly sample (w/ replacement) from combined dataset to create two pseudo datasets w/ same n as real datasets Compute descriptors and test statistic for pseudo datasets Repeat 10,000 times, keeping track of pseudo test stat

Combine datasets and resample box = [ctrl; exp]; % combine datasets into 'box' % could use concat() pseudo_stat_dist = zeros(1,10000); % create vector 'pseudo_stat_dist' % to store results of resampling % --- start resampling --- for trials = 1:10000 % resample times pseudo_ctrl =... % create pseudo groups to be sample(length(ctrl), box); % same size as actual groups pseudo_exp =... sample(length(exp), box); pseudo_ctrl_d =... % compute pseudo group mean(pseudo_ctrl); % descriptors pseudo_exp_d =... mean(pseudo_exp); pseudo_stat =... % compute pseudo test stat abs(pseudo_ctrl_d - pseudo_exp_d); pseudo_stat_dist(trials) =... % store pseudo test stat to pseudo_stat; % build null distribution end

Now what? Compute how many values of pseudo test stat are greater than our actual test stat, then divide by the total number of data points in the null distribution This approximates the likelihood of getting our actual test stat (i.e. the likelihood of seeing a difference as big as we see) if our two datasets were truly from the same underlying distribution → p-value

Compute p-val and visualize null distribution bigger = count(pseudo_stat_dist > test_stat); % count pseudo test stats % bigger than actual stat pval = bigger / length(pseudo_stat_dist) % divide by total number % of pseudo test stats to % compute p-value % could use proportion() hist(pseudo_stat_dist) % show histogram of pseudo title('Pseudo test stat distribution') % test stats xlabel('Pseudo test stat') ylabel('Frequency')

How to interpret the p-value? Assuming that the null hypothesis is true, the p- value is the likelihood that we would see a value of the test statistic that is greater than the value of our actual test statistic Restated → Assuming both groups really are from the same underlying distribution, the p-value is the likelihood that we would see a difference between them as big as we do

“Paired” data What if your two datasets are not actually independent? Maybe you have a single group of subjects/birds/cells/etc. with measurements taken under different conditions Important: the measurements within each condition must still be independent

Bootstrap to test paired data Basic strategy is the same, but... Instead of one descriptor/group, have as many descriptors as data pairs – compute difference between measurement under condition1 and measurement under condition2 Test statistic is mean/median/etc. of the differences Slightly different resampling procedure

Visualize data Investigate distributions for raw data, also should plot distribution of the differences between paired data-points Why? → Test statistic is computed from these differences, not the raw datasets

Resampling procedure Randomly sample 1 or -1 and store in vector Multiply by vector of differences – randomizes direction of change Compute test stat (mean, median, etc.) for randomized differences Repeat Now we have distribution of test statistic if direction of change is random, i.e. NOT dependent on experimental conditions

Compute descriptors and test stat diffs = ctrl - exp; % compute change from ctrl condition % to exp condition test_stat = abs(mean(diffs)); % compute test statistic pseudo_stat_dist = zeros(1,10000); % create vector to store pseudo test % stats generated from resampling

Resampling for trials = 1:10000 % resample 10,000 times signs = sample(length(diffs), [-1 1])'; % randomly create vector of 1's % and -1's same size as diffs pseudo_diffs = signs.* diffs; % multiply by actual diffs to % randomize direction of diff pseudo_stat = abs(mean(pseudo_diffs)); % compute pseudo test stat from % pseudo diffs pseudo_stat_dist(trials) = pseudo_stat; % store pseudo test s tat to % grow null distribution end

Compute p-val and visualize null distribution bigger = count(pseudo_stat_dist > test_stat); % count pseudo test stats % bigger than actual stat pval = bigger / length(pseudo_stat_dist) % divide by total number % of pseudo test stats to % compute p-value % could use proportion() hist(pseudo_stat_dist) % show histogram of pseudo title('Pseudo test stat distribution') % test stats xlabel('Pseudo test stat') ylabel('Frequency')

function [] = bootstrap2groupsMEAN_PAIRED (ctrl, exp) % --- prepare for resampling --- diffs = ctrl - exp; % compute change from ctrl condition % to exp condition test_stat = abs(mean(diffs)) % compute test statistic pseudo_stat_dist = zeros(1,10000); % create vector to store pseudo test % stats generated from resampling % --- resampling --- for trials = 1:10000 % resample 10,000 times signs = sample(length(diffs), [-1 1])'; % randomly create vector of 1's % and -1's same size as diffs pseudo_diffs = signs.* diffs; % multiply by actual diffs to % randomize direction of diff pseudo_stat = abs(mean(pseudo_diffs)); % compute pseudo test stat from % pseudo diffs pseudo_stat_dist(trials) = pseudo_stat; % store pseudo test stat to % grow null distribution end % --- end resampling --- bigger = count(pseudo_stat_dist > test_stat); % count pseudo test stats % bigger than actual stat pval = bigger / length(pseudo_stat_dist) % divide by total number % of pseudo test stats to % compute p-value % could use proportion() hist(pseudo_stat_dist) % show histogram of pseudo title('Pseudo test stat distribution') % test stats xlabel('Pseudo test stat') ylabel('Frequency')

function [] = bootstrap2groupsMEANS (ctrl, exp) % % function [] = bootstrap2groupsMEANS (ctrl, exp) % % 'ctrl' and 'exp' are assumed to be column vectors, this is % necessary for creation of 'box' % % --- prepare for resampling --- ctrl_d = mean(ctrl); % compute descriptors exp_d = mean(exp); % 'ctrl_d' and 'exp_d' test_stat = abs(ctrl_d - exp_d) % compute test statistic 'test_stat' box = [ctrl; exp]; % combine datasets into 'box' % could use concat() pseudo_stat_dist = zeros(1,10000); % create vector 'pseudo_stat_dist' % to store results of resampling % --- start resampling --- for trials = 1:10000 % resample 10,000 times pseudo_ctrl =... % create pseudo groups to be sample(length(ctrl), box); % same size as actual groups pseudo_exp =... sample(length(exp), box); pseudo_ctrl_d =... % compute pseudo group mean(pseudo_ctrl); % descriptors pseudo_exp_d =... mean(pseudo_exp); pseudo_stat =... % compute pseudo test stat abs(pseudo_ctrl_d - pseudo_exp_d); pseudo_stat_dist(trials) =... % store pseudo test stat to pseudo_stat; % build null distribution end % --- end resampling --- bigger = count(pseudo_stat_dist > test_stat); % count pseudo test stats % bigger than actual stat pval = bigger / length(pseudo_stat_dist) % divide by total number % of pseudo test stats to % compute p-value % could use proportion() hist(pseudo_stat_dist) % show histogram of pseudo title('Pseudo test stat distribution') % test stats xlabel('Pseudo test stat') ylabel('Frequency')

Don't forget to look online to get a more generalized version of 'bootstrap2groupsMEANS.m' called 'bootstrap2groups.m'online It's more flexible (can define method for descriptor and iterations as input), stores output in a struct, plots raw data, has one-tailed tests if needed, and can be used for demo (try running 'out =

Possible topics for next week confidence intervals normality testing ANOVA (1 or 2-way) power