Advanced Biostatistics

Slides:



Advertisements
Similar presentations
Review bootstrap and permutation
Advertisements

Hypothesis Testing Steps in Hypothesis Testing:
CHAPTER 21 Inferential Statistical Analysis. Understanding probability The idea of probability is central to inferential statistics. It means the chance.
Chapter 10 Section 2 Hypothesis Tests for a Population Mean
Resampling techniques Why resampling? Jacknife Cross-validation Bootstrap Examples of application of bootstrap.
Chapter 11 Analysis of Variance
PY 427 Statistics 1Fall 2006 Kin Ching Kong, Ph.D Lecture 12 Chicago School of Professional Psychology.
Topic 3: Regression.
Today Concepts underlying inferential statistics
Data Analysis Statistics. Levels of Measurement Nominal – Categorical; no implied rankings among the categories. Also includes written observations and.
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 10-1 Chapter 10 Analysis of Variance Statistics for Managers Using Microsoft.
Chap 10-1 Analysis of Variance. Chap 10-2 Overview Analysis of Variance (ANOVA) F-test Tukey- Kramer test One-Way ANOVA Two-Way ANOVA Interaction Effects.
Chapter 12 Inferential Statistics Gay, Mills, and Airasian
Analysis of Variance. ANOVA Probably the most popular analysis in psychology Why? Ease of implementation Allows for analysis of several groups at once.
Chapter 4 Hypothesis Testing, Power, and Control: A Review of the Basics.
Statistical Inference: Which Statistical Test To Use? Pınar Ay, MD, MPH Marmara University School of Medicine Department of Public Health
University of Ottawa - Bio 4118 – Applied Biostatistics © Antoine Morin and Scott Findlay 03/10/2015 6:40 PM Final project: submission Wed Dec 15 th,2004.
Accuracy and power of randomization tests in multivariate analysis of variance with vegetation data Valério De Patta Pillar Departamento de Ecologia Universidade.
Bootstrapping (And other statistical trickery). Reminder Of What We Do In Statistics Null Hypothesis Statistical Test Logic – Assume that the “no effect”
University of Ottawa - Bio 4118 – Applied Biostatistics © Antoine Morin and Scott Findlay 08/10/ :23 PM 1 Some basic statistical concepts, statistics.
t(ea) for Two: Test between the Means of Different Groups When you want to know if there is a ‘difference’ between the two groups in the mean Use “t-test”.
9 Mar 2007 EMBnet Course – Introduction to Statistics for Biologists Nonparametric tests, Bootstrapping
Educational Research: Competencies for Analysis and Application, 9 th edition. Gay, Mills, & Airasian © 2009 Pearson Education, Inc. All rights reserved.
Inference and Inferential Statistics Methods of Educational Research EDU 660.
Section 10.1 Confidence Intervals
Educational Research Chapter 13 Inferential Statistics Gay, Mills, and Airasian 10 th Edition.
Adjusted from slides attributed to Andrew Ainsworth
Limits to Statistical Theory Bootstrap analysis ESM April 2006.
Chapter 6: Analyzing and Interpreting Quantitative Data
Inferential Statistics Introduction. If both variables are categorical, build tables... Convention: Each value of the independent (causal) variable has.
IMPORTANCE OF STATISTICS MR.CHITHRAVEL.V ASST.PROFESSOR ACN.
Chapter 4 Analysis of Variance
Chapter 15 The Chi-Square Statistic: Tests for Goodness of Fit and Independence PowerPoint Lecture Slides Essentials of Statistics for the Behavioral.
From Wikipedia: “Parametric statistics is a branch of statistics that assumes (that) data come from a type of probability distribution and makes inferences.
URBDP 591 A Lecture 16: Research Validity and Replication Objectives Guidelines for Writing Final Paper Statistical Conclusion Validity Montecarlo Simulation/Randomization.
Jump to first page Inferring Sample Findings to the Population and Testing for Differences.
BIOL 582 Lecture Set 2 Inferential Statistics, Hypotheses, and Resampling.
Educational Research Inferential Statistics Chapter th Chapter 12- 8th Gay and Airasian.
Bootstrapping and Randomization Techniques Q560: Experimental Methods in Cognitive Science Lecture 15.
Université d’Ottawa / University of Ottawa 2001 Bio 4118 Applied Biostatistics L12.1 Lecture 12: Generalized Linear Models (GLM) What are they? When do.
Dr.Theingi Community Medicine
Applied Regression Analysis BUSI 6220
Stats Methods at IC Lecture 3: Regression.
Chapter 11 Analysis of Variance
I. ANOVA revisited & reviewed
Logic of Hypothesis Testing
Step 1: Specify a null hypothesis
Chapter 10 Two-Sample Tests and One-Way ANOVA.
Two-Sample Hypothesis Testing
Statistics for Managers Using Microsoft Excel 3rd Edition
Statistical Quality Control, 7th Edition by Douglas C. Montgomery.
INTRODUCTORY STATISTICS FOR CRIMINAL JUSTICE
Inference and Tests of Hypotheses
Hypothesis Testing and Confidence Intervals (Part 1): Using the Standard Normal Lecture 8 Justin Kern October 10 and 12, 2017.
Chapter 8: Inference for Proportions
CHAPTER 10 Comparing Two Populations or Groups
GLM Interaction Terms and Patterns of Change
Essential Statistics (a.k.a: The statistical bare minimum I should take along from STAT 101)
Comparing Multiple Groups: Analysis of Variance ANOVA (1-way)
Chapter 11 Analysis of Variance
Basic Training for Statistical Process Control
Hypothesis testing. Chi-square test
Basic Training for Statistical Process Control
Ch13 Empirical Methods.
Elements of a statistical test Statistical null hypotheses
Bootstrap and randomization methods
One-Factor Experiments
MGS 3100 Business Analysis Regression Feb 18, 2016
Presentation transcript:

Advanced Biostatistics Resampling Methods Advanced Biostatistics Dean C. Adams Lecture 2 EEOB 590C

Inferential Statistics: Expected Distributions Distribution of ‘expected’ values from H0 Compare observed to expected to assess significance “How ‘extreme’ is my observed value?” Frequentist statistics: Distributions from theory Resampling methods: Generate expected distributions from data Observed value probability

Resampling Methods Take many samples from original data set Evaluate significance of the original based on these samples Nonparametric (no theoretical distribution) Very flexible (easy to assess complex designs) Major variants: randomization, bootstrap, jackknife, Monte Carlo Useful for testing: Standard designs Non-standard designs High-dimensional data (small N; large p)

Randomization (Permutation) First true randomization: Fisher’s exact test (1935) Complete enumeration of possible pairings of data (for t-test) Calculate observed statistic (e.g., T-statistic): Eobs Reorder data set (i.e. randomly shuffle data) and recalculate statistic Erand Repeat for all possible combinations and generate distribution of possible statistics Percentage of Erand more extreme than Eobs is significance level Note: Eobs is treated as an iteration Randomization can be used to determine most any test statistic 3 4 2 5 6 9 8 7 6 8 5 2 3 4 9 7 3 4 9 6 5 2 8 7 Eobs

Randomization: Example P. cinereus & P. hoffmani: compete when sympatric What happens to jaw morphology? Compare squamosal/dentary ratios Plethodon cinereus Plethodon hoffmani sq dent F = 15.47, P = 7.76 x 10-9 Prand = 0.00001 (99,999 iterations) Data From Adams and Rohlf (2000). PNAS 97:4106-4111.

General Permutation Test All possible permutations not feasible for most cases Use large number of iterations instead (4,999, 9,999, etc.) ↑ # iterations improves precision of estimated significance from Adams and Anthony (1996). Anim. Behav. 51:733-738.

Randomization: Comments EXTREMELY useful and flexible technique Critical issue: What and How to resample General procedure: shuffle dependent (Y) variables relative to X Works for: Standard designs (ANOVA, regression, factorial ANOVA) Non-standard designs Small p, large N

Exchangeable Units What one shuffles matters Designing a proper resampling test requires 1: Identifying the null hypothesis (H0) 2: Having a known expected value under H0 3: Identifying what values may be shuffled to estimate distribution under H0 Not all things that can be shuffled should be shuffled!

Exchangeable Units: Example High-dimensional PCM (phylogenetic comparative method) 1: Shuffle Y-data and re-calculate things each time (D-PGLS) 2: Calculate PICs then shuffle these (PICrand) PICrand has high type I error rates (PICs are NOT the exchangeable units under the null hypothesis) Adams and Collyer 2015. Evol.

Standard Designs: T-Test / ANOVA Assess association of X & Y Shuffle Y relative to X: models expectations of H0 (no relationship) Example 1: Comparison of groups (T-test or ANOVA) Identify column representing independent variable (X) Identify column representing dependent variables (Y): calculate F or T Shuffle Y on X and recalculate statistic (F or T) Works for multivariate Y data (shuffle ROWS of Y) X M F Eobs Y X M F Erand Y Eobs

Standard Designs: Regression/Correlation Example 2: Tests of Association (correlation and regression) Identify column representing independent variable (X) Identify column representing dependent variables (Y); calculate F or r Shuffle Y on X and recalculate statistic (F, r, etc.) Works for multivariate Y data (shuffle ROWS of Y) X Eobs Y X Erand Y Eobs

Restricted Randomization Restrict permutation of values to sub-set of data Useful for hypotheses where some combinations don’t make sense (or for where specific hypotheses are of interest) Example: Two species with males and females Compare species but preserve sexual dimorphism: Shuffle within each sex Compare sexes but preserve species: Shuffle only within each species ♂ ♀ Spp. 1 Spp. 2

Factorial Models Model: Y~A+B+A*B Assessing factors via resampling is challenging (requires estimates of EMS for each) 1: Unrestricted Randomization: Permute Y vs. (A+B+A*B) Can test all terms (MSA, MSB, & MSA*B) Often the wrong H0! Conflates MS across terms (can yield uninterpretable results) 2: Restricted Randomization: Permute Y (within A; then within B) Can test MSA & MSB, but not MSA*B (could use unrestricted randomization for A*B) 3: Residual Randomization: Permute Yresid from sequential Ho models Proper H0 for each See Edgington 1995 Manly 1998

Factorial Models: Understanding the Null Factorial models are sets of sequential hypothesis tests Model: Y~ A + B + A*B Y~A: Tests MSA vs. H0.r Y~1 (Does A explain more variation than the mean?) Y~ A + B: Tests MSB vs. vs. H0.r1 Y~A (Does B|A explain > variation than A?) Y~ A + B + A*B: Tests MSA*B vs. vs. H0.r2 Y~A+B (as above for A*B) Develop resampling procedures that appropriately test each H0 Residual randomization most appropriate for factorial models See Gonzalez and Manly 1998 Andersson and TerBraak 2003 Collyer, Sekora, and Adams 2015

Residual Randomization Permute Yresid from reduced model (H0.r) with fewer terms Holds constant SS terms in H0.r while testing SS terms not in H0.r Protocol Calculate parameters and observed test statistic (Eobs) from full model (e.g., 2-factor ANOVA: , where X contains factors A, B, and A×B) Remove term (e.g., A×B) from X, calculate predicted values ( ) and residuals (e) Shuffle residuals (e), add to predicted values, and calculate Erand Repeat many times and percentage of Erand more extreme than Eobs is significance level Higher statistical power for factorial designs (Andersson and TerBraak 2003) Extremely powerful for many E&E hypotheses See Gonzalez and Manly 1998. Environmetrics. Collyer and Adams 2007. Ecology. Collyer, Sekora, and Adams 2015. Heredity.

Permutation For Non-Standard Designs Permutation useful when no theoretical distribution exists for H0 VERY COMMON in biology, as biologists frequently have specific hypotheses not ‘covered’ by current distribution theory Protocol Collect data and generate hypothesis Identify dependent and independent variables; calculate appropriate Tobs Shuffle data to generate distribution of Trand

Non-Standard Permutation: Example P. cinereus & P. hoffmani: compete in sympatry Is there evidence of character displacement? H0: Sympatric differences > allopatric differences Data: Head shape (multivariate) H0: Dsymp> Dallo (non-standard design) Conclusion: evidence for character displacement Plethodon cinereus Plethodon hoffmani sympatric P. cinereus (green) and sympatric P. hoffmani (red) Dsymp = 0.0753 Dallo = 0.0444 T = 0.0308 Prand = 0.0001 Data From Adams and Rohlf (2000). PNAS.

The ‘Small N to Large p’ Problem High-dimensional multivariate data increasingly common If p>N, standard approaches can fail Example: MANOVA design with p>N |SSCPF|=0 SSCPF-1 does not work (divide by zero) MANOVA can’t be computed Solution: Use resampling-based methods 1: Assess significance from other model parameters 2: Distance-based statistical approaches

Resample Parameters for Hypothesis Testing Test significance of some parameter using randomization Obtain original test-statistics (Tobs): tr(SSPCmodel), Dgp1,gp2, etc. Shuffle data & calculate Trand Compare Tobs vs. Trand Repeat Doesn’t require inverting covariance matrix, so general solution

Distance-Based Approaches Test significance based on distances between objects Relies on covariance matrix - distance matrix equivalency (Gower, 1966) MANOVA is covariance based Its ‘dual’ (permutational-MANOVA) is distance-based Dist PCoA Y PCA VCV Gower 1966. Biometrika. Adams 2014. Evol. & Syst. Biol. * Method will be discussed in more detail later this semester

Permutational-MANOVA*: Computations Permutational-MANOVA partitions variation in distances SSBtwn and SSErr found from Distances Obtain SSB, SSW: estimate Fobs Shuffle data; estimate Frand Compare Fobs vs. Frand Repeat Doesn’t require inverting covariance matrix, so general solution Same group: eij=1 Different group: eij=0 *Method identical to Procrustes ANOVA and AMOVA

Bootstrap Permutation: resamping without replacement Each observation present, just shuffles order Bootstrap: resampling with replacement Some observations chosen more than once, others not at all Useful for estimating confidence intervals (CI) (though other uses as well) Several approaches exist

Standard Bootstrap CI Proposed to alleviate bias in estimating s Protocol Generate many bootstrap data sets Estimate test statistic for each Find s from bootstrap test statistics CI calculated as: Traditional CI: red Bootstrap CI: green

Percentile Bootstrap CI Proposed to alleviate use of normal distribution Protocol Generate many bootstrap data sets Estimate test statistic for each Bootstrap CI: upper and lower a/2 percent (usually: 0.025 & 0.975) Note: assumes the distribution of bootstrap test statistics is centered on observed test statistic Traditional CI: red Bootstrap CI: blue

Bias-Corrected Percentile Bootstrap CI Accounts for when > 50% of bootstrap test statistics are above or below observed value (‘Slides’ the percentiles a bit) Protocol Generate many bootstrap data sets Estimate test statistic for each Find fraction (Fr) of bootstrap values above/below observed statistic Upper and lower CI: (F is cumulative normal distribution, and a is desired type I error: usually 0.05)

Bootstrapping and Phylogenetics Felsenstein (1985) proposed bootstrapping to assess confidence in phylogenetic trees Calculate phylogenetic tree from data (e.g., parsimony or UPGMA) Bootstrap data set large # times and recalculate tree Proportion of nodes in bootstrapped trees is ‘support’ for that node in the observed tree Logic: measured characters are representative of true character set Bootstrap generates alternative character matrices CAREFUL IN INTERPRETATION! Bootstrap estimates on nodes are NOT independent Bootstrap values often follow particular pattern: large at base and tips, smaller in middle (result of combinatoric branching theory)

Jackknife Jackknifing resamples by systematically eliminating 1 sample Each iterated data set thus contains n-1 observations Asks how precise is the observed estimate (or how sensitive it is to particular values) Typically used to estimate bias, standard errors, and CI of test statistics

Jackknife Protocol for Bias Calculate observed test statistic Eobs Remove one observation and calculate estimate of statistic Ejack Repeat above step, removing a different object each iteration Calculate mean of estimates Note: the jackknife is less frequently used due to greater computer power (full permutations and bootstraps are more computationally feasible)

Monte Carlo Simulations Use parameterized model to simulate data, from which distribution of Erand is generated NOT a permutation or bootstrap, because values in each iteration are not from the original set of data However, parameters for the model are estimated from the original data Assumes that the observed data is a representative sample, so other such samples are generated, and used to compare patterns in original sample to those of randomly generated samples

Monte Carlo Simulations Example applications: Are plants distributed randomly in forest? Calculate point-pattern statistic of actual plants Simulate random plant locations (using RandUnif, or other model) and compare patterns Are species ‘evenly’ distributed among communities? Calculate evenness measure (E) for actual communities Simulate random communities from a community-assembly model and compare Erand to Eobs In E&E, one often hears of ‘parametric bootstrap’ for hypothesis testing and generation of confidence intervals. This is a Monte Carlo procedure

Resampling: Comments Resampling approaches extremely useful and flexible Much more powerful than rank-based nonparametric approaches, and can be as powerful as parametric tests in some circumstances Can be used to assess significance when data don’t meet certain assumptions of test (e.g., data not normal but in ANOVA format) Useful when no theoretical distribution exists (CCorA &2B-PLS) Also useful when data design or hypothesis is ‘non-standard’ Can implement resampling methods in: R SAS Any computer programming language (Perl, Python, C, Pascal, etc.) Excel with Pop-tools add-in (intuitive, but limited in capabilities) Permute (Legendre)