1 Advances in Statistics Or, what you might find if you picked up a current issue of a Biological Journal.

Slides:



Advertisements
Similar presentations
Week 2 – PART III POST-HOC TESTS. POST HOC TESTS When we get a significant F test result in an ANOVA test for a main effect of a factor with more than.
Advertisements

Lecture (11,12) Parameter Estimation of PDF and Fitting a Distribution Function.
Hypothesis Testing Steps in Hypothesis Testing:
i) Two way ANOVA without replication
Linear regression models
Nonparametric tests and ANOVAs: What you need to know.
1 1 Slide © 2009, Econ-2030 Applied Statistics-Dr Tadesse Chapter 10: Comparisons Involving Means n Introduction to Analysis of Variance n Analysis of.
Copyright ©2011 Brooks/Cole, Cengage Learning Analysis of Variance Chapter 16 1.
Design of Engineering Experiments - Experiments with Random Factors
Correlation and Regression. Spearman's rank correlation An alternative to correlation that does not make so many assumptions Still measures the strength.
Part I – MULTIVARIATE ANALYSIS
Statistics Are Fun! Analysis of Variance
PSY 307 – Statistics for the Behavioral Sciences
Final Review Session.
Lecture 9: One Way ANOVA Between Subjects
Copyright © 2014 by McGraw-Hill Higher Education. All rights reserved.
Chapter 12 Inferential Statistics Gay, Mills, and Airasian
ANOVA 3/19/12 Mini Review of simulation versus formulas and theoretical distributions Analysis of Variance (ANOVA) to compare means: testing for a difference.
Psy B07 Chapter 1Slide 1 ANALYSIS OF VARIANCE. Psy B07 Chapter 1Slide 2 t-test refresher  In chapter 7 we talked about analyses that could be conducted.
F-Test ( ANOVA ) & Two-Way ANOVA
Chapter 13: Inference in Regression
5-1 Introduction 5-2 Inference on the Means of Two Populations, Variances Known Assumptions.
QNT 531 Advanced Problems in Statistics and Research Methods
12-1 Chapter Twelve McGraw-Hill/Irwin © 2005 The McGraw-Hill Companies, Inc., All Rights Reserved.
1 1 Slide © 2006 Thomson/South-Western Slides Prepared by JOHN S. LOUCKS St. Edward’s University Slides Prepared by JOHN S. LOUCKS St. Edward’s University.
1 Tests with two+ groups We have examined tests of means for a single group, and for a difference if we have a matched sample (as in husbands and wives)
One-Way Analysis of Variance Comparing means of more than 2 independent samples 1.
12-1 Chapter Twelve McGraw-Hill/Irwin © 2006 The McGraw-Hill Companies, Inc., All Rights Reserved.
© Copyright McGraw-Hill CHAPTER 12 Analysis of Variance (ANOVA)
Chapter 10 Analysis of Variance.
Correlation and Regression Used when we are interested in the relationship between two variables. NOT the differences between means or medians of different.
Basic concept Measures of central tendency Measures of central tendency Measures of dispersion & variability.
1 Chapter 13 Analysis of Variance. 2 Chapter Outline  An introduction to experimental design and analysis of variance  Analysis of Variance and the.
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 10-1 Chapter 10 Analysis of Variance Statistics for Managers Using Microsoft.
Multiple Regression and Model Building Chapter 15 Copyright © 2014 by The McGraw-Hill Companies, Inc. All rights reserved.McGraw-Hill/Irwin.
Comparing Three or More Means ANOVA (One-Way Analysis of Variance)
Confidence intervals and hypothesis testing Petter Mostad
Nonparametric Tests IPS Chapter 15 © 2009 W.H. Freeman and Company.
DOX 6E Montgomery1 Design of Engineering Experiments Part 9 – Experiments with Random Factors Text reference, Chapter 13, Pg. 484 Previous chapters have.
Educational Research Chapter 13 Inferential Statistics Gay, Mills, and Airasian 10 th Edition.
1 G Lect 11a G Lecture 11a Example: Comparing variances ANOVA table ANOVA linear model ANOVA assumptions Data transformations Effect sizes.
Analysis of Variance (One Factor). ANOVA Analysis of Variance Tests whether differences exist among population means categorized by only one factor or.
Previous Lecture: Phylogenetics. Analysis of Variance This Lecture Judy Zhong Ph.D.
Supplementary PPT File for More detail explanation on SPSS Anova Results PY Cheng Nov., 2015.
1 ANALYSIS OF VARIANCE (ANOVA) Heibatollah Baghi, and Mastee Badii.
Statistics for Differential Expression Naomi Altman Oct. 06.
Chapter Seventeen. Figure 17.1 Relationship of Hypothesis Testing Related to Differences to the Previous Chapter and the Marketing Research Process Focus.
Marketing Research Aaker, Kumar, Day Seventh Edition Instructor’s Presentation Slides.
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 14 Comparing Groups: Analysis of Variance Methods Section 14.3 Two-Way ANOVA.
1 Experiments with Random Factors Previous chapters have considered fixed factors –A specific set of factor levels is chosen for the experiment –Inference.
Copyright (C) 2002 Houghton Mifflin Company. All rights reserved. 1 Understandable Statistics S eventh Edition By Brase and Brase Prepared by: Lynn Smith.
Irwin/McGraw-Hill © Andrew F. Siegel, 1997 and Methods and Applications CHAPTER 15 ANOVA : Testing for Differences among Many Samples, and Much.
© Copyright McGraw-Hill 2004
Marshall University School of Medicine Department of Biochemistry and Microbiology BMS 617 Lecture 11: Models Marshall University Genomics Core Facility.
Econ 3790: Business and Economic Statistics Instructor: Yogesh Uppal
IE241: Introduction to Design of Experiments. Last term we talked about testing the difference between two independent means. For means from a normal.
Kin 304 Inferential Statistics Probability Level for Acceptance Type I and II Errors One and Two-Tailed tests Critical value of the test statistic “Statistics.
Introduction to ANOVA Research Designs for ANOVAs Type I Error and Multiple Hypothesis Tests The Logic of ANOVA ANOVA vocabulary, notation, and formulas.
1 1 Slide The Simple Linear Regression Model n Simple Linear Regression Model y =  0 +  1 x +  n Simple Linear Regression Equation E( y ) =  0 + 
While you wait: Enter the following in your calculator. Find the mean and sample variation of each group. Bluman, Chapter 121.
Copyright © 2011 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Multiple Regression Chapter 14.
Nonparametric Statistics
Educational Research Inferential Statistics Chapter th Chapter 12- 8th Gay and Airasian.
 List the characteristics of the F distribution.  Conduct a test of hypothesis to determine whether the variances of two populations are equal.  Discuss.
The “Big Picture” (from Heath 1995). Simple Linear Regression.
Econ 3790: Business and Economic Statistics
Kin 304 Inferential Statistics
The Analysis of Variance
Chapter 10 – Part II Analysis of Variance
Presentation transcript:

1 Advances in Statistics Or, what you might find if you picked up a current issue of a Biological Journal

2 Advances in Statistics Extensions to the ANOVA Computer-intensive methods Maximum likelihood

3 Extensions to ANOVA One-way ANOVA –This works for a single explanatory variable –Simplest possible design Two-way ANOVA –Two categorical explanatory variables –Factorial design

4 ANOVA Tables Source of variation Sum of squaresdfMean Squares F ratioP Treatment k-1 Error N-k Total N-1 *

5 Two-factor ANOVA Table Source of variation Sum of Squares dfMean SquareF ratioP Treatment 1SS 1 k 1 - 1SS 1 k MS 1 MSE Treatment 2SS 2 k 2 - 1SS 2 k MS 2 MSE Treatment 1 * Treatment 2 SS 1*2 (k 1 - 1)*(k 2 - 1)SS 1*2 (k 1 - 1)*(k 2 - 1) MS 1*2 MSE ErrorSS error XXXSS error XXX TotalSS total N-1

6 Two-factor ANOVA Table Source of variation Sum of Squares dfMean SquareF ratioP Treatment 1SS 1 k 1 - 1SS 1 k MS 1 MSE Treatment 2SS 2 k 2 - 1SS 2 k MS 2 MSE Treatment 1 * Treatment 2 SS 1*2 (k 1 - 1)*(k 2 - 1)SS 1*2 (k 1 - 1)*(k 2 - 1) MS 1*2 MSE ErrorSS error XXXSS error XXX TotalSS total N-1 Two categorical explanatory variables

7 General Linear Models Used to analyze variation in Y when there is more than one explanatory variable Explanatory variables can be categorical or numerical

8 General Linear Models First step: formulate a model statement Example:

9 General Linear Models First step: formulate a model statement Example: Overall mean Treatment effect

10 General Linear Models Second step: Make an ANOVA table Example: Source of variation Sum of squares dfMean Squares F ratioP Treatment k-1 Error N-k Total N-1 *

11 General Linear Models Second step: Make an ANOVA table Example: Source of variation Sum of squares dfMean Squares F ratioP Treatment k-1 Error N-k Total N-1 * This is the same as a one-way ANOVA!

12 General Linear Models If there is only one explanatory variable, these are exactly equivalent to things we’ve already done –One categorical variable: ANOVA –One numerical variable: regression Great for more complicated situations

13 Example 1: Experiment with blocking Fish experiment: sensitivity of goldfish to light Fish are randomly selected from the population Four different light treatments are applied to each fish

14 Randomized Block Design Blocks (fish) Treatments (light wavelengths)

15 Randomized Block Design

16 Step 1: Make a model statement

17 Step 2: Make an ANOVA table

18 Another Example: Mole Rats Are there lazy mole rats? Two variables: –Worker type: categorical “frequent workers” and “infrequent workers” –Body mass (ln-transformed): numerical

19

20 Step 1: Make a model statement

21 Step 2: Make an ANOVA table

22 Step 2: Make an ANOVA table

23 Step 1: Make a model statement

24 Step 2: Make an ANOVA table

25 Step 2: Make an ANOVA table Also called ANCOVA- Analysis of Covariance

26 General Linear Models Can handle any number of predictor variables Each can be categorical or numerical Tables have the same basic structure Same assumptions as ANOVA

27 General Linear Models Don’t run out of degrees of freedom! Sometimes, the F-statistics will have DIFFERENT denominators - see book for an example

28 Computer-intensive methods Hypothesis testing: –Simulation –Randomization Confidence intervals –Bootstrap

29 Simulation Simulates the sampling process on a computer many times: generates the null distribution from estimates done on the simulated data Computer assumes the null hypothesis is true

30 Example: Social spider sex ratios Social spiders live in groups

31 Example: Social spider sex ratios Groups are mostly females Hypothesis: Groups have just enough males to allow reproduction Test: Whether distribution of number of males is as predicted by chance Problem: Groups are of many different sizes Binomial distribution therefore doesn’t apply

32 Simulation: For each group, the number of spiders is known. The overall proportion of males, p m, is known. For each group, the computer draws the real number of spiders, and each has p m probability of being male. This is done for all groups, and the variance in proportion of males is calculated. This is repeated a large number of times.

33 The observed value (0.44), or something more extreme, is observed in only 4.9% of the simulations. Therefore P =

34 Randomization Used for hypothesis testing Mixes the real data randomly Variable 1 from an individual is paired with variable 2 data from a randomly chosen individual. This is done for all individuals. The estimate is made on the randomized data. The whole process is repeated numerous times. The distribution of the randomized estimates is the null distribution.

35 Without replacement Randomization is done without replacement. In other words, all data points are used exactly once in each randomized data set.

36 Randomization can be done for any test of association between two variables

37 Example: Sage crickets Sage cricket males sometimes offer their hind-wings to females to eat during mating. Do females who eat hind-wings wait longer to re-mate?

38

39 Problems: Unequal variance, non-normal distributions

40 Male wingless Male winged Real data: Randomized data: Male wingless Male winged

41 Note that each data point was only used once

randomizations P < 0.001

43 Randomization: Other questions Q: Is this periodic? (yes)

44 Bootstrap Method for estimation (and confidence intervals) Often used for hypothesis testing too "Picking yourself up by your own bootstraps"

45 Bootstrap For each group, randomly pick with replacement an equal number of data points, from the data of that group With this bootstrap dataset, calculate the estimate -- bootstrap replicate estimate

46 Male wingless Male winged Real data: Bootstrap data: Male wingless Male winged

47

48 Bootstraps are often used in evolutionary trees

49 Likelihood Likelihood considers many possible hypotheses, not just one

50 Law of likelihood A particular data set supports one hypothesis better than another if the likelihood of that hypothesis is higher than the likelihood of the other hypothesis. Therefore we try to find the hypothesis with the maximum likelihood.

51 All estimates we have learned so far are also maximum likelihood estimates.

52 "Simple" example Using likelihood to estimate a proportion Data: 3 out of 8 individuals are male. Question: What is the maximum likelihood estimate of the proportion of males?

53 Likelihood where x is a hypothesized value of the proportion of males. e.g., L(p=0.5) is the likelihood of the hypothesis that the proportion of males is 0.5.

54 For this example only... The probability of getting 3 males out of 8 independent trials is given by the binomial distribution.

55 How to find maximum likelihood hypothesis 1.Calculus or 2.Computer calculations

56 By calculus... Maximum value of L(p=x) is found when x = 3/8. Note that this is the same value we would have gotten by methods we already learned.

57 By computer calculation... Input likelihood formula to computer, plot the value of L for each value of x, and find the largest L.

58 Finding genes for corn yield: Corn Chromosome 5

59 Hypothesis testing by likelihood Compares the likelihood of maximum likelihood estimate to a null hypothesis Log-likelihood ratio =

60 Test statistic With df equal to the number of variables fixed to make null hypothesis

61 Example:3 males out of 8 individuals H 0 : 50% are male Maximum likelihood estimate

62 Likelihood of null hypothesis

63 Log likelihood ratio We fixed one variable in the null hypothesis (p), So the test has df = 1., so we do not reject H 0.