STA248 week 121 Bootstrap Test for Pairs of Means of a Non-Normal Population – small samples Suppose X 1, …, X n are iid from some distribution independent.

Slides:



Advertisements
Similar presentations
CHAPTER 25: One-Way Analysis of Variance Comparing Several Means
Advertisements

Inferential Statistics
Copyright © 2014 by McGraw-Hill Higher Education. All rights reserved.
Decision Errors and Power
Objectives (BPS chapter 18) Inference about a Population Mean  Conditions for inference  The t distribution  The one-sample t confidence interval 
Inference for a population mean BPS chapter 18 © 2006 W. H. Freeman and Company.
Objectives (BPS chapter 24)
Confidence Interval and Hypothesis Testing for:
Copyright ©2011 Brooks/Cole, Cengage Learning Testing Hypotheses about Means Chapter 13.
Comparing Two Population Means The Two-Sample T-Test and T-Interval.
MARE 250 Dr. Jason Turner Hypothesis Testing II To ASSUME is to make an… Four assumptions for t-test hypothesis testing: 1. Random Samples 2. Independent.
MARE 250 Dr. Jason Turner Hypothesis Testing II. To ASSUME is to make an… Four assumptions for t-test hypothesis testing:
Lecture 9: One Way ANOVA Between Subjects
Chapter 2 Simple Comparative Experiments
Incomplete Block Designs
The Practice of Statistics
Copyright © 2014 by McGraw-Hill Higher Education. All rights reserved.
Chapter 11: Inference for Distributions
Inferences About Process Quality
Statistics 270– Lecture 25. Cautions about Z-Tests Data must be a random sample Outliers can distort results Shape of the population distribution matters.
5-3 Inference on the Means of Two Populations, Variances Unknown
Marshall University School of Medicine Department of Biochemistry and Microbiology BMS 617 Lecture 14: Non-parametric tests Marshall University Genomics.
Hypothesis Testing and T-Tests. Hypothesis Tests Related to Differences Copyright © 2009 Pearson Education, Inc. Chapter Tests of Differences One.
Inferential Statistics
Ch 10 Comparing Two Proportions Target Goal: I can determine the significance of a two sample proportion. 10.1b h.w: pg 623: 15, 17, 21, 23.
QNT 531 Advanced Problems in Statistics and Research Methods
Experiments and Observational Studies. Observational Studies In an observational study, researchers don’t assign choices; they simply observe them. look.
The paired sample experiment The paired t test. Frequently one is interested in comparing the effects of two treatments (drugs, etc…) on a response variable.
More About Significance Tests
+ Chapter 9 Summary. + Section 9.1 Significance Tests: The Basics After this section, you should be able to… STATE correct hypotheses for a significance.
Comparing Two Population Means
T tests comparing two means t tests comparing two means.
Chap 20-1 Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chapter 20 Sampling: Additional Topics in Sampling Statistics for Business.
Chapter 9 Hypothesis Testing and Estimation for Two Population Parameters.
Chapter 10 Comparing Two Means Target Goal: I can use two-sample t procedures to compare two means. 10.2a h.w: pg. 626: 29 – 32, pg. 652: 35, 37, 57.
Week 111 Power of the t-test - Example In a metropolitan area, the concentration of cadmium (Cd) in leaf lettuce was measured in 7 representative gardens.
Education Research 250:205 Writing Chapter 3. Objectives Subjects Instrumentation Procedures Experimental Design Statistical Analysis  Displaying data.
+ Chapter 12: Inference for Regression Inference for Linear Regression.
Analyzing Data: Comparing Means Chapter 8. Are there differences? One of the fundament questions of survey research is if there is a difference among.
The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers CHAPTER 10 Comparing Two Populations or Groups 10.2.
Psychology 301 Chapters & Differences Between Two Means Introduction to Analysis of Variance Multiple Comparisons.
Hypothesis Testing A procedure for determining which of two (or more) mutually exclusive statements is more likely true We classify hypothesis tests in.
Introduction to Inferential Statistics Statistical analyses are initially divided into: Descriptive Statistics or Inferential Statistics. Descriptive Statistics.
Analysis of Variance 1 Dr. Mohammed Alahmed Ph.D. in BioStatistics (011)
Essential Statistics Chapter 141 Thinking about Inference.
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 14 Comparing Groups: Analysis of Variance Methods Section 14.1 One-Way ANOVA: Comparing.
Section Copyright © 2014, 2012, 2010 Pearson Education, Inc. Lecture Slides Elementary Statistics Twelfth Edition and the Triola Statistics Series.
STA 2023 Module 11 Inferences for Two Population Means.
Chapter Seventeen. Figure 17.1 Relationship of Hypothesis Testing Related to Differences to the Previous Chapter and the Marketing Research Process Focus.
Analysis of RT distributions with R Emil Ratko-Dehnert WS 2010/ 2011 Session 07 –
Week111 The t distribution Suppose that a SRS of size n is drawn from a N(μ, σ) population. Then the one sample t statistic has a t distribution with n.
Lecture 3: Statistics Review I Date: 9/3/02  Distributions  Likelihood  Hypothesis tests.
BPS - 5th Ed. Chapter 151 Thinking about Inference.
T tests comparing two means t tests comparing two means.
IE241: Introduction to Design of Experiments. Last term we talked about testing the difference between two independent means. For means from a normal.
Statistical Inference Statistical inference is concerned with the use of sample data to make inferences about unknown population parameters. For example,
One-way ANOVA Example Analysis of Variance Hypotheses Model & Assumptions Analysis of Variance Multiple Comparisons Checking Assumptions.
The Mixed Effects Model - Introduction In many situations, one of the factors of interest will have its levels chosen because they are of specific interest.
Comparing Means Chapter 24. Plot the Data The natural display for comparing two groups is boxplots of the data for the two groups, placed side-by-side.
Chapter 7 Data for Decisions. Population vs Sample A Population in a statistical study is the entire group of individuals about which we want information.
Learning Objectives After this section, you should be able to: The Practice of Statistics, 5 th Edition1 DESCRIBE the shape, center, and spread of the.
Week 21 Order Statistics The order statistics of a set of random variables X 1, X 2,…, X n are the same random variables arranged in increasing order.
HYPOTHESIS TESTING FOR DIFFERENCES BETWEEN MEANS AND BETWEEN PROPORTIONS.
Review Statistical inference and test of significance.
Inferential Statistics Psych 231: Research Methods in Psychology.
Week 21 Statistical Model A statistical model for some data is a set of distributions, one of which corresponds to the true unknown distribution that produced.
Class Six Turn In: Chapter 15: 30, 32, 38, 44, 48, 50 Chapter 17: 28, 38, 44 For Class Seven: Chapter 18: 32, 34, 36 Chapter 19: 26, 34, 44 Quiz 3 Read.
Hypothesis Testing and Confidence Intervals (Part 1): Using the Standard Normal Lecture 8 Justin Kern October 10 and 12, 2017.
Tests for Two Means – Normal Populations
Test for Mean of a Non-Normal Population – small n
Presentation transcript:

STA248 week 121 Bootstrap Test for Pairs of Means of a Non-Normal Population – small samples Suppose X 1, …, X n are iid from some distribution independent of Y 1, …, Y m are iid from another distribution. Further suppose that both n and m are small and we are interested in testing whether the two populations have the same means. Can use the t-test (pooled or unpooled) since it is robust as long as there are no extreme 1outliers and skewness. Alternatively, we can use bootstrap hypothesis testing.

STA248 week 122 Bootstrap Hypothesis Testing - Introduction Suppose X 1, …, X n is a random sample of size n, independent from another random sample Y 1, …, Y m of size m. and we wish to test vs. As a test statistics we will use. The P-values of this test is. We want the bootstrap estimate of this P-value.

STA248 week 123 Bootstrap Test Procedure To obtain the bootstrap estimate of the P-value we need to generate samples with H 0 true. One way of doing this (assuming X and Y have same distribution) is to combine 2 samples into 1 of size n+m. Then re-sample with replacement from this combined sample such that each re-sampling has two groups … For each bootstrap sample calculate the bootstrap estimate of the test statistics, j = 1, …, B. The bootstrap estimate of the P-value is ….

STA248 week 124 Example

STA248 week 125 Data Collection There are three main methods for collecting data.  Observational studies  Sample survey  Planned / designed experiments These methods differ in the strength of conclusion that can be drawn.

STA248 week 126 Observational Studies In some cases, a study may be undertaken retrospectively. In observational studies we simply collect information about variables of interest without applying any intervention or controlling for any factors. When factors are not controlled we are not able to infer a cause- effect relationship. Other problems with observation studies are:  Confounding – can’t separate effect of one variable from another.  Lack of generalization.

STA248 week 127 Sample Surveys Sample surveys are observational in nature. Surveys require existence of physically real population. Data is collected on a random sample from the target population. Survey design includes selection of sample so it is representative of the population as a whole. Use statistics to make inference about entire population. Confounding is still a problem. However, the results can be generalized to the population. Cause of any observed differences cannot be determined. To allow generalization and to avoid bias – sample must be chosen randomly e.g., SRS.

STA248 week 128 Planned / Designed Experiments There are few key features of designed experiments that distinguish it from any other type of study. Independent variables of interest are carefully controlled by the experimenter in order to determine their effect on a response (dependent) variable. Researcher randomly assign a treatment to the subjects or experimental units. Control of independent variables and randomization make it possible to infer cause and effect relationship. Use of replication – multiple observation per treatment. Replication allows measurement of variability.

Treatments are sometimes called predictor variables and sometimes called “factors”. The values of a factor are its “levels”. A design is balanced if each treatment has the same number of experimental units. Problem: can’t always carry out an experiment. STA248 week 109

STA248 week 1210 Randomization The use of randomization to allocate treatments to experimental units (or vice versa) is the key element of well-designed experiment. Random allocation tends to produce subgroups which are comparable with respect to the variables known to influence the response. Randomization ensures that no bias is introduced in allocation of treatments to experimental units. Randomization reduces the possibility that factors not included in the design will be confounded with treatment.

Cautions Regarding Experiments “Effective sample size” – all statistical techniques we have learned assume observations are independent. If they are not but treated as if they were, get more power and smaller CI than you should. “Fishing expedition” – if doing 100 tests at α = 0.05 significant level, expect 5 of 100 tests to show significant differences from H 0 even when H 0 is always true (type I errors). STA248 week 1011

Controlling for Type I error One widely use method for controlling for type I error uses Bonferoni Inequality…. If A i is the event that the i th test has a type I error, and typically P(A i ) = α, then by Bonferoni Inequality we that:.. That is the probability of committing at least one type I error in k tests is at most kα. Therefore, if use significant level of α/k for each individual test, then the “overall significant level” (P(at least 1 type I error)) is at most α. The Bonferoni method is very conserevative. STA248 week 1012

Analysis of Variance – Introduction Generalization of the two sample t-procedures (with equal variances). The objective in analysis of variance is to determine whether there are differences in means of more than 2 groups. The statistical methodology for comparing several means is called analysis of variance, or simply ANOVA. When studying the effect of one factor only on the response we use one-way ANOVA to analyze the data. When studying the effect of two factors on the response we use two- way ANOVA. STA248 week 1013

One-Way ANOVA model The response variable Y is measured on each experimental unit in each treatment group. Measure Y ij for the j th subject in the i th group. The one-way ANOVA model is: Y ij = μ i + ε ij for i = 1, 2,…, k and j = 1, 2, …, n i. μ i is the unknown mean response for the i th group. The ε ij are called “random errors” and are assumed to be i.i.d N(0, σ 2 ). The parameters of the model are the population means μ 1, μ 2,…, μ k and the common standard deviation σ. The objective of one-way ANOVA is to test whether the mean response in each treatment group is the same. The null and alternative hypotheses are…. STA248 week 1014

Derivation of Test Statistics STA248 week 1015