Topic 8 - Comparing two samples

Slides:



Advertisements
Similar presentations
T-tests continued.
Advertisements

Statistics Review – Part II Topics: – Hypothesis Testing – Paired Tests – Tests of variability 1.
BPS - 5th Ed. Chapter 241 One-Way Analysis of Variance: Comparing Several Means.
Topic 6: Introduction to Hypothesis Testing
Chapter 10 Two-Sample Tests
Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Statistics for Business and Economics 7 th Edition Chapter 10 Hypothesis Testing:
Chapter Seventeen HYPOTHESIS TESTING
PSY 307 – Statistics for the Behavioral Sciences
10-1 Introduction 10-2 Inference for a Difference in Means of Two Normal Distributions, Variances Known Figure 10-1 Two independent populations.
ANOVA Determining Which Means Differ in Single Factor Models Determining Which Means Differ in Single Factor Models.
Chap 11-1 Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chapter 11 Hypothesis Testing II Statistics for Business and Economics.
Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-Hall, Inc. Chap 9-1 Introduction to Statistics Chapter 10 Estimation and Hypothesis.
1/45 Chapter 11 Hypothesis Testing II EF 507 QUANTITATIVE METHODS FOR ECONOMICS AND FINANCE FALL 2008.
A Decision-Making Approach
Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc. Chap 10-1 Business Statistics: A Decision-Making Approach 7 th Edition Chapter.
Inferences About Process Quality
5-3 Inference on the Means of Two Populations, Variances Unknown
Statistics for Managers Using Microsoft® Excel 5th Edition
Hypothesis Testing Using The One-Sample t-Test
CHAPTER 19: Two-Sample Problems
Statistical Analysis. Purpose of Statistical Analysis Determines whether the results found in an experiment are meaningful. Answers the question: –Does.
Statistical Inference for Two Samples
AP Statistics Section 13.1 A. Which of two popular drugs, Lipitor or Pravachol, helps lower bad cholesterol more? 4000 people with heart disease were.
Statistics 11 Hypothesis Testing Discover the relationships that exist between events/things Accomplished by: Asking questions Getting answers In accord.
Two Sample Tests Ho Ho Ha Ha TEST FOR EQUAL VARIANCES
Statistical Analysis Statistical Analysis
McGraw-Hill/Irwin Copyright © 2007 by The McGraw-Hill Companies, Inc. All rights reserved. Statistical Inferences Based on Two Samples Chapter 9.
8-1 COMPLETE BUSINESS STATISTICS by AMIR D. ACZEL & JAYAVEL SOUNDERPANDIAN 6 th edition (SIE)
Copyright © 2010, 2007, 2004 Pearson Education, Inc. Section 9-2 Inferences About Two Proportions.
Chapter 9 Hypothesis Testing and Estimation for Two Population Parameters.
Week 111 Power of the t-test - Example In a metropolitan area, the concentration of cadmium (Cd) in leaf lettuce was measured in 7 representative gardens.
One-sample In the previous cases we had one sample and were comparing its mean to a hypothesized population mean However in many situations we will use.
Chapter 10 Inferences from Two Samples
1 1 Chapter 2: Comparing Means 2.1 One-Sample t -Test 2.2 Paired t -Test 2.3 Two-Sample t -Test.
Pengujian Hipotesis Dua Populasi By. Nurvita Arumsari, Ssi, MSi.
Copyright © Cengage Learning. All rights reserved. 10 Inferences Involving Two Populations.
Chap 9-1 Two-Sample Tests. Chap 9-2 Two Sample Tests Population Means, Independent Samples Means, Related Samples Population Variances Group 1 vs. independent.
Essential Question:  How do scientists use statistical analyses to draw meaningful conclusions from experimental results?
Tests of Hypotheses Involving Two Populations Tests for the Differences of Means Comparison of two means: and The method of comparison depends on.
© Copyright McGraw-Hill 2000
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. Section 9-1 Review and Preview.
Aim: How do we test hypotheses that compare means of two groups? HW: complete last two questions on homework slides.
Week111 The t distribution Suppose that a SRS of size n is drawn from a N(μ, σ) population. Then the one sample t statistic has a t distribution with n.
9.2 Testing the Difference Between Two Means: Using the t Test
© Copyright McGraw-Hill 2004
1 Testing of Hypothesis Two Sample test Dr. T. T. Kachwala.
Sullivan – Fundamentals of Statistics – 2 nd Edition – Chapter 11 Section 1 – Slide 1 of 26 Chapter 11 Section 1 Inference about Two Means: Dependent Samples.
Topic 8 - Comparing two samples Confidence intervals/hypothesis tests for two means - pages Hypothesis test for two variances - pages.
AP Statistics. Chap 13-1 Chapter 13 Estimation and Hypothesis Testing for Two Population Parameters.
Lecture 8 Estimation and Hypothesis Testing for Two Population Parameters.
Chapter 9 Lecture 3 Section: 9.3. We will now consider methods for using sample data from two independent samples to test hypotheses made about two population.
AP Stat 2007 Free Response. 1. A. Roughly speaking, the standard deviation (s = 2.141) measures a “typical” distance between the individual discoloration.
Chapter 10: The t Test For Two Independent Samples.
Statistical Decision Making. Almost all problems in statistics can be formulated as a problem of making a decision. That is given some data observed from.
Class Six Turn In: Chapter 15: 30, 32, 38, 44, 48, 50 Chapter 17: 28, 38, 44 For Class Seven: Chapter 18: 32, 34, 36 Chapter 19: 26, 34, 44 Quiz 3 Read.
When the means of two groups are to be compared (where each group consists of subjects that are not related) then the excel two-sample t-test procedure.
Chapter 10 Two Sample Tests
Testing the Difference Between Two Means
Lecture Nine - Twelve Tests of Significance.
Lecture Slides Elementary Statistics Twelfth Edition
Estimation & Hypothesis Testing for Two Population Parameters
Inference about Comparing Two Populations
Chapter 9 Hypothesis Testing.
Lesson Comparing Two Means.
Comparing Populations
Inferential Statistics and Probability a Holistic Approach
Chapter 13: Inferences about Comparing Two Populations Lecture 7a
Hypothesis Testing: The Difference Between Two Population Means
Chapter 9 Lecture 3 Section: 9.3.
Presentation transcript:

Topic 8 - Comparing two samples Confidence intervals/hypothesis tests for two means Hypothesis test for two variances

Comparing two populations Sometimes we want to compare two populations rather making decisions about a single population. For example, we might want to compare two population means or two population proportions to see if they are equal. Is the expected drying time for one type of paint lower than that of another type of paint? Is a new drug more effective? Either increased or decreased mean versus the “established” drug, or increased or decreased percentage vs. control Does the new method actually result in increased crop yields or percentages, or decrease in tons lost to insects, etc.

Behind the scenes. What do the distributions look like?

Comparing two population means Suppose we have two independent samples, X1,…,Xm and Y1,…,Yn, from two separate populations. A natural statistic for comparing the two population means, mX and mY, is . The distribution of is also Normal for m and n both large.

Large samples test for comparing population means To test H0: mX – mY = D0, use the test statistic HA Reject H0 if mX – mY < D0 Z < -za mX – mY > D0 Z > za mX – mY ≠ D0 |Z| > za/2

Home sales data A realtor in Albuquerque wants to argue that houses in the Northeast are more expensive on average than those in the rest of town. NE = 0 indicates a home was not in the Northeast. Test the appropriate hypotheses with a = 0.01.

This is what the StatCrunch data looks like.

Here’s the output in StatCrunch

What does it look like?

Large samples confidence interval for the difference between two population means A large sample (1-a)100% confidence interval for mX – mY is For the home sales data, what is a 99% confidence interval for the difference between sale prices in the Northeast and the rest of town?

Equal population variances Suppose we assume that the two populations have a common variance s2. We can then estimate this common variance using the pooled sample variance:

Small samples test for comparing population means from Normal distributions with equal variances To test H0: mX – mY = D0, use the test statistic HA Reject H0 if mX – mY < D0 T < -ta,n+m-2 mX – mY > D0 T > ta,n+m-2 mX – mY ≠ D0 |T| > ta/2,n+m-2

THC example with equal variances The active component in marijuana is THC. An experiment was conducted to compare two slightly different configurations of this substance. The THC data set contains the time until the effect was perceived for 6 subjects exposed to each configuration. Is there any evidence that the mean time to perception is different between the two configurations using a = 0.01?

Here’s what the calculations look like. Pooled standard deviation

What does it look like? Twice the one tail value.

Small samples confidence interval for the difference between two population means Assuming equal variances, a small sample (1-a)100% confidence interval for mX – mY is For the THC data, what is a 99% confidence interval for the mean difference between the detection times for the two configurations?

Unequal population variances The pooled procedures we have discussed previously are fairly robust to the assumption of equal variances. In other words if the two population variances are relatively close, the procedures perform well: The level of significance for the hypothesis test is close to what it should be The coverage probability for the confidence interval is close to what it should be If the variances are quite different, then we need a different procedure.

Small samples test for comparing population means from Normal distributions with unequal variances To test H0: mX – mY = D0, use the test statistic with degrees of freedom HA Reject H0 if mX – mY < D0 T < -ta,v mX – mY > D0 T > ta,v mX – mY ≠ D0 |T| > ta/2,v

Small samples confidence interval for the difference between two population means… with unequal variances. Assuming unequal variances, a small sample (1-a)100% confidence interval for mX – mY is For the THC data, what is a 99% confidence interval for the mean difference between the detection times for the two configurations?

Comparing two population variances Suppose two chemical companies can supply a raw material, but we suspect the variability in concentration may differ between the two. The standard deviation of concentration in a random sample of 15 batches from company 1 was found to be 4.7 g/l (variance 22.09). A sample of 21 batches from company 2 yielded a standard deviation of 5.8 g/l (variance 33.64). Is there sufficient evidence to conclude that the variability in concentration differs for the two companies?

Test for comparing population variances from Normal distributions To test H0: sX2= sY2, use the test statistic HA Reject H0 if sX2 > sY2 F > Fa,m-1,n-1 sX2 < sY2 F < F1-a,m-1,n-1 sX2 ≠ sY2 F > Fa/2,m-1,n-1 or F < F1-a/2,m-1,n-1

Chemical example Is there sufficient evidence to conclude that the variability in concentration differs for the two companies with a = 0.05? Demonstrate the F calculator.

Confidence interval for the ratio of two Normal population variances A (1-a)100% confidence interval for sX2/sY2 is For the THC example, what is a 95% confidence interval for the ratio of concentration variances? The additional file for Topic 8 contains examples of large and small scale tests on the differences in population means and proportions.

Paired data Sometimes we have a third variable that connects elements from the X and Y samples. In this case, the assumption of independence between the two samples may be violated. Is there any evidence that the first twin and the second twin have different average weights among boy-boy twins? In this case, the twins are clearly connected by the mother. It might be better to base our test on the n pairwise differences, Di = Xi – Yi.

Paired test for comparing population means To test H0: mX – mY = D0, use the test statistic HA Reject H0 if mX – mY < D0 T < -ta,n-1 mX – mY > D0 T > ta,n-1 mX – mY ≠ D0 |T| > ta/2,n-1

Twins example Load the Twins data from StatCrunch sample data sets. Is there any evidence that Twin A and Twin B have different average weights among boy-boy twins with a = 0.1?

Additional pooled vs. paired Example: The article “Sex and Race Discrimination in the New Car Showroom: A fact or Myth” (J. Consumer Affairs, 1977, pp 107-113) reports the results of an experiment in which individuals of different races and sexes visited 9 car dealerships to request the best possible deal on a certain car. The actual car prices obtained are shown below:

The standard deviations are relatively close, so we could consider this as a pooled test of differences, with the following results;

Two ways to look at the situation Why did we get such poor results from our test? The assumption in a pooled test is that there’s independence of data. In other words, any values from the woman’s distribution of prices are independent of values from the man’s distribution…. A valid comparison in that situation looks like this….

However, we know that’s not the case. Prices from dealership 1 can be compared to each other (M to W), dealership 2, etc. There’s a relationship between the prices, a “pairing variable”. They are not independent and when viewed correctly, the data shows something completely diffferent…..

Paired confidence interval for the difference between two population means A small sample (1-a)100% confidence interval for mX – mY is For the car price example, what is a 90% confidence interval for the mean difference between the prices quoted to the black woman vs. the white man? CarData

Comparing two population proportions A natural statistic for comparing the two population proportions, pX and pY, is . The distribution of is also Normal for m and n both large.

Large samples test for comparing population proportions To test H0: pX – pY = 0, use the test statistic HA Reject H0 if pX – pY < 0 Z < -za pX – pY > 0 Z > za pX – pY ≠ 0 |Z| > za/2 Please note that the common p listed above is calculated as the total number of successes overall in the study, divided by the total number of observations…..

Polio example The following table summarizes a study of the efficacy of the Salk vaccine. (Please note that I changed the actual percentages who got polio in this example to make the numbers MUCH more workable….don’t panic). Was the vaccine effective? Test at a = 0.05. Treatment Total Patients Polio Vaccine 2,000 30 Placebo 100

Large samples confidence interval for the difference between two population proportions A large sample (1-a)100% confidence interval for pX – pY is For the Polio data, what is a 95% confidence interval for the difference between the proportion who contract the disease under each treatment?