1 Difference Between the Means of Two Populations.

Slides:



Advertisements
Similar presentations
1 T-test for the Mean of a Population: Unknown population standard deviation Here we will focus on two methods of hypothesis testing: the critical value.
Advertisements

1 Confidence Interval for Population Mean The case when the population standard deviation is unknown (the more common case).
1 Difference Between the Means of Two Populations.
Statistics Review – Part II Topics: – Hypothesis Testing – Paired Tests – Tests of variability 1.
1 1 Slide IS 310 – Business Statistics IS 310 Business Statistics CSU Long Beach.
1 1 Slide © 2005 Thomson/South-Western Slides Prepared by JOHN S. LOUCKS St. Edward’s University Slides Prepared by JOHN S. LOUCKS St. Edward’s University.
1 One Tailed Tests Here we study the hypothesis test for the mean of a population when the alternative hypothesis is an inequality.
1 Matched Samples The paired t test. 2 Sometimes in a statistical setting we will have information about the same person at different points in time.
Middle on the Normal distribution. Z = =.1003 What is going on here? It is just an exercise in using.
1 Test for the Population Proportion. 2 When we have a qualitative variable in the population we might like to know about the population proportion of.
1 Analysis of Variance This technique is designed to test the null hypothesis that three or more group means are equal.
PSY 307 – Statistics for the Behavioral Sciences
1 More Regression Information. 2 3 On the previous slide I have an Excel regression output. The example is the pizza sales we saw before. The first thing.
An Inference Procedure
1/55 EF 507 QUANTITATIVE METHODS FOR ECONOMICS AND FINANCE FALL 2008 Chapter 10 Hypothesis Testing.
1 The Basics of Regression Regression is a statistical technique that can ultimately be used for forecasting.
1 Hypothesis Testing In this section I want to review a few things and then introduce hypothesis testing.
More Simple Linear Regression 1. Variation 2 Remember to calculate the standard deviation of a variable we take each value and subtract off the mean and.
Two Population Means Hypothesis Testing and Confidence Intervals With Unknown Standard Deviations.
Sampling and Sampling Distributions
1 T-test for the Mean of a Population: Unknown population standard deviation Here we will focus on two methods of hypothesis testing: the critical value.
Chapter 9 Hypothesis Testing.
Chapter 8 Introduction to Hypothesis Testing
1 Matched Samples The paired t test. 2 Sometimes in a statistical setting we will have information about the same person at different points in time.
1 (Student’s) T Distribution. 2 Z vs. T Many applications involve making conclusions about an unknown mean . Because a second unknown, , is present,
5-3 Inference on the Means of Two Populations, Variances Unknown
1 Confidence Interval for Population Mean The case when the population standard deviation is unknown (the more common case).
Hypothesis Testing: Two Sample Test for Means and Proportions
Inference about Population Parameters: Hypothesis Testing
An Inference Procedure
Fundamentals of Hypothesis Testing: One-Sample Tests
1/2555 สมศักดิ์ ศิวดำรงพงศ์
Copyright © Cengage Learning. All rights reserved. 13 Linear Correlation and Regression Analysis.
Lesson Carrying Out Significance Tests. Vocabulary Hypothesis – a statement or claim regarding a characteristic of one or more populations Hypothesis.
Section 10.1 ~ t Distribution for Inferences about a Mean Introduction to Probability and Statistics Ms. Young.
Copyright © 2013, 2010 and 2007 Pearson Education, Inc. Chapter Inference on the Least-Squares Regression Model and Multiple Regression 14.
Sullivan – Fundamentals of Statistics – 2 nd Edition – Chapter 11 Section 2 – Slide 1 of 25 Chapter 11 Section 2 Inference about Two Means: Independent.
More About Significance Tests
McGraw-Hill/Irwin Copyright © 2007 by The McGraw-Hill Companies, Inc. All rights reserved. Statistical Inferences Based on Two Samples Chapter 9.
Dan Piett STAT West Virginia University
The Practice of Statistics Third Edition Chapter 11: Inference for Distributions Copyright © 2008 by W. H. Freeman & Company Daniel S. Yates.
Statistics and Quantitative Analysis U4320
6.1 - One Sample One Sample  Mean μ, Variance σ 2, Proportion π Two Samples Two Samples  Means, Variances, Proportions μ 1 vs. μ 2.
10.2 Tests of Significance Use confidence intervals when the goal is to estimate the population parameter If the goal is to.
Statistical Inference
T- and Z-Tests for Hypotheses about the Difference between Two Subsamples.
1 Lecture note 4 Hypothesis Testing Significant Difference ©
10.1: Confidence Intervals Falls under the topic of “Inference.” Inference means we are attempting to answer the question, “How good is our answer?” Mathematically:
McGraw-Hill/Irwin Copyright © 2007 by The McGraw-Hill Companies, Inc. All rights reserved. Chapter 8 Hypothesis Testing.
Lecture 9 Chap 9-1 Chapter 2b Fundamentals of Hypothesis Testing: One-Sample Tests.
Interval Estimation and Hypothesis Testing Prepared by Vera Tabakova, East Carolina University.
© Copyright McGraw-Hill 2000
Chapter 8 Parameter Estimates and Hypothesis Testing.
Chap 8-1 Fundamentals of Hypothesis Testing: One-Sample Tests.
Testing Differences between Means, continued Statistics for Political Science Levin and Fox Chapter Seven.
Chapter 10 The t Test for Two Independent Samples
Copyright ©2013 Pearson Education, Inc. publishing as Prentice Hall
Confidence Interval Estimation For statistical inference in decision making: Chapter 9.
Statistical Inference Statistical inference is concerned with the use of sample data to make inferences about unknown population parameters. For example,
Sullivan – Fundamentals of Statistics – 2 nd Edition – Chapter 11 Section 3 – Slide 1 of 27 Chapter 11 Section 3 Inference about Two Population Proportions.
Chapter 7 Inference Concerning Populations (Numeric Responses)
 What is Hypothesis Testing?  Testing for the population mean  One-tailed testing  Two-tailed testing  Tests Concerning Proportions  Types of Errors.
Copyright © 2009 Pearson Education, Inc t LEARNING GOAL Understand when it is appropriate to use the Student t distribution rather than the normal.
Chapter 9 Fundamentals of Hypothesis Testing: One-Sample Tests
CHAPTER 6 Statistical Inference & Hypothesis Testing
Chapter 11 Inferences About Population Variances
Comparing two Rates Farrokh Alemi Ph.D.
Comparing Two Populations
Intro to Confidence Intervals Introduction to Inference
Presentation transcript:

1 Difference Between the Means of Two Populations

2 We have be studying inference methods for a single variable. When the variable was quantitative we had inference for the population mean. When the variable was qualitative we had inference for the population proportion. Now we want to study inference methods for two variables. Both variables could be quantitative, both qualitative or one of each. Depending on the which we have, we will look to certain techniques. At this stage of the game we will begin to look at these different methods. I want to start with 1 quantitative variable and one qualitative variable. In fact the qualitative variable is special here: the variable identifies membership in one of only two groups. Then on the quantitative variable we segment each observation into the appropriate group and think about the mean of the quantitative variable of each group.

3 Our context here is that we really want to know about the population of the two groups, but we will only take a sample from each group. We will look at both confidence intervals and hypothesis tests in this context. Some notation: μ i = the population mean of group i for i = 1, 2. σ i = the population standard deviation for group i for i = 1, 2. x i = the sample mean of group i for i = 1, 2. s i = the sample standard deviation for group i for i = 1, 2. Now for ease of typing I will call the population means mu1 or mu2, population standard deviations sigma1 or sigma2, sample means xbar1 or xbar2, and sample standard deviations s1 or s2. n1 is the sample size from population 1 and n2 has similar meaning.

4 Our context for inference is really the difference in means: mu1 minus mu2. So we are checking to see what difference there is in the means from the two groups. Our point estimator will be xbar1 minus xbar2. In a repeated sampling context the point estimator would vary from sample to sample. As an example say I want to check the average age of students in the economics program and the finance program. One sample from each group would yield one estimate and the estimate would likely be different when I get a different sample (from each major). Also note the sample obtained from group 1 is independent of the sample obtained from group 2. The sampling distribution of xbar1 minus xbar2 will be studied next.

5 The sampling distribution of xbar1 minus xbar2 Case 1 – we can use the normal distribution for the sampling distribution when sigma1 and sigma2 are known. This means we will use Z in our confidence intervals and hypothesis tests The center of the sampling distribution is mu1 minus mu2 and the standard error is the (note or digress x^2 means x squared ) square root[((sigma1^2)/n1)+((sigma2^2)/n2)]. Case 2 – we can use the t distribution for the sampling distribution when sigma1 and sigma2 are unknown. This means we will use t in our confidence intervals and hypothesis tests. The center of the sampling distribution is mu1 minus mu2 and the standard error is seen as the denominator of equation 10.2 on page 313. This is not pretty, but we must use it. Note that when using a t distribution one needs to have a degrees of freedom value. In our current context the value is n1 + n2 – 2.

6 Inference for case 1 Confidence interval We are C% confident the unknown population difference mu1 minus mu2 is in the interval (xbar1 minus xbar2) ± MOE, Where MOE = margin of error and this equals the appropriate Z times the standard error of the sampling distribution. Remember if C = 95 the Z = 1.96, and if C = 90 the Z = 1.645, and if C = 99 the Z = 2.58.

7 Hypothesis Test Recall from our past work that in an hypothesis test context we have a null and an alternative hypothesis. Plus the form of the alternative hypothesis will determine if we have a one or a two tailed test. Two tailed test When we study the difference in the means from two populations if we feel there is a difference of Do, but not concerned about the difference being positive or negative, then the null and alternative hypotheses are Ho: mu1 minus mu2 = Do, Ha: mu1 minus mu2 ≠ Do, and we have a two tailed test. Based on an alpha value (the probability of a type I error), we pick critical values of Z and if the calculated Z is more extreme than either critical value we reject the null and go with the alternative.

8 The calculated value of Z from the sample information = [(xbar1 minus xbar2) minus Do] divided by the standard error listed on slide 5 with case 1. Another way to think of the hypothesis test is with the use of the p-value for the calculated Z. If the p-value < alpha, reject the null. Otherwise you have to stick with the null. In practice with a two tailed test you will find the p-value as the area on one side of the distribution but you must double it to be on both sides. alpha/2 Upper Critical Z Alpha/2 lower Critical Z

9 One tailed test When the researcher has the feeling that the difference in mu1 and mu2 should be positive, then the alternative will reflect this feeling and we will have Ho: mu1 minus mu2 ≤ Do Ha: mu1 minus mu2 > Do. The signs are reversed when the researcher feels the difference should be negative. The test procedure proceeds in the same fashion as with the two- tailed test except the focus is just on one side of the distribution as directed by the alternative hypothesis. Note that Do is often zero. In that case we just want to see if the group means are different.

10 Common critical Z’s Two tailed testOne tailed test (negative if on left) Alpha = Alpha = Alpha =

11 Problems 1, 2, 3 page 319 1)Zstat = ( – 0)/SQRT[((20^2)/40) + ((10^2)/50)] = 6/sqrt[(400/40) + (100/50)] = 6/sqrt(12) = ) With alpha =.01 with a two tailed test we have.01/2 on each side. The critical Z’s would have.005 in each tail. So we have critical Z’s of and Since 1.73 is in the middle we do not reject the null. So, we have to say the population means are equal. 3) The tail area for Z = 1.73 is ( ) =.0418 and we double because of two tail test for a p-value of Since this is greater than.01 we can not reject the null.

12 Inference for case 2 Inference for case 2 is similar to case 1, except in how the standard error is calculated and that is shown on slide 5 and in using t. Let’s do problem 4 page 319 a) Looking at page 313 let’s first calculate the S squared sub p amount  [(7)16 + (14)25]/21 = [ ]/21 = 22 The tstat = [ – 0]/sqrt[22((1/8) + (1/15)) = 8/sqrt((22)(23/120)) = 8/2.05 = 3.90 (book answer is different due to rounding) b) Df = n1 + n2 – 2 = – 2 = 21 c) Critical t from the table is d) Since our tstat 3.90 > we reject the null and conclude mu1 > mu2.

13 Problem 8 page 320 a) Let’s say mu1 = mean strength of the new machine and mu2 = mean strength old machine. Ho: mu1 ≤ mu2 H1: mu1 > mu2. With a one tail test (upper tail) the critical Z (use Z because population standard devs are known) with alpha =.01 is The Zstat from the sample information is (72 – 65 – 0)/sqrt[((81/100) + ((100/100))] = 7/sqrt(181/100) = This is more extreme than the critical value so we reject the null. There is evidence to get the new machine. b) The Zstat = 5.20 has p-value (upper tail) approximately = which is <.01. Reject null.

14 Problem 12 page 321 a) Let’s say mu1= mean score for men and mu2= mean score for women. Ho: mu1 = mu2 H1: mu1 ≠ mu2. With a two tail test the critical t’s (use t because population standard devs are UNknown) with alpha =.05 and df = 170 are and (df = 120 is as close as we can get!!!) The S squared sub p in the formula is [99(13.35)^2 + 71(9.42)^2]/[99+71] = The tstat from the sample information is (40.26 – – 0)/sqrt[140.85((1/100) + ((1/72))] = 3.41/1.83 = This is between the critical values so we can not reject the null. The evidence suggests the population means are the same for males and females. b) The tstat = 1.86 has upper tail between.025 and.05 and doubling for two tail test p-value is between.05 and.10 which is >.05. Do not reject null.