Testing a Hypothesis about means

Slides:



Advertisements
Similar presentations
Testing a Claim about a Proportion Assumptions 1.The sample was a simple random sample 2.The conditions for a binomial distribution are satisfied 3.Both.
Advertisements

Chapter 18: Inference about One Population Mean STAT 1450.
Inference for Regression
Analysis of Variance The contents in this chapter are from Chapter 15 and Chapter 16 of the textbook. One-Way Analysis of Variance Multiple Comparisons.
Section 9.3 Inferences About Two Means (Independent)
Chapter 10 Section 2 Hypothesis Tests for a Population Mean
Hypothesis Testing Using a Single Sample
Lecture 6 Outline: Tue, Sept 23 Review chapter 2.2 –Confidence Intervals Chapter 2.3 –Case Study –Two sample t-test –Confidence Intervals Testing.
T-Tests Lecture: Nov. 6, 2002.
8-4 Testing a Claim About a Mean
Copyright © 2014, 2013, 2010 and 2007 Pearson Education, Inc. Chapter Hypothesis Tests Regarding a Parameter 10.
Chapter 11: Inference for Distributions
Chapter 10, sections 1 and 4 Two-sample Hypothesis Testing Test hypotheses for the difference between two independent population means ( standard deviations.
ESTIMATION AND HYPOTHESIS TESTING: TWO POPULATIONS
Hypothesis Testing and T-Tests. Hypothesis Tests Related to Differences Copyright © 2009 Pearson Education, Inc. Chapter Tests of Differences One.
Estimation and Hypothesis Testing Faculty of Information Technology King Mongkut’s University of Technology North Bangkok 1.
Inferential Statistics: SPSS
Experimental Statistics - week 2
Hypothesis Testing in SPSS Using the T Distribution
Chapter 13 – 1 Chapter 12: Testing Hypotheses Overview Research and null hypotheses One and two-tailed tests Errors Testing the difference between two.
Week 9 Chapter 9 - Hypothesis Testing II: The Two-Sample Case.
II.Simple Regression B. Hypothesis Testing Calculate t-ratios and confidence intervals for b 1 and b 2. Test the significance of b 1 and b 2 with: T-ratios.
Statistical inference: confidence intervals and hypothesis testing.
Hypothesis testing – mean differences between populations
Review of Statistical Inference Prepared by Vera Tabakova, East Carolina University ECON 4550 Econometrics Memorial University of Newfoundland.
Lecture 14 Testing a Hypothesis about Two Independent Means.
Chapter 9.3 (323) A Test of the Mean of a Normal Distribution: Population Variance Unknown Given a random sample of n observations from a normal population.
Copyright © 2013, 2010 and 2007 Pearson Education, Inc. Chapter Inference on the Least-Squares Regression Model and Multiple Regression 14.
Education 793 Class Notes T-tests 29 October 2003.
Estimates and Sample Sizes Lecture – 7.4
Topics: Statistics & Experimental Design The Human Visual System Color Science Light Sources: Radiometry/Photometry Geometric Optics Tone-transfer Function.
Chapter 9 Hypothesis Testing and Estimation for Two Population Parameters.
Independent Samples t-Test (or 2-Sample t-Test)
Chapter 11 Inference for Distributions AP Statistics 11.1 – Inference for the Mean of a Population.
1 Chapter 7 Looking at Distributions. 2 Modeling by A Distribution For a given data set we want to know which distribution can fit each variable. This.
CHAPTER 18: Inference about a Population Mean
Learning Objectives In this chapter you will learn about the t-test and its distribution t-test for related samples t-test for independent samples hypothesis.
+ Chapter 12: More About Regression Section 12.1 Inference for Linear Regression.
CADA Final Review Assessment –Continuous assessment (10%) –Mini-project (20%) –Mid-test (20%) –Final Examination (50%) 40% from Part 1 & 2 60% from Part.
DIRECTIONAL HYPOTHESIS The 1-tailed test: –Instead of dividing alpha by 2, you are looking for unlikely outcomes on only 1 side of the distribution –No.
BPS - 3rd Ed. Chapter 161 Inference about a Population Mean.
1 Regression Analysis The contents in this chapter are from Chapters of the textbook. The cntry15.sav data will be used. The data collected 15 countries’
1 Objective Compare of two population variances using two samples from each population. Hypothesis Tests and Confidence Intervals of two variances use.
Week111 The t distribution Suppose that a SRS of size n is drawn from a N(μ, σ) population. Then the one sample t statistic has a t distribution with n.
Slide Slide 1 Section 8-4 Testing a Claim About a Mean:  Known.
Chapter 8 Parameter Estimates and Hypothesis Testing.
© Copyright McGraw-Hill 2004
7.5 Hypothesis Testing for Variance and Standard Deviation Key Concepts: –The Chi-Square Distribution –Critical Values and Rejection Regions –Chi-Square.
Sullivan – Fundamentals of Statistics – 2 nd Edition – Chapter 11 Section 1 – Slide 1 of 26 Chapter 11 Section 1 Inference about Two Means: Dependent Samples.
Estimating a Population Mean. Student’s t-Distribution.
Copyright © 1998, Triola, Elementary Statistics Addison Wesley Longman 1 Assumptions 1) Sample is large (n > 30) a) Central limit theorem applies b) Can.
Lecture Slides Elementary Statistics Twelfth Edition
Assumptions and Conditions –Randomization Condition: The data arise from a random sample or suitably randomized experiment. Randomly sampled data (particularly.
T tests comparing two means t tests comparing two means.
ENGR 610 Applied Statistics Fall Week 7 Marshall University CITE Jack Smith.
Essential Statistics Chapter 171 Two-Sample Problems.
Chapter 9: Introduction to the t statistic. The t Statistic The t statistic allows researchers to use sample data to test hypotheses about an unknown.
Chapter 7 Inference Concerning Populations (Numeric Responses)
Statistical Inference for the Mean Objectives: (Chapter 8&9, DeCoursey) -To understand the terms variance and standard error of a sample mean, Null Hypothesis,
Copyright © 2009 Pearson Education, Inc t LEARNING GOAL Understand when it is appropriate to use the Student t distribution rather than the normal.
Copyright © 2013, 2010 and 2007 Pearson Education, Inc. Chapter Hypothesis Tests Regarding a Parameter 10.
Section Copyright © 2014, 2012, 2010 Pearson Education, Inc. Lecture Slides Elementary Statistics Twelfth Edition and the Triola Statistics Series.
Assumptions For testing a claim about the mean of a single population
Inferential Statistics Inferences from Two Samples
Lecture Slides Elementary Statistics Twelfth Edition
Elementary Statistics
Levene's Test for Equality of Variances
Elementary Statistics
Power Section 9.7.
Lecture Slides Elementary Statistics Twelfth Edition
Presentation transcript:

Testing a Hypothesis about means The contents in this chapter are from Chapter 12 to Chapter 14 of the textbook. Testing a single mean Testing two related means Testing two independent means

Testing a single mean This chapter uses the gssft.sav data, which includes data for fulltime workers only. The variables are: Hrsl: number of hours worked last week Agecat: age category Rincome: respondents income

Example The left plot is a histogram of the number of hours worked in the previous week for 437 college graduates The peak at 40 hours is higher than you would expect for a normal distribution. There is also a tail toward larger values of hours worked. It appears that people are more likely to work a long week than a short week.

Example basic statistics The sample mean (47) is not equals to the sample median (45). The distribution is right-skewed that is consistent with Sk=1.24 The distribution is not normal. How would you go about determining if 47 is an unlikely value if the population mean to be 40.

Testing a single mean The variance is unknown, The statistic The rejection region The critical value of t can be found in many textbooks or SPSS.

Testing a single mean The standard error of the mean is The t -statistic The 95% confidence interval of the difference is

The t-distribution The statistic used in the previous page follows a t-distribution with n-1 degrees of freedom. This is a 2-tailed test. The p-value is the probability that a sample t value is greater than 14.3 or less than -14.3. The p-value in this example is less than 0.0005. We can conclude that it’s quite unlikely that college graduates work a 40-hour on average.

Normal approximation The degree of freedoms in this test is 437-1=436. The t distribution is very close to the normal. The critical values or confidence interval can be determined based on the normal population.

The 95% confidence interval is given by

Hypothesis Testing The p-value is the probability of getting a test statistic equal to or more extreme than the sample result, given that the null hypothesis is true.

Testing a Hypothesis about Two related means We use the endoph.sav data set provided by the author. Dale et al. (1987) investigated the possible role of in the collapse of runners. are morphine (吗啡)-like substances manufactured in the body. They measured plasma (血浆) concentrations for 11 runners before and after they participated in a half-marathon run. The question of interest was whether average levels changed during a run.

Testing a Hypothesis about Two related means

Testing a Hypothesis about Two related means This problem is recommended to use the paired-samples t test.

Testing a Hypothesis about Two related means The average difference is 18.74 that is large comparing with S.D.=8.3. The 95% confidence interval for the average difference is (13.14, 24.33) that does not includes the value of o, you can reject the hypothesis. An equivalent way or testing the hypothesis is the t test. The p-value is less than 0.0005, we should reject the hypothesis.

Testing a Hypothesis about Two related means

Testing a Hypothesis about Two related means diff Stem-and-Leaf Plot Frequency Stem & Leaf 1.00 0 . 3 4.00 1 . 0127 5.00 2 . 00458 1.00 3 . 0 Stem width: 10.00 Each leaf: 1 case (s) Each difference uses only the first two digits with rounding.

Testing a Hypothesis about Two related means All the differences are positive. That is, the after values are always greater than the before values. The stem-and-leaf plot doesn’t suggest any obvious departures from normality. A normal probability plot, or Q-Q plot, can helps us to test the normality of the data.

Normal Probability Plot For each data point, the Q-Q plot shows the observed value and the value that is expected if the data are a sample from a normal distribution. The points should cluster around a straight line if the data are from a normal distribution. The normal Q-Q plot of the difference variable is nor or less linear, so the assumption of normality appears to be reasonable.

Normal Probability Plot

Testing Two Independent Means This section uses the gss.sav data set. Consider the number of hours of television viewing per day reported by internet users and non-users. It is clear that both are not from a normal distribution.

Testing Two Independent Means We find that there are some problems in the data. There are people who report watching television for 24 hours a day!! It is impossible. Watch TV is not a very well-defined term. If you have the TV on while you are doing homework, are you studying or watching TV? The observations in these two groups are independent. This fact implies “two independent means”.

Testing Two Independent Means

Testing Two Independent Means Two sample means, 2.42 hours of TV viewing and 3.52 hours for those who don’t use the internet. A difference is about 1.1 hours. The 5% trimmed means, which are calculated by removing the top and bottom 5% of the values, are 0.3 hours less for both groups than the arithmetic means. The trimmed means are more meaningful in this case study.

Testing Two Independent Means For testing the hypothesis There are several cases:

Testing Two Independent Means

Testing Two Independent Means In most cases the variances are unknown.

Testing Two Independent Means Output from t test for TV watching hours

Testing Two Independent Means In the output, there are two difference versions of the t test. One makes the assumption that the variances in the two populations are equal; the other does not. Both tests recommend to reject the hypothesis with a significant level less than 0.0005. The two-tailed test used in the two tests. Testing the equality of two variances will be given next section.

Testing Two Independent Means The 95% confidence interval for the true difference is [0.77, 1.42] for equal variances not assumed, [0.76, 1.42] for the equal variances assumed. Both the intervals do not cover the value 0, we should reject the hypothesis.

F test for equality of Two Variances

F test for equality of Two Variances

F test for equality of Two Variances

F test for equality of Two Variances From the results below we have The critical value is close to 1.00 that implies to reject the hypothesis that two populations have the same variance.

Levene’s test for equality of variances The SPSS report used the Levene’s test (1960) that is used to test if k samples have equal variances. Equal variances across samples is called homogeneity of variance. The Lenene’s test is less sensitive than some other tests. The SPSS output recommends to reject the hypothesis.

Effect Outliers Some one reported watching TV for very long time, including 24 hours a day. Removed observations where the person watch TV for more than 12 hours.

Effect Outliers The average difference between the two groups reduced from 1.09 to 1.05. The conclusions do not have any change.

Introducing More Variables Let us consider more related variables to study on the TV watching time Consider age, education, working hours.

Introducing More Variables We reject the hypothesis that in the population the two groups have the same average age, education, and hours. Internet users are significantly younger, better educated, and work more hours per week.

Introducing More Variables