The mean and the std. dev. of the sample mean

Slides:



Advertisements
Similar presentations
Week 71 Bootstrap Method - Introduction The bootstrap, developed by Efron in the late 1970s, allows us to calculate estimates in situations where there.
Advertisements

11.1 – Significance Tests: The Basics
Hypothesis Testing A hypothesis is a claim or statement about a property of a population (in our case, about the mean or a proportion of the population)
Decision Errors and Power
STATISTICAL INFERENCE PART V
Topic 6: Introduction to Hypothesis Testing
EPIDEMIOLOGY AND BIOSTATISTICS DEPT Esimating Population Value with Hypothesis Testing.
Distribution of the sample mean and the central limit theorem
Chapter 9 Chapter 10 Chapter 11 Chapter 12
Business Statistics for Managerial Decision
1 Tests of Hypotheses about the mean - continued.
BCOR 1020 Business Statistics Lecture 21 – April 8, 2008.
Inferences About Process Quality
Chapter 9 Hypothesis Testing.
McGraw-Hill/IrwinCopyright © 2009 by The McGraw-Hill Companies, Inc. All Rights Reserved. Chapter 9 Hypothesis Testing.
Experimental Statistics - week 2
Overview Definition Hypothesis
An importer of Herbs and Spices claims that average weight of packets of Saffron is 20 grams. However packets are actually filled to an average weight,
Section 9.1 Introduction to Statistical Tests 9.1 / 1 Hypothesis testing is used to make decisions concerning the value of a parameter.
1/2555 สมศักดิ์ ศิวดำรงพงศ์
1 BA 275 Quantitative Business Methods Hypothesis Testing Elements of a Test Concept behind a Test Examples Agenda.
STAT 5372: Experimental Statistics Wayne Woodward Office: Office: 143 Heroy Phone: Phone: (214) URL: URL: faculty.smu.edu/waynew.
More About Significance Tests
McGraw-Hill/Irwin Copyright © 2007 by The McGraw-Hill Companies, Inc. All rights reserved. Statistical Inferences Based on Two Samples Chapter 9.
STATISTICAL INFERENCE PART VII
6.1 - One Sample One Sample  Mean μ, Variance σ 2, Proportion π Two Samples Two Samples  Means, Variances, Proportions μ 1 vs. μ 2.
Significance Tests: THE BASICS Could it happen by chance alone?
Exercises Z ~ N(0, 1). Find P (-1.96 < Z < 1.96).
Stat 1510 Statistical Inference: Confidence Intervals & Test of Significance.
Essential Statistics Chapter 131 Introduction to Inference.
10.2 Tests of Significance Use confidence intervals when the goal is to estimate the population parameter If the goal is to.
Population distribution VS Sampling distribution
The Practice of Statistics Third Edition Chapter 10: Estimating with Confidence Copyright © 2008 by W. H. Freeman & Company Daniel S. Yates.
Week101 Decision Errors and Power When we perform a statistical test we hope that our decision will be correct, but sometimes it will be wrong. There are.
1 Chapter 10: Introduction to Inference. 2 Inference Inference is the statistical process by which we use information collected from a sample to infer.
Large sample CI for μ Small sample CI for μ Large sample CI for p
Lecture 16 Section 8.1 Objectives: Testing Statistical Hypotheses − Stating hypotheses statements − Type I and II errors − Conducting a hypothesis test.
Introduction to Inferece BPS chapter 14 © 2010 W.H. Freeman and Company.
1 Chapter 9 Hypothesis Testing. 2 Chapter Outline  Developing Null and Alternative Hypothesis  Type I and Type II Errors  Population Mean: Known 
McGraw-Hill/Irwin Copyright © 2007 by The McGraw-Hill Companies, Inc. All rights reserved. Chapter 8 Hypothesis Testing.
Lecture 9 Chap 9-1 Chapter 2b Fundamentals of Hypothesis Testing: One-Sample Tests.
Week111 The t distribution Suppose that a SRS of size n is drawn from a N(μ, σ) population. Then the one sample t statistic has a t distribution with n.
STATISTICAL INFERENCE PART IV CONFIDENCE INTERVALS AND HYPOTHESIS TESTING 1.
Lecture PowerPoint Slides Basic Practice of Statistics 7 th Edition.
Fall 2002Biostat Statistical Inference - Proportions One sample Confidence intervals Hypothesis tests Two Sample Confidence intervals Hypothesis.
Week121 Robustness of the two-sample procedures The two sample t-procedures are more robust against nonnormality than one-sample t-procedures. When the.
Chapter 9: Hypothesis Tests Based on a Single Sample 1.
AP Statistics Section 11.1 B More on Significance Tests.
Business Statistics for Managerial Decision Farideh Dehkordi-Vakil.
© Copyright McGraw-Hill 2004
Exercise - 1 A package-filling process at a Cement company fills bags of cement to an average weight of µ but µ changes from time to time. The standard.
AP Statistics Chapter 11 Notes. Significance Test & Hypothesis Significance test: a formal procedure for comparing observed data with a hypothesis whose.
C HAPTER 4  Hypothesis Testing -Test for one and two means -Test for one and two proportions.
Understanding Basic Statistics Fourth Edition By Brase and Brase Prepared by: Lynn Smith Gloucester County College Chapter Nine Hypothesis Testing.
Uncertainty and confidence Although the sample mean,, is a unique number for any particular sample, if you pick a different sample you will probably get.
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 7 Inferences Concerning Means.
4-1 Statistical Inference Statistical inference is to make decisions or draw conclusions about a population using the information contained in a sample.
Parameter, Statistic and Random Samples
Chapter Nine Hypothesis Testing.
Some useful results re Normal r.v
Point Estimates and CI μ σ2 S2 p
Introduction to Inference
Hypothesis Testing – Introduction
Chapter 9 Hypothesis Testing.
Introduction to Inference
Problems: Q&A chapter 6, problems Chapter 6:
Decision Errors and Power
STAT 111 Introductory Statistics
Hypothesis Testing – Introduction
Chapter 9: Significance Testing
Presentation transcript:

The mean and the std. dev. of the sample mean Select a SRS of size n from a population and measure a variable X on each individual in the sample. The data consists of observations on n r.v’s X1,X2…,Xn. If the population is large we can consider X1,X2…,Xn to be independent. The sample mean of a SRS of size n is . If the population has mean  and std dev. , what is the: mean of the total T = X1+X2+···+Xn ? Answer: μT = μ X1+X2+···+Xn = n·μ week9

Variance of the sample mean ? Mean of the sample mean ? Variance of the total T ? Variance of the sample mean ? week9

Sampling distribution of a sample mean If a population has the N(,) distribution, then the sample mean of n independent observations has the N(, / ) Example A bottling company uses a filling machine to fill plastic bottles with a popular cola. The bottles are supposed to contain 300 milliliters (ml). In fact, the contents vary according to a normal distribution with mean 298 ml and standard deviation 3 ml. (a) What is the probability that an individual bottle contains less than 295 ml? (b) What is the probability that the mean contents of the bottles in a six-pack is less than 295ml? . week9

The central limit theorem Draw a SRS of size n from a population with mean  and std dev. . When n is large, sampling distribution of a sample mean is approximately normal with mean  and std dev.  / . Note: The normal approximation for the sample proportion and counts is an important example of the central limit theorem. Note: The total T = X1+X2+···+Xn is approximately normal with mean n and stdev. ·. week9

Example (Question 24 Final Dec 98) Suppose that the weights of airline passengers are known to have a distribution with a mean of 75kg and a std. dev. of 10kg. A certain plane has a passenger weight capacity of 7700kg. What is the probability that a flight of 100 passengers will exceed the capacity? week9

Example In a certain University, the course STA100 has tutorials of size 40. The course STA200 has tutorials of size 25, and the course STA300 has tutorials of size 15. Each course has 5 tutorials per year. Students are enrolled by computer one by one into tutorials. Assume that each student being enrolled by computer may be considered a random selection from a very big group of people wherein there is a 50-50 male to female sex ratio. Which of the following statements is true? A) Over the years STA100 will have more tutorials with 2/3 females (or more). B) Over the years STA200 will have more tutorials with 2/3 females (or more). C) Over the years STA300 will have more tutorials with 2/3 females (or more). D) Over the years, each course will have about the same number of tutorials with 2/3 females (or more). E) No course will have tutorials with 2/3 females (or more). week9

Question State whether the following statements are true or false. (i) As the sample size increases, the mean of the sampling distribution of the sample mean decreases. (ii) As the sample size increases, the standard deviation of the sampling distribution of the sample mean decreases. (iii) The mean of a random sample of size 4 from a negatively skewed distribution is approximately normally distributed. (iv) The distribution of the proportion of successes in a sufficiently large sample is approximately normal with mean p and standard deviation where p is the population proportion and n is the sample size. (v) If is the mean of a simple random sample of size 9 from N(500, 18) distribution, then has a normal distribution with mean 500 and variance 36. week9

Question State whether the following statements are true or false. A large sample from a skewed population will have an approximately normal shaped histogram. The mean of a population will be normally distributed if the population is quite large. The average blood cholesterol level recorded in a SRS of 100 students from a large population will be approximately normally distributed. The proportion of people with incomes over $200 000, in a SRS of 10 people, selected from all Canadian income tax filers will be approximately normal. week9

Exercise A parking lot is patrolled twice a day (morning and afternoon). In the morning, the chance that any particular spot has an illegally parked car is 0.02. If the spot contained a car that was ticketed in the morning, the probability the spot is also ticketed in the afternoon is 0.1. If the spot was not ticketed in the morning, there is a 0.005 chance the spot is ticketed in the afternoon. a) Suppose tickets cost $10. What is the expected value of the tickets for a single spot in the parking lot. b) Suppose the lot contains 400 spots. What is the distribution of the value of the tickets for a day? c) What is the probability that more than $200 worth of tickets are written in a day? week9

Exercises Z ~ N(0, 1). Find P (-1.96 < Z < 1.96). Z ~ N(0, 1). Find the value of c such that P(-c < Z < c) = 0.95. Z ~ N(0, 1). Find the value of c such that P(-c < Z < c) = 0.90. X ~ N(500, 15). Find the values of c and d such that P(c < X < d ) = 0.95. week9

8. X~ N(, ) Let be the mean of a random sample of size n X~N(, ). Find the values of c and d (in terms of , and ) such that P(c < X < d ) = 0.95 X~N(, ). Find the values of c and d (in terms of , and ) such that P(c < X < d ) = 0.90 X~N(500, 15). Let be the mean of a random sample of size 9. Find the values of c and d such that P( c < < d ) = 0.95 8. X~ N(, ) Let be the mean of a random sample of size n Find the values of c and d such that P( c < < d ) = 0.95 week9

Point Estimates and CI μ σ2 S2 p A basic tool in statistical inference is point estimate of the population parameter. However, an estimate without an indication of it’s variability is of little value. Example: A level C confidence interval for a parameter is an interval computed from sample data by a method that has probability C of producing an interval containing the true value of the parameter. Parameter Estimate Std. Error μ σ2 S2 p week9

Confidence interval for the population mean Choose a SRS of size n from a population having unknown mean  and known stdev. . A level C confidence interval for  is an interval of the form, Here is the value on the standard normal curve with area C between and . The interval is exact when the population distribution is normal and approximately correct for large n in other cases. In general CIs have the form: Estimate  margin of error In the above case, Margin of error = m = week9

Note, in the above formula for the CI for the population mean, is the stdev. of the sample mean (this is also known as the std. error of the sample mean ) and it can also be written as The width of any CI is L = 2m i.e. twice the margin of error. Here are three ways to reduce the margin of error (and the width of the CI) Use a lower level of confidence (smaller C) Increase the sample size n. Reduce  (usually not possible). week9

Sample size for desired margin of error The CI for population mean will have a specified margin of error m when the sample size is Example: A limnologist wishes to estimate the mean phosphate content per unit volume of lake water. It is known from previous studies that the stdev. has a fairly stable value of 4mg. How many water samples must the limnologist analyze to be 90% certain that the error of estimation does not exceed 0.8 mg? week9

Example You want to rent an unfurnished one-bedroom apartment for next semester. The mean monthly rent for a random sample of 10 apartments advertised in the local newspaper is $580. Assume that the stdev. is $90. Find a 95% CI for the mean monthly rent for unfurnished one-bedroom apartments available for rent in this community. How large a sample of one-bedroom apartments would be needed to estimate the mean µ within ±$20 with 90% confidence? week9

Exercise Data on the Degree of Reading Power (DRP) scores for 44 students are recorded. Suppose that the SD of the population of DRP scores is know to be σ =11. 95% CI for the population mean score is given in the MINITAB output below. DRP Scores 40 26 39 14 42 18 25 43 46 27 19 47 19 26 35 34 15 44 40 38 31 46 52 25 35 35 33 29 34 41 49 28 52 47 35 48 22 33 41 51 27 14 54 45 Z Confidence Intervals The assumed sigma = 11.0 Variable N Mean StDev SE Mean 95.0 % CI DRP Scor 44 35.09 11.19 1.66 (31.84 , 38.34) MINITAB Command Stat > Basic Statistics > 1 Sample Z and select ‘Confidence interval’ week9

Exercise A random sample of 85 students in Chicago city high schools taking a course designed to improve SAT scores. Based on these students a 90% CI for the mean improvement in SAT scores for all Chicago high school students is computed as (72.3, 91.4) points. Which of the following statements are true? 90% of the students in the sample improved their scores by between 72.3 and 91.4 points. 90% of the students in the population improved their scores by between 72.3 and 91.4 points. 95% CI will contain the value 72.3. The margin of error of the 90% CI above is 9.55. 90% CI based on a sample of 340 ( 85 X 4) students will have margin of error 9.55/4. week9

Statistical Tests A significance test is a formal procedure for comparing observed data with a hypothesis whose truth we want to assess. The hypothesis is a statement about the parameters in a population or model. Null hypothesis The statement being tested in a test of significance is called the null hypothesis. The test of significance is designed to assess the strength of the evidence against the null hypothesis. Usually the null hypothesis is a statement of “no effect” or “no difference”. We abbreviate “null hypothesis” as H0 . week9

Example Each of the following situations requires a significance test about a population mean . State the appropriate null hypothesis H0 and alternative hypothesis Ha in each case. The mean area of the several thousand apartments in a new development is advertised to be 1250 square feet. A tenant group thinks that the apartments are smaller than advertised. They hire an engineer to measure a sample of apartments to test their suspicion. (b) Larry's car consume on average 32 miles per gallon on the highway. He now switches to a new motor oil that is advertised as increasing gas mileage. After driving 3000 highway miles with the new oil, he wants to determine if his gas mileage actually has increased. (c) The diameter of a spindle in a small motor is supposed to be 5 millimeters. If the spindle is either too small or too large, the motor will not perform properly. The manufacturer measures the diameter in a sample of motors to determine whether the mean diameter has moved away from the target. week9

Test Statistic The test is based on a statistic that estimate the parameter that appears in the hypotheses. Usually this is the same estimate we would use in a confidence interval for the parameter. When H0 is true, we expect the estimate to take a value near the parameter value specified in H0. Values of the estimate far from the parameter value specified by H0 give evidence against H0. The alternative hypothesis determines which directions count against H0. A test statistic measures compatibility between the null hypothesis and the data. We use it for the probability calculation that we need for our test of significance It is a random variable with a distribution that we know. week9

Example An air freight company wishes to test whether or not the mean weight of parcels shipped on a particular root exceeds 10 pounds. A random sample of 49 shipping orders was examined and found to have average weight of 11 pounds. Assume that the stdev. of the weights () is 2.8 pounds. The null and alternative hypotheses in this problem are: H0: μ = 10 ; Ha: μ > 10 . The test statistic for this problem is the standardized version of Decision: ? week9

P-value and Significance level The probability computed under the assumption that H0 is true, that the test statistic would take a value as extreme or more extreme than that actually observed is called the P-value of the test. The smaller the P-value the stronger the evidence against H0 provided by the data. The decisive value of the P is called the significance level. It is denoted by . Statistical significance If the P-value is as small or smaller than , we reject H0 and say that the data are statistically significant at level . The P-value is the smallest level α at which the data are significant. week9

Z Test for a population mean ( known) To test the hypothesis H0: µ = µ0 based on a SRS of size n from a population with unknown mean µ and known stdev σ, compute the test statistic In terms of a standard Normal variable Z, the P-value for the test of H0 against Ha : µ > µ0 is P( Z ≥ z ) Ha : µ < µ0 is P( Z ≤ z ) Ha : µ ≠ µ0 is 2·P( Z ≥ |z|) These P-values are exact if the population distribution is normal and are approximately correct for large n in other cases. week9

Critical value approach We can base our test conclusions on a fixed level of significant α without computing the P-value. For this we need to find a critical value z* from the standard normal distribution with a specified tail area (to the right or left depending on Ha). This tail area is called the rejection region. If the test statistic falls in the rejection region we reject H0 and conclude that the data are statistically significant at level . A P-value is more informative then a reject-or-not finding at a fixed significance level because it can tell us about the strength of evidence we found against the H0. week9

Example The Pfft Light Bulb Company claims that the mean life of its 2 watt bulbs is 1300 hours. Suspecting that the claim is too high, Nalph Rader gathered a random sample of 64 bulbs and tested each. He found the average life to be 1295 hours. Test the company's claim using  = 0.01. Assume  = 20 hours. week9

Exercise A standard intelligence examination has been given for several years with an average score of 80 and a standard deviation of 7. If 25 students taught with special emphasis on reading skill, obtain a mean grade of 83 on the examination, is there reason to believe that the special emphasis changes the result on the test? Use  = 0.05. week9

Exercise Data on the Degree of Reading Power (DRP) scores for 44 students in a suburban school district (same data as on slide 17). Suppose that the SD of scores in this school district is known to be σ =11. The researcher believes that the mean score μ of all the students in this district is higher than the national mean which is 32. The MINITAB output for the test is given below. Z-Test Test of mu = 32.00 vs mu > 32.00 The assumed sigma = 11.0 Variable N Mean StDev SE Mean Z P DRP Scor 44 35.09 11.19 1.66 1.86 0.031 MINITAB Command Stat > Basic Statistics > 1 Sample Z and select ‘Test mean’ week9

Confidence Intervals and two-sided tests A level  two-sided significance test rejects a hypothesis H0: μ = μ0 exactly when the value μ0 falls outside the 1- α confidence interval for . Example For the exercise on slide 27 a 95% CI is 83 ± 1.96·(7/5) = (80.256, 85.744) The value 80 is not in this interval and so we reject H0:  = 80 at the 5% level of significance. week9