Statistics for Business and Economics Module 1:Probability Theory and Statistical Inference Spring 2010 Lecture 4: Estimating parameters with confidence.

Slides:



Advertisements
Similar presentations
Estimation in Sampling
Advertisements

Chapter 10: Estimating with Confidence
1 Introduction to Inference Confidence Intervals William P. Wattles, Ph.D. Psychology 302.
McGraw-Hill Ryerson Copyright © 2011 McGraw-Hill Ryerson Limited. Adapted by Peter Au, George Brown College.
Inference: Confidence Intervals
Introduction to Inference Estimating with Confidence Chapter 6.1.
Introduction to Inference Estimating with Confidence IPS Chapter 6.1 © 2009 W.H. Freeman and Company.
Business Statistics for Managerial Decision
IPS Chapter 6 © 2012 W.H. Freeman and Company  6.1: Estimating with Confidence  6.2: Tests of Significance  6.3: Use and Abuse of Tests  6.4: Power.
Copyright © 2010 Pearson Education, Inc. Chapter 19 Confidence Intervals for Proportions.
Inferences About Process Quality
BCOR 1020 Business Statistics
Chapter 10: Estimating with Confidence
Tests of significance: The basics BPS chapter 15 © 2006 W.H. Freeman and Company.
Objectives (BPS chapter 14)
Tests of significance: The basics BPS chapter 15 © 2006 W.H. Freeman and Company.
14. Introduction to inference
Sociology 5811: Lecture 7: Samples, Populations, The Sampling Distribution Copyright © 2005 by Evan Schofer Do not copy or distribute without permission.
CHAPTER 8 Estimating with Confidence
Estimation of Statistical Parameters
+ The Practice of Statistics, 4 th edition – For AP* STARNES, YATES, MOORE Chapter 8: Estimating with Confidence Section 8.1 Confidence Intervals: The.
Lecture 12 Statistical Inference (Estimation) Point and Interval estimation By Aziza Munir.
PARAMETRIC STATISTICAL INFERENCE
Introduction to inference Tests of significance IPS chapter 6.2 © 2006 W.H. Freeman and Company.
AP STATISTICS LESSON 10 – 1 ( DAY 3 ) CHOOSING THE SAMPLE SIZE.
Inference We want to know how often students in a medium-size college go to the mall in a given year. We interview an SRS of n = 10. If we interviewed.
The Practice of Statistics Third Edition Chapter 10: Estimating with Confidence Copyright © 2008 by W. H. Freeman & Company Daniel S. Yates.
10.1: Confidence Intervals – The Basics. Review Question!!! If the mean and the standard deviation of a continuous random variable that is normally distributed.
Confidence intervals: The basics BPS chapter 14 © 2006 W.H. Freeman and Company.
Section 10.1 Confidence Intervals
Chapter 8 Delving Into The Use of Inference 8.1 Estimating with Confidence 8.2 Use and Abuse of Tests.
Introduction to the Practice of Statistics Fifth Edition Chapter 6: Introduction to Inference Copyright © 2005 by W. H. Freeman and Company David S. Moore.
6.1 Inference for a Single Proportion  Statistical confidence  Confidence intervals  How confidence intervals behave.
Ch 10 – Intro To Inference 10.1: Estimating with Confidence 10.2 Tests of Significance 10.3 Making Sense of Statistical Significance 10.4 Inference as.
Confidence intervals: The basics BPS chapter 14 © 2006 W.H. Freeman and Company.
Sampling distributions rule of thumb…. Some important points about sample distributions… If we obtain a sample that meets the rules of thumb, then…
Chapter 8: Confidence Intervals based on a Single Sample
IPS Chapter 6 © 2012 W.H. Freeman and Company  6.1: Estimating with Confidence  6.2: Tests of Significance  6.3: Use and Abuse of Tests  6.4: Power.
IPS Chapter 6 DAL-AC FALL 2015  6.1: Estimating with Confidence  6.2: Tests of Significance  6.3: Use and Abuse of Tests  6.4: Power and Inference.
Introduction to inference Tests of significance IPS chapter 6.2 © 2006 W.H. Freeman and Company.
SAMPLING DISTRIBUTION OF MEANS & PROPORTIONS. PPSS The situation in a statistical problem is that there is a population of interest, and a quantity or.
Introduction to Inference Estimating with Confidence PBS Chapter 6.1 © 2009 W.H. Freeman and Company.
Chapter 13 Sampling distributions
Introduction to inference Estimating with confidence IPS chapter 6.1 © 2006 W.H. Freeman and Company.
1 Probability and Statistics Confidence Intervals.
Estimation by Intervals Confidence Interval. Suppose we wanted to estimate the proportion of blue candies in a VERY large bowl. We could take a sample.
SAMPLING DISTRIBUTION OF MEANS & PROPORTIONS. SAMPLING AND SAMPLING VARIATION Sample Knowledge of students No. of red blood cells in a person Length of.
SAMPLING DISTRIBUTION OF MEANS & PROPORTIONS. SAMPLING AND SAMPLING VARIATION Sample Knowledge of students No. of red blood cells in a person Length of.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. Chapter 19 Confidence Intervals for Proportions.
Uncertainty and confidence Although the sample mean,, is a unique number for any particular sample, if you pick a different sample you will probably get.
Statistics for Business and Economics Module 1:Probability Theory and Statistical Inference Spring 2010 Lecture 8: Tests of significance and confidence.
Copyright © 2010 Pearson Education, Inc. Slide
10.1 Estimating with Confidence Chapter 10 Introduction to Inference.
Chapter 8: Estimating with Confidence
+ The Practice of Statistics, 4 th edition – For AP* STARNES, YATES, MOORE Chapter 8: Estimating with Confidence Section 8.1 Confidence Intervals: The.
Sampling distributions BPS chapter 10 © 2006 W. H. Freeman and Company.
Class Six Turn In: Chapter 15: 30, 32, 38, 44, 48, 50 Chapter 17: 28, 38, 44 For Class Seven: Chapter 18: 32, 34, 36 Chapter 19: 26, 34, 44 Quiz 3 Read.
And distribution of sample means
More on Inference.
Introduction to inference Estimating with confidence
Introduction to Inference
Tests of significance: The basics
More on Inference.
Confidence intervals: The basics
Estimating with Confidence
The Practice of Statistics in the Life Sciences Fourth Edition
Confidence intervals: The basics
Objectives 6.1 Estimating with confidence Statistical confidence
Objectives 6.1 Estimating with confidence Statistical confidence
Inference: Confidence Intervals
Presentation transcript:

Statistics for Business and Economics Module 1:Probability Theory and Statistical Inference Spring 2010 Lecture 4: Estimating parameters with confidence Priyantha Wijayatunga, Department of Statistics, Umeå University These materials are altered ones from copyrighted lecture slides (© 2009 W.H. Freeman and Company) from the homepage of the book: The Practice of Business Statistics Using Data for Decisions :Second Edition by Moore, McCabe, Duckworth and Alwan.

Estimation of parameters with confidence  Point estimation of mean and variance  Statistical confidence  Confidence intervals  Behavior of confidence intervals  Choosing the sample size  Cautions

Overview of Inference: estimation  Methods for drawing conclusions about a population from sample data are called statistical inference  Methods  Point values - estimating of a value of a population parameter with a single value  Confidence Intervals - estimating a value of a population parameter with a range of values with some confidence  Tests of significance - assess evidence for a claim about a population  Inference is appropriate when data are produced by either  a random sample or  a randomized experiment

Uncertainty and confidence Although the sample mean,, is a unique number for any particular sample, if you pick a different sample you will probably get a different sample mean. In fact, you could get many different values for the sample mean, and virtually none of them would actually equal the true population mean, . x

But the sample distribution is narrower than the population distribution, by a factor of √n. Thus, the estimates gained from our samples are always relatively close to the population parameter µ. n Sample means, n subjects  Population, x individual subjects If the population is normally distributed N(µ,σ), so will the sampling distribution N(µ,σ/√n),

Mean and variance are two main characteristics of aproability distribution If we dont know the values of them we estimate them with point values  Take a (random) sample from the popuation  Estimate:  For random samples they are unbiased  Estimating mean and variance of a probability distribution

Red dot: mean value of individual sample 95% of all sample means will be within roughly 2 standard deviations (2*  /√n) of the population parameter  Because distances are symmetrical, this implies that the population parameter  must be within roughly 2 standard deviations from the sample average, in 95% of all samples. This reasoning is the essence of statistical inference.

The weight of a single egg of the brown variety is normally distributed N(65 g,5 g). Think of a carton of 12 brown eggs as an SRS of size 12.. You buy a carton of 12 white eggs instead. The box weighs 770 g. The average egg weight from that SRS is thus = 64.2 g.  Knowing that the standard deviation of egg weight is 5 g, what can you infer about the mean µ of the white egg population? There is a 95% chance that the population mean µ is roughly within ± 2  /√n of, or 64.2 g ± 2.88 g. population sample  What is the distribution of the sample means ? Normal (mean , standard deviation  /√n) = N(65 g,1.44 g).  Find the middle 95% of the sample means distribution. Roughly ± 2 standard deviations from the mean, or 65g ± 2.88g.

Confidence interval The confidence interval is a range of values with an associated probability or confidence level C. The probability quantifies the chance that the interval contains the true population parameter. ± 4.2 is a 95% confidence interval for the population parameter . This equation says that in 95% of the cases, the actual value of  will be within 4.2 units of the value of.

Implications We don’t need to take a lot of random samples to “rebuild” the sampling distribution and find  at its center. n n Sample Population  All we need is one SRS of size n and relying on the properties of the sample means distribution to infer the population mean .

Reworded With 95% confidence, we can say that µ should be within roughly 2 standard deviations (2*  /√n) from our sample mean.  In 95% of all possible samples of this size n, µ will indeed fall in our confidence interval.  In only 5% of samples would be farther from µ.

A confidence interval can be expressed as:  Mean ± m m is called the margin of error  within ± m Example: 120 ± 6  Two endpoints of an interval  within ( − m) to ( + m) ex. 114 to 126 A confidence level C (in %) indicates the probability that the µ falls within the interval. It represents the area under the normal curve within ± m of the center of the curve. mm

Review: standardizing the normal curve using z N(0,1) N(64.5, 2.5) N(µ, σ/√n) Standardized height (no units) Here, we work with the sampling distribution, and  /√n is its standard deviation (spread). Remember that  is the standard deviation of the original population.

Confidence intervals contain the population mean  in C% of samples. Different areas under the curve give different confidence levels C. Example: For an 80% confidence level C, 80% of the normal curve’s area is contained in the interval. C z*z*−z*−z* Varying confidence levels Practical use of z: z*  The critical value z* is related to the chosen confidence level C.  C is the area under the standard normal curve between −z* and z*. The confidence interval is thus:

How do we find specific z* values? We can use a table of z/t values (Table C). For a particular confidence level, C, the appropriate z* value is just above it. We can use software. In Excel: =NORMINV(probability,mean,standard_dev) gives z for a given cumulative probability. Since we want the middle C probability, the probability we require is (1 - C)/2 Example: For a 98% confidence level, =NORMINV(.01,0,1) = − (= neg. z*) Example: For a 98% confidence level, z*=2.326

Link between confidence level and margin of error The confidence level C determines the value of z* (in table C). The margin of error also depends on z*. C z*z*−z* m Higher confidence C implies a larger margin of error m (thus less precision in our estimates). A lower confidence level C produces a smaller margin of error m (thus better precision in our estimates).

How confidence intervals behave The margin of error gets smaller:  When z* gets smaller. Smaller z* is the same as lower confidence level C.  When σ gets smaller. It is easier to estimate μ when σ is small.  When n gets larger.

Changing the sample size n  95% confidence interval for the mean LTDR, z* = 1.96, n = 110. = 76.7 ± 1.96(12.3/√110) = 76.7 ± 2.3 = (74.4, 79.0)  95% confidence interval for the mean LTDR, z* = 1.96, n = 25. = 76.7 ± 1.96(12.3/√25) = 76.7 ± 4.8 = (71.9, 81.5) Banks’ Loan-to-Deposit Ratio (LTDR) The mean LTDR for 110 banks is 76.7 and the standard deviation s = The sample is large enough for us to use s as σ. Give a 95% confidence interval for mean LTDR. Suppose there were only 25 banks in the survey, and the sample mean and σ are unchanged. The margin of error increases for the smaller sample.

Impact of sample size The spread in the sampling distribution of the mean is a function of the number of individuals per sample.  The larger the sample size, the smaller the standard deviation (spread) of the sample mean distribution.  But the spread only decreases at a rate equal to √n. Sample size n Standard error  ⁄ √n

Sample size and experimental design You may need a certain margin of error (e.g., drug trial, manufacturing specs). In many cases, the population variability (  is fixed, but we can choose the number of measurements (n). So plan ahead what sample size to use to achieve that margin of error. Remember, though, that sample size is not always stretchable at will. There are typically costs and constraints associated with large samples. The best approach is to use the smallest sample size that can give you useful results.

What sample size for a given margin of error? Banks’ Loan-to-Deposit Ratio (LTDR) Recall that we found margin of error = 2.3 for estimating the mean LTDR of banks with n = 110 and 95% confidence. We are willing to settle for margin of error 3.0 next year. How many banks should we survey? For 95% confidence, z* = 1.96, assume σ = Using only 64 observations will not be enough to ensure that m is no more than 3.0. Therefore, we need at least 65 observations.

Some cautions  The data must be an SRS from the population.  The formula is not correct for sampling designs more complex than an SRS (such as multistage or stratified samples).  Outliers can have a large effect on the confidence interval.  Examine your data carefully for skewness and other signs of non- Normality.  You must know the standard deviation σ of the population.