The Standardized Normal Distribution Z is N( 0, 1 2 ) The standardized normal X is N( ,  2 ) 1.For comparison of several different normal distributions.

Slides:



Advertisements
Similar presentations
Previous Lecture: Distributions. Introduction to Biostatistics and Bioinformatics Estimation I This Lecture By Judy Zhong Assistant Professor Division.
Advertisements

Copyright © 2010, 2007, 2004 Pearson Education, Inc. Chapter 18 Sampling Distribution Models.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. Chapter 18 Sampling Distribution Models.
Estimation in Sampling
Chapter 8: Estimating with Confidence
1 Virtual COMSATS Inferential Statistics Lecture-7 Ossam Chohan Assistant Professor CIIT Abbottabad.
McGraw-Hill Ryerson Copyright © 2011 McGraw-Hill Ryerson Limited. Adapted by Peter Au, George Brown College.
Chapter 19 Confidence Intervals for Proportions.
Math 161 Spring 2008 What Is a Confidence Interval?
The standard error of the sample mean and confidence intervals
Chapter 7 Sampling and Sampling Distributions
Chapter 9 Chapter 10 Chapter 11 Chapter 12
Point and Confidence Interval Estimation of a Population Proportion, p
Sampling Distributions
1 The Basics of Regression Regression is a statistical technique that can ultimately be used for forecasting.
Sampling Distributions
Part III: Inference Topic 6 Sampling and Sampling Distributions
Need to know in order to do the normal dist problems How to calculate Z How to read a probability from the table, knowing Z **** how to convert table values.
C82MCP Diploma Statistics School of Psychology University of Nottingham 1 Overview Central Limit Theorem The Normal Distribution The Standardised Normal.
1 The Sample Mean rule Recall we learned a variable could have a normal distribution? This was useful because then we could say approximately.
Sampling Theory Determining the distribution of Sample statistics.
Inferential Statistics
The standard error of the sample mean and confidence intervals How far is the average sample mean from the population mean? In what interval around mu.
Estimation Goal: Use sample data to make predictions regarding unknown population parameters Point Estimate - Single value that is best guess of true parameter.
Sociology 5811: Lecture 7: Samples, Populations, The Sampling Distribution Copyright © 2005 by Evan Schofer Do not copy or distribute without permission.
Many times in statistical analysis, we do not know the TRUE mean of a population of interest. This is why we use sampling to be able to generalize the.
STA Lecture 161 STA 291 Lecture 16 Normal distributions: ( mean and SD ) use table or web page. The sampling distribution of and are both (approximately)
Dan Piett STAT West Virginia University
McGraw-Hill/Irwin Copyright © 2007 by The McGraw-Hill Companies, Inc. All rights reserved. Chapter 6 Sampling Distributions.
AP Statistics Chapter 9 Notes.
Estimation of Statistical Parameters
Copyright © 2010, 2007, 2004 Pearson Education, Inc. Review and Preview This chapter combines the methods of descriptive statistics presented in.
Lecture 14 Sections 7.1 – 7.2 Objectives:
Population All members of a set which have a given characteristic. Population Data Data associated with a certain population. Population Parameter A measure.
Chapter 8 Confidence Intervals Statistics for Business (ENV) 1.
Sampling Distribution ● Tells what values a sample statistic (such as sample proportion) takes and how often it takes those values in repeated sampling.
1 Sampling Distributions Lecture 9. 2 Background  We want to learn about the feature of a population (parameter)  In many situations, it is impossible.
Section 8.1 Estimating  When  is Known In this section, we develop techniques for estimating the population mean μ using sample data. We assume that.
University of Ottawa - Bio 4118 – Applied Biostatistics © Antoine Morin and Scott Findlay 08/10/ :23 PM 1 Some basic statistical concepts, statistics.
Ch 8 Estimating with Confidence. Today’s Objectives ✓ I can interpret a confidence level. ✓ I can interpret a confidence interval in context. ✓ I can.
Stats 120A Review of CIs, hypothesis tests and more.
Statistical Sampling & Analysis of Sample Data
Copyright © 2009 Pearson Education, Inc. Chapter 18 Sampling Distribution Models.
February 2012 Sampling Distribution Models. Drawing Normal Models For cars on I-10 between Kerrville and Junction, it is estimated that 80% are speeding.
Section 10.1 Confidence Intervals
Lecture 7 Dustin Lueker. 2  Point Estimate ◦ A single number that is the best guess for the parameter  Sample mean is usually at good guess for the.
LSSG Black Belt Training Estimation: Central Limit Theorem and Confidence Intervals.
Sampling distributions rule of thumb…. Some important points about sample distributions… If we obtain a sample that meets the rules of thumb, then…
MBA7025_04.ppt/Jan 27, 2015/Page 1 Georgia State University - Confidential MBA 7025 Statistical Business Analysis Descriptive Statistics Jan 27, 2015.
CONFIDENCE INTERVALS.
Copyright © 2009 Cengage Learning 9.1 Chapter 9 Sampling Distributions ( 표본분포 )‏
Chapter 13 Sampling distributions
8.1 Estimating µ with large samples Large sample: n > 30 Error of estimate – the magnitude of the difference between the point estimate and the true parameter.
SAMPLING DISTRIBUTION OF MEANS & PROPORTIONS. SAMPLING AND SAMPLING VARIATION Sample Knowledge of students No. of red blood cells in a person Length of.
SAMPLING DISTRIBUTION OF MEANS & PROPORTIONS. SAMPLING AND SAMPLING VARIATION Sample Knowledge of students No. of red blood cells in a person Length of.
Sample Size Needed to Achieve High Confidence (Means)
Copyright © 2010, 2007, 2004 Pearson Education, Inc. Chapter 18 Sampling Distribution Models.
The inference and accuracy We learned how to estimate the probability that the percentage of some subjects in the sample would be in a given interval by.
The accuracy of averages We learned how to make inference from the sample to the population: Counting the percentages. Here we begin to learn how to make.
Sampling: Distribution of the Sample Mean (Sigma Known) o If a population follows the normal distribution o Population is represented by X 1,X 2,…,X N.
Sampling and Sampling Distributions. Sampling Distribution Basics Sample statistics (the mean and standard deviation are examples) vary from sample to.
Sampling Distribution Models
Parameter versus statistic
Sampling Distributions and Estimation
Chapter 6: Sampling Distributions
Descriptive Statistics: Presenting and Describing Data
Combining Random Variables
Estimation Goal: Use sample data to make predictions regarding unknown population parameters Point Estimate - Single value that is best guess of true parameter.
Sampling Distributions
Advanced Algebra Unit 1 Vocabulary
Presentation transcript:

The Standardized Normal Distribution Z is N( 0, 1 2 ) The standardized normal X is N( ,  2 ) 1.For comparison of several different normal distributions 2.For calculations without a computer

The Normal Approximation to the Binomial Both distributions have the same shape…

This normal approximation to the binomial works reasonably well when 1.np  5 and n(1-p)  5 2.No computer is nearby But it is important fact that the binomial distribution and normal distribution are similar … we will return to this subject relatively soon … central limit …

When talking about binomial and normal probabilities, we’ve taken the following point of view: A situation follows certain “probabilities,” and we can use this knowledge to deduce specific information about the situation Now we will take the reverse point of view: Specific information about a situation can be used to find “probabilities” that describe the situation Conceptual idea of new topics The word “features” could replace “probabilities”

For example, think about the quiz you took… Professors have always noticed that students’ scores on a test tend to follow a normal distribution By actually giving a test to a sample of students, you can estimate the mean and standard deviation of the underlying normal distribution For tests like the SAT, the underlying distribution is then used as a ranking measure for students taking the same test later These ideas are loose, and first we’re going to learn how to work with sample data

Populations and Samples A population is a complete set of data representing a given situation A sample is a subset of the population --- ideally a small-scale replica of the population E.g., all students that take the SAT constitute a population, while those taking the test on a particular Saturday are a sample E.g., all American citizens are a population, while those selected for a survey are a sample Populations are a relative concept

For the following definitions, imagine a population like the starting salaries of all MBA students graduating this year A population is assumed to follow a random variable X, with values and probabilities X = starting salary of a particular MBA student P(X = $100,000) = ???? So we could calculate the expected value  of the population, as well as the standard deviation  Except it is sometimes hard to get a handle on the entire population. Imagine finding out the starting salaries of every single graduating MBA student in the U.S.!

So instead of trying to look at the entire population, we look at a sample of the population, which hopefully gives us a good picture of the population We might take a survey of graduating MBAs to determine the average (or expected) starting salary A sample statistic is a quantitative measure of a sample; used to make estimates of the population A sample mean (or expected value) is used to estimate the population mean A sample standard deviation is used to estimate the population mean

Summary/Sample Measures A sample is made up of n observations X 1, X 2, …, X n sample mean = X = (X 1 + X 2 + … + X n ) / n Sample std dev = S X = sqrt ( [ (X 1 – X) 2 + (X 2 – X) 2 + … + (X n – X) 2 ] / (n-1) ) The median is the middle of the values; 50% of the observation values fall below the median and 50% above The mode is the most frequent observation value The maximum and minimum are the largest and smallest observations; the range is the difference between the max and min

Relevant Excel Commands = AVERAGE(array)= STDEV(array) Tools  Data Analysis  Descriptive Statistics (see Excel file)

Setup and Assumptions for This Lecture You have a population about which you’d like to know things such as mean, std dev, proportions Each member of the population is assumed to follow the random variable X with mean  X, std dev  X, and particular proportion p X Again, however,  X,  X, and p X are unknown The population is too big to measure directly You will take samples instead What information can be deduced?

A Practical Approach: Point Estimates To estimate  X,  X, and p X, take a sample of size n and calculate sample mean X-bar, sample std dev S X, and sample proportion X/n. Use these as “point estimates” of the true  X,  X, and p X Here’s an idea… (a) X i = value of i- th observation (b) X i = 1 if i-th observations has attribute, 0 otherwise

Can we do better? Point estimates are nice, but is there a better idea? After all, who knows how close X-bar, S X, and X/n are to  X,  X, and p X ? …interval estimates… For example, point estimate: “An estimate for the true mean  X is the point estimate X-bar = ” For example, interval estimate: “There is a 95% probability that the true mean  X lies between and ” Interval estimates are stronger than point estimates

Yes, we can do better! But it takes the investigation of some pretty tricky concepts… the sampling random variable and the sampling distribution

The Sampling Random Variable and the Sampling Distribution Fix in your mind a number n – the number of observations taken in a single sample Now think about taking many different samples of size n and calculating the sample mean for each sample taken The sampling random variable is the random variable that assigns the sample mean to each sample of size n … And the sampling distribution is the distribution of this random variable

Key Facts about the Sampling Distribution The mean of the sampling distribution is the mean of the population The std dev of the sampling distribution is the std dev of the population divided by the square root of n Central Limit Theorem If n is large then the sampling random variable is approximately normally distributed

Comments on the Sampling Distribution Remember: we don’t actually know  X or  X and so we don’t know the mean and standard deviation of the sampling distribution either We can make statements like: “The sample mean of a random sample of size n has a 95% chance of falling within 2 std devs up or down from the true population mean” The standard deviation of the sampling distribution is commonly called the standard error

An Example We can make statements like: “The sample mean of a random sample of size n has a 95% chance of falling within 2 standard errors up or down from the true population mean” (see Excel) Again, we must stress that we don’t know true population mean, population std dev, or sampling distribution std error

How the Sampling Distribution Can Be Used If we don’t know anything about the sampling distribution except “in theory,” then how can we really use it? Well, we can determine some information about the sampling distribution by taking an actual sample of size n S X-bar is called the sample standard error

A Practical Approach: Interval Estimates (Means) Using a sample of size n, let X-bar serve as a point estimate of the true population mean  X and of the mean  X-bar of the sampling distribution Also let the sample standard error S X-bar serve as an estimate of the standard error of X-bar From this information, we can build “confidence intervals” for the true mean  X of the population

Confidence Intervals (Means) Using a single sample of size n  30 with information a 95% confidence interval for the actual population mean  X is “We are 95% confident that the true population mean  X is between these two numbers.”

Confidence Intervals (Means) (cont’d) For a k% confidence interval, is replaced with the value z having P(-z  Z  z) = 0.01*(100 - k): k%0.01*(50 + k/2)z k%0.01*(50 + k/2)NORMSINV(0.01*(50 + k/2) ) 90% % % Z is the standardized normal By formula, z = NORMSINV( 0.01*( 50 + k/2 ) )

Sample Problem The corresponding Excel file contains a sample of size 80 on the length of a precision shaft for use in lathes. a.Calculate the mean, standard deviation, and standard error of the 80 values b.Construct 95% and 99% confidence intervals (C.I.s) for the population mean (see Excel)

Confidence Intervals (Proportions) If you have then a 95% confidence interval for the true population proportion is The follows the same rules as for means with n  30

Estimate = X/100

The estimate is, therefore, a binomial random variable with : Mean = np/n = p And Standard Deviation = sqrt(np[1-p])/n = sqrt(p[1-p]/n) Note: We can apply the CLT to approximate the binomial with a normal having the same mean and standard deviation.

Central Limit Theorem Restated for Population Proportions As the Sample size, n, increases, the sampling distribution approaches a normal distribution with Mean = p Standard deviation = sqrt[p (1 – p)/n]

Heart Valve Example Re-Visited 79 out of 100 assemblies were good  Estimates for mean and stdev are: Mean = 0.79 Stdev = sqrt[0.79 (1 – 0.79)/100] =

P(0.79 – r <= p <= r) = 95%

Example Problem Continued (see Excel) c.Estimate the population proportion of lathes that exceed inches. Construct a 90% C.I. for this proportion

Sample Size Needed to Achieve High Confidence (Means) Considering estimating  X, how many observations n are needed to obtain a 95% confidence interval for a particular error tolerance? The error tolerance E is ½ the width of the confidence interval Here,  is a conservative (high) estimate of the true std dev  X, often gotten by doing a preliminary small sample can be adjusted to get different confidences

Example Problem Continued (see Excel) d.Consider the sample of 80 as a preliminary sample. Find the minimal sample size to yield a 95% C.I. for the population mean with E = What about for 99% confidence?

Sample Size Needed to Achieve High Confidence (Proportions) Considering estimating p X, how many observations n are needed to obtain a 95% confidence interval for a particular error tolerance? The error tolerance E is ½ the width of the confidence interval Here, p is a conservative (closer to 0.5) estimate of the true population proportion p X, often gotten by doing a preliminary small sample can be adjusted to get different confidences

Polling Example In estimating the proportion of the population that approves of Bush’s performance as President, how many people should be polled to provide a 95% confidence interval with error 0.02? “83% of Americans approve of the job Bush is doing, plus or minus 2 percentage points” We should use conservative estimate of true proportion n = (1.960/0.02)^2 (0.5)(1 – 0.5) = But, if we’re certain p x  0.6… n = (1.960/0.02)^2 (0.6)(1 – 0.6) =