Sampling Theory Determining the distribution of Sample statistics.

Slides:



Advertisements
Similar presentations
THE CENTRAL LIMIT THEOREM
Advertisements

Chapter 18 Sampling distribution models
Chapter 6 Sampling and Sampling Distributions
Sta220 - Statistics Mr. Smith Room 310 Class #14.
SAMPLING DISTRIBUTIONS Chapter How Likely Are the Possible Values of a Statistic? The Sampling Distribution.
Week11 Parameter, Statistic and Random Samples A parameter is a number that describes the population. It is a fixed number, but in practice we do not know.
Statistics : Statistical Inference Krishna.V.Palem Kenneth and Audrey Kennedy Professor of Computing Department of Computer Science, Rice University 1.
Sampling Distributions (§ )
THE CENTRAL LIMIT THEOREM The “World is Normal” Theorem.
Terminology A statistic is a number calculated from a sample of data. For each different sample, the value of the statistic is a uniquely determined number.
Introduction to Statistics
Central Limit Theorem.
Chapter 7 Introduction to Sampling Distributions
Chapter 7 Introduction to Sampling Distributions
Chapter 7 Sampling Distributions
Sampling Distributions
Chapter 7 Sampling and Sampling Distributions
Fall 2006 – Fundamentals of Business Statistics 1 Chapter 6 Introduction to Sampling Distributions.
Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-Hall, Inc. Chap 6-1 Introduction to Statistics Chapter 7 Sampling Distributions.
QM Spring 2002 Business Statistics Sampling Concepts.
Chapter 7: Variation in repeated samples – Sampling distributions
Sampling Distributions
Part III: Inference Topic 6 Sampling and Sampling Distributions
Chapter 7 ~ Sample Variability
Sampling Theory Determining the distribution of Sample statistics.
Sample Distribution Models for Means and Proportions
Sampling Theory Determining the distribution of Sample statistics.
McGraw-Hill/IrwinCopyright © 2009 by The McGraw-Hill Companies, Inc. All Rights Reserved. Chapter 7 Sampling Distributions.
Sampling Distributions
McGraw-Hill/Irwin Copyright © 2007 by The McGraw-Hill Companies, Inc. All rights reserved. Chapter 6 Sampling Distributions.
© 2003 Prentice-Hall, Inc.Chap 6-1 Business Statistics: A First Course (3 rd Edition) Chapter 6 Sampling Distributions and Confidence Interval Estimation.
© 2003 Prentice-Hall, Inc.Chap 7-1 Basic Business Statistics (9 th Edition) Chapter 7 Sampling Distributions.
Chap 6-1 A Course In Business Statistics, 4th © 2006 Prentice-Hall, Inc. A Course In Business Statistics 4 th Edition Chapter 6 Introduction to Sampling.
1 Sampling Distributions Lecture 9. 2 Background  We want to learn about the feature of a population (parameter)  In many situations, it is impossible.
Agresti/Franklin Statistics, 1e, 1 of 139  Section 6.4 How Likely Are the Possible Values of a Statistic? The Sampling Distribution.
Sampling Distributions Chapter 7. The Concept of a Sampling Distribution Repeated samples of the same size are selected from the same population. Repeated.
HAWKES LEARNING SYSTEMS math courseware specialists Copyright © 2010 by Hawkes Learning Systems/Quant Systems, Inc. All rights reserved. Chapter 9 Samples.
Copyright © 2010 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Chapter 7 Sampling Distributions.
Chapter 7: Sample Variability Empirical Distribution of Sample Means.
Chapter 7 Sample Variability. Those who jump off a bridge in Paris are in Seine. A backward poet writes inverse. A man's home is his castle, in a manor.
McGraw-Hill/IrwinCopyright © 2009 by The McGraw-Hill Companies, Inc. All Rights Reserved. Chapter 7 Sampling Distributions.
Population and Sample The entire group of individuals that we want information about is called population. A sample is a part of the population that we.
Chap 7-1 Basic Business Statistics (10 th Edition) Chapter 7 Sampling Distributions.
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc.. Chap 7-1 Chapter 7 Sampling Distributions Basic Business Statistics.
Chapter 7: Sampling Distributions Section 7.1 How Likely Are the Possible Values of a Statistic? The Sampling Distribution.
26134 Business Statistics Tutorial 12: REVISION THRESHOLD CONCEPT 5 (TH5): Theoretical foundation of statistical inference:
Chapter 5 Sampling Distributions. The Concept of Sampling Distributions Parameter – numerical descriptive measure of a population. It is usually unknown.
1 ES Chapter 11: Goals Investigate the variability in sample statistics from sample to sample Find measures of central tendency for sample statistics Find.
Random Variables Numerical Quantities whose values are determine by the outcome of a random experiment.
Review of Statistical Terms Population Sample Parameter Statistic.
Introduction to Inference Sampling Distributions.
Sampling Distributions Sampling Distributions. Sampling Distribution Introduction In real life calculating parameters of populations is prohibitive because.
Chapter 7 Introduction to Sampling Distributions Business Statistics: QMIS 220, by Dr. M. Zainal.
Sampling Distribution Models and the Central Limit Theorem Transition from Data Analysis and Probability to Statistics.
Sampling Distributions Chapter 18. Sampling Distributions A parameter is a number that describes the population. In statistical practice, the value of.
Chapter 6 Sampling and Sampling Distributions
Topic 8: Sampling Distributions
THE CENTRAL LIMIT THEOREM
Chapter 7 Review.
Sampling Distributions
Understanding Sampling Distributions: Statistics as Random Variables
Basic Business Statistics (8th Edition)
Introduction to Sampling Distributions
Sampling Distributions
THE CENTRAL LIMIT THEOREM
Combining Random Variables
Determining the distribution of Sample statistics
Sampling Distribution Models
CHAPTER 15 SUMMARY Chapter Specifics
Quantitative Methods Varsha Varde.
Presentation transcript:

Sampling Theory Determining the distribution of Sample statistics

Sampling Theory sampling distributions It is important that we model this and use it to assess accuracy of decisions made from samples. A sample is a subset of the population. In many instances it is too costly to collect data from the entire population. Note:It is important to recognize the dissimilarity (variability) we should expect to see in various samples from the same population.

Statistics and Parameters A statistic is a numerical value computed from a sample. Its value may differ for different samples. e.g. sample mean, sample standard deviation s, and sample proportion. A parameter is a numerical value associated with a population. Considered fixed and unchanging. e.g. population mean , population standard deviation , and population proportion p.

Observations on a measurement X x 1, x 2, x 3, …, x n taken on individuals (cases) selected at random from a population are random variables prior to their observation. The observations are numerical quantities whose values are determined by the outcome of a random experiment (the choosing of a random sample from the population).

The probability distribution of the observations x 1, x 2, x 3, …, x n is sometimes called the population. This distribution is the smooth histogram of the the variable X for the entire population

the population is unobserved (unless all observations in the population have been observed)

A histogram computed from the observations x 1, x 2, x 3, …, x n Gives an estimate of the population.

A statistic computed from the observations x 1, x 2, x 3, …, x n Is also a random variable prior to observation of the sample. A statistic is also a numerical quantity whose value is determined by the outcome of a random experiment (the choosing of a random sample from the population).

The probability distribution of statistic computed from the observations x 1, x 2, x 3, …, x n is sometimes called its sampling distribution. This distribution describes the random behaviour of the statistic

It is important to determine the sampling distribution of a statistic. It will describe its sampling behaviour. The sampling distribution will be used the asses the accuracy of the statistic when used for the purpose of estimation. Sampling theory is the area of Mathematical Statistics that is interested in determining the sampling distribution of various statistics

Many statistics have a normal distribution. This quite often is true if the population is Normal It is also sometimes true if the sample size is reasonably large. (reason – the Central limit theorem, to be mentioned later)

Two important statistics that have a normal distribution The sample mean The sample proportion: X is the number of successes in a Binomial experiment

has Normal distribution with The sampling distribution of the sample mean

Graphs The probability distribution of individual observations The sampling distribution of the mean

Example Suppose we are measuring the cholesterol level of men age This measurement has a Normal distribution with mean  = 220 and standard deviation  = 17. A sample of n = 10 males age are selected and the cholesterol level is measured for those 10 males. x 1, x 2, x 3, x 4, x 5, x 6, x 7, x 8, x 9, x 10, are those 10 measurements Find the probability distribution of Compute the probability that is between 215 and 225

Solution Find the probability distribution of

Graphs The probability distribution of individual observations The sampling distribution of the mean

The Central Limit Theorem The Central Limit Theorem (C.L.T.) states that if n is sufficiently large, the sample means of random samples from a population with mean  and finite standard deviation  are approximately normally distributed with mean  and standard deviation. Technical Note: The mean and standard deviation given in the CLT hold for any sample size; it is only the “approximately normal” shape that requires n to be sufficiently large.

Graphical Illustration of the Central Limit Theorem Original Population 30 Distribution of x: n = 10 Distribution of x: n = 30 Distribution of x: n = 2 30

Implications of the Central Limit Theorem The Conclusion that the sampling distribution of the sample mean is Normal, will to true if the sample size is large (>30). (even though the population may be non- normal). When the population can be assumed to be normal, the sampling distribution of the sample mean is Normal, will to true for any sample size. Knowing the sampling distribution of the sample mean allows to answer probability questions related to the sample mean.

Example Example:Consider a normal population with  = 50 and  = 15. Suppose a sample of size 9 is selected at random. Find: Px()4560  Px(.)  475 1) 2) Solutions: Since the original population is normal, the distribution of the sample mean is also (exactly) normal 1)  x  50  x n  )

x Example PxP x Pz (.). (.)...             z =; x -   n

x 0  Example PxP Pz () ( )             zz =; x -   n

Example Example:A recent report stated that the day-care cost per week in Boston is $109. Suppose this figure is taken as the mean cost per week and that the standard deviation is known to be $20. 1)Find the probability that a sample of 50 day-care centers would show a mean cost of $105 or less per week. 2)Suppose the actual sample mean cost for the sample of 50 day- care centers is $120. Is there any evidence to refute the claim of $109 presented in the report ? Solutions: The shape of the original distribution is unknown, but the sample size, n, is large. The CLT applies. The distribution of is approximately normal x x  n  x

Example xPP Pz (). (.)           z =; x -   n z x 1)

To investigate the claim, we need to examine how likely an observation is the sample mean of $120 There is evidence (the sample) to suggest the claim of  = $109 is likely wrong Since the probability is so small, this suggests the observation of $120 is very rare (if the mean cost is really $109) Consider how far out in the tail of the distribution of the sample meanis $120 PxP Pz (). (.)            = z =; x -   n z 2)

Summary The distribution of is (exactly) normal when the original population is normal The CLT says: the distribution of is approximately normal regardless of the shape of the original distribution, when the sample size is large enough! The mean of the sampling distribution of is equal to the mean of the original population: The standard deviation of the sampling distribution of (also called the standard error of the mean) is equal to the standard deviation of the original population divided by the square root of the sample size:

Sampling Distribution for Any Statistic Every statistic has a sampling distribution, but the appropriate distribution may not always be normal, or even approximately bell-shaped.

Sampling Distribution for Sample Proportions Let p = population proportion of interest or binomial probability of success. Let is a normal distribution with = sample proportion or proportion of successes.

Example Sample Proportion Favoring a Candidate Suppose 20% all voters favor Candidate A. Pollsters take a sample of n = 600 voters. Then the sample proportion who favor A will have approximately a normal distribution with

Determine the probability that the sample proportion will be between 0.18 and 0.22 Using the Sampling distribution: Suppose 20% all voters favor Candidate A. Pollsters take a sample of n = 600 voters.

Solution: