Presentation is loading. Please wait.

Presentation is loading. Please wait.

Ka-fu Wong © 2003 Chap 8- 1 Dr. Ka-fu Wong ECON1003 Analysis of Economic Data.

Similar presentations


Presentation on theme: "Ka-fu Wong © 2003 Chap 8- 1 Dr. Ka-fu Wong ECON1003 Analysis of Economic Data."— Presentation transcript:

1 Ka-fu Wong © 2003 Chap 8- 1 Dr. Ka-fu Wong ECON1003 Analysis of Economic Data

2 Ka-fu Wong © 2003 Chap 8- 2 l GOALS Explain why a sample is the only feasible way to learn about a population. Describe methods to select a sample. Define and construct a sampling distribution of the sample mean. Explain the central limit theorem. Use the Central Limit Theorem to find probabilities of selecting possible sample means from a specified population. Chapter Eight Sampling Methods and the Central Limit Theorem

3 Ka-fu Wong © 2003 Chap 8- 3 Why Sample the Population? The physical impossibility of checking all items in the population. The cost of studying all the items in a population. The sample results are usually adequate. Contacting the whole population would often be time-consuming. The destructive nature of certain tests.

4 Ka-fu Wong © 2003 Chap 8- 4 Probability Sampling A probability sample is a sample selected such that each item or person in the population being studied has a known likelihood of being included in the sample.

5 Ka-fu Wong © 2003 Chap 8- 5 Methods of Probability Sampling Simple Random Sample: A sample formulated so that each item or person in the population has the same chance of being included. Systematic Random Sampling: The items or individuals of the population are arranged in some order. A random starting point is selected and then every kth member of the population is selected for the sample.

6 Ka-fu Wong © 2003 Chap 8- 6 Methods of Probability Sampling Stratified Random Sampling: A population is first divided into subgroups, called strata, and a sample is selected from each stratum. Cluster Sampling: A population is first divided into primary units then samples are selected from the primary units.

7 Ka-fu Wong © 2003 Chap 8- 7 Potential problems with the sampling method of “ Sampling Straws ” Choice of sampling method is important. As illustrated in the tutorial “ Sampling Straws ” experiments, some sampling method can produce a biased estimate of the population parameters. The bag contain a total of 12 straws, 4 of which are 4 inches in length, 4 are 2 inches long, and 4 are 1 inch long. The population mean length is 2.33 (=4*(1+2+4)/12) Randomly draw 4 straws one by one with replacement. Compute the sample mean. The average of the sample means of experiments is 2.37 The corresponding standard deviation is 0.61

8 Ka-fu Wong © 2003 Chap 8- 8 Methods of Probability Sampling “Sampling Straws” experiments The bag contain a total of 12 straws, 4 of which are 4 inches in length, 4 are 2 inches long, and 4 are 1 inch long. The population mean length is 2.33 (=4*(1+2+4)/12) Randomly draw 4 straws one by one with replacement. Compute the sample mean. The average of the sample means of experiments is 2.37 The corresponding standard deviation is 0.61 The sample scheme is biased because the longer straws have a higher chance of being drawn, if the draw is truly random (say, draw your first touched straw). The draw may not be random because we can feel the length of the straw before we pull out the straw.

9 Ka-fu Wong © 2003 Chap 8- 9 Methods of Probability Sampling “ Sampling Straws ” experiments The bag contain a total of 12 straws, 4 of which are 4 inches in length, 4 are 2 inches long, and 4 are 1 inch long. The population mean length is 2.33 (=4*(1+2+4)/12) Randomly draw 4 straws one by one with replacement. Compute the sample mean. The average of the sample means of experiments is 2.37 The corresponding standard deviation is 0.61 Alternative sampling scheme: Label the straws 1 to 12. Label 12 identical balls 1 to 12. Draw four balls with replacement. Measure the corresponding straws and compute the sample mean.

10 Ka-fu Wong © 2003 Chap 8- 10 Methods of Probability Sampling In nonprobability sample inclusion in the sample is based on the judgment of the person selecting the sample. The sampling error is the difference between a sample statistic and its corresponding population parameter. Sampling error is almost always nonzero.

11 Ka-fu Wong © 2003 Chap 8- 11 Sampling Distribution of the Sample Means The sampling distribution of the sample mean is a probability distribution consisting of all possible sample means of a given sample size selected from a population.

12 Ka-fu Wong © 2003 Chap 8- 12 EXAMPLE 1 The law firm of Hoya and Associates has five partners. At their weekly partners meeting each reported the number of hours they billed clients for their services last week. The population mean is also 25.2 hours.

13 Ka-fu Wong © 2003 Chap 8- 13 Example 1 If two partners are selected randomly, how many different samples are possible? This is the combination of 5 objects taken 2 at a time. That is: There are a total of 10 different samples.

14 Ka-fu Wong © 2003 Chap 8- 14 Example 1 continued

15 Ka-fu Wong © 2003 Chap 8- 15 EXAMPLE 1 continued Organize the sample means into a sampling distribution. The mean of the sample means is 25.2 hours. The mean of the sample means is exactly equal to the population mean.

16 Ka-fu Wong © 2003 Chap 8- 16 Example 1 Population variance = [ (22-25.2) 2 +(26-25.2) 2 + … + (22-25.2) 2 ] / 5 = 8.96 Variance of the sample means: =[ (1)(22-25.2) 2 +(4)(24-25.2) 2 + (3)(26-25.2) 2 + (2)(22-25.2) 2 ] / ( 1+2+3+2) = 3.36 The variance of sample means < variance of population variance 3.36/8.96 = 0.375 <1 Note that this is like sampling without replacement.

17 Ka-fu Wong © 2003 Chap 8- 17 Example Suppose we had a uniformly distributed population containing equal proportions (hence equally probable instances) of (0,1,2,3,4). If you were to draw a very large number of random samples from this population, each of size n=2, the possible combinations of drawn values and the sums are SumsCombinations 00,0 10,1 1,0 21,1 2,0 0,2 31,2 2,1 3,0 0,3 41,3 3,1 2,2 4,0 0,4 51,4 4,1 3,2 2,3 63,3 4,2 2,4 73,4 4,3 84,4 Note that this is sampling with replacement.

18 Ka-fu Wong © 2003 Chap 8- 18 Example Population mean = mean of sample means Population mean = (0+1+2+3+4)/5=2 Mean of sample means = [ (1)(0) + (2)(0.5) + …+(1)(4) ] / 25 = 2 Variance of sample means = Population variance/ sample size Population variance =(1-2) 2 + … + (4-2) 2 / 5 = 2 Variance of sample means =(1)(0-2) 2 +… +(1)(4-2) 2 / 25 =1 MeansCombinations 0.00,0 0.50,1 1,0 1.01,1 2,0 0,2 1.51,2 2,1 3,0 0,3 2.01,3 3,1 2,2 4,0 0,4 2.51,4 4,1 3,2 2,3 3.03,3 4,2 2,4 3.53,4 4,3 4.04,4

19 Ka-fu Wong © 2003 Chap 8- 19 Probability Histograms In a probability histograms, the area of the bar represents the chance of a value happening as a result of the random (chance) process Empirical histograms (from observed data) for a process converge to the probability histogram

20 Ka-fu Wong © 2003 Chap 8- 20 Examples of empirical histogram Roll a fair die: 20, 50, 200 times 20 times 50 times 200 times The empirical histogram will approach the probability histogram as the number of draws increase.

21 Ka-fu Wong © 2003 Chap 8- 21 Empirical histogram of sum Roll a fair die 20 and sum the rolls

22 Ka-fu Wong © 2003 Chap 8- 22 Empirical histogram of average Roll a fair die 20 times and average rolls

23 Ka-fu Wong © 2003 Chap 8- 23 Distribution of Sample means of different sample sizes and from different population distribution http://www.ruf.rice.edu/~lane/stat_sim/sampli ng_dist/index.html http://www.ruf.rice.edu/~lane/stat_sim/sampli ng_dist/index.html http://www.kuleuven.ac.be/ucs/java/index.htm and choose basic and distribution of mean. http://www.kuleuven.ac.be/ucs/java/index.htm http://faculty.vassar.edu/lowry/central.html

24 Ka-fu Wong © 2003 Chap 8- 24 Central Limit Theorem For a population with a mean  and a variance  2 the sampling distribution of the means of all possible samples of size n generated from the population will be approximately normally distributed. The mean of the sampling distribution equal to  and the variance equal to  2 /n. The sample mean of n observation The population distribution

25 Ka-fu Wong © 2003 Chap 8- 25 Central Limit Theorem: Sums For a large number of random draws, with replacement, the distribution of the sum approximately follows the normal distribution Mean of the normal distribution is n* (expected value for one repetition) SD for the sum (SE) is This holds even if the underlying population is not normally distributed

26 Ka-fu Wong © 2003 Chap 8- 26 Central Limit Theorem: Averages For a large number of random draws, with replacement, the distribution of the average = (sum)/n approximately follows the normal distribution The mean for this normal distribution is (expected value for one repetition) The SD for the average (SE) is This holds even if the underlying population is not normally distributed

27 Ka-fu Wong © 2003 Chap 8- 27 Law of large numbers The sample mean converges to the population mean as n gets large. For a large number of random draws from any population, with replacement, the distribution of the average = (sum)/n approximately follows the normal distribution The mean for this normal distribution is the (expected value for one repetition) The SD for the average (SE) is SD for the average tends to zero as n increases. This holds even if the underlying population is not normally distributed

28 Ka-fu Wong © 2003 Chap 8- 28 Point Estimates Examples of point estimates are the sample mean, the sample standard deviation, the sample variance, the sample proportion. A point estimate is one value ( a single point) that is used to estimate a population parameter.

29 Ka-fu Wong © 2003 Chap 8- 29 Point Estimates If a population follows the normal distribution, the sampling distribution of the sample mean will also follow the normal distribution. To determine the probability a sample mean falls within a particular region, use:

30 Ka-fu Wong © 2003 Chap 8- 30 Point Estimates If the population does not follow the normal distribution, but the sample is of at least 30 observations, the sample means will follow the normal distribution. To determine the probability a sample mean falls within a particular region, use:

31 Ka-fu Wong © 2003 Chap 8- 31 Example 2 Suppose the mean selling price of a gallon of gasoline in the United States is $1.30. Further, assume the distribution is positively skewed, with a standard deviation of $0.28. What is the probability of selecting a sample of 35 gasoline stations and finding the sample mean within $.08?

32 Ka-fu Wong © 2003 Chap 8- 32 Example 2 continued The first step is to find the z-values corresponding to $1.24 and $1.36. These are the two points within $0.08 of the population mean.

33 Ka-fu Wong © 2003 Chap 8- 33 Example 2 continued Next we determine the probability of a z-value between -1.69 and 1.69. It is: We would expect about 91 percent of the sample means to be within $0.08 of the population mean.

34 Ka-fu Wong © 2003 Chap 8- 34 - END - Chapter Eight Sampling Methods and the Central Limit Theorem


Download ppt "Ka-fu Wong © 2003 Chap 8- 1 Dr. Ka-fu Wong ECON1003 Analysis of Economic Data."

Similar presentations


Ads by Google