CHAPTER-6 Sampling error and confidence intervals
population samplestatistic Parameter error
Section 1 sampling error of mean Section 2 t distribution Section 3 confidence intervals for the population mean
Section 1 sampling error of mean
A simple random sample is a sample of size n drawn from a population of size N in such a way that every possible random samples n has the same probability of being selected. Variability among the simple random samples drawn from the same population is called sampling variability, and the probability distribution that characterizes some aspect of the sampling variability, usually the mean but not always, is called a sampling distribution. These sampling distributions allow us to make objective statements about population parameters without measuring every object in the population.
[Example 1] The population mean of DBP in the Chinese adult men is 72mmHg with standard deviation 5mmHg. 10 adult participants was chosen randomly from the Chinese adult men, here we can calculate the sample mean and sample standard deviation. Supposing sampling 100 times, what’s the result?
linkage N
If random samples are repeatedly drawn from a population with a mean μ and standard deviation σ, we can find: 1 the sample means are different from the others 2 The sample mean are not necessary equal to population mean μ 3 The distribution of sample mean is symmetric about μ HOW TO EXPLORE THE SAMPLING DISTRIBUTION FOR THE MEAN?
The difference between sample statistics and population parameter or the difference among sample statistics are called sampling error.
In real life we sample only once, but we realize that our sample comes from a theoretical sampling distribution of all possible samples of a particular size. The sampling distribution concept provides a link between sampling variability and probability. Choosing a random sample is a chance operation and generating the sampling distribution consists of many repetitions of this chance operation.
When sampling from a normally distributed population with mean μ, the distribution of the sample mean will be normal with mean μ Central limit Theorem
= 50 =10 XX Population distribution n = 4 Sampling distribution X n =16
When sampling from a nonnormally distributed population with mean μ, the distribution of the sample mean will be approximately normal with mean μ as long as n is larger enough (n>50). Central limit Theorem
X
Standard error (SE) can be used to assess sampling error of mean. Although sampling error is inevitable, it can be calculated accurately.
theoretical value of SE estimation of SE Calculation of standard error (SE) s↑→SE ↑ n ↑→ SE ↓ linkage
Example 5.2 One analyst chose randomly a sample (n=100) and measured their weights with a mean of 72kg and standard deviation of 15kg. Question: what is the standard error?
Solution:
Exercise 5.1 Consider a sample of measurement 100 with mean 121cm and standard deviation 7cm drawn from a normal population. Try to compute its standard error.
Solution:
Section 2 t distribution
1. Definition N(μ, 2 ) N(0, 1)
Random sampling
Usually standard deviation σ is unknown, so we can only get s, then we can calculate
This sampling distribution was developed by W.S Gossett and published under the pseudonym “student” in it is, therefore, sometimes called the “student’s t distribution and is really a family of distributions dependent on the n-1.
= n-1 Z distribution t distribution
2. the characteristics of t distribution graph FIG 4 the graph of t distribution with different degrees of freedom
1 symmetric about 0; 2 the shape of t curve is determined by degree of freedom, df=n-1.degree of freedom 3 t-distribution is approximated to standard normal distribution when n is infinite.
t critical value with one-sided probability → t (α, ) t critical value with two-sided probability → t (α /2, )
Example 5.2 With n=15, find t 0 such that P(-t 0 ≤t≤ t 0 )=0.90
solution From t value table, df=15-1=14, the two- tailed shaded area equals 0.10, so -t 0 = and t 0 =1.761
Section 3 confidence intervals for the population mean
Statistical methods descriptive statistics inferential statistics parameter estimation hypothesis test Intervals estimation Point estimation
1. Basic concepts Parameter estimation: Deduce the population parameter basing on the sample statistics
Point Estimate A single-valued estimate. A single element chosen from a sampling distribution. Conveys little information about the actual value of the population parameterabout the accuracy of the estimate.
Confidence Interval or Interval Estimation An interval or range of values believed to include the unknown population parameter.
Point estimation Lower limit Upper limit Intervals estimation
1 - /2/2/2/2 /2/2/2/2
2. Methods Z distribution 1. σ is known 2. σ is unknown , n > 50 t distribution σ is unknown , n≤50 CI
Example 5.3 A horticultural scientist is developing a new variety of apple. One of the important traits, in addition to taste, color, and storability, is the uniformity of the fruit size. To estimate the weight she samples 100 mature fruit and calculates a sample mean of 220g and standard deviation 5g Develop 95% confidence intervals for the population mean μ from her sample
solution 95% confidence intervals for the population mean is between and g
Exercise A forester is interested in estimating the average number of ‘count trees’ per acre. A random sample of n=64 one acre is selected and examined. The average (mean) number of count trees per acre is found to be 27.3, with a standard deviation of Use this information to construct 95% confidence interval for μ.
solution 95% confidence intervals for the population mean is between and 30.24
The forester is 95% confident that the population mean for “count trees” per acre is between and 30.24
Example 5.4 The ecologist samples 25 plants and measures their heights. He finds that the sample has a mean of 15cm and a sample deviation of 4cm. what is the 95% confidence interval for the population mean μ
solution df=25-1=24
The plant ecologist is 95% confident that the population mean for heights of these plants is between and cm
Exercise 1 one doctor samples 25 men and measures their heights. He finds that the sample has a mean of cm and a sample deviation of 4.50cm. what is the 95% confidence interval for the population mean μ
solution 95% confidence intervals for the population mean is between and
Exercise 2 Random samples of size 9 are repeatedly drawn from a normal distribution with a mean of 65 and a standard deviation of 18. Describe the sampling distribution of mean.
65 Lower limit Upper limit
PROBLEM 1. What are the difference of SD and SE? 2. What is the medical reference range? What is the confidence intervals for population mean?