Download presentation
Presentation is loading. Please wait.
Published byMabel Young Modified over 9 years ago
1
QMS 6351 Statistics and Research Methods Chapter 7 Sampling and Sampling Distributions Prof. Vera Adamchik
2
Chapter 7 Outline Simple random sampling Point estimation Introduction to sampling distributions Sampling distribution of Other sampling methods
3
Statistical inference The purpose of statistical inference is to obtain information about a population from information contained in a sample. A population is the set of all the elements of interest in a study. A sample is a subset of the population.
4
A parameter is a numerical characteristic of a population. A sample statistic is a numerical characteristic of a sample. We will use a sample statistic in order to judge tentatively or approximately the value of the population parameter.
5
The sample results provide only estimates (that is, rough and approximate values) of the values of the population characteristics. The reason is simply that the sample contains only a portion of the population. With proper sampling methods, the sample results will provide “good” estimates of the population characteristics.
6
Simple random sampling procedure
7
Selecting a sample Sampling from a finite population. Finite populations are often defined by lists such as organization membership roster, class roster, inventory product numbers, etc. Sampling from an infinite population (a process). The population is usually considered infinite if it involves an ongoing process that makes listing or counting every element impossible. For example, parts being manufactured on a production line, customers entering a store, etc.
8
Sampling from a finite population A simple random sample from a finite population of size N is a sample selected such that each possible sample of size n has the same probability of being selected. Replacing each sampled element before selecting subsequent elements is called sampling with replacement. Sampling without replacement is the procedure used most often. In large sampling projects, computer- generated random numbers are often used to automate the sample selection process.
9
St. Andrew’s College received 900 applications for admission in the upcoming year from prospective students. The applicants were numbered, from 1 to 900, as their applications arrived. The Director of Admissions would like to select a simple random sample of 30 applicants. Example: St. Andrew’s College
10
Sampling from a finite population using Excel =RAND() Excel generates a random number between 0 and 1 =RAND()*N Excel generates a random number greater than or equal to 0 but less than or equal N =INT(RAND()*900)
11
In the case of infinite populations, it is impossible to obtain a list of all elements in the population. The random number selection procedure cannot be used for infinite populations. Sampling from an infinite population
12
A simple random sample from an infinite population is a sample selected such that the following conditions are satisfied: Each element selected comes from the same population. Each element is selected independently. Sampling from an infinite population
13
Point estimation
14
In point estimation we use the data from the sample to compute a value of a sample statistic that serves as an estimate of a population parameter. A point estimate is a statistic computed from a sample that gives a single value for the population parameter. An estimator is a rule or strategy for using the data to estimate the parameter.
15
Terminology of point estimation We refer to as the point estimator of the population mean We refer to as the point estimator of the population standard deviation
16
Terminology of point estimation We refer to as the point estimator of the population proportion p. The actual numerical value obtained for in a particular sample is called the point estimate of the parameter.
17
Example: St. Andrew’s College Recall that St. Andrew’s College received 900 applications from prospective students. The application form contains a variety of information including the individual’s scholastic aptitude test (SAT) score and whether or not the individual desires on- campus housing. At a meeting in a few hours, the Director of Admissions would like to announce the average SAT score and the proportion of applicants that want to live on campus, for the population of 900 applicants.
18
However, the necessary data on the applicants have not yet been entered in the college’s computerized database. So, the Director decides to estimate the values of the population parameters of interest based on sample statistics. The sample of 30 applicants selected earlier with Excel’s RAND() function will be used. Example: St. Andrew’s College
19
Excel Value Worksheet Note: Rows 10-31 are not shown. Point estimation using Excel A 1 Applicant Number 2 3 4 5 6 7 8 9 12 773 408 58 116 185 510 394 CD SAT Score On-Campus Housing 1107No 1043Yes 991Yes 1008No 1127Yes 982Yes 1163Yes 1008No
20
Point estimates Note: Different random numbers would have identified a different sample which would have resulted in different point estimates.
21
Once all the data for the 900 applicants were entered in the college’s database, the values of the population parameters of interest were calculated. Population parameters
22
PopulationParameterPointEstimatorPointEstimateParameterValue = Population mean SAT score SAT score 990997 = Population std. deviation for deviation for SAT score SAT score 80 s = Sample std. s = Sample std. deviation for deviation for SAT score SAT score75.2 p = Population pro- portion wanting portion wanting campus housing campus housing.72.68 Summary of point estimates obtained from a simple random sample = Sample mean = Sample mean SAT score SAT score = Sample pro- = Sample pro- portion wanting portion wanting campus housing campus housing
23
Making inferences about a population mean
24
Making inferences about a population mean The value of is used to make inferences about the value of . The sample data provide a value for the sample mean. A simple random sample of n elements is selected from the population. Population with mean = ?
25
Population vs sampling distribution The population distribution is the probability distribution derived from the information on all elements of a population. The probability distribution of a sample statistic ( ) is called its sampling distribution.
26
Sampling distribution of The sampling distribution of the sample mean ( ) is the probability distribution of all possible values of. We need to know: Expected value of Standard deviation of Form of the sampling distribution of
27
Mean of the sampling distribution of The mean of the sampling distribution of is equal to the mean of the population. Thus, x x
28
Standard deviation of the sampling distribution of Finite population and n/N 0.05 (1) Infinite population (N is unknown); (2) Finite population and n/N 0.05 is referred to as the standard error of the mean. x is the finite population is the finite population correction factor correction factor
29
Two important observations 1. The spread of the sampling distribution of is smaller than the spread of the corresponding population distribution. In other words,. 2. The standard deviation of the sampling distribution of decreases as the sample size increases. x x
30
Form of the sampling distribution of 1. The population has a normal distribution. If the population from which the samples are drawn is normally distributed, then the sampling distribution of the sample mean will also be normally distributed for any sample size. x
31
Form of the sampling distribution of 2. The population is not normally distributed but the sample size is large (n 30). According to the Central Limit Theorem, for a large sample size (n 30), the sampling distribution of the sample mean is approximately normal, irrespective of the shape of the population distribution. In cases where the population is highly skewed or outliers are present, samples of size 50 or more may be needed. x
32
Form of the sampling distribution of 3. The sample size is small (n < 30) and the population is not normally distributed. Use special statistical procedures. x
33
Sampling Distribution of for SAT Scores Sampling distribution of x Example: St. Andrew’s College
34
What is the probability that the sample mean will be between 980 and 1000? In other words, what is the probability that a simple random sample of 30 applicants will provide an estimate of the population mean SAT score that is within +/-10 of the actual population mean ? P (980 < < 1000) = ?
35
1000980990 Area =.5034 SamplingDistributionof for SAT Scores Example: St. Andrew’s College
36
The probability of 0.5034 means that, for a large number of samples of size 30 selected from the population, we can expect that in 50.34% of all cases the sample mean will be within +/-10 of the actual population mean (that is, 980- 1000) and in 49.66% of all cases the sample mean will be further than +/-10 of the actual population mean (that is, below 980 or above 1000). Example: St. Andrew’s College
37
Relationship between the sample size and the sampling distribution of Example: St. Andrew’s College Suppose we select a simple random sample of 100 applicants instead of the 30 originally considered.
38
regardless of the sample size. In our example, E( ) remains at 990. Whenever the sample size is increased, the standard error of the mean is decreased. With the increase in the sample size to n = 100, the standard error of the mean is decreased from 14.6 to:
39
With n = 30, With n = 100, Example: St. Andrew’s College Relationship between the sample size and the sampling distribution of
40
Recall that when n = 30, P(980 < < 1000) =.5034. Now, with n = 100, P(980 < < 1000) =.7888. Because the sampling distribution with n = 100 has a smaller standard error, the values of have less variability and tend to be closer to the population mean than the values of with n = 30. Example: St. Andrew’s College
41
1000980990 Area =.7888 Sampling Distribution of for SAT Scores Example: St. Andrew’s College
42
The probability of 0.7888 means that, for a large number of samples of size100 selected from the population, we can expect that in 78.88% of all cases the sample mean will be within +/-10 of the actual population mean (that is, 980-1000) and in 21.12% of all cases the sample mean will be further than +/-10 of the actual population mean (that is, below 980 or above 1000). Example: St. Andrew’s College
43
Making inferences about a population proportion
44
of is used The value of is used to make inferences p. about the value of p. The sample data provide a value for the proportion. sample proportion. A simple random sample of n elements is selected from the population. Population with proportion p = ? Making inferences about a population proportion
45
Sampling distribution of The sampling distribution of the sample proportion ( ) is the probability distribution of all possible values of. We need to know: Expected value of Standard deviation of Form of the sampling distribution of
46
Mean of the sampling distribution of The mean of the sampling distribution of is equal to the population proportion. Thus,
47
Standard deviation of the sampling distribution of Finite population and n/N 0.05 (1) Infinite population (N is unknown); (2) Finite population and n/N 0.05 is referred to as the standard error of the proportion.
48
Form of the sampling distribution of The sampling distribution of can be approximated by a normal probability distribution whenever the sample size is large. The sample size is considered large whenever the following two conditions are satisfied:
49
For values of p near.50, sample sizes as small as 10 permit a normal approximation. With very small (approaching 0) or very large (approaching 1) values of p, much larger samples are needed. Form of the sampling distribution of
50
For our example, with n = 30 and p =.72, the normal distribution is an acceptable approximation because: n (1 - p ) = 30(.28) = 8.4 > 5 and np = 30(.72) = 21.6 > 5 Example: St. Andrew’s College
51
Sampling Distributionof Example: St. Andrew’s College Sampling distribution of
52
Recall that 72% of the prospective students applying to St. Andrew’s College desire on-campus housing. What is the probability that a simple random sample of 30 applicants will provide an estimate of the population proportion of applicant desiring on-campus housing that is within plus or minus.05 of the actual population proportion? Example: St. Andrew’s College P (.67 < <.77) = ?
53
.77.67.72 Area =.4582 Sampling Distributionof Example: St. Andrew’s College
54
Other sampling methods
55
Stratified random sampling Cluster sampling Systematic sampling Convenience sampling Judgment sampling
56
Stratified random sampling The population is first divided into groups of elements called strata. Each element in the population belongs to one and only one stratum. Best results are obtained when the elements within each stratum are as much alike as possible (i.e. a homogeneous group). A simple random sample is taken from each stratum.
57
Cluster sampling The population is first divided into separate groups of elements called clusters. Ideally, each cluster is a representative small-scale version of the population (i.e. heterogeneous group). A simple random sample of the clusters is then taken. All elements within each sampled (chosen) cluster form the sample.
58
Systematic sampling If a sample size of n is desired from a population containing N elements, we might sample one element for every n/N elements in the population. We randomly select one of the first n/N elements from the population list. We then select every n/Nth element that follows in the population list.
59
Convenience sampling It is a nonprobability sampling technique. Items are included in the sample without known probabilities of being selected. The sample is identified primarily by convenience. Example: A professor conducting research might use student volunteers to constitute a sample.
60
Judgment sampling The person most knowledgeable on the subject of the study selects elements of the population that he or she feels are most representative of the population. It is a nonprobability sampling technique. Example: A reporter might sample three or four senators, judging them as reflecting the general opinion of the senate.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.