Download presentation
Presentation is loading. Please wait.
Published byClemence Fowler Modified over 9 years ago
1
Chapter 7: Sampling and Sampling Distributions
2
LO1Contrast sampling to census and differentiate among different methods of sampling, which include simple, stratified, systematic, and cluster random sampling; and convenience, judgment, quota, and snowball nonrandom sampling, by assessing the advantages associated with each. LO2Describe the distribution of a sample’s mean using the central limit theorem, correcting for a finite population if necessary. LO3Describe the distribution of a sample’s proportion using the z formula for sample proportions. Learning Objectives
3
Sampling is used for gathering useful information about a population Sampling can provide information in a timely and convenient form Sampling can save time and money. For given resources, sampling is more efficient and can broaden the scope of a study. The research process sometimes requires destructing product; sampling can reduce the cost of destroying product. If accessing the population is impossible; sampling is the only option. Reasons for Sampling
4
When it is essential to eliminate the possibility that by chance a random sample may not be representative of the population. When sampling errors have fatal consequences, a census is required for the safety of the consumer. Reasons for Taking a Census
5
A list, map, directory, or other sources used to identify and locate the population A list, map, or directory such as a school list, trade association list, telephone directory, or even a list sold by list brokers is called a frame The frame should ideally be a one-to-one correspondence with the population, but may have a Gap due to over registration or under-registration Population Frame
6
Over registration: the frame contains all members of the target population and some additional elements – Example: using Bell Montreal telephone registry as a listing of residences with Bell telephones in Montreal Under registration: the frame does not contain all members of the target population. – Example: using the chamber of commerce membership directory as the frame for a target population of all businesses. Population Frame
7
Random sampling – A chance process mechanism used to select some units of the population – Every unit of the population have the same probability of being included in the sample. – Eliminates bias in the selection process – Also called probability sampling Nonrandom Sampling – Every unit of the population does not have the same probability of being included in the sample. – Open to selection bias – Not an appropriate data gathering technique for use in most statistical methods presented in this text – Also known as nonprobability sampling Random Versus Nonrandom Sampling
8
Four Basic Sampling Techniques Simple Random Sampling Systematic Random Sampling Stratified Random Sampling Cluster (or Area) Sampling Basic Random Sampling Techniques
9
The most elementary sampling technique The basis for developing other random sampling methods Use random number generator to select units Random numbers: a sequence of numbers that lack any pattern Number or code each frame unit from 1 to N. Easier to perform for small populations Cumbersome for large populations Seldom used in practice Simple Random Sample
10
Uses a random number table or a random number generator. Each unit of the frame is numbered from 1 to N Each unit of frame has an equal chance of being selected to sample Use random number table to select n distinct numbers from N or between 1 and N, inclusively Does not guarantee that sample is representative of the population Application of the Simple Random Sample Technique
11
Simple Random Sampling: Random Number Generator Table
12
Simple Random Sample: Sample Members Selected 01 Acceleware Corp. 02 Apption Software 03 Auctionwire Inc. 04 Audability Inc. 05 b5media Inc. 06 Bond Consulting Group 07 Cadre Staffi ng Inc. 08 Direct Sales Force Inc. 09 Diversified Brands 2005 Inc. 10 Eagle Wake Ltd./ Ticket Gold 11 EFT Canada Inc. 12 Filemobile Inc. 13 Hutton Forest Products Inc. 14 KMA Contracting Inc. 15 League Assets Corp. 16 Lettuce Eatery (Freshii Inc.) 17 LOGiQ3 Inc. 18 MedicLINK Systems Ltd. 19 Mortgagebrokers.com Holdings Inc. 20 Rapido Trains Inc. 21 Pacesetter Directional and Performance Drilling Ltd. 22 PrecisionERP Inc. 23 Scalar Decisions Inc. 24 Siamons International Inc. 25 Simcoe Canada Land Development Inc. 26 Stiris Research Inc. 27 Sweetspot.ca Inc. 28 TAG Recruitment Group Inc. 29 Unity Telecom Corp. 30 Vortex Mobile (Vortxt Interactive Inc.) Population Size = N = 30 Sample Size = n = 6
13
Simple Random Sample: Numbered Population Frame Use Excel’s RANDBETWEEN function to generate a random sample size of 6.
14
Stratified Random Sample Population is divided into nonoverlapping subpopulations called strata. Internally, sub-populations should be as homogeneous as possible; Externally, they should contrast with each other. A random sample is selected from each stratum. Potential for reducing sampling error Proportionate: the percentage of the sample taken from each stratum is proportionate to the percentage that each stratum is within the population Disproportionate: proportions of the strata within the sample are different than the proportions of the strata within the population
15
Stratified Random Sample: Population of FM Radio Listeners
16
Convenient and relatively easy to administer Population elements are an ordered sequence (at least, conceptually). The first sample element is selected randomly from the first k population elements. Thereafter, sample elements are selected at a constant interval, k, from the ordered sequence frame. Systematic Sampling
17
When used with alphabetic ordered set, it is no better than simple random sampling and therefore does not guarantee representative samples. The sample becomes nonrandom when the data is subject to periodicity Problems With Systematic Sampling
18
Frame: Scott’s National manufacturers of Canada Directory listing N= 105,000 manufacturers in alphabetic order Sample n = 1,000 k = 105,000/1,000 = 105 First sample element randomly selected from the first 105 manufacturers. Assume the 5th purchase order was selected from random tables: the first element is the manufacturer coded 5 Subsequent sample elements k+5, 2k+5, etc:, 110, 215, 320,...until 1,000 manufacturers are selected. Systematic Sampling: Example
19
The population is divided into nonoverlapping and internally homogeneous clusters or areas. Each cluster is a miniature, or microcosm, of the population. A subset of the clusters is selected randomly for the sample. If the number of elements in the subset of clusters is larger than the desired value of n, these clusters may be subdivided to form a new set of clusters and subjected to a random selection process. Cluster or Area Sampling
20
Advantages – More convenient for geographically dispersed populations – Reduced travel costs to contact sample elements – Simplified administration of the survey – Unavailability of sampling frame prohibits using other random sampling methods Disadvantages – Statistically less efficient when the cluster elements are similar – Costs and problems of statistical analysis are greater than for simple random sampling. Cluster Sampling
21
In cluster sampling, sometimes the clusters are too large, and a second set of clusters is taken from each original cluster. – This technique is called two-stage sampling. – Canadian Example: divide Canada into clusters of cities; then divide the cities into clusters of blocks; and randomly select individual houses from the block clusters. Advantages: – Clusters are usually convenient to obtain – Cost of sampling entire population is reduced due to reduction in scope of study Two-Stage-Cluster Sampling
22
Convenience Sampling: sample elements are selected for the convenience of the researcher Judgment Sampling: sample elements are selected by the judgment of the researcher Quota Sampling: sample elements are selected until the quota controls are satisfied Snowball Sampling: survey subjects are selected based on referral from other survey respondents Nonrandom Sampling
23
Data from nonrandom samples are not appropriate for analysis by inferential statistical methods. All errors other than sampling errors are nonsampling errors Sampling error occurs when the sample is not representative of the population. Sampling errors are unavoidable and usually not measurable. Biases may be avoidable and are usually measurable. Causes of Nonsampling Errors –Missing data, recording, data entry, and analysis errors –Poorly conceived concepts, unclear definitions, and defective questionnaires –Response errors occur when people do not know, will not say, or overstate in their answers –Virtually no statistical method exists to control for nonsampling errors. Diligence in planning survey and execution required Nonsampling Errors
24
Proper analysis and interpretation of a sample statistic requires knowledge of its distribution. The sample mean is one of the more common statistics used in inferential statistics. Its underlying probability function and the inferential process Sampling Distribution of
25
Distribution of a Small Finite Population Suppose a small finite population consists of only N = 8 numbers: 5455596364686970
26
Generating the Following Sample Space Taking Samples of for n = 2 with Replacement
27
Excel Produced Histogram of the 64 Sample Means for n=2
28
Histogram of a Poisson-Distributed Population, λ = 1.25
29
Histogram of Sample Means for the Data In Previous Slide
30
The previous slides illustrate that as the size of the sample n increases and as the number of sample increase, the shape of the sample mean histogram generated by the sampling distribution becomes more symmetric and smoother looking. The next set of slides demonstrate this for the case where sampling is from a population which has a uniform distribution in which a = 10 and b= 30 Note that even for small sample sizes that the distribution of sample means begin to pile up in the middle General Rule: as sample sizes become much larger, the sample mean distribution begins to approach a normal distribution and the variation among the means decreases. The Changing Shape of The Distribution of Sample Means Relative to the Sample Size n
31
Means of 90 Samples (n = 2 to n = 30) from Uniformly Distributed Distribution
32
1,800 Randomly Selected Values from a Uniform Distribution
33
Means of 60 Samples (n = 2) from a Uniform Distribution
34
Means of 60 Samples (n = 5) from a Uniform Distribution
35
Means of 60 Samples (n = 30) from a Uniform Distribution
36
Central Limit Theorem ∗ Note that the central limit theorem itself does not specify what a “large sample size” is. As a guideline, it is assumed to be 30 or more, although this does not follow from the central limit theorem itself. The derivations are beyond the scope of this text and are not shown.
37
Shapes of the Distribution of Sample Means for 3 Sample Sizes and the Normal and Uniform Distributions
38
Distribution of Sample Means for 3 Sample Sizes and the U-shape and Normal Distributions
39
The distribution of sample means is normal for any sample size. Sampling from a Normal Population
40
Z Formula for Sample Means
41
Tire Store Example in Figure 7.6
42
Graphic Solution to the Store Example
43
Demonstration Problem 7.1 For this problem, μ = 448, σ = 21, and n = 49. The problem is to determine P(441 ≤ x ≤ 446). The following diagram depicts the problem.
44
Demonstration Problem 7.1
45
In this case, the standard deviation of the distribution of sample means is smaller than when sampling from an infinite population (or from a finite population with replacement). The correct value of this standard deviation is computed by applying a finite correction factor to the standard deviation for sampling from a infinite population. If the sample size is less than 5% of the population size, the adjustment is unnecessary. Sampling from a Finite Population without Replacement
46
Finite Correction Factor Modified Z Formula Sampling from a Finite Population
47
Finite Correction Factor for Selected Sample Sizes
48
If research or experiment produces, not measurable, but countable items such as the frequency with which an attribute occurs then the sample proportion is often the statistic of choice Example: Take samples of 3 with replacement from a group of 5 things. Total possible samples is 2 5 = 32. If there are only two attributes or countable outcomes possible (defective and non-defective), each sample have a certain proportion of things defective or non-defective. There will be 32 possible proportions. And as in the case for the means of measurable outcomes, these 32 proportions have a distribution, with parameters that differ from those of the original population. Sampling Distribution of p
49
Sampling Distribution of the population proportion and its parameters: The Sample Proportion The standard deviation of the distribution is Sampling Distribution of p
50
Sampling Distribution is approximately normal if n ∙ p > 5 and n ∙ q > 5 where (p is the population proportion and q = 1 − p) The mean of sample proportions for all samples of size n randomly drawn from a population is p (the population proportion) and the standard deviation of sample proportions is which is sometimes referred to as the standard error of the proportion. Sampling Distribution of p
51
Z Formula for Sample Proportions
52
Solution for Demonstration Problem 7.3
53
COPYRIGHT Copyright © 2014 John Wiley & Sons Canada, Ltd. All rights reserved. Reproduction or translation of this work beyond that permitted by Access Copyright (The Canadian Copyright Licensing Agency) is unlawful. Requests for further information should be addressed to the Permissions Department, John Wiley & Sons Canada, Ltd. The purchaser may make back-up copies for his or her own use only and not for distribution or resale. The author and the publisher assume no responsibility for errors, omissions, or damages caused by the use of these programs or from the use of the information contained herein.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.