Basic Sampling Issues Chapter 11
What is sampling Sampling: a way of studying a subset of the population but still ensuring “generalizability” (vs. census – study of entire population) – does the study have external validity?
Definitions of Important Terms Population or Universe – entire set of elements to be studied Census – all elements that completely make up the population. Sample – a subset
Unit of AnalysisLevel of social life being studied – individuals or groups of individuals ChildNeighborhood ElementsIndividual members of the population Charlie, Lucy, Linus, Patty, Violet, etc. Midtown, Natomas, Land Park, Sampling FrameList of all elements or other units containing the elements; used for drawing sample Public school rolls Phone listings Marketing list of households with children List of neighborhoods List of cities in Sacramento region
Steps in Developing a Sample Plan Step 1: Define the Population of Interest Step 2: Choose a Data collection Method Step 3: Choosing a Sampling Frame Step 5: Sample Size Step 4: Selecting a Sampling Method Boundaries Operational Implementability
Sampling Method Probability samples: Samples in which every element of the population has a known, nonzero probability of selection. Generalizable Sampling error Expensive; More time and effort needed Non-probability samples: Samples that include elements from the population selected in a nonrandom manner. Hidden agendas Biased towards well known members of the population; Biased against unusual population members
Sampling and Nonsampling Errors Parameter vs. Statistic (Estimate) Sample statistic: statistic (e.g. mean) computed from sample data - Population parameter: true value for statistic (e.g. mean) for population (we don’t know this) - Sampling error: population parameter – sample statistic (we don’t know this) - Confidence interval: interval in which we can be confident that true value lies, based on sample statistic and its standard error
Advantages Of Probability Samples 1. Information from a representative cross-section 2. Sampling error can be computed 3. Results are projectable to the total population. Disadvantages Of Probability Samples 1. More expansive than nonprobabiity samples 2. Take more time to design and execute.
Disadvantages of Nonprobability Samples 1. Sampling error cannot be computed 2. Representativeness of the sample is not known 3. Results cannot be projected to the population. Advantages of Nonprobability Samples 1. Cost less than probability 2. Can be conducted more quickly 3. Produces samples that are reasonably representative
Classification of Sampling Methods Sampling Methods Probability Samples Simple Random SystematicStratified Non- probability Judgment ConvenienceSnowball Cluster Quota
Sampling Error The error that results when the same sample is not perfectly representative of the population. Remember? Sampling And Nonsampling Errors + - ss ns + - X = X = sample mean = true population mean ss = sampling error ns = nonsampling error
Sampling Error The error that results when the same sample is not perfectly representative of the population. Administrative error: problems in the execution of the sample (can be reduced) Random error: due to chance and cannot be avoided; but can be contolled by random sampling and…..estimated! Measurement or Nonsampling Error Includes everything other than sampling error that can cause inaccuracy and bias (data entry, biased q’s, bad analysis etc). Sampling And Nonsampling Errors
Probability Sampling Methods Simple Random Sampling A probability sample is a sample in which every element of the population has a known and equal probability of being selected into the sample- EPSEM. Probability of Selection = Sample Size Population Size
Probability Sampling Methods Systematic Sampling Probability sampling in which the entire population is numbered, and elements are drawn using a skip interval. Skip Interval = Population Size Sample Size
Probability Sampling Methods Stratified Samples Probability samples that select elements from relevant population subsets to be more representative. Cluster Samples Probability sample of geographic areas
Three steps: In implementing a properly stratified sample: 1. Identify salient demographic or classification factors correlated with the behavior of interest. 2. Determine what proportions of the population fall into various sub subgroups under each stratum. proportional allocation disproportional or optimal allocation 3.Select separate simple random samples from each stratum Stratified Samples Probability samples that select elements from relevant population subsets to be more representative.
Cluster Samples Sampling units are selected in groups. 1. The population of interest is divided into mutually exclusive and exhaustive subsets. 2.A random sample of the subsets is selected. One-stage cluster—all elements in subset selected Two-stage cluster—elements selected in some probabilistic manner from the selected subsets
Stratified Example Reason for use Strata Divide city into districts 2. Draw random sample of households from each district. To ensure desired number of households in each district. Cluster 1.Divide city into districts (clusters). 2.Draw random sample of districts. 3.Draw random sample of households from each district. To make it easier to do door-to-door surveys.
Handout 1 – Baseball Example 1. Ramon Aviles Larry Bowa Pete Rose Mike Schmidt Manny Trillo John Yukovich0.161 Mean = / 6 = 0.261
SRS of sample size = 2 MeanError Aviles, Bowa Aviles, Rose Aviles, Schmidt Aviles, Trillo Aviles, Yukovich Bowa, Rose Bowa, Schmidt
SRS of sample size = 2 Bowa, Trillo Bowa, Yukovich Rose, Schmidt Rose, Trillo Rose, Yukovich Schmidt, Trillo Schmidt, Yukovich Trillo, Yukovich
Stratification Let’s divide the sample into two strata One with Yukovich and another with all others Stratum 1: Yukovich Stratum 2: Aviles, Bowa, Rose, Trillo, Schmidt
Stratified Sampling 1. Yukovich, Aviles 2. Yukovich, Bowa 3. Yukovich, Rose 4. Yukovich, Schmidt 5. Yukovich, Trillo Weight the sample. Why? For anyone from Stratum 2, multiply their value by 5
Example – Mean computation Yukovich, Schmidt Yukovich = Schmidt = Therefore, Schmidt’s value is (0.286 * 5) which is 1.43 Yukovich + Schmidt = = Mean (Yukovich + Schmidt) = / 6 = 0.265
Stratified Sampling 1. Yukovich Aviles Yukovich, Bowa Yukovich, Rose Yukovich, Schmidt Yukovich, Trillo What’s happening to errors of estimate?
Nonprobability Sampling Methods Convenience Samples Nonprobability samples used primarily because they are easy to collect ; Theory testing Judgment Samples Nonprobability samples in which the selection criteria are based on personal judgment that the element is representative of the population under study
Nonprobability Sampling Methods Snowball Samples Nonprobability samples in which selection of additional respondents is based on referrals from the initial respondents. Quota Samples Nonprobability samples in which a population subgroup is classified on the basis of researcher judgment Different from Stratified