1 Systematic Sampling (SYS) Up to now, we have only considered one design: SRS of size n from a population of size N New design: SYS DEFN: A 1-in-k systematic sample is a sample obtained by randomly selecting one sampling unit from the first k sampling units in the sampling frame, and every k-th sampling unit thereafter
2 Sampling procedure for SYS Have a frame, or list of N SUs Assume SU = OU for now Determine sampling interval, k k is the next integer after N/n Select first SU in the list Choose a random number, R, between 1 & k R-th SU is the first SU to be included in the sample Select every k-th SU after the R-th SU Sample includes unit R, unit R + k, unit R + 2k, …, unit R + (n-1)k
3 Example Telephone survey of members in an organization abut organization’s website use N = 500 members Have resources to do n = 75 calls N / n = 500/75 = 6.67 k = 7 Random number table entry: Rule: if pick 1, 2, …, 7, assign as R; otherwise discard # Select R = 5 Take SU 5, then SU 5+7 =12, then SU 12+7 =19, 26, 33, 40, 47, …
4 Example – 2 Arrange population in rows of length k = 7 R i ……
5 Properties of systematic sampling – 1 Number of possible SYS samples of size n is k Only 1 random act - selecting R After select 1 st SU, all other SUs to be included in the sample are predetermined A SYS is a cluster with sample size 1 Cluster = set of SUs separated by k units Unlike SRS, some sample sets of size n have no chance of being selected given a frame A SU belongs to 1 and only 1 sample
6 Properties of systematic sampling – 2 Number of possible SYS samples of size n is k Are these samples equally likely to be selected? Probability of selecting a sample P{S } = 1/k Inclusion probability for a SU P{ SU i S } = 1/k
7 Properties of systematic sampling – 3 Plan for sample size of n, but actual sample size may vary If N / k is an integer, then n = N / k If N / k is NOT an integer, then n is either the integer part of (N / k ) or the integer part of (N / k ) + 1
8 Properties of systematic sampling – 4 Because only the starting SU of a SYS sample is randomized, a direct estimate of the variance of the sampling distribution can not be estimated Under SRS, variance of the sampling distribution was a function of the population variance, S 2 Have no such relationship for SYS
9 Estimation for SYS Use SRS formulas to estimate population parameters and variance of estimator
10 Properties of systematic sampling – 5 Properties of SRS estimators depends on frame ordering SRS estimators for population parameters usually have little or no bias under SYS Precision of SRS estimators under SYS depends on ordering of sample frame
11 Order of sampling frame Random order SYS acts very much like SRS SRS variance formula is good approximation Ordered in relation to y Improves representativeness of sample SRS formula overestimates sampling variance (estimate is more precise than indicated by SE) Periodicity in y = sampling interval k Poor quality estimates SRS formula underestimates sampling variance (overstate precision of estimate)
12 Example – 3 Suppose X [age of member] is correlated with Y [use of org website] Sort list by X before selecting sample k Xi young …mid… old72
13 Practicalities Another building block (like SRS) used in combination with other designs SYS is more likely to be used than SRS if there is no stratification or clustering Useful when a full frame cannot be enumerated at beginning of study Exit polls for elections Entrance polls for parks
14 Practicalities – 2 Best if you can sort the sampling frame by an auxiliary variable X that is related to Y Improve representativeness of sample (relative to SRS) Improve precision of estimates Essentially offers implicit form of stratification