LECTURE 3 SAMPLING THEORY EPSY 640 Texas A&M University
POPULATIONS finite population consists of the actual group of objects or persons, which we know is potentially countable and finite. infinite population population is a mathematical abstraction that is useful because the properties of the population are assumed or defined carefully,,
POPULATIONS Parameter = characteristic of the population. If a sample is drawn and the characteristic computed, it will be a statistic for the sample.
POPULATIONS Accessible vs. Target Populations. Target Population, the population we wish to represent. Instead, we might be able to draw from all public school grade 3 students in class during a particular week in the school year. This is our Accessible Population, the population we have access to.
Sampling Methods RANDOM SAMPLING –SIMPLE –STRATIFIED –MULTISTAGE –CLUSTER SYSTEMATIC SAMPLING CONVENIENCE (NONRANDOM) SAMPLING
RANDOM SAMPLING If every member of a population has an equal chance of being selected involves being able to define and count the population. can then use a process called randomization to select the sample
Table of Random Numbers Location RN … … … … … … … ….. In selecting a sample of 20 students from a list of of 75, a random start point was selected as shown above. The ad hoc rule was to go down the column to the bottom and up the next. Thus, children with identifiers 75 1, 13, 26, 69, 22, 46, 29, 55, 34, 59, 59, and17 have been selected within this section of the random number table. The location value allows checking and replication of a random sample selection process.
finite population correction fpc= 1- n/N = 1-f where n= # in sample N=# in population
Finding survey sample size (z /d) 2 n = ________________________ 1 + (1/N)(z /d) 2 z = z-score for probability for confidence interval required (usually 1.96 for.05 or 2.59 for.01) = SD of distribution (can be 1.0 for arbitrary units) d = desired degree of error in SD units
Finding survey sample size- example Alpha=.05, N=1,000,000 d=.1 , = 1 (1.96/.1) 2 n = ________________________ 1 + (1/ )(1.96/.1) 2 = =
Population SizeSample Size Required for d=.1 for =.05 for = Table 4.2: Sample sizes required for various population sizes for 95% and 99% confidence intervals
Mean and standard deviation for simple random sampling (x.) = (sample mean estimates population mean unbiasedly) V(x.) = (1/n) s 2 (1-f) (variance must be corrected) _____ s x. = V(x.) = standard error of the mean =s m
-1.96s m -s m s m 1.96s m Mean from a particular sample
-1.96s m -s m s m 1.96s m Mean from a particular sample Original Data Distribution Distribution of Means
Confidence interval Mean zs x. z = # SDs of normal distribution for some probability of confidence, usually.01 or.05 for real data: x. 1.96s x gives a confidence interval around the mean: –Interpretation: in 95 of 100 times we do the study, the population mean will be in the interval we construct.
-1.96s m -s m s m 1.96s m Mean from a particular sample Distribution of Means Confidence interval Interpretation: in one event is either IN or OUT of the confidence interval; for 100 intervals, it should be IN 95 times on average.
Stratified random sampling subpopulations, called strata. We then use simple random sampling for each stratum. We can decide to sample proportionately or disproportionately.
Stratified random sampling Proportionate sampling: percentage in sample is same as in population or disproportionate sampling: percentage in sample is different from that of population Example Males and Females (50% in pop.). –Proportional: 50 males, 50 females –Disproportional: 75 males, 25 females
Stratified random sampling Example: Ethnicity of students in District: 80% Anglo, 10% Hispanic, 5% African American, 5% Native American Proportional for 200 student sample: –160 Anglo, 20 Hispanic, 10 African-American, 10 Native American Disproportional: –50 Anglo, 50 Hispanic, 50 African-American, 50 Native American
Stratified random sampling Example: Ethnicity of students in District: 80% Anglo, 10% Hispanic, 5% African American, 5% Native American Proportional for 200 student sample: –160 Anglo, 20 Hispanic, 10 African-American, 10 Native American –May give poor estimates for H, AA, NA samples Disproportional: –50 Anglo, 50 Hispanic, 50 African-American, 50 Native American –Will give estimates with similar confidence intervals for all groups –may need fpc for some groups
Mean for stratified random sample. s x.. est = ( N i x i. )/N i=1 Where N i = numer of cases in the population stratum i, N = total number of cases in the entire population, and s = number of strata.
Mean for stratified random sample- example 3 strata, N 1 =1000, N 2 =2000, N 3 =3000 X 1 = 70, X 2 = 80, X 3 = 90 s x.. est = ( N i x i. )/N i=1 = [(1000 x 70) + (2000 x 80) + (3000 x 90) ] / 6000 = 83.33
SD for stratified random sample. s V(x.. est ) = N i 2 s 2 x i. /N 2 i=1 x. = V(x..est ), where s 2 x i. = V(x i.), the variance error of the mean using the simple random sample formula
SD for stratified random sample. SUBPOPULATION N I n i X.s i s m. A B C X.. est = (77 x x x 12)/1044 = V(X.. est ) = (77 2 x (.419) x (.751) x (.956) 2 ) / =.485 s(X.. est ) =.696 Table 4.3: Calculation of stratified sample mean and variance error of the mean
SD for stratified random sample. SUBPOPULATION N I n i X.s i s m. A B C s 2 m = (1/n i )s i 2 (1-f i ) = (1/50)5 2 (1-50/77) = (1/50)6 2 (1-50/229) = (1/50)7 2 (1-50/738)