Collecting Samples Chapter 2.3 – In Search of Good Data Mathematics of Data Management (Nelson) MDM 4U
Why Sampling? sampling is done because a census is too expensive or time consuming the challenge is being confident that the sample represents the population accurately convenience sampling occurs when you simply take data from the most convenient place (for example collecting data by walking around the hallways at school) convenience sampling is not representative
Random Sampling representative samples involve random sampling random events are events that are considered to occur by chance random numbers are described as numbers that occur without pattern random numbers can be generated using a calculator, computer or random number table random choice is used as a method of selecting members of a population without introducing bias
1) Simple Random Sampling this sample requires that all selections be equally likely and that all combinations of selections be equally likely the sample is likely to be representative of the population but if it isn’t, this is due to chance example: put entire population’s names in a hat and draw them
2) Systematic Random Sampling you decide to sample a fixed percent of the population using some random starting point and you select every n th individual n in this case is determined by calculating the sampling interval (population size ÷ sample size) example: you decide to sample 10% of 800 people. n = 800 ÷ 80 = 10, so generate a random number between 1 and 10, start at this number and sample each 10 th person
3) Stratified Random Sampling the population is divided into groups called strata (which could be MSIPs or grades) a simple random sample is taken of each of these with the size of the sample determined by the size of the strata example: sample CPHS students by MSIP, with samples randomly drawn from each MSIP (the number drawn is relative to the size of the MSIP)
4) Cluster Random Sampling the population is ordered in terms of groups (like MSIPs or schools) groups are randomly chosen for sampling and then all members of the chosen groups are surveyed example: student attitudes could be measured by randomly choosing schools from across Ontario, and then surveying all students in those
5) Multistage Random Sampling groups are randomly chosen from a population, subgroups from these groups are randomly chosen and then individuals in these subgroups are then randomly chosen to be surveyed example: to understand student attitudes a school might randomly choose one period, randomly choose MSIPs during that period then randomly choose students from within those MSIPs
6) Destructive Sampling sometimes the act of sampling will restrict the ability of a surveyor to return the element to the population example: cars used in crash tests cannot be used again for the same purpose example: taking a standardized test (individuals may acquire learning during sampling that would introduce bias if they were tested again)
Example: do students at CPHS want a longer lunch? (sample 200 of 800 students) Simple Random Sampling Create a numbered, alphabetic list of students, have a computer generate 200 names and interview those students Systematic Random Sampling sampling interval n = 800 ÷ 200 = 4 generate a random number between 1 and 4 start with that number on the list and interview each 4 th person after that
Example: do students at CPHS want a longer lunch? Stratified Random Sampling group students by grade and have a computer generate a random group of names from each grade to interview the number of students interviewed from each grade is probably not equal, rather it is proportional to the size of the group if there were 180 grade 10’s, 180 ÷ 800 = 800 × = 45 so we would need to interview 45 grade 10s
Example: do students at CPHS want a shorter lunch? Cluster Random Sampling randomly choose enough MSIPs to sample 200 students say there are 25 per MSIP, we would need 8 MSIPs, since 8 x 25 = 200 interview every student in each of these rooms
Example: do UCDSB high school students want a shorter lunch? Multi Stage Random Sampling Randomly select 4 high schools in the UCDSB Randomly choose a period from 1-5 randomly choose 2 MSIP classes of 25 interview every student in those MSIPs 200 students total
Sample Size the size of the sample will have an effect on the reliability of the results the larger the better factors: variability in the population (the more variation, the larger the sample required to capture that variation) degree of precision required for the survey the sampling method chosen
Techniques for Experimental Studies Experimental studies are different from studies where a population is sampled as it exists in experimental studies some treatment is applied to some part of the population however, the effect of the treatment can only be known in comparison to some part of the population that has not received the treatment
Vocabulary treatment group the part of the experimental group that receives the treatment control group the part of the experimental group that does not receive the treatment
Vocabulary placebo a treatment that has no value given to the control group to reduce bias in the experiment no one knows whether they are receiving the treatment or not (why?) double-blind test in this case, neither the subjects or the researchers doing the testing know who has received the treatment (why?)
Class Activity How would we take a sample of the students in this class using the following methods: a) 40% Simple Random Sampling b) 20% Systematic Random Sampling? c) 40% Stratified Random Sampling? d) 50% Cluster Random Sampling?
MSIP / Homework p. 99 #1, 5, 6, 10, 11 For 6b, see Ex. 1 on p. 95