Chapter 5 Producing Data 5.1 Designing Samples
A Class Survey
Example…Do NOT Copy!
Exploratory Data Analysis Seeks to discover and describe what data say by using graphs and numerical summaries. The conclusions we draw apply to the specific data we examine. But, what if we want to extend our analysis to a larger group? How can I do that?
P S Population: The entire group of individuals that we want information about. Sample: A part of the population that we actually examine in order to gather information. Parameter: The data (values) describing the population. Statistic: The data (values) describing the sample. Population & Parameter; Sample & Statistic
Just a bit more info… Statistical techniques learned in this chapter for producing data open the door to statistical inference. 15-20% of the AP Exam is from this chapter! This is a very vocabulary-intense chapter. Be SURE to read before you arrive daily so we can focus on the FUN of sampling and experimental design!
Samples vs Experiments In an observational study, you observe and measure but you do not attempt to influence responses. Sampling is a form of an observational study. An experiment deliberately imposes a treatment on individuals in order to observe their responses. Sampling involves studying a part to gain information about the whole. A Census attempts to gather data from the entire population of interest.
Questions we need to answer in designing a study or experiment: How many individuals must we collect data from? (sample size (n)) How will we select the individuals to be studied? If (as in many experiments) several groups of individuals are to receive different treatments, how will we form the groups?
Designing Samples for Observational Studies Goal: To use information obtained from a “representative” sample to make inferences about the population from which the sample was taken; the only alternative is taking a census—not very practical!
“Bad” Sampling Methods Voluntary response sample – consists of people who choose themselves by responding to a general appeal; voluntary response samples tend to be biased (systematically favors one group) because people who have strong opinions are most likely to respond. Convenience sample – “grab” the first “n” people available (not random) Example: Prob 5.7 p. 274
Random is hard? On the next slide, look at the numbers quickly and pick a number at random.
What did you choose? Almost 75% of all people pick the number 3. About 20% pick 2 or 4. Only about 5% choose 1.
Why be random? Statisticians don’t think of randomness as the annoying tendency of things to be unpredictable or haphazard. Statisticians use randomness as a tool. But, truly random values are surprisingly hard to get…
“Good” Sampling Methods Probability Samples Simple random sample (SRS) – each individual in the population has an equal chance of being included in the sample and each subgroup of size n has an equal chance of being in the sample.
How to choose an SRS You can select a SRS by labeling all the individuals in the population with a number and then randomly choosing the sample using a table or your calculator. Example: prob 5.10 p. 279 Example: prob 5.21 p. 286
Other random sample methods Systematic random sample – randomly select a starting place/number and then take every kth value/individual (ex: prob 5.30 p. 289) Stratified random sample – break the population into strata (similar groups), then take a SRS from each strata (ex: prob 5.25 p. 287 Multistage sample – divide population into units, sample those units, divide the sample into units, sample those units……for as many stages as desired (ex: prob 5.23 p. 286)
Time to turn around! Half of you will turn around and face the back wall; the other half of you will look at the screen and answer the question SILENTLY in your notes. Then we will swap – the first half will look and SILENTLY write while the other group stares at the wall. Comparison to follow!
Face or Vase?Vase?
Swap!
Vase or Face?
Compare time! What did the first group see? Second group? Why are we getting these results? Discuss among yourselves and see if you can figure it out! One more time – let’s try it again!
Face or Vase?
Swap!
Face or Vase?Vase?
Let’s compare again Ok, now what did we say as the answer? Again, discuss. What do you think this tells us about surveying?
Cautions Regarding Surveys Undercoverage – some groups are left out in the process of choosing the sample. Nonresponse – the selected individual cannot be contacted or refuses to answer the questions. Response bias – caused by the behavior of the respondent or the interviewer Wording of questions – is the most important influence on the answers Example: prob 5.15 p. 284; prob 5.26 p. 288; prob 5.22 p. 286