Download presentation
Presentation is loading. Please wait.
1
4.1 (Day 2)
2
Other Sampling Methods
Last week we talked about simple random samples Simplest way to eliminate bias Sometimes, however, there are factors that make more complex sampling methods more appealing Taking an SRS is too difficult/expensive We are not convinced that an SRS will actually give is a representative sample
3
Stratified Random Sample
Probably the most commonly used sampling method Instead of sampling from the entire population, you divide the population into sub-groups (“strata”) You then randomly sample from each of these strata Combine these “sub-samples” together to form your overall sample
4
Stratified Random Sampling
Taking an SRS is too difficult/expensive We are not convinced that an SRS will actually give is a representative sample Which of these problems is Stratified Random Sampling fixing?
5
Stratified Random Sampling
Taking an SRS is too difficult/expensive We are not convinced that an SRS will actually give is a representative sample Which of these problems is Stratified Random Sampling fixing? #2 Sidenote: do NOT abbreviate stratified random sampling as SRS SRS always means simple random sample
6
Stratified Random Sampling
When is it most commonly used? Commonly used to make sure that important groups are all represented Income Race/ethnicity Gender Age Etc.
7
Downsides to Stratified Random Sampling
8
Downsides to Stratified Random Sampling
Harder/more expensive than SRS You have to know in advance who falls into which strata You have to know how many to sample from each strata to make it representative
9
Cluster Sampling Cluster sampling is a solution to the other problem: that SRS can be expensive/difficult Particularly if your population is large and/or very spread out, it is difficult to use SRS or stratified random sampling These methods are also very difficult to use if you cannot easily identify each individual in your population Think of the trees in the forest
10
Cluster Sampling to the Rescue
The procedure for carrying out a cluster sample is similar to a stratified random sample First divide the population into smaller groups (or “clusters”) So far this sounds exactly like stratified random sampling But now we are not dividing into different groups to try to make it representative, but instead we are dividing into groups that can be readily found together Homerooms Neighborhoods 50x50 sections of the forest
11
Cluster Sampling to the Rescue
But now we are not dividing into different groups to try to make it representative, but instead we are dividing into groups that can be readily found together Homerooms Neighborhoods 50x50 sections of the forest Then we randomly select which clusters to sample Typically, once a cluster is selected, ALL individuals in the cluster are included in the sample Occasionally you may choose to then take an SRS within each cluster
12
Cluster Sampling As with stratified random sampling, we then combine the clusters back together to form our overall sample So cluster sampling is NOT necessarily making our sample more representative than an SRS But it is, in some cases, making it easier to perform and more cost/time effective
13
The Stratified/Cluster Sample Hybrid
On a test, a given method will clearly be an SRS, a stratified random sample, or a cluster sample But in practice, there is some overlap between cluster samples and stratified random samples You may even use such a hybrid later this semester
14
The Stratified/Cluster Sample Hybrid
But in practice, there is some overlap between cluster samples and stratified random samples You may even use such a hybrid later this semester For example, if my population is citizens of the United States, it would be too difficult to take an SRS So I decide to use a cluster sample to make it easier But I want to make sure that people in both smaller cities/towns (under 40,000 people) and bigger cities (over 40,000 people) are represented
15
The Stratified/Cluster Sample Hybrid
For example, if my population is citizens of the United States, it would be too difficult to take an SRS So I decide to use a cluster sample to make it easier But I want to make sure that people in both smaller cities/towns (under 40,000 people) and bigger cities (over 40,000 people) are represented So I decide to randomly sample individual from 10 big cities, and 5 smaller cities
16
The Stratified/Cluster Sample Hybrid
So I decide to randomly sample individual from 10 big cities, and 5 smaller cities Notice that I have clustered by city But also stratified by size of the city So this is not a pure stratified random sample, and not a pure cluster sample—it is a hybrid Your book (as far as I know) does not really talk about this, but I just wanted you to be aware of it
18
Inference Why random sampling? Let’s do an activity
I have 3 decks of cards One at a time, come shuffle a deck, then pick 5 cards from a deck Then put them back in the deck Then record on the (left) dotplot what percentage were red What does it look like?
19
Inference Why random sampling? Let’s do an activity
I have 3 decks of cards Now pick 10 cards Then record on the (middle) dotplot what percentage were red What does it look like?
20
Inference Why random sampling? Let’s do an activity
I have 3 decks of cards Now pick 20 cards Then record on the (right) dotplot what percentage were red What does it look like?
21
Inference What did this activity tell us?
What does it imply for random sampling?
22
Random Sampling Random sampling works so well because each individual is equally likely to be chosen Each card was equally likely to be chosen As the sample gets larger, it comes closer and closer to representing the entire population
23
Margin of Error Each sample has a margin of error
Sets bounds for the likely potential range of the true population value based on the sample
24
Sample Surveys: What Can Go Wrong?
Undercoverage occurs when some groups in the population are left out of the process of choosing the sample. Nonresponse occurs when an individual chosen for the sample can’t be contacted or refuses to participate. A systematic pattern of incorrect responses in a sample survey leads to response bias. The wording of questions is the most important influence on the answers given to a sample survey.
25
Wording of the Question
“Since they are breaking the law, should illegal immigrants be deported?” “Should illegal immigrants be deported?” “Despite holding jobs and helping the US economy, should illegal immigrants be deported?”
26
Order of the Questions A series of two questions were asked to college students as part of a larger survey. The order of the two questions was randomly assigned Order #1: “How happy are you with your life in general? (on a scale of 1 to 5)” “How many dates have you been on in the past month?” This order found almost no correlated between happiness and number of dates
27
Order of the Questions Order #2:
“How many dates have you been on in the past month?” “How happy are you with your life in general? (on a scale of 1 to 5)” This order found a moderately strong positive correlation between happiness and number of dates Sometimes the answer to one question changes the answer to another—so even the order can matter
28
Surveys Moral of the story:
There are lots of ways to create bias in a survey You will explore these in the first semester project In general, if you want unbiased results: Be careful with your wording—make sure that the questions are as neutral as possible Try not to put questions near each other that would affect the answer to the other Make your sample as representative as possible It is HARD to know for sure that you’re not getting bias—but you can take precautions It is EASY to intentionally create bias But this makes it more PROPAGANDA than statistics
29
The End
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.