Causes of Error in Sampling
Sampling Error Sampling error is error caused by the way you chose your sample – Volunteer Sampling & Convenience Sampling – Causes Bias How do you reduce? – Use randomness in choosing your sample
Sampling Error Random Sampling Error Random Sampling Error – You sample may randomly have a higher or lower percentage of females, college educated people, Hispanics, etc. than what what is found in your population. – Causes Variation How do you reduce? – Choose a larger sample size.
Sampling Error Random Sampling Error From the bell curve, 95% of the data lie within + ______ standard deviations This means that we are 95% confident that the true value is within 2 standard deviations of our measured value. Std. Dev. for Categorical Variables 2
Sampling Error Random Sampling Error If you change the 2 you can find other confidence intervals. – Example: Using 3 yields 99.7% confidence interval – We’ll mostly stick to 95% confidence This only finds the expected error from randomness… variability. Does not account for bias.
Sampling Error Undercoverage Undercoverage is when some groups in the population are left out of the process of choosing the sample. What groups CAN’T be contacted by a survey where you call random people and ask their opinion? – Amish – Homeless – People serving overseas – Prison Inmates – Unlisted numbers & other people without phones
Sampling Error: Undercoverage Undercoverage causes… How do you reduce/prevent undercoverage – Census (poll everyone) – Reduce bias in your sampling (avoid convenience & volunteer sampling) – Stratified Sample: Split the population in groups and then sample each group. Example: Instead of randomly selection 100 people, randomly select 50 men and 50 women bias
Nonsampling Error: Processing Error Examples – Enter a number wrong in Excel – Math Error How do you reduce/prevent – Double Check your Work – Don’t Rush Causes Variation
Nonsampling Error: Response Error Response Error – given an incorrect response Why would someone do that? – Lie How much do you weigh? Have you ever used drugs? – Remember Incorrectly How many minutes have you watched TV this week? Where exactly were you at 3:30 PM last Saturday? – Vague or Confusing Question How many windows do you have? Do door windows count?
Nonsampling Error: Response Error Lying causes bias How can we reduce lying? – Confidential: the interviewer promises their name won’t be released with the results – Anonymous: even the interviewer doesn’t know which response sheet corresponds to which person – Study them without them knowing, as with the handwashing homework problem Is this ethical?
Nonsampling Error: Response Error How can we reduce memory errors?
Nonsampling Error: Response Error Confusing questions cause bias if they favor one answer over another. They cause variability if they are just confusing in general, but don’t favor a particular outcome How can we reduce people from misunderstanding the question? – Clear, careful and extremely specific wording
Nonsampling Error: Response Error: Question Wording 13 % of Americans think we are spending too much on “assistance to the poor,” but 44% think we are spending too much on “welfare” A poll in Scotland showed that 51% would vote in favor for “independence for Scotland,” but 34% would vote in favor for “an independent Scotland separate from the United Kingdom.” Assistance & Independence are positive words while Welfare & Separate are negative words
Wording Questions Loaded questions cause bias – Do you favor banning private ownership of handguns in order to reduce the rate of violent crime? – George Bush: great president or greatest president? – Do you support our president? – Do you agree with all of Obama’s policies? – Do you approve or disapprove of the way Barack Obama is handling his job as president?
Wording Questions: Vague How many windows are in your house is a simple enough question… – Does this room have 2 or 12 windows on the wall? – Do the windows on the door count? If you’re not specific, everyone is going to interpret the question differently adding a lurking variable and variability to your data.
Wording Questions Open vs. Closed Questions – Rate Obama’s performance on a scale of 1 to 10 vs. what do you think about Obama’s job as president? – Limiting people’s options can cause bias – Limiting people’s options can reduce people misunderstanding the question – Closed Questions are easier to analyze
Non Response They refuse to answer your survey. They hang up on you, slam the door in your face or just politely say no.
Nonadherer A nonadherer is a generic term for someone who doesn’t follow directions for whatever reason. – Example: They forgot to take the experimental pill every 6 hours. – Example: They lied on their survey form How would you reduce?
Dropouts Sometimes your study the same group over several days, weeks, months or even years. Dropouts are people who start doing the experiment, but then stop before it is complete.