Journalism 614: Sampling and Non-Response

Slides:



Advertisements
Similar presentations
Sampling.
Advertisements

© 2012 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Sampling.
Why sample? Diversity in populations Practicality and cost.
Social Research Methods: Qualitative and Quantitative Approaches, 5e This multimedia product and its contents are protected under copyright law. The following.
CHAPTER 7, the logic of sampling
Chapter Outline  Populations and Sampling Frames  Types of Sampling Designs  Multistage Cluster Sampling  Probability Sampling in Review.
Sampling. Concerns 1)Representativeness of the Sample: Does the sample accurately portray the population from which it is drawn 2)Time and Change: Was.
Qualitative and Quantitative Sampling
4.2 Statistics Notes What are Good Ways and Bad Ways to Sample?
Sampling Defined / The idea – Making inference about a larger population What is the population – Some particular value in the population estimating.
 Sampling Design Unit 5. Do frog fairy tale p.89 Do frog fairy tale p.89.
Sampling Design Notes Pre-College Math.
Sampling. Sampling Can’t talk to everybody Select some members of population of interest If sample is “representative” can generalize findings.
Variables, sampling, and sample size. Overview  Variables  Types of variables  Sampling  Types of samples  Why specific sampling methods are used.
Sampling Design.
Lecture 9 Prof. Development and Research Lecturer: R. Milyankova
Chapter 15 Sampling and Sample Size Winston Jackson and Norine Verberg Methods: Doing Social Research, 4e.
7: The Logic of Sampling. Introduction Nobody can observe everything Critical to decide what to observe Sampling –Process of selecting observations Probability.
CHAPTER 7, THE LOGIC OF SAMPLING. Chapter Outline  A Brief History of Sampling  Nonprobability Sampling  The Theory and Logic of Probability Sampling.
Journalism 614: Non-Response and Cell Phone Adoption Issues.
Sampling Chapter 5. Introduction Sampling The process of drawing a number of individual cases from a larger population A way to learn about a larger population.
Journalism 614: Non-Response and Cell Phone Adoption Issues.
Journalism 614: Sampling. Sampling  Probability Sampling –Based on random selection  Non-probability sampling –Based on convenience.
Copyright © 2009 Pearson Education, Inc.
ThiQar college of Medicine Family & Community medicine dept
Logic of Sampling Cornel Hart February 2007.
Module 9: Choosing the Sampling Strategy
Chapter 14 Sampling PowerPoint presentation developed by:
Chapter Ten Basic Sampling Issues Chapter Ten.
Sampling.
Sampling Why use sampling? Terms and definitions
Social Research Methods
Sampling.

Sources of Bias 1. Voluntary response 2. Undercoverage 3. Nonresponse
Part III – Gathering Data
Section 5.1 Designing Samples
Chapter 10 Samples.
Graduate School of Business Leadership
Population and samples
Bias On-Level Statistics.
SAMPLING (Zikmund, Chapter 12.
Week 6 Lecture 1 Chapter 10. Sample Survey.
SAMPLE DESIGN.
Meeting-6 SAMPLING DESIGN
Sampling: Theory and Methods
Inference for Sampling
Social Research Methods
Defining and Collecting Data
MA151 Lecture 2: Sampling methods
Welcome.
Sampling Population – any well-defined set of units of analysis; the group to which our theories apply Sample – any subset of units collected in some manner.
1.2 Sampling LEARNING GOAL
SAMPLING.
Sampling Lecture 10.
Week Three Review.
SAMPLING (Zikmund, Chapter 12).
Chapter 5: Producing Data
MATH 2311 Section 6.1.
Sampling Designs and Sampling Procedures
BUSINESS MARKET RESEARCH
Sample-Sampling-Pengelompokan Data
Sampling.
Sampling Chapter 6.
Sampling: How to Select a Few to Represent the Many
Defining and Collecting Data
Social Research Methods
Defining and Collecting Data
Defining and Collecting Data
Presentation transcript:

Journalism 614: Sampling and Non-Response

Sampling Probability Sampling Non-probability sampling Based on random selection Non-probability sampling Based on convenience

Sampling Miscues: Alf Landon for President (1936) Literary Digest: post cards to voters in 6 states Correctly predicting elections from 1920-1932 Names selected from telephone directories and automobile registrations In 1936, they sent out 10 million post cards Results pick Landon 57% to Roosevelt 43% Election: Roosevelt in the largest landslide Roosevelt 61% of the vote and 523-8 in Elect. Col. Why so inaccurate?: Poor sampling frame Leads to selection of wealthy respondents

Sampling Miscues: Thomas E. Dewey for President (1948) Gallup picks winner 1936-1944 Use quota sampling: matches sample characteristics to population Gallup quota samples on the basis of income In 1948, Gallup picked Dewey to defeat Truman Reasons: 1. Most pollsters quit polling in October 2. Undecided voters went for Truman 3. Unrepresentative samples—WWII changed society since census

Non-probability Sampling In situations where sampling frame for randomization doesn’t exist Types of non-probability samples: 1. Reliance on available subjects convenience sampling 2. Purposive or judgmental sampling 3. Snowball sampling 4. Quota sampling

Reliance on Available Subjects Person on the street, easily accessible Examples: Mall intercepts, college students, e-polls Frequently used, but usually biased Notoriously inaccurate Especially in making inferences about larger population, even with many respondents

Purposive or Judgmental Sampling Dictated by the purpose of the study Situational judgments about what individuals should be surveyed to make for a useful or representative sample E.g., Using college students to study third-person effects regarding rap and metal music 3pe: Others are more affected by exposure than self Assessing effects on self and others Using college students makes for homogeneity of self

Snowball Sampling Used when population of interest is difficult to locate E.g., homeless people, meth addicts Research collects data from of few people in the targeted group Initially surveyed individuals asked to name other people to contact Good for exploration Bad for generalizability

Quota Sampling Begins with a table of relevant characteristics of the population Proportions of Gender, Age, Education, Ethnicity from census data Selecting a sample to match those proportions Problems: 1. Quota frame must be accurate 2. Sample is not random, but can be representative

Probability Sampling Goal: Representativeness Random selection Sample resembles larger population Random selection Enhancing likelihood of representative sample Each unit of the population has an equal chance of being selected into the sample

Population Parameters Parameter: Summary statistic for the population E.g., Mean age of the population Sample allows parameter estimates E.g., Mean age of the sample Used as an estimate of the population parameter

Sampling Error Every time you draw a sample from the population, the parameter estimate will fluctuate slightly E.g.: Sample 1: Mean age = 37.2 Sample 2: Mean age = 36.4 Sample 3: Mean age = 38.1 If you draw lots of samples, you would get a normal curve of values

Normal Curve of Sample Estimates Frequency of estimated means from multiple samples Likely population parameter Estimated Mean

Error and Sample Size As the sample size increases: The error decreases In other words, large sample estimate is likely to be closer to the population parameter As the sample size increases, we get more confident in our parameter estimate

Confidence Interval Interval width at which we are 95% confident the estimate contains the population parameter For example, we predict that Candidate X will receive 45% of the vote with a 3% confidence interval We are 95% sure the parameter will be between 42% and 48% The “margin of error” in a poll Confidence interval shrinks as: Error is smaller Sample size is larger

Sample Size & Confidence Interval How precise does the estimate have to be? More precise: larger sample size Larger samples increase precision But at a diminishing rate Each unit you add to your sample contributes to the accuracy of your estimate But the amount it adds shrinks with additional unit added

95% Confidence Intervals Sample Size % split N = 100 N = 200 N = 300 N = 400 N = 500 N = 700 N = 1000 N = 1500 50/50 10.0 7.1 5.8 5.0 4.5 3.8 3.2 2.6 70/30 9.2 6.5 5.3 4.6 4.1 3.5 2.9 2.4 90/10 6.8 4.2 3.0 2.7 2.3 1.9 1.5

Describe Sampling Frame List of units from which sample is drawn Defines your population E.g., List of members of population Ideally you’d like to list all members of your population as your sampling frame Randomly select your sample from that list Often impractical to list entire population

Sampling Frames for Surveys Limitations of the telephone book: Misses unlisted numbers/mobile numbers SES and age bias: Poor people may not have phone Less likely to have multiple phone lines Young people have mobile phone numbers Most studies use a technique such as Random Digit Dialing as a way around this

Types of Sampling Designs Simple Random Sampling Systematic Sampling Stratified Sampling Multi-stage Cluster Sampling

Simple Random Sampling Establish a sampling frame A number is assigned to each element Elements randomly selected into the sample Use a random number generator to select every case you need for inclusion.

Systematic Sampling Establish sampling frame Select every kth element with random start E.g., 1000 on the list, choosing every 5th name yields a sample size of 200 Sampling interval: standard distance between units for the sampling frame Sampling interval = pop. size / sample size Sampling ratio: proportion of pop. selected Sampling ratio = sample size / population size

Stratified Sampling Modification used to reduce potential for sampling error Research ensures that certain groups are represented proportionately in the sample E.g., If the population is 60% female, stratified sample selects 60% females into the sample E.g., Stratifying by region of the country to make sure that each region is proportionately represented

Cluster Sampling Frequently, there is no convenient way of listing the population for sampling E.g., Sample of Dane County or Wisconsin Hard to get a list of the population members Cluster sample Sample of census blocks List of census blocks, list people for selected blocks Select sub-sample of people living on each block

Multi-stage Cluster Sample Cluster sampling done in a series of stages: List, then sample within Example: Stage 1: Listing zip codes Randomly selecting zip codes Stage 2: List census blocks within selected zip codes Randomly select census blocks Stage 3: List households on selected census blocks Randomly select households Stage 4: List residents of selected households Randomly select person to interview

Nonresponse Declining contact and cooperation rates Especially for “gold standard” RDD National Telephone Surveys Early research suggests the issues are rather small, with little bias on results Examined by comparing “easy to contact” individuals to “hard to contact” More systematic version is to compare between standard 5-day and “rigorous” survey

Accelerating Problem Survey firms reporting increasingly high rates of non-contact and non-cooperation Americans leading increasingly busy lives More and more unsolicited calls to home Sophisticated technologies to avoid calls Big drop offs in last 15-20 years Call screening (I only take known callers) Cell phones (I pay for minutes during survey)

Hard to Gauge the Effect Initial work conducted in late 90s Curtain et al - Low effort “restricted call” design versus high effort “all call” design See no difference in population estimates Keeter et al – Two parallel surveys, one using standard 5 day vs. “rigorous” On average, a two percentage point difference Seem to suggest that lower response rate does not effect survey quality

Non-response in this Century Lot has changed in last decade + More legislative restrictions More mobile technologies More VOIP technologies Re-ran the study and found similar results comparing 5-day and rigorous 5-day – 10 call backs, one refusal conversion Rigorous – 21 weeks, advance letters, left messages, additional call backs, etc. Little difference in findings

The Problem of Cell Phones In 2006, 13% of cell phone only HHs Increasing 1% every six months 2003-2006 Increasing 2% every six months after 2006 By 2015, 46% of U.S. Adults Live In Cellphone-Only HHs 64% of Millennials (born 1977-1994) are Cellphone-Only Bias in terms of who is missed is most prominent among young people. “Serious coverage problem” “Particular challenge”

Big differences in wireless only HHs by Age and SES Only 16% of those 65+ Nearly 70% among 25-29 Just over 40% among “not poor” Over 50% for “near poor” Nearly 60% for the “poor” This creates systematic biases

Some substantial differences Big differences between cell and non-cell respondents to a range of questions Especially for issues that affect younger people, and behaviors such as voting Register to vote? Political knowledge? Media usage?

Strive for Higher Response Rate

To Achieve a High Response Rate Incentives: gifts or drawings for completion prize drawing vs smaller incentives to alll donating to a charity as an inducement Run experiments to see which incentive works Online: pre-invitation and landing page Test different versions to see what encourages respondents to click on survey link (online) Reminders and follow-ups to boost response In general, you only want to send one or two email reminders. Make online survey friendly for all devices/browsers. Make it usable on mobile or tablet. See if there is a high bounce rate from particular devices