Chapter 2 Sampling Design.

Slides:



Advertisements
Similar presentations
About BIAS…. Bias A systematic error in measuring the estimateA systematic error in measuring the estimate favors certain outcomesfavors certain outcomes.
Advertisements

Sampling Design.
Sampling Designs Vocabulary for sampling types. How do we gather data? Surveys Opinion polls Interviews Studies –Observational –Retrospective (past) –Prospective.
 Sampling Design Unit 5. Do frog fairy tale p.89 Do frog fairy tale p.89.
Chapter 2 Sampling Design. How do we gather data? Surveys Opinion polls Interviews Studies –Observational –Retrospective (past) –Prospective (future)
Sampling Design Notes Pre-College Math.
Designing Samples Chapter 5 – Producing Data YMS – 5.1.
Chapter 12 Sampling Design. How do we gather data? SurveysSurveys Opinion pollsOpinion polls InterviewsInterviews StudiesStudies –Observational –Retrospective.
AP Statistics Monday, 28 September 2015 OBJECTIVE TSW explore sampling design. TESTS are graded. AGENDA: 09/28/15 – 10/02/15 –MONDAYRead Ch. 11 (pp )
AP STATISTICS LESSON AP STATISTICS LESSON DESIGNING DATA.
AP STATISTICS Section 5.1 Designing Samples. Objective: To be able to identify and use different sampling techniques. Observational Study: individuals.
Chapter 2 Sampling Design. How do we gather data? Surveys Opinion polls Interviews Studies –Observational –Retrospective (past) –Prospective (future)
Lesson Objectives At the end of the lesson, students can: Recognize and define different sampling strategies Design sampling strategies Use the Random.
Chapter 5 Sampling: good and bad methods AP Standards Producing Data: IIB4.
Collecting Data Understanding Random Sampling. Objectives: To develop the basic properties of collecting an unbiased sample. To learn to recognize flaws.
Sampling Design.
Designing Studies In order to produce data that will truly answer the questions about a large group, the way a study is designed is important. 1)Decide.
1 Chapter 11 Understanding Randomness. 2 Why Random? What is it about chance outcomes being random that makes random selection seem fair? Two things:
Chapter 2 Sampling Design. How do we gather data? SurveysSurveys Opinion pollsOpinion polls InterviewsInterviews StudiesStudies –Observational –Retrospective.
Chapter 10 Sampling Design. How do we gather data? Surveys Opinion polls InterviewsStudies –Observational –Retrospective (past) –Prospective (future)
Chapter 11 Sample Surveys. How do we gather data? Surveys Opinion polls Interviews Studies –Observational –Retrospective (past) –Prospective (future)
Ten percent of U. S. households contain 5 or more people
Chapter 5 Sampling Design. How do we gather data? Surveys Opinion polls Interviews Studies –Observational –Retrospective (past) –Prospective (future)
Unit 4--Lesson 2. Lesson Objectives At the end of the lesson, students can: Identify common issues with sampling and surveys Design an experiment using.
Chapter 2 Sampling Design. How do we gather data? SurveysSurveys Opinion pollsOpinion polls InterviewsInterviews StudiesStudies –Observational –Retrospective.
+ Homework READ p COMPLETE (p.227)  19, 21, 23, 25, 27-29, 31, 33, 35 CHECK your answers in the back of the book Please Complete the Following.
Sampling Design. How do we gather data? Surveys Opinion polls Interviews Studies –Observational –Retrospective (past) –Prospective (future) Experiments.
Chapter 11 Understanding Randomness. What is the most important aspect of randomness? It must be fair. How is this possible? 1) Nobody can guess the outcome.
–every possible sample of size n has an equal chance of being selected –Randomly select subjects Simple Random Sample (SRS) Suppose we were to take an.
Sampling Design. Do River Project Will take two days From Josh Tabor Notes.
Sampling Design. How do we gather data? SurveysSurveys Opinion pollsOpinion polls InterviewsInterviews StudiesStudies –Observational –Retrospective (past)
Chapter 12 Sample Surveys.
Chapter 4 Sampling Design.
Sampling Design.
Chapter 5 Data Production
Sampling Design.
Sources of Bias 1. Voluntary response 2. Undercoverage 3. Nonresponse
Sampling Design.
Collecting Data and Planning Observational Studies
Sampling Design.
Section 5.1 Designing Samples
CHAPTER 4 Designing Studies
Chapter 10 Samples.
Population: the entire group of individuals that we want information about   Census: a complete count of the population Sample: A part of the population.
Chapter 5 Sampling Design.
4.1: Samples & Surveys.
Bias On-Level Statistics.
Inference for Sampling
CHAPTER 4 Designing Studies
Chapter 4 Sampling Design.
MA151 Lecture 2: Sampling methods
Section 5.1 Designing Samples
Sampling and Surveys How do we collect data? 8/20/2012.
Federalist Papers Activity
CHAPTER 4 Designing Studies
Producing Data Chapter 5.
Sampling Design.
WARM – UP Use LINE 5 of the random digit table. 30. The World Series.
4.1: Samples & Surveys.
CHAPTER 4 Designing Studies
Section 5.1 Designing Samples
Chapter 5: Producing Data
GREAT Day!!!.
MATH 2311 Section 6.1.
Chapter 5: Producing Data
Sample Design Section 4.1.
Chapter 2 Sampling Design.
Chapter 3 producing data
Designing Samples Section 5.1.
Understanding Randomness
Presentation transcript:

Chapter 2 Sampling Design

How do we gather data? Surveys Studies Opinion polls Simulations Observational Retrospective (past) Simulations Prospective (future) Experiments

Population Entire group we want information about

Census Complete survey of population

How effective is a census? Frog fairy tale . . . The answer is 83!

Why not use a census all the time? Not accurate Very expensive Perhaps impossible Destructive sampling would destroy population Breaking strength of soda bottles Lifetime of flashlight batteries Safety ratings for cars Look at the U.S. census – it has a huge amount of error in it. Plus, it takes a long to compile the data, making the data obsolete by the time we get it! Since a census of any population takes time, censuses are VERY costly to do! Suppose you wanted to know the average weight of the white-tail deer population in Michigan – would it be feasible to do a census?

Sample Part of the population that we actually examine to gather info Generalize sample data to population

Sampling Design Method used to choose the sample from the population

Sampling Frame List of every individual in population

Simple Random Sample (SRS) Suppose we were to take an SRS of 100 GBHS students: put each student’s name in a hat, then randomly select 100 names from the hat. Each student has the same chance to be selected! Simple Random Sample (SRS) Not only does each student have the same chance to be selected – but every possible group of 100 students has the same chance to be selected! So it has to be possible for all 100 students to be seniors in order for it to be an SRS. n individuals chosen so that: every individual has an equal chance of being chosen every group of n individuals has an equal chance of being chosen

Describing an SRS Pop. = 3000; sample = 100 Assign each person a number from 1 to 3000. Use a RNG to choose 100 distinct numbers from 1 to 3000.

SRS Advantages Disadvantages Unbiased Easy Large variance May not be representative Need sampling frame Advantages Unbiased Easy

Stratified Random Sample Homogeneous groups: individuals in the group are all alike based upon some characteristic Stratified Random Sample Suppose we were to take a stratified random sample of 100 GBHS students. Since students are already divided by grade level, grade levels can be our strata. Then randomly select 25 students from each grade. Divide population into homogeneous groups, called strata Draw an SRS from each stratum

Stratified Disadvantages Advantages More precise than SRS (less variability) Cost reduced if strata already exist Disadvantages Difficult if you must divide strata Need sampling frame

Systematic Random Sample Suppose we want to do a systematic random sample of GBHS students. Number a list of students. (There are approximately 3000 students  if we want a sample of 100, 3000/100 = 30) Select a number between 1 and 30 at random. That student will be the first student chosen, then choose every 30th student from there. Systematic Random Sample Randomly choose where to begin Follow a systematic process from there (every __th person)

Systematic Advantages Disadvantages Large variance Unbiased Don’t need sampling frame Ensures sample is spread across population Cheaper and more efficient Disadvantages Large variance Can be tricked by trends or cycles

Cluster Sample Based on location Suppose we want to do a cluster sample of GBHS students. One way to do this would be to randomly select 10 classrooms during 2nd hour. Sample all students in those rooms! Cluster Sample Based on location Randomly pick a location & sample everything there

Cluster Advantages Disadvantages Unbiased Cheaper Don't need sampling frame Disadvantages Clusters may not be representative of population

Multistage Sample Choose successively smaller groups To use a multistage approach to sampling GBHS students, we could first divide 4th hour classes by level (AP, Honors, Regular, etc.) and randomly select 4 classes from each group. Then we could randomly select 5 students from each of those classes. The selection process is done in stages. Multistage Sample Choose successively smaller groups Use SRS at each stage

Identify the sampling design 1) The Educational Testing Service (ETS) needed a sample of colleges. ETS first divided all colleges into groups of similar types (small public, small private, etc.). They then randomly selected 3 colleges from each group. Stratified random sample

Identify the sampling design 2) A county commissioner wants to survey people in her district to determine their opinions on a particular law up for adoption. She decides to randomly select blocks in her district and then survey all residents who live on those blocks. Cluster sample

Identify the sampling design 3) A local restaurant manager wants to survey customers about the service they receive. Each night the manager randomly chooses a number between 1 & 10. He then gives a survey to that number customer, and to every 10th customer after them, to fill out before they leave. Systematic random sample

Random Digit Table Each entry is equally likely to be any digit Numbers can be read across. Numbers can be read vertically. The following is part of the random digit table found on pages 814-815 of your textbook: Row 1 4 5 1 8 5 0 3 3 7 1 2 4 2 5 5 8 0 4 5 7 0 3 8 9 9 3 4 3 5 0 6 3 Numbers can be read diagonally. Each entry is equally likely to be any digit Digits are independent of each other

Ignore. Ignore. Ignore. Ignore. We need to use double digit random numbers, ignoring any number greater than 20. Start with Row 1 and read across. Suppose your population consists of these 20 people: 1) Aidan 6) Fred 11) Kathy 16) Paul 2) Bob 7) Gloria 12) Lori 17) Shawnie 3) Chico 8) Hannah 13) Matthew 18) Tracy 4) Doug 9) Israel 14) Nan 19) Uncle Sam 5) Edward 10) Jung 15) Opus 20) Vernon Use the following random digits to select a sample of five from these people. 1) Aidan 13) Matthew 18) Tracy 5) Edward 15) Opus Ignore. Ignore. Ignore. Ignore. Stop when five people are selected. So my sample would consist of: Aidan, Edward, Matthew, Opus, and Tracy Row 1 4 5 1 8 0 5 1 3 7 1 2 0 1 5 5 8 0 1 5 7 0 3 8 9 9 3 4 3 5 0 6 3

Bias Systematic error in measuring the statistic Favors certain outcomes Anything that causes the data to be wrong – could be attributed to the researchers, the subjects, or the sampling method!

Sources of Bias Many things can cause bias Cannot do anything with biased data

Voluntary Response Bias People choose to respond Usually people with very strong opinions respond Examples: Online polls, call-in voting shows like American Idol The respondents select themselves to participate! Remember: Voluntary Response = Self-Selection

Convenience Sampling / Convenience Bias The data obtained by a convenience sample will be biased – however, this method is often used for surveys & results reported in the news! Convenience Sampling / Convenience Bias Only ask people who are easy to ask Examples: Stopping friendly-looking people in the mall to survey, surveying the first 20 people that walk into a restaurant – convenient methods!

Undercoverage Bias Some groups are left out of the sampling process People with unlisted phone numbers (usually high income families) Some groups are left out of the sampling process People without phone numbers (usually low income families) Suppose you randomly select names from the phone book. Which groups will not have the opportunity of being selected? People with only cell phones (usually young adults)

This is often confused with voluntary response. Because of huge telemarketing efforts in the past few years, telephone surveys have a MAJOR problem with nonresponse! Nonresponse Bias Selected individuals can’t be contacted or refuse to cooperate Telephone surveys: 70% nonresponse This is often confused with voluntary response. People are chosen by the researchers, but refuse to participate  NOT self-selected! One way to help fix nonresponse bias is to make follow-up contact with people who are not home on the first try

Response Bias Behavior of interviewer or respondent causes bias Suppose we want to survey high school students on drug abuse and we use a uniformed police officer to interview each student in our sample. Would we get honest answers? Response Bias Behavior of interviewer or respondent causes bias Wrong answers Response bias occurs you get incorrect answers – either interviewer's or respondent's fault.

Question Wording Wording can influence answers Connotation of words Questions must be worded as neutrally as possible to avoid influencing the response. Question Wording The level of vocabulary should be appropriate for the population you are surveying Wording can influence answers Connotation of words Use of “big” or technical words

Source of Bias? 1) Before the presidential election of 1936 (Democrat FDR vs. Republican Alf Landon), the magazine Literary Digest predicted Landon winning the election by a 3-to-2 margin, based on a survey of 2.8 million people – magazine subscribers, car owners, telephone directories, etc. George Gallup surveyed only 50,000 people and predicted that Roosevelt would win. Undercoverage: The Digest’s survey probably came mostly from high-income families, who were more likely to be Republicans.

2) Suppose you want to estimate the total amount of money spent by students on textbooks each semester at MSU. You collect register receipts for students as they leave the bookstore during lunch one day. Convenience bias: Easy way to collect data or Undercoverage: Students who buy books online are not included

3) A radio host asks listeners to call in and answer the question, "Given how much schools have to spend on heating in the winter, do you think we should make winter break longer?" Question Wording: Encourages listeners to lean toward a particular opinion or Voluntary Response: Respondents will largely be those with strong opinions