Presentation is loading. Please wait.

Presentation is loading. Please wait.

Collecting Data and Planning Observational Studies

Similar presentations


Presentation on theme: "Collecting Data and Planning Observational Studies"— Presentation transcript:

1 Collecting Data and Planning Observational Studies
Chapter 1 Collecting Data and Planning Observational Studies

2 Statistics the science of collecting, analyzing, and drawing conclusions from data

3 How do we gather data? Surveys Opinion polls Interviews
Observational Studies Cross-Sectional (present) Retrospective (past) Prospective (future) Experiments

4 Definitions: Observational study - observe outcomes without imposing any treatment Experiment - actively impose some treatment in order to observe the response

5 Population the entire group of individuals that we want information about

6 Census a complete count of the population

7 How good is a census? Do frog fairy tale . . .
The answer is 83!

8 Why would we not use a census all the time?
Not accurate Very expensive Perhaps impossible If using destructive sampling, you would destroy population Breaking strength of soda bottles Lifetime of flashlight batteries Safety ratings for cars Look at the U.S. census – it has a huge amount of error in it; plus it takes a long to compile the data making the data obsolete by the time we get it! Since taking a census of any population takes time, censuses are VERY costly to do! Suppose you wanted to know the average weight of the white-tail deer population in Texas – would it be feasible to do a census?

9 Sample A part of the population that we actually examine in order to gather information Use sample to generalize to population

10 Sampling design refers to the method used to choose the sample from the population

11 Sampling frame a list of every individual in the population

12 Jelly Blubber Activity
Select 10 Jelly blubbers that you think are representative of the population of blubbers in regards to length. Find the mean length of your sample

13 Judgmental Sample Samples that are selected on the basis of being “typical”. The person selecting the samples chooses items that he or she thinks are representative of the population. The validity of the results reflects the soundness of the collector’s judgement.

14 Simple Random Sample (SRS)
Suppose we were to take an SRS of 100 MHS students – put each students’ name in a hat. Then randomly select 100 names from the hat. Each student has the same chance to be selected! Not only does each student have the same chance to be selected – but every possible group of 100 students has the same chance to be selected! Therefore, it has to be possible for all 100 students to be seniors in order for it to be an SRS! consist of n individuals from the population chosen in such a way that every individual has an equal chance of being selected every set of n individuals has an equal chance of being selected

15 Stratified random sample
Homogeneous groups are groups that are alike based upon some characteristic of the group members. Suppose we were to take a stratified random sample of 100 MHS students. Since students are already divided by grade level, grade level can be our strata. Then randomly select 50 seniors and randomly select 50 juniors. population is divided into homogeneous groups called strata SRS’s are pulled from each stratum

16 Systematic random sample
Suppose we want to do a systematic random sample of MHS students - number a list of students (There are approximately 2000 students – if we want a sample of 100, 2000/100 = 20) Select a number between 1 and 20 at random. That student will be the first student chosen, then choose every 20th student from there. select sample by following a systematic approach randomly select where to begin

17 Cluster Sample based upon location
Suppose we want to do a cluster sample of MHS students. One way to do this would be to randomly select 10 classrooms during 2nd period. Sample all students in those rooms! based upon location randomly pick a location & sample all there

18 For the Jelly Blubber colony:
m = 19.41

19 Convenience Sample Consists only of the available people.
Chooses the individuals easiest to reach Ex. Students who come to class early. This often leads to biased studies and is not recommended.

20 Identify the sampling design
1)The Educational Testing Service (ETS) needed a sample of colleges. ETS first divided all colleges into groups of similar types (small public, small private, etc.) Then they randomly selected 3 colleges from each group. Stratified random sample

21 Identify the sampling design
2) A county commissioner wants to survey people in her district to determine their opinions on a particular law up for adoption. She decides to randomly select blocks in her district and then survey all who live on those blocks. Cluster sampling

22 Identify the sampling design
3) A local restaurant manager wants to survey customers about the service they receive. Each night the manager randomly chooses a number between 1 & 10. He then gives a survey to that customer, and to every 10th customer after them, to fill it out before they leave. Systematic random sampling

23 Numbers can be read across.
Random digit table Numbers can be read vertically. The following is part of the random digit table found on page 847 of your textbook: Row Numbers can be read diagonally. each entry is equally likely to be any of the 10 digits digits are independent of each other

24 Ignore. Ignore. Ignore. Ignore.
Suppose your population consisted of these 20 people: 01) Aidan 06) Fred 11) Kathy 16) Paul 02) Bob 07) Gloria 12) Lori 17) Shawnie 03) Chico 08) Hannah 13) Matthew 18) Tracy 04) Doug 09) Israel 14) Nan 19) Uncle Sam 05) Edward 10) Jung 15) Opus 20) Vernon Use the following random digits to select a sample of five from these people. We will need to use double digit random numbers, ignoring any number greater than 20. Start with Row 1 and read across. 1) Aidan 13) Matthew 18) Tracy 5) Edward 15) Opus Ignore. Ignore. Ignore. Ignore. Stop when five people are selected. So my sample would consist of : Aidan, Edward, Matthew, Opus, and Tracy Row

25 Bias A systematic error in measuring the estimate
favors certain outcomes Anything that causes the data to be wrong! It might be attributed to the researchers, the respondent, or to the sampling method!

26 Sources of Bias things that can cause bias in your sample
cannot do anything with bad data

27 Remember – the way to determine voluntary response is:
People chose to respond Usually only people with very strong opinions respond An example would be the surveys in magazines that ask readers to mail in the survey. Other examples are call-in shows, American Idol, etc. Remember, the respondent selects themselves to participate in the survey! Remember – the way to determine voluntary response is: Self-selection!!

28 People with unlisted phone numbers – usually high-income families
Selection Bias some groups of population are left out of the sampling process People without phone numbers –usually low-income families Suppose you take a sample by randomly selecting names from the phone book – some groups will not have the opportunity of being selected! People with ONLY cell phones – usually young adults

29 Nonresponse Bias Because of huge telemarketing efforts in the past few years, telephone surveys have a MAJOR problem with nonresponse! occurs when an individual chosen for the sample can’t be contacted or refuses to cooperate telephone surveys 70% nonresponse People are chosen by the researchers, BUT refuse to participate. NOT self-selected! This is often confused with voluntary response! One way to help with the problem of nonresponse is to make follow contact with the people who are not home when you first contact them.

30 Suppose we wanted to survey high school students on drug abuse and we used a uniformed police officer to interview each student in our sample – would we get honest answers? Response bias occurs when the behavior of respondent or interviewer causes bias in the sample wrong answers Response bias occurs when for some reason (interviewer’s or respondent’s fault) you get incorrect answers.

31 More Response Bias wording can influence the answers that are given
The level of vocabulary should be appropriate for the population you are surveying Questions must be worded as neutral as possible to avoid influencing the response. wording can influence the answers that are given connotation of words use of “big” words or technical words – if surveying Podunk, TX, then you should avoid complex vocabulary. – if surveying doctors, then use more complex, technical wording.

32 Source of Bias? 1) Before the presidential election of 1936, FDR against Republican ALF Landon, the magazine Literary Digest predicting Landon winning the election in a 3-to-2 victory. George Gallup surveyed only 50,000 people and predicted that Roosevelt would win. The Digest’s survey came from magazine subscribers, car owners, telephone directories, etc. Selection Bias – since the Digest’s survey comes from car owners, etc., the people selected were mostly from high-income families and thus mostly Republican! (other answers are possible)

33 Convenience– easy way to collect data
2) Suppose that you want to estimate the total amount of money spent by students on textbooks each semester at UVA. You collect register receipts for students as they leave the bookstore during lunch one day. Convenience– easy way to collect data or Selection Bias – students who buy books from on-line bookstores are not included.

34 (other answers are possible)
3) To find the average value of a home in Norfolk, one averages the price of homes that are listed for sale with a realtor. Selection Bias – leaves out homes that are not for sale or homes that are listed with different realtors. (other answers are possible)


Download ppt "Collecting Data and Planning Observational Studies"

Similar presentations


Ads by Google