Presentation is loading. Please wait.

Presentation is loading. Please wait.

Section 1.2 Sampling from a Population Sample versus Population A population includes all individuals or objects of interest. A sample is all the cases.

Similar presentations


Presentation on theme: "Section 1.2 Sampling from a Population Sample versus Population A population includes all individuals or objects of interest. A sample is all the cases."— Presentation transcript:

1

2 Section 1.2 Sampling from a Population

3 Sample versus Population A population includes all individuals or objects of interest. A sample is all the cases that we have collected data on (a subset of the population). Statistical inference is the process of using data from a sample to gain information about the population.

4 The Big Picture Population Sample Sampling Statistical Inference

5 Dewey Defeats Truman?

6 The paper was published before the conclusion of the 1948 presidential election, and was based on the results of a large telephone poll which showed Dewey sweeping Truman However, Harry S. Truman won the election What went wrong?

7 Sampling Bias Sampling bias occurs when the method of selecting a sample causes the sample to differ from the population in some relevant way. If sampling bias exists, we cannot trust generalizations from the sample to the population

8 Sampling Population Sample GOAL: Select a sample that is similar to the population, only smaller

9 Can you avoid sampling bias? The next slide shows Lincoln’s Gettysburg Address. The entire population, all words in his address, will be shown to you. What is the average word length? Your task: Select a sample of 10 words that resemble the overall address. Write them down. Calculate the average number of letters for the words in your sample Place a dot above your sample average on the board

10 Lincoln’s Gettysburg Address “Four score and seven years ago our fathers brought forth, on this continent, a new nation, conceived in Liberty, and dedicated to the proposition that all men are created equal. Now we are engaged in a great civil war, testing whether that nation, or any nation so conceived and so dedicated, can long endure. We are met on a great battle- field of that war. We have come to dedicate a portion of that field, as a final resting place for those who here gave their lives that that nation might live. It is altogether fitting and proper that we should do this. But, in a larger sense, we can not dedicate—we can not consecrate—we can not hallow—this ground. The brave men, living and dead, who struggled here, have consecrated it, far above our poor power to add or detract. The world will little note, nor long remember what we say here, but it can never forget what they did here. It is for us the living, rather, to be dedicated here to the unfinished work which they who fought here have thus far so nobly advanced. It is rather for us to be here dedicated to the great task remaining before us—that from these honored dead we take increased devotion to that cause for which they here gave the last full measure of devotion—that we here highly resolve that these dead shall not have died in vain— that this nation, under God, shall have a new birth of freedom—and that government of the people, by the people, for the people, shall not perish from the earth.”

11 Can you avoid sampling bias? Actual average: 4.29 letters People are TERRIBLE at selecting a good sample, even when explicitly trying to avoid sampling bias! We need a better way…

12 Random Sampling How can we make sure to avoid sampling bias? Imagine putting the names of all the units of the population into a hat, and drawing out names at random to be in the sample More often, we use technology Take a RANDOM sample!

13 Random Sampling Before the 2008 election, the Gallup Poll took a random sample of 2,847 Americans. 52% of those sampled supported Obama In the actual election, 53% voted for Obama Random sampling is a very powerful tool!!!

14 “Random” Numbers 1. Pick 10 “random” numbers between 1 and 268. Write these numbers down. (Note: When choosing a real sample, you should use technology to generate random numbers. This is simply for illustrative purposes in class.) 2. Using the next slide, calculate the average number of letters in the words corresponding to your random numbers 3. Place a dot below this average on the board

15 1 Four35in69dedicate103But,137add171here205these239that 2 score36a70a104in138or172to206honored240this 3 and37great71portion105a139detract.173the207dead241nation, 4 seven38civil72of106larger140The174unfinished208we242under 5 years39war,73that107sense,141world175work209take243God, 6 ago,40testing74field108we142will176which210increased244shall 7 our41whether75as109cannot143little177they211devotion245have 8 fathers42that76a110dedicate,144note,178who212to246a 9 brought43nation,77final111we145nor179fought213that247new 10 forth44or78resting112cannot146long180here214cause248birth 11 upon45any79place113consecrate,147remember,181have215for249of 12 this46nation80for114we148what182thus216which250freedom, 13 continent47so81those115cannot149we183far217they251and 14 a48conceived82who116hallow150say184so218gave252that 15 new49and83here117this151here,185nobly219the253government 16 nation:50so84gave118ground.152but186advanced.220last254of 17 conceived51dedicated,85their119The153it187It221full255the 18 in52can86lives120brave154can188is222measure256people, 19 liberty,53long87that121men,155never189rather223of257by 20 and54endure.88that122living156forget190for224devotion,258the 21 dedicated55We89nation123and157what191us225that259people, 22 to56are90might124dead,158they192to226we260for 23 the57met91live.125who159did193be227here261the 24 proposition58on92It126struggled160here.194here228highly262people, 25 that59a93is127here161It195dedicated229resolve263shall 26 all60great94altogether128have162is196to230that264not 27 men61battlefield95fitting129consecrated163for197the231these265perish 28 are62of96and130it,164us198great232dead266from 29 created63that97proper131far165the199task233shall267the 30 equal.64war.98that132above166living,200remaining234not268earth. 31 Now65We99we133our167rather,201before235have 32 we66have100should134poor168to202us,236died 33 are67come101do135power169be203that237in 34 engaged68to102this.136to170dedicated204from238vain,

16 Lincoln’s Gettysburg Address

17 Random vs Non-Random Sampling Random samples have averages that are centered around the correct number Non-random samples may suffer from sampling bias, and averages may not be centered around the correct number Only random samples can truly be trusted when making generalizations to the population!

18 Bowl of Soup Analogy Think of tasting a bowl of soup… Population = entire bowl of soup Sample = whatever is in your tasting bites If you take bites non-randomly from the soup (if you stab with a fork, or prefer noodles to vegetables), you may not get a very accurate representation of the soup If you take bites at random, only a few bites can give you a very good idea for the overall taste of the soup

19 Simple Random Sample In a simple random sample, each unit of the population has the same chance of being selected, regardless of the other units chosen for the sample More complicated random sampling schemes exist, but will not be covered in this course

20 Realities of Sampling While a random sample is ideal, often it isn’t feasible. A list of the entire population may not be available, or it may be impossible or too difficult to contact all members of the population. In practice, think hard about potential sources of sampling bias, and try your best to avoid them

21 Non-Random Samples Suppose you want to estimate the average number of hours that students spend studying each week. Which of the following is the best method of sampling? (a) Go to the library and ask all the students there how much they study (b) Email all students asking how much they study, and use all the data you get (c) Give a clicker question in this class and force every student to respond (d) Stand outside the student center and ask everyone going in how much they study

22 Bad Methods of Sampling Letting your sample be comprised of whoever chooses to participate (volunteer bias) People who chose to participate or respond are probably not representative of the entire population  Emailing or mailing the entire population, and then making conclusions about the population based on whoever chooses to respond  Example: An airline emails all of it’s customers asking them to rate their satisfaction with their recent travel

23 Alcohol, Marijuana, and Driving The Federal Office of Road Safety in Australia conducted a study on the effects of alcohol and marijuana on performance Volunteers who responded to advertisements for the study on rock radio stations were given a random combination of the two drugs, then their performance was observed  What is the sample? What is the population?  Is there sampling bias?  Will the results be informative and/or do you think the study is worth conducting? Source: Chesher, G., Dauncey, H., Crawford, J. and Horn, K, “The Interaction between Alcohol and Marijuana: A Dose Dependent Study on the Effects of Human Moods and Performance Skills,” Report No. C40, Federal Office of Road Safety, Federal Department of Transport, Australia, 1986.The Interaction between Alcohol and Marijuana: A Dose Dependent Study on the Effects of Human Moods and Performance Skills

24 DATA Data Collection and Bias Population Sample Sampling Bias? Other forms of bias?

25 Other Forms of Bias Even with a random sample, data can still be biased, especially when collected on humans Other forms of bias to watch out for in data collection:  Question wording  Context  Inaccurate responses  Many other possibilities – examine the specifics of each study!

26 Question Wording “Do you think the US should allow public speeches against democracy?” “Do you think the US should not forbid public speeches against democracy?” Source: Rugg, D. (1941). “Experiments in wording questions,” Public Opinion Quarterly, 5, 91-92. 21% said speeches should be allowed 39% said speeches should not be forbidden

27 Question Wording A random sample was asked: “Should there be a tax cut, or should money be used to fund new government programs?” A different random sample was asked: “Should there be a tax cut, or should money be spent on programs for education, the environment, health care, crime-fighting, and military defense?” Tax Cut: 60%Programs: 40% Tax Cut: 22%Programs: 78%

28 Context Ann Landers column asked readers “If you had it to do over again, would you have children? The first request for data contained a letter from a young couple which listed worries about parenting and various reasons not to have kids  30% said “yes” The second request for data was in response to this number, in which Ann wrote how she was “stunned, disturbed, and just plain flummoxed”  95% said “yes”

29 Inaccurate Responses In a study on US students, 93% of the sample said they were in the top half of the sample regarding driving skill Svenson, O. (February 1981). "Are we all less risky and more skillful than our fellow drivers?" Acta Psychologica 47 (2): 143–148.Are we all less risky and more skillful than our fellow drivers? From random sample of all US college students, 22.7% reported using illicit drugs. Do you think this number is accurate? Substance Abuse and Mental Health Services Administration (2010). “Results from the 2009 National Survey on Drug Use and Health: Volume 1.” Summary of National Findings (Office of Applied Studies, NSDUH Series H-38A, HHS Publication No. SMA 10- 4856Findings). Rockville, MD, heeps://nsduhweb.rti.org/Results from the 2009 National Survey on Drug Use and Health: Volume 1

30 Summary Always think critically about how the data were collected, and recognize that not all forms of data collection lead to valid inferences This is the easiest way to instantly become a more statistically literate individual!


Download ppt "Section 1.2 Sampling from a Population Sample versus Population A population includes all individuals or objects of interest. A sample is all the cases."

Similar presentations


Ads by Google