Statistics: Unlocking the Power of Data Lock 5 STAT 250 Dr. Kari Lock Morgan Collecting Data: Sampling SECTION 1.2 Sample versus Population Statistical.

Slides:



Advertisements
Similar presentations
STAT Section 5 Lecture 3 Professor Hao Wang University of South Carolina Spring 2012.
Advertisements

Chapter 3 Generalization: How broadly do the results apply?
Statistics: Unlocking the Power of Data Lock 5 STAT 101 Dr. Kari Lock Morgan Collecting Data: Sampling SECTION 1.2 Sample versus Population Statistical.
Data Collection: Sampling
Chapter 7: Data for Decisions Lesson Plan
© 2012 W.H. Freeman and Company Lecture 7 – Sept Sampling designs We have a population we want to study. It is impractical to collect data on the.
Drawing Samples in “Observational Studies” Sample vs. the Population How to Draw a Random Sample What Determines the “Margin of Error” of a Poll?
Chapter 3 Producing Data 1. During most of this semester we go about statistics as if we already have data to work with. This is okay, but a little misleading.
Describing Data: One Quantitative Variable
Today’s Agenda Review Homework #1 [not posted]
Chapter 12 Sample Surveys. At the end of this chapter, you should be able to Identify populations, samples, parameters and statistics for a given problem.
Stat 512 – Lecture 13 Chi-Square Analysis (Ch. 8).
Chapter 12 Sample Surveys
7-3F Unbiased and Biased Samples
Copyright ©2005 Brooks/Cole, a division of Thomson Learning, Inc. How to Get a Good Sample Chapter 4.
The number of minutes each of 26 students in a class spent to complete an obstacle course is shown below. 5,2,5,5,8,12,6,7,5,5,6,5,5,5,6,10,7,5,5,7,5,7,5,7,6,6.
PRODUCING DATA. A look at your class The class survey The class survey.
Do Now If you were to take a poll of V.C. students, what do you think would be the overall opinion of: School Lunch (Does it need improvement, why/why.
Quantitative vs. Categorical Data
Section 1.2 Sampling from a Population Sample versus Population A population includes all individuals or objects of interest. A sample is all the cases.
Statistics: Unlocking the Power of Data Lock 5 STAT 101 Dr. Kari Lock Morgan 8/30/12 Collecting Data: Sampling SECTION 1.2 Sample versus Population Statistical.
4.2 Statistics Notes What are Good Ways and Bad Ways to Sample?
Where Do Data Come From? ● Conceptualization and operationalization of concepts --> measurement strategy --> data. ● Different strategies --> different.
Statistics: Unlocking the Power of Data Lock 5 STAT 250 Dr. Kari Lock Morgan Multiple Regression SECTIONS 10.1, 10.3 (?) Multiple explanatory variables.
THE WHO AND HOW. Opinion Polling. Who does polling? News organizations like CNN, Fox News, ABC, and NBC. Polling organizations like Rasmussen, Gallup,
Sampling 12/4/2012. Readings Chapter 8 Correlation and Linear Regression (Pollock) (pp ) Chapter 6 Foundations of Statistical Inference (Pollock)
Psychological Methods Original Content Copyright by HOLT McDougal. Additions and changes to the original content are the responsibility of the instructor.
Statistics: Unlocking the Power of Data Lock 5 Afternoon Session Using Lock5 Statistics: Unlocking the Power of Data Patti Frazer Lock University of Kentucky.
 Sampling Design Unit 5. Do frog fairy tale p.89 Do frog fairy tale p.89.
Estimation: Sampling Distribution
Political Science 30: Political Inquiry Drawing a Good Sample.
Sample Surveys.  The first idea is to draw a sample. ◦ We’d like to know about an entire population of individuals, but examining all of them is usually.
Introduction to Sampling “If you don’t believe in sampling, the next time you have a blood test tell the doctor to take it all.”
Chapter 12 Designing Good Samples. Doubting the Holocaust? An opinion poll conducted in 1992 for the American Jewish Committee asked: Does it seem possible.
Surveys and Questionnaires Government agencies, news organizations, and marketing companies often conduct surveys. The results can be factual or subjective.
Section 1.2 ~ Sampling Introduction to Probability and Statistics Ms. Young.
CONFIDENCE INTERVALS Feb. 18 th, A STATS PROFESSOR ASKED HER STUDENTS WHETHER OR NOT THEY WERE REGISTERED TO VOTE. IN A SAMPLE OF 50 OF HER STUDENTS.
Statistical Inference: Making conclusions about the population from sample data.
Sampling is the other method of getting data, along with experimentation. It involves looking at a sample from a population with the hope of making inferences.
Chapter 7: Data for Decisions Lesson Plan Sampling Bad Sampling Methods Simple Random Samples Cautions About Sample Surveys Experiments Thinking About.
Chapter 12 Sample Surveys
Statistics: Unlocking the Power of Data Lock 5 Hypothesis Testing: Cautions STAT 250 Dr. Kari Lock Morgan SECTION 4.3, 4.5 Type I and II errors (4.3) Statistical.
Chapter 2 Lesson 2.2a Collecting Data Sensibly 2.2: Sampling.
Statistics: Unlocking the Power of Data Lock 5 STAT 250 Dr. Kari Lock Morgan Collecting Data: Randomized Experiments SECTION 1.3 Control/comparison group.
Statistics: Unlocking the Power of Data Lock 5 STAT 250 Dr. Kari Lock Morgan Collecting Data: Observational Studies SECTION 1.3 Association versus Causation.
Public Opinion Public Opinion is the aggregation of people’ view about issues, situations, and public figures. Why is Public Opinion important? Public.
STT 421 Day 7: September 28, 2015 September 28, 2015
Random Samples 12/5/2013. Readings Chapter 6 Foundations of Statistical Inference (Pollock) (pp )
Chapter 7 Sampling Distributions Target Goal: DISTINGUISH between a parameter and a statistic. DEFINE sampling distribution. DETERMINE whether a statistic.
Chapter 19 Confidence intervals for proportions
Chapter 21 Samples, Good and Bad. Chapter 22 Thought Question 1 Popular magazines often contain surveys that ask their readers to answer questions about.
I can identify the difference between the population and a sample I can name and describe sampling designs I can name and describe types of bias I can.
METHODS OF OBSERVATION. SURVEY METHOD Gathering information by asking directly Series of questions on a particular subject ex. voting preferences, shopping,
Organization of statistical investigation. Medical Statistics Commonly the word statistics means the arranging of data into charts, tables, and graphs.
Statistics 100 Lecture Set 2. Lecture Set 2 Chapter 2 … please read Will be doing chapter 3 in the next lecture set Some suggested problems: –Chapter.
Chapter 3 Surveys and Sampling © 2010 Pearson Education 1.
Statistics: Unlocking the Power of Data Lock 5 STAT 250 Dr. Kari Lock Morgan Multiple Regression SECTIONS 10.1, 10.3 Multiple explanatory variables (10.1,
Statistics: Unlocking the Power of Data Lock 5 STAT 250 Dr. Kari Lock Morgan Collecting Data: Sampling SECTION 1.2 Sample versus Population Statistical.
Copyright © 2009 Pearson Education, Inc. Chapter 12 Sample Surveys.
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 1 Statistics: The Art and Science of Learning from Data Section 1.1 Using Data to Answer.
Introduction Sample surveys involve chance error. Here we will study how to find the likely size of the chance error in a percentage, for simple random.
Ten percent of U. S. households contain 5 or more people
During routine screening, a doctor notices that 22% of her adult patients show higher than normal levels of glucose in their blood – a possible warning.
Collecting Data: Sampling
Section 5.1 Designing Samples
Confidence Intervals: Sampling Distribution
Bias On-Level Statistics.
Inference for Sampling
How do I study different sampling methods for collecting data?
Chapter 1 Statistics: The Art and Science of Learning from Data
Presentation transcript:

Statistics: Unlocking the Power of Data Lock 5 STAT 250 Dr. Kari Lock Morgan Collecting Data: Sampling SECTION 1.2 Sample versus Population Statistical Inference Sampling Bias Simple Random Sample Other Sources of Bias

Statistics: Unlocking the Power of Data Lock 5 Sample versus Population A population includes all individuals or objects of interest. A sample is all the cases that we have collected data on (a subset of the population). Statistical inference is the process of using data from a sample to gain information about the population.

Statistics: Unlocking the Power of Data Lock 5 The Big Picture Population Sample Sampling Statistical Inference

Statistics: Unlocking the Power of Data Lock 5 Medical School? Do you want to go to medical school? a) Yes b) No c) Maybe

Statistics: Unlocking the Power of Data Lock 5 Medical School What is the sample? Suppose Penn State is interested in the proportion of it’s students who want to go to medical school. What is the population? Can our sample data be generalized to make inferences about the population? Why or why not?

Statistics: Unlocking the Power of Data Lock 5 Dewey Defeats Truman? The paper was published before the conclusion of the 1948 presidential election, and was based on the results of a large telephone poll Harry S. Truman won the election What went wrong?

Statistics: Unlocking the Power of Data Lock 5 Sampling Bias Sampling bias occurs when the method of selecting a sample causes the sample to differ from the population in some relevant way. If sampling bias exists, we cannot trust generalizations from the sample to the population

Statistics: Unlocking the Power of Data Lock 5 Scope of Inference Inference can only extend to a population that “looks like” the sample Examples:  Drug testing on animals  Agricultural experiment in a certain location  23andme genetic data  Tracking disease by those who visit doctors

Statistics: Unlocking the Power of Data Lock 5 How does this apply to you? Come up with an example of how sampling bias might arise in your field or where the sample clearly differs from the population:

Statistics: Unlocking the Power of Data Lock 5 Sampling Population Sample ✓ New population ?

Statistics: Unlocking the Power of Data Lock 5 Random Sampling How do we get a sample that looks like the population??? A random sample will resemble the population! Random sampling avoids sampling bias! Take a RANDOM sample!

Statistics: Unlocking the Power of Data Lock 5 Random Sampling Right before the 2012 election, the Gallup Poll took a random sample of 2,551 likely voters. 49% of those sampled supported Obama, 48% supported Romney. In the actual election, 50.4% voted for Obama, and 48.1% voted for Romney. Random sampling is a very powerful tool!!!

Statistics: Unlocking the Power of Data Lock 5 Bowl of Soup Analogy Think of tasting a bowl of soup… Population = entire bowl of soup Sample = whatever is in your tasting bites If you take bites non-randomly from the soup (if you stab with a fork, or prefer noodles to vegetables), you may not get a very accurate representation of the soup If you take bites at random, only a few bites can give you a very good idea for the overall taste of the soup

Statistics: Unlocking the Power of Data Lock 5 Simple Random Sample In a simple random sample, each unit of the population has the same chance of being selected, regardless of the other units chosen for the sample Like drawing names out of a hat More complicated random sampling schemes exist, but will not be covered in this course

Statistics: Unlocking the Power of Data Lock 5 Random Sample Examples Random sample of Americans to infer the proportion of Americans who are obese Random sample of plants in a field to infer average yield per plant in that field Random sample of patients to measure satisfaction Random sample of square meters in a forest to measure tree blight Note: Won’t get answer exactly (how close you get depends on sample size), but will avoid bias!

Statistics: Unlocking the Power of Data Lock 5 How does this apply to you? Come up with a situation where random sampling could be feasible and help to answer a question in your field:

Statistics: Unlocking the Power of Data Lock 5 Reality Unfortunately, obtaining random samples are often not possible  Clinical trials – can’t force someone to participate  Random sample of everyone with a particular disease or trait?  Random sample of all humans?  Random sample of any species of animal or plant? May have to alter the population to be feasible In reality, just watch out for ways in which the sample may differ from the population

Statistics: Unlocking the Power of Data Lock 5 DATA Data Collection and Bias Population Sample Sampling Bias? Other forms of bias?

Statistics: Unlocking the Power of Data Lock 5 Other Forms of Bias Other forms of bias to watch out for in data collection:  Inaccurate responses (self-reporting, measurement error)  Influential question wording  Context  Many other possibilities – examine the specifics of each study!

Statistics: Unlocking the Power of Data Lock 5 Self-Reported Values NHANES (random sample!) got self-reported height and weight and measured height and weight  Men over reported their height by 0.48 in and their weight by 0.66 lbs  Women over reported their height by 0.27 in and under reported their weight by 3.06 lbs Merrill, R.M. and Richardson, J.S. (2009). Validity of Self-Reported Height, Weight, and Body Mass Index: Findings From the National Health and Nutrition Examination Survey, 2001 – Preventing Chronic Disease: Public Health Research, Practice, and Policy. 6(4).Validity of Self-Reported Height, Weight, and Body Mass Index: Findings From the National Health and Nutrition Examination Survey

Statistics: Unlocking the Power of Data Lock 5 Inaccurate Responses In a study on US students, 93% of the sample said they were in the top half of the sample regarding driving skill Svenson, O. (February 1981). "Are we all less risky and more skillful than our fellow drivers?" Acta Psychologica 47 (2): 143–148.Are we all less risky and more skillful than our fellow drivers? From random sample of all US college students, 22.7% reported using illicit drugs. Do you think this number is accurate? Substance Abuse and Mental Health Services Administration (2010). “Results from the 2009 National Survey on Drug Use and Health: Volume 1.” Summary of National Findings (Office of Applied Studies, NSDUH Series H-38A, HHS Publication No. SMA Findings). Rockville, MD, heeps://nsduhweb.rti.org/Results from the 2009 National Survey on Drug Use and Health: Volume 1

Statistics: Unlocking the Power of Data Lock 5 Question Wording A random sample was asked: “Should there be a tax cut, or should money be used to fund new government programs?” A different random sample was asked: “Should there be a tax cut, or should money be spent on programs for education, the environment, health care, crime-fighting, and military defense?”

Statistics: Unlocking the Power of Data Lock 5 Context Ann Landers column asked readers “If you had it to do over again, would you have children? The first request for data contained a letter from a young couple which listed worries about parenting and various reasons not to have kids The second request for data was in response to this number, in which Ann wrote how she was “stunned, disturbed, and just plain flummoxed”

Statistics: Unlocking the Power of Data Lock 5 Having Children If we were to run the question all by itself in the newspaper with a request for responses, could we trust the results? (a) Yes (b) No

Statistics: Unlocking the Power of Data Lock 5 Having Children Newsday conducted a random sample of all US adults, and asked them the same question, without any additional leading material Do you think the true proportion of US parents who are happy they had children is close to 91%? a) Yes b) No

Statistics: Unlocking the Power of Data Lock 5 Summary Always think critically about how the data were collected, and recognize that not all forms of data collection lead to valid inferences Random sampling is a powerful way to avoid sampling bias

Statistics: Unlocking the Power of Data Lock 5 To Do Read Section 1.2 Due TODAY: Pretests and intro survey  Pretest 1  Pretest 2  Survey Due Friday, 1/23: Sections 1.1, 1.2 HW If you haven’t already…  Get the textbook with WileyPlus  Get a clicker and register it on ANGEL by 1/23