Statistics – OR 155 Section 1 J. S. Marron, Professor Department of Statistics and Operations Research.

Slides:



Advertisements
Similar presentations
Overview of Inferential Statistics
Advertisements

Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. 1.1 Chapter Five Data Collection and Sampling.
Where do data come from and Why we don’t (always) trust statisticians.
Stat 155, Section 2, Last Time Producing Data How to Sample? –History of Presidential Election Polls Random Sampling Designed Experiments –Treatments &
GrowingKnowing.com © Sampling A sample is a subset of the population In a sample, you study a few members of the population In a census, you study.
GATHERING DATA Chapter Experiment or Observe?
Economics 105: Statistics Review #1 due next Tuesday in class Go over GH 8 No GH’s due until next Thur! GH 9 and 10 due next Thur. Do go to lab this week.
Drawing Samples in “Observational Studies” Sample vs. the Population How to Draw a Random Sample What Determines the “Margin of Error” of a Poll?
Stat 155, Section 2, Last Time Producing Data: How to Sample? –Placebos –Double Blind Experiment –Random Sampling Statistical Inference –Population “parameters”,,
Literary Digest Poll 1936 election: Franklin Delano Roosevelt vs. Alf Landon Literary Digest had called the election since 1916 Sample size: 2.4 million!
Chapter 3 Producing Data 1. During most of this semester we go about statistics as if we already have data to work with. This is okay, but a little misleading.
Section Decision Making with Data  NOT ALL DATA IS GOOD DATA!  “Do not put faith in what statisticians say until you have carefully considered.
Chapter 12 Sample Surveys. At the end of this chapter, you should be able to Identify populations, samples, parameters and statistics for a given problem.
Copyright (c) Bani Mallick1 Lecture 4 Stat 651. Copyright (c) Bani Mallick2 Topics in Lecture #4 Probability The bell-shaped (normal) curve Normal probability.
Sampling and levels of measurement Data collection.
CHAPTER 7, the logic of sampling
Chapter Nine: Evaluating Results from Samples Review of concepts of testing a null hypothesis. Test statistic and its null distribution Type I and Type.
Statistical Inference: Which Statistical Test To Use? Pınar Ay, MD, MPH Marmara University School of Medicine Department of Public Health
MATH1342 S08 – 7:00A-8:15A T/R BB218 SPRING 2014 Daryl Rupp.
4.2 Statistics Notes What are Good Ways and Bad Ways to Sample?
Sampling Defined / The idea – Making inference about a larger population What is the population – Some particular value in the population estimating.
 Sampling Design Unit 5. Do frog fairy tale p.89 Do frog fairy tale p.89.
Homework Read pages Page 467: 1 – 16, 29 – 34, 37, 38, 59.
Measurements, Mistakes and Misunderstandings in Sample Surveys Lecture 1.
Sociological Research Methods Sociology: Chapter 2, Section 1.
Pitfalls of Surveys. The Literary Digest Poll 1936 US Presidential Election Alf Landon (R) vs. Franklin D. Roosevelt (D)
Chapter 12 Sample Surveys *Sample *Bias *Randomizing *Sample Size.
Sampling Design Notes Pre-College Math.
Sampling. Sampling Can’t talk to everybody Select some members of population of interest If sample is “representative” can generalize findings.
Chapter 7: Data for Decisions Lesson Plan Sampling Bad Sampling Methods Simple Random Samples Cautions About Sample Surveys Experiments Thinking About.
Chapter 12 Sample Surveys
Section 5.1 Designing Samples Malboeuf AP Statistics, Section 5.1, Part 1 3 Observational vs. Experiment An observational study observes individuals.
Stat 31, Section 1, Last Time Correlation Linear Regression –Idea – graphics –Computation –Interpretation.
© 2012 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Section 5.1 Designing Samples AP Statistics
Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. 1.1 Chapter Five Data Collection and Sampling.
Chapter Five Data Collection and Sampling Sir Naseer Shahzada.
Psychological Research Methods Psychology: Chapter 2, Section 2.
Hypothesis Testing An understanding of the method of hypothesis testing is essential for understanding how both the natural and social sciences advance.
Statistics – OR 155, Section 2 J. S. Marron, Professor Department of Statistics and Operations Research.
Copyright © 2006 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Slide
Part III – Gathering Data
SECTION 4.1. INFERENCE The purpose of a sample is to give us information about a larger population. The process of drawing conclusions about a population.
Chapter 21: More About Tests
Inference: Probabilities and Distributions Feb , 2012.
I can identify the difference between the population and a sample I can name and describe sampling designs I can name and describe types of bias I can.
SAMPLING DISTRIBUTION OF MEANS & PROPORTIONS. PPSS The situation in a statistical problem is that there is a population of interest, and a quantity or.
1 Introduction to Statistics. 2 What is Statistics? The gathering, organization, analysis, and presentation of numerical information.
 An observational study observes individuals and measures variable of interest but does not attempt to influence the responses.  Often fails due to.
Statistics 100 Lecture Set 2. Lecture Set 2 Chapter 2 … please read Will be doing chapter 3 in the next lecture set Some suggested problems: –Chapter.
Stat 31, Section 1, Last Time Big Rules of Probability –The not rule –The or rule –The and rule P{A & B} = P{A|B}P{B} = P{B|A}P{A} Bayes Rule (turn around.
Chapter 3 Surveys and Sampling © 2010 Pearson Education 1.
Unit 8: The Normal Distribution. Probability distributions The probability of an outcome in an interval is shown in an histogram as the area above that.
STT 350: SURVEY SAMPLING Dr. Cuixian Chen Chapter 2: Elements of the Sampling Problem Elementary Survey Sampling, 7E, Scheaffer, Mendenhall, Ott and Gerow.
C1, L1, S1 Chapter 1 What is Statistics ?. C1, L1, S2 Chapter 1 - What is Statistics? A couple of definitions: Statistics is the science of data. Statistics.
1 Chapter 11 Understanding Randomness. 2 Why Random? What is it about chance outcomes being random that makes random selection seem fair? Two things:
Plan for Today: Chapter 1: Where Do Data Come From? Chapter 2: Samples, Good and Bad Chapter 3: What Do Samples Tell US? Chapter 4: Sample Surveys in the.
Uncertainty and confidence Although the sample mean,, is a unique number for any particular sample, if you pick a different sample you will probably get.
The inference and accuracy We learned how to estimate the probability that the percentage of some subjects in the sample would be in a given interval by.
Introduction Sample surveys involve chance error. Here we will study how to find the likely size of the chance error in a percentage, for simple random.
Chapter 11 Sample Surveys. How do we gather data? Surveys Opinion polls Interviews Studies –Observational –Retrospective (past) –Prospective (future)
Introduction/ Section 5.1 Designing Samples.  We know how to describe data in various ways ◦ Visually, Numerically, etc  Now, we’ll focus on producing.
The Law of Averages. What does the law of average say? We know that, from the definition of probability, in the long run the frequency of some event will.
Sampling Analysis. Statisticians collect information about specific groups through surveys. The entire group of objects or people that you want information.
AP Statistics C5 D1 HW: p.285 #19-24 Quiz in 2 class days Obj: to choose a simple random sample Do Now: What is the difference between a sample and a population?
Section 5.1 Designing Samples
Chapter 10 Samples.
Inference for Sampling
What do Samples Tell Us Variability and Bias.
Designing Samples Section 5.1.
Presentation transcript:

Statistics – OR 155 Section 1 J. S. Marron, Professor Department of Statistics and Operations Research

Class Information Handouts With: Blackboard Info Student Survey (please fill out & return after class)

Class Information Go to Blackboard (for class details): Website: Log-in with Onyen Choose this course Control Panel > Content Areas Course Information Choose Item “Course Information”

Relationship to Textbook Ordering of material in textbook is usual But I don’t like it (poorly motivated) So will change the order of the material (for better motivation) Will jump around a lot through the text

Reading In Textbook Approximate Reading for Today’s Material: Pages 1-5, , Approximate Reading for Next Class: Pages

What is Statistics? Definition 1: Gaining Insight from Numbers (similar to text’s definition) Definition 2: The Science of Managing Uncertainty

What is Statistics? Subtopics: Gathering the Numbers –E.g. Statistician at a ball game –Will see: how this is done is critical Forming Conclusions –Will use math, etc. –Major focus of this course

Key Themes I.Uncertainty II.Variability (will get quantitative about these) Favorite Quote: “I was never good at math, but statistics is easy, since it is just common sense”

Motivating Examples 1.Political Polls –Try to predict outcome of election –Too expensive to ask everyone –So ask some (hope they are “representative”) 2.Measurement Error –No measurement is exact –Can improve by multiple measurements –How to model? Lessons of these are broadly applicable

Common Structure For both, find out about truth from a sample E.g. 1: % for Cand. in population % for Cand. in sample E.g. 2: true size observed measurement

Motivating Examples 1.Political Polls 2.Measurement Error Will study each using mathematical models Do E.g. 1 first, since easier Appropriate Models?

Political Polls Appropriate Mathematical Models? Depends on how data are gathered. See Text, pages Seems easy??? “Just choose some”??? Take a look at history…

How to sample? History of Presidential Election Polls During Campaigns, constantly hear in news “polls say …” How good are these? Why?

How to sample? History of Presidential Election Polls During Campaigns, constantly hear in news “polls say …” How good are these? Why? 1936 Landon vs. Roosevelt Literary Digest Poll: 43% for R

How to sample? History of Presidential Election Polls During Campaigns, constantly hear in news “polls say …” How good are these? Why? 1936 Landon vs. Roosevelt Literary Digest Poll: 43% for R Result: 62% for R

How to sample? History of Presidential Election Polls During Campaigns, constantly hear in news “polls say …” How good are these? Why? 1936 Landon vs. Roosevelt Literary Digest Poll: 43% for R Result: 62% for R What happened? Sample size not big enough? 2.4 million Biggest Poll ever done (before or since)

Bias in Sampling Bias: Systematically favoring one outcome (need to think carefully) Selection Bias: Addresses from L. D. readers, phone books, club memberships (representative of population?) Non-Response Bias: Return-mail survey (who had time?)

How to sample? 1936 Presidential Election (cont.) Interesting Alternative Poll: Gallup: 56% for R (sample size ~ 50,000) Gallup of L.D. 44% for R ( ~ 3,000)

How to sample? 1936 Presidential Election (cont.) Interesting Alternative Poll: Gallup: 56% for R (sample size ~ 50,000) Gallup of L.D. 44% for R ( ~ 3,000) Predicted both correct result (62% for R), and L. D. error (43% for R)! (how was improvement done?)

Improved Sampling Gallup’s Improvements: (i)Personal Interviews (attacks non-response bias) (ii)Quota Sampling (attacks selection bias)

Quota Sampling Idea: make “sample like population” So surveyor chooses people to give: i.Right % male ii.Right % “young” iii.Right % “blue collar” iv.… This worked fairly well (~5% error), until …

How to sample? 1948 Dewey Truman sample size

How to sample? 1948 Dewey Truman sample size Crossley 50% 45% Gallup 50% 44% ~50,000 Roper 53% 38% ~15,000

How to sample? 1948 Dewey Truman sample size Crossley 50% 45% Gallup 50% 44% ~50,000 Roper 53% 38% ~15,000 Actual 45% 50% -

How to sample? 1948 Dewey Truman sample size Crossley 50% 45% Gallup 50% 44% ~50,000 Roper 53% 38% ~15,000 Actual 45% 50% - Note: Embarassing for polls, famous photo of Truman + Headline “Dewey Wins”

How to sample? Note: Embarassing for polls, famous photo of Truman + Headline “Dewey Wins”

What went wrong? Problem: Unintentional Bias (surveyors understood bias, but still made choices)

What went wrong? Problem: Unintentional Bias (surveyors understood bias, but still made choices) Lesson: Human Choice can not give a Representative Sample

What went wrong? Problem: Unintentional Bias (surveyors understood bias, but still made choices) Lesson: Human Choice can not give a Representative Sample Surprising Improvement: Random Sampling Now called “scientific sampling” Random = Scientific???

Random Sampling Key Idea: “random error” is smaller than “unintentional bias”, for large enough sample sizes

Random Sampling Key Idea: “random error” is smaller than “unintentional bias”, for large enough sample sizes How large? Current sample sizes: ~1, ,000

Random Sampling Key Idea: “random error” is smaller than “unintentional bias”, for large enough sample sizes How large? Current sample sizes: ~1, ,000 Note: now << 50,000 used in So surveys are much cheaper (thus many more done now….)

Random Sampling How Accurate? Can (& will) calculate using “probability” Justifies term “scientific sampling” 2 nd improvement over quota sampling

Random Sampling What is random? Simple Random Sampling: Each member of population is equally likely to be in sample Key Idea: Different from “just choose some”

Random Sampling An old (but still fun?) experiment: Choose a number among 1,2,3,4

Random Sampling An old (but still fun?) experiment: Choose a number among 1,2,3,4 Old typical results: about 70% choose “3” (perhaps you have seen this before…)

Random Sampling An old (but still fun?) experiment: Choose a number among 1,2,3,4 Old typical results: about 70% choose “3” (perhaps you have seen this before…) Main lesson: human choice does not give “equally likely” (i.e. random sample)

Random Sampling How to choose a random sample? Old Approaches: –Random Number Table –Roll Dice Modern Approach: –Computer Generated

Random Sampling HW Interesting Question: What is the % of Male Students at UNC? (Your chance of date, or take 100% - to get your chance) HW: C1: Class Handout

Random Sampling HW Notes on HW C1: 3 dumb ways to sample, 1 good one Goal is to learn about sampling, Not “get right answer” Part 1, put symbol for yourself, Ms and Fs for others Put both count & % (%100 x count / 25) Part 2, “tally” is: Part 4, student phone directory available in Student Union?

Random Sampling HW Notes on HW C1, Hints on Part 4: –For each draw, first draw a “random page” –Tools  Data Analysis  Random Number Generation  Uniform is one way to do this –In “Uniform”, you need to set “Parameters”, to 0 and “number of pages” –This gives a random decimal, to get an integer, round up, using CEILING –In CEILING, set “significance” to 1

Random Sampling HW Notes on HW C1, Hints on Part 4 (cont.): –Next Choose Random Column –Next Choose Random Name –Caution: Different numbers on each page. –Challenge: still make equally likely –Approach: choose larger number –Approach: when not there, just toss it out –Approach: then do a “redraw” –Also redraw if can’t tell gender