What is statistics? Statistics is the science of dealing with data.

Slides:



Advertisements
Similar presentations
Where do data come from and Why we don’t (always) trust statisticians.
Advertisements

Sampling.
Chapter 7: Data for Decisions Lesson Plan
Statistics-MAT 150 Chapter 1 Introduction to Statistics Prof. Felix Apfaltrer Office:N518 Phone: x7421.
GATHERING DATA Chapter Experiment or Observe?
Copyright © 2010 Pearson Education, Inc. Slide
Economics 105: Statistics Review #1 due next Tuesday in class Go over GH 8 No GH’s due until next Thur! GH 9 and 10 due next Thur. Do go to lab this week.
Drawing Samples in “Observational Studies” Sample vs. the Population How to Draw a Random Sample What Determines the “Margin of Error” of a Poll?
Chapter 5 Producing Data
Literary Digest Poll 1936 election: Franklin Delano Roosevelt vs. Alf Landon Literary Digest had called the election since 1916 Sample size: 2.4 million!
Chapter 3 Producing Data 1. During most of this semester we go about statistics as if we already have data to work with. This is okay, but a little misleading.
§ Populations, Surveys and Random Sampling Kent: “ Mr. Simpson, how do you respond to the charges that petty vandalism such as graffiti is.
Bush's lead gets smaller in poll By Susan Page, USA TODAY WASHINGTON — President Bush leads Sen. John Kerry by 8 percentage points among likely voters,
§ Terminology, Clinical Studies, Graphical Representations of Data.
Why sample? Diversity in populations Practicality and cost.
The eternal tension in statistics.... Between what you really really want (the population) but can never get to...
Chapter 12 Sample Surveys. At the end of this chapter, you should be able to Identify populations, samples, parameters and statistics for a given problem.
Chapter 4 How to get the Data Part1 n In the first 3 lectures of this course we spoke at length about what care we should take in conducting a study ourselves.
Chapter 12 Sample Surveys
CHAPTER 7, the logic of sampling
PRODUCING DATA. A look at your class The class survey The class survey.
AP Statistics Overview and Basic Vocabulary. Key Ideas The Meaning of Statistics Quantitative vs. Qualitative Data Descriptive vs. Inferential Statistics.
Copyright © 2011 Pearson Education, Inc. Samples and Surveys Chapter 13.
Statistics Statistics is the art and science of gathering, analyzing, and making inferences (predictions) from numerical information, data, obtained in.
MATH1342 S08 – 7:00A-8:15A T/R BB218 SPRING 2014 Daryl Rupp.
Chapter 1 Getting Started
Chapter 12: AP Statistics
Chapter 4 Gathering data
Chapter 1: Introduction to Statistics
Qualitative and Quantitative Sampling
4.2 Statistics Notes What are Good Ways and Bad Ways to Sample?
–Population: The collection of objects or individuals. N-value: The number of individuals in the population. N-value: The number of individuals.
1 Excursions in Modern Mathematics Sixth Edition Peter Tannenbaum.
Sampling Defined / The idea – Making inference about a larger population What is the population – Some particular value in the population estimating.
Excursions in Modern Mathematics, 7e: Copyright © 2010 Pearson Education, Inc. 13 Collecting Statistical Data 13.1The Population 13.2Sampling.
Sampling.
Political Science 30: Political Inquiry Drawing a Good Sample.
Homework Read pages Page 467: 1 – 16, 29 – 34, 37, 38, 59.
Introduction to Sampling “If you don’t believe in sampling, the next time you have a blood test tell the doctor to take it all.”
Chapter 12 Designing Good Samples. Doubting the Holocaust? An opinion poll conducted in 1992 for the American Jewish Committee asked: Does it seem possible.
DATA COLLECTION METHODS Sampling
Sampling Design Notes Pre-College Math.
Part III Gathering Data.
Chapter 7: Data for Decisions Lesson Plan Sampling Bad Sampling Methods Simple Random Samples Cautions About Sample Surveys Experiments Thinking About.
Chapter 12 Sample Surveys
MDM4U - Collecting Samples Chapter 5.2,5.3. Why Sampling? sampling is done because a census is too expensive or time consuming the challenge is being.
Lecture # 6:Designing samples or sample survey Important vocabulary Experimental Unit: An individual person,animal object on which the variables of interest.
Sampling Techniques 19 th and 20 th. Learning Outcomes Students should be able to design the source, the type and the technique of collecting data.
STT 421 Day 7: September 28, 2015 September 28, 2015
Part III – Gathering Data
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. 1 In an observational study, the researcher observes values of the response variable and explanatory.
Statistics – OR 155 Section 1 J. S. Marron, Professor Department of Statistics and Operations Research.
1. When I give you the signal, you will have 10 seconds to look at a slide and make a guess as to the average number of m&m’s per pile. Do not use pencil.
Sampling Techniques 1. Simple Random Sample (SRS) or just Random Sample Taking a sample from a population in which… a)Every member has the same chance.
Chapter Five Vocabulary. Page 1 (1) A Census of the Population This would be ideal – we would actually KNOW the values of the parameters! Really hard.
Chapter 3 Surveys and Sampling © 2010 Pearson Education 1.
Excursions in Modern Mathematics, 7e: Copyright © 2010 Pearson Education, Inc. 13 Collecting Statistical Data 13.1The Population 13.2Sampling.
Chapter 7 Data for Decisions. Population vs Sample A Population in a statistical study is the entire group of individuals about which we want information.
Plan for Today: Chapter 1: Where Do Data Come From? Chapter 2: Samples, Good and Bad Chapter 3: What Do Samples Tell US? Chapter 4: Sample Surveys in the.
Copyright © 2014, 2011 Pearson Education, Inc. 1 Chapter 13 Samples and Surveys.
Chapter 2 The Data Analysis Process and Collecting Data Sensibly.
MATH Section 6.1. Sampling: Terms: Population – each element (or person) from the set of observations that can be made Sample – a subset of the.
1.3 Experimental Design. What is the goal of every statistical Study?  Collect data  Use data to make a decision If the process to collect data is flawed,
Collecting Samples Chapter 2.3 – In Search of Good Data Mathematics of Data Management (Nelson) MDM 4U.
Chapter 12 Sample Surveys.
Statistics: Experimental Design
Week 6 Lecture 1 Chapter 10. Sample Survey.
Introduction to Statistics
MATH 2311 Section 6.1.
COLLECTING STATISTICAL DATA
Presentation transcript:

What is statistics? Statistics is the science of dealing with data. Data is any type of info packaged in numerical form. Common examples: Political polls, Health/medical studies

Some Basic Definitions Population: collection of individuals or objects we want to study statistically “What is the population to which the statistical statement applies?” N-value: how many individuals/objects there are in the population

Example Study: What percentage of the M&Ms in the jar are blue? Population: all of the M&Ms in the jar N-value: 4392

Census Census: the process of collecting data by going through every member of the population Our example: Count all M&Ms in the jar, count all of the blue ones, find percentage. Drawbacks: Expensive Too much work Almost impossible for large populations

Census vs Survey Census: the process of collecting data by going through every member of the population Survey: process of collecting data only from some members of the population (and use that data to draw conclusions & make inferences about the entire population) Poll: data collection done by asking questions

Use samples! Sample: a subgroup of the population chosen to provide the data Sampling: the act of selecting a sample Finding a good sample is EXTREMELY DIFFICULT!!!! Sampling frame: the actual subset of the population from which the sample will be drawn

Example Study: What percentage of our class likes cheeseburgers? Population: all members of our class N-value: 20 Sampling frame: all of the women in our class A Sample: all of the women in our class who are present today

Sampling frames make a difference! CNN/USA Today/ Gallup Poll, Nov 2004: If the election for Congress were being held today, which party’s candidate would you vote for in your district? Asked of 1866 registered voters nationwide: 49% for Dem, 47% for Rep, 4% undecided Asked of 1573 likely voters nationwide: 50% for Rep, 46% for Dem, 3% undecided Differences: sampling frames for each was different…..sampling frame for the second poll more representative of people who actually voted, and closely predicted actual results. However, it’s much easier to get a list of registered voters as opposed to likely voters

Representative Samples When a population is highly homogeneous, a very small sample may be representative Ex: blood samples, thoroughly mixed cake batter, etc More heterogeneous populations -> more difficult to find representative samples

Are these samples representative? Question: What is the average time it takes a UNL student to walk to class? Samples: All students living in dorms All students who use city buses All students in the Union at noon All students currently taking math classes

1936 Literary Digest Poll US presidential election: Alfred Landon (R) vs incumbent Franklin D Roosevelt (D) Sampling frame included: Every person listed in a telephone directory anywhere in the US Every person on a magazine subscription list Every person listed on the roster of a club or professional association List of 10 million people created to whom mock ballots were mailed

1936 Literary Digest Poll Poll predicted Landon with 57% of vote vs Roosevelt’s 43% Reality: 62% for Roosevelt and 38% for Landon What went wrong?! Think about the sample. Representative? Biased? During the depression, those people with phones, magazine subscriptions, club memberships were RICH

Bias Selection bias: when the choice of the sample has a built-in tendency to exclude a particular group or characteristic within the population Literary Digest poll only had 24% response rate Low response rate -> nonresponse bias (selection bias) People selected themselves out of the survey. Always low response rate for mail surveys. Also, people more passionate about a topic are more likely to respond.

Lots of different kinds of bias Leading-question bias: Are you in favor of paying higher taxes to bail the federal government out of its disastrous economic policies and its mismanagement of the federal budget? Question order bias Afraid to answer bias: Have you ever cheated on your income taxes?

Morals Bigger samples aren’t necessarily better samples! Watch out for different types of bias! A representative sample is key!

Lots of Sampling Methods Convenience sampling: selection of individuals included in the sample is dictated by what is easiest or cheapest Notoriously bad! Ex: Want to know the average score on the last quiz? Sample: Look at the scores of the people sitting next to you. Ex: Want to know how people feel about making the switch to the Big Ten? Sample: Set up a table outside of your house for people to come by and fill out questionnaire

Quota sampling Quota sampling: the sample should have so many women, so many men, so many Christians, so many Muslims, so many urban-dwellers, so many rural farmers, etc The proportions in each category in the sample should be the same as those in the population

Example of quota sampling Intro to Stats has 120 students 40 freshman 30 sophomores 30 juniors 20 seniors To fill out questionnaire, prof selects 24 freshman 18 sophomores 18 juniors 12 seniors

1948 US Presidential Election Gallup poll used detailed quota sampling Sample size: 3250 people Prediction vs reality: Thomas Dewey: 49.5% / 44.5% Harry Truman: 44.5% / 49.9% What went wrong? Missing criterion wrt the categories considered for quota. Interviewers we free to choose whom to interview -> selection bias

Simple Random Sampling SRS: all members of the population have an equal chance at being included in the sample How were previous examples not SRS? Examples of methods: Pull names from a hat Flip a coin Random number generator

Stratified Sampling Break the sampling frame into categories (strata), then randomly choose a sample from these strata Those chosen strata are subdivided into substrata, and a random sample taken. Subdivide again and take a random sample, etc End up with clusters, but usually reliable

Stratified Sampling Example

Now survey these houses!

More Definitions Statistic: Numerical information drawn from a sample Parameter: unknown measure (numerical info) from the population Hopefully, the statistic will be close to the parameter so conclusions made about the sample will be true for the whole population.

Error and Bias Sampling error: the difference between the parameter (estimated) and the statistic Sampling error attributed to: Chance error Sampling variability: different samples give different results Sampling bias: bad sample chosen

Sample Size Population size = N Sample size = n Sampling proportion = n/N Modern public opinion polls: 1000 ≤ n ≤ 1500

Capture-Recapture Used to estimate the N-value Steps: Choose a sample of size , tag the members, and release. After some time, capture a new sample of size and take an exact head count of tagged individuals. Call that number k. The N-value is approximately Proportion is sample approximately the proportion in the population

Small fish in a big pond A pond of fish! Capture = 200 fish. Tag them. Capture = 150 fish. Notice that k = 21 of these fish have tags. There are approximately N ≈ (200*150)/21 ≈ 1428 fish

CORRELATION DOES NOT IMPLY CAUSATION!!!!!!!!!! Clinical Studies Try to study cause and effect, whereas surveys just observe and report CORRELATION DOES NOT IMPLY CAUSATION!!!!!!!!!!

Alar Scare Alar: chemical used by apple growers 1973: mice exposed to active chemicals in Alar at 8 times greater than the max tolerated dosage A child would have to eat 200,000 apples per day to get that dosage Alar doesn’t really cause cancer, but no longer used. Washington State apple industry lost $375 million.

Clinical studies Concerned with determining whether a single variable or treatment (vaccine, drug, therapy, etc) can cause a certain effect (disease, symptom, cure, etc) Confounding variables: all other possible contributing causes that could produce the same effect First step: isolate the treatment under investigation from confounding variables

Controlled Study Subjects are divided into two different groups: Treatment group: consists of subjects receiving the actual treatment Control group: consists of subjects that are not receiving any treatment (for comparison only) Randomized controlled study: subjects are assigned to the treatment group or control group randomly....hopefully groups are representative samples

Placebos Placebo: fake treatment intended to look like the real treatment Controlled placebo study: controlled study in which control group is given a placebo Placebo effect: just the idea of getting treatment can produce positive results

Don’t tell them about the placebo! Blind study: neither the members of the treatment group nor the members of the control group know to which of the two groups they belong Double-blind study: the scientists conducting the study don’t know either

Homework Read Chapter 13 Answer the questions on the Vocabulary worksheet Exercises beginning on page 515: 1-4, 13, 17-25, 30-32, 45-48, 57-60, 70