INFERENTIAL STATISTICS  Samples are only estimates of the population  Sample statistics will be slightly off from the true values of its population’s.

Slides:



Advertisements
Similar presentations
Inference about a Population Proportion
Advertisements

A Sampling Distribution
1. Exams 2. Sampling Distributions 3. Estimation + Confidence Intervals.
Confidence Intervals This chapter presents the beginning of inferential statistics. We introduce methods for estimating values of these important population.
Statistics and Quantitative Analysis U4320
1. Estimation ESTIMATION.
Review: What influences confidence intervals?
PPA 415 – Research Methods in Public Administration Lecture 5 – Normal Curve, Sampling, and Estimation.
Why sample? Diversity in populations Practicality and cost.
Today’s Agenda Review Homework #1 [not posted]
Chapter 7 Estimation Procedures. Chapter Outline  A Summary of the Computation of Confidence Intervals  Controlling the Width of Interval Estimates.
Sampling Distributions
Statistical inference Population - collection of all subjects or objects of interest (not necessarily people) Sample - subset of the population used to.
Probability and the Sampling Distribution Quantitative Methods in HPELS 440:210.
Agenda Review Exam I Sampling Start Probabilities.
INFERENTIAL STATISTICS – Samples are only estimates of the population – Sample statistics will be slightly off from the true values of its population’s.
INFERENTIAL STATISTICS – Samples are only estimates of the population – Sample statistics will be slightly off from the true values of its population’s.
1. Homework #2 2. Inferential Statistics 3. Review for Exam.
Essentials of Marketing Research
Chapter 11: Estimation Estimation Defined Confidence Levels
STA Lecture 161 STA 291 Lecture 16 Normal distributions: ( mean and SD ) use table or web page. The sampling distribution of and are both (approximately)
A Sampling Distribution
© 2008 McGraw-Hill Higher Education The Statistical Imagination Chapter 7. Using Probability Theory to Produce Sampling Distributions.
AP Statistics Chapter 9 Notes.
STA291 Statistical Methods Lecture 16. Lecture 15 Review Assume that a school district has 10,000 6th graders. In this district, the average weight of.
Estimation of Statistical Parameters
Topic 5 Statistical inference: point and interval estimate
Population All members of a set which have a given characteristic. Population Data Data associated with a certain population. Population Parameter A measure.
Review: Two Main Uses of Statistics 1)Descriptive : To describe or summarize a collection of data points The data set in hand = all the data points of.
LECTURE 16 TUESDAY, 31 March STA 291 Spring
Estimates and Sample Sizes Lecture – 7.4
Lecture 14 Dustin Lueker. 2  Inferential statistical methods provide predictions about characteristics of a population, based on information in a sample.
Introduction to Inferential Statistics. Introduction  Researchers most often have a population that is too large to test, so have to draw a sample from.
Sampling Distribution ● Tells what values a sample statistic (such as sample proportion) takes and how often it takes those values in repeated sampling.
Statistical Inference: Making conclusions about the population from sample data.
1 Estimation From Sample Data Chapter 08. Chapter 8 - Learning Objectives Explain the difference between a point and an interval estimate. Construct and.
Chapter 7 Estimation Procedures. Basic Logic  In estimation procedures, statistics calculated from random samples are used to estimate the value of population.
Copyright © 2012 by Nelson Education Limited. Chapter 6 Estimation Procedures 6-1.
Making Inferences. Sample Size, Sampling Error, and 95% Confidence Intervals Samples: usually necessary (some exceptions) and don’t need to be huge to.
The Normal Curve Theoretical Symmetrical Known Areas For Each Standard Deviation or Z-score FOR EACH SIDE:  34.13% of scores in distribution are b/t the.
Lecture 2 Review Probabilities Probability Distributions Normal probability distributions Sampling distributions and estimation.
DIRECTIONAL HYPOTHESIS The 1-tailed test: –Instead of dividing alpha by 2, you are looking for unlikely outcomes on only 1 side of the distribution –No.
Introduction to Confidence Intervals using Population Parameters Chapter 10.1 & 10.3.
Confidence Interval Estimation For statistical inference in decision making:
Statistics and Quantitative Analysis U4320 Segment 5: Sampling and inference Prof. Sharyn O’Halloran.
Review I A student researcher obtains a random sample of UMD students and finds that 55% report using an illegally obtained stimulant to study in the past.
1 Chapter 9: Sampling Distributions. 2 Activity 9A, pp
Inference: Probabilities and Distributions Feb , 2012.
Sampling Distributions Chapter 18. Sampling Distributions A parameter is a measure of the population. This value is typically unknown. (µ, σ, and now.
Summarizing Risk Analysis Results To quantify the risk of an output variable, 3 properties must be estimated: A measure of central tendency (e.g. µ ) A.
Chapter 13 Sampling distributions
1 VI. Why do samples allow inference? How sure do we have to be? How many do I need to be that sure? Sampling Distributions, Confidence Intervals, & Sample.
1 Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. Example: In a recent poll, 70% of 1501 randomly selected adults said they believed.
Chapter 7: The Distribution of Sample Means
Estimating a Population Proportion ADM 2304 – Winter 2012 ©Tony Quon.
Sampling Distributions Chapter 18. Sampling Distributions A parameter is a number that describes the population. In statistical practice, the value of.
 Normal Curves  The family of normal curves  The rule of  The Central Limit Theorem  Confidence Intervals  Around a Mean  Around a Proportion.
Design and Data Analysis in Psychology I English group (A) Salvador Chacón Moscoso Susana Sanduvete Chaves Milagrosa Sánchez Martín School of Psychology.
The Statistical Imagination Chapter 7. Using Probability Theory to Produce Sampling Distributions.
CHAPTER 6: SAMPLING, SAMPLING DISTRIBUTIONS, AND ESTIMATION Leon-Guerrero and Frankfort-Nachmias, Essentials of Statistics for a Diverse Society.
ESTIMATION.
Making inferences from collected data involve two possible tasks:
Sampling Distributions and Estimation
1. Estimation ESTIMATION.
Week 10 Chapter 16. Confidence Intervals for Proportions
Econ 3790: Business and Economics Statistics
Sampling Distributions
Presentation transcript:

INFERENTIAL STATISTICS  Samples are only estimates of the population  Sample statistics will be slightly off from the true values of its population’s parameters Sampling error:  The difference between a sample statistic and a population parameter  Probability theory Permits us to estimate the accuracy or representativeness of the sample

The “Catch-22” of Inferential Statistics  When we collect a sample, we know nothing about the population’s distribution of scores We can calculate the mean (x-bar) & standard deviation (s) of our sample, but  and  are unknown The shape of the population distribution (normal or skewed?) is also unknown

μ = ??? (N= Thousands ) Sample N = 150 Probability Theory Allows Us To Answer: What is the likelihood that a given sample statistic accurately represents a population parameter? X=9.6 Number of serious crimes committed in year prior to prison for inmates entering the prison system

Sampling Distribution (a.k.a. “Distribution of Sample Outcomes”)  “OUTCOMES” = proportions, means, etc.  From repeated random sampling, a mathematical description of all possible sampling event outcomes And the probability of each one  Permits us to make the link between sample and population… Answer the question: “What is the probability that a sample finding is due to chance?”

Relationship between Sample, Sampling Distribution & Population POPULATION SAMPLING DISTRIBUTION (Distribution of sample means, proportions, or other outcomes) SAMPLE Empirical (exists in reality) but unknown Nonempirical (theoretical or hypothetical) Laws of probability allow us to describe its characteristics (shape, central tendency, dispersion) Empirical & known (e.g., distribution shape, mean, standard deviation)

Sampling Distribution: Characteristics Central tendency  Sample means will cluster around the population mean  Since samples are random, the sample means should be distributed equally on either side of the population mean  The mean of the sampling distribution is always equal to the population mean Shape: Normal distribution  Central Limit Theorem:  Regardless of the shape of a raw score distribution (sample or population) of an interval-ratio variable, the sampling distribution will be approximately normal, as long as sample size is ≥ 100

Sampling Distribution: Characteristics  Dispersion: Standard Error (SE) Measures the spread of sampling error that occurs when a population is sampled repeatedly  Same thing as standard deviation of the sampling distribution  Tells exactly how much error, on average, should exist between the sample mean & the population mean  Formula: σ / √N  However, because σ usually isn’t known, s (sample standard deviation) is used to estimate population standard deviation

Sampling Distribution Standard Error Law of Large Numbers: The larger the sample size (N), the more probable it is that the sample mean will be close to the population mean  In other words: a big sample works better (should give a more accurate estimate of the pop.) than a small one  Makes sense if you study the formula for standard error

1. Estimation ESTIMATION

Introduction to Estimation Estimation procedures  Purpose: To estimate population parameters from sample statistics  Using the sampling distribution to infer from a sample to the population  Most commonly used for polling data  2 components: Point estimate Confidence intervals

Estimation Point Estimate: Value of a sample statistic used to estimate a population parameter Confidence Interval: A range of values around the point estimate Confidence Interval Point Estimate Confidence Limit (Lower) Confidence Limit (Upper)

Example CNN Poll (CNN.com; Feb 20, 2009): Slight majority thinks stimulus package will improve economy “The White House's economic stimulus plan isn't a surefire winner with the American public, but a majority does think the recovery plan will help. According to a new poll, fifty-three percent said the plan will improve economic conditions, while 44 percent said it won't stimulate the economy.” “On an individual level, there was less hope for improvement. According to the poll, 67 percent said it would not help them personally.” “The Poll was conducted Wednesday and Thursday (Feb 18-19, 2009), with 1,046 people questioned by telephone. The survey's sampling error is plus or minus 3 percentage points.”

Estimation  POINT ESTIMATES (another way of saying sample statistics)  CONFIDENCE INTERVAL a.k.a. “MARGIN OF ERROR” Indicates that over the long run, 95 percent of the time, the true pop. value will fall within a range of +/- 3  Point estimates & confidence interval should be reported together “… but a majority does think the recovery plan will help, according to a new poll. Fifty-three percent said the plan will improve economic conditions, while 44 percent said it won't stimulate the economy. …. The Poll was conducted Wednesday and Thursday (Feb 18-19, 2009), with 1,046 people questioned by telephone. The survey's sampling error is plus or minus 3 percentage points.

Estimation1 : Pick Confidence Level  Confidence LEVEL Probability that the unknown population parameter falls within the interval  Alpha (   The probability that the parameter is NOT within the interval   is the odds of making an error  Confidence level = 1 -  Conventionally, confidence level values are almost always 95%or 99%

Procedure for Constructing an Interval Estimate 2. Divide the probability of error equally into the upper and lower tails of the distribution (2.5% error in each tail with 95% confidence level)  Find the corresponding Z score  Z scores 

Procedure for Constructing an Interval Estimate 3. Construct the confidence interval Proportions (like the eavesdropping poll example):  Sample point estimate (convert % to a proportion):  “Fifty-three percent said the plan will improve economic conditions…”  0.53  Sample size (N) = 1,046  Formula 7.3 in Healey  Numerator = (your proportion) (1- proportion)  95% confidence level (replicating results from article)  99% confidence level – intervals widen as level of confidence increases

Example 1: Estimate for the economic recovery poll p =.53 (53% think it will help) Z = 1.96 (95% confidence interval) N = 1046 (sample size) What happens when we…  Recalculate for N = 10,000  N back to original, recalculate for p. =.90  Back to original, but change confidence level to 99%

Example 2 Houston Chronicle (2008) — A University of Texas poll to be released today shows Republican presidential candidate John McCain and GOP Sen. John Cornyn leading by comfortable margins in Texas, as expected. But the statewide survey of 550 registered voters has one very surprising finding: 23 percent of Texans are convinced that Democratic presidential nominee Barack Obama is a Muslim.  The Obama-is-a-Muslim confusion is caused by fallacious Internet rumors and radio talk-show gossip. McCain went so far at one of his town hall meetings to grab a microphone from a woman who claimed that Obama was an Arab. 1. GIVEN THIS INFO, IDENTIFY A POINT ESTIMATE & CALCULATE THE CONFIDENCE INTERVAL (ASSUMING A 95% CONFIDENCE LEVEL). 2. CALCULATE THE CONFIDENCE INTERVAL ASSUMING A 99% CONFIDENCE LEVEL

Sample means and proportions (like the.53 [53%] &.23 [23%]) are UNBIASED estimates of the population parameters  We know that the mean of the sampling distribution = the pop. Mean  Other sample statistics (such as standard deviation) are biased  The standard deviation of a sample is by definition smaller than the standard deviation of the population Bottom line: A good estimate is UNBIASED  Trustworthy estimator of the pop. parameter A Good Estimate is Unbiased

Efficiency  Refers to the extent to which the sampling distribution is clustered about its mean  Efficiency depends largely on sample size—as the sample size increases, the sampling distribution gets tighter (more narrow)  Remember from earlier—the sampling distribution is only normal with N>100  BOTTOM LINE: THE LESS SPREAD (THE SMALLER THE S.E.), THE BETTER A Good Estimate is Efficient

Estimation of Population Means EXAMPLE: A researcher has gathered information from a random sample of 178 households. Construct a confidence interval to estimate the population mean at the 95% level:  An average of 2.3 people reside in each household. Standard deviation is.35.

PROCEDURE FOR CONSTRUCTING AN INTERVAL ESTIMATE  A random sample of 429 college students was interviewed  They reported they had spent an average of $178 on textbooks during the previous semester. If the standard deviation (s) of these data is $15 construct an estimate of the population at the 95% confidence level.  They reported they had missed 2.8 days of class per semester because of illness. If the sample standard deviation is 1.0, construct an estimate of the population mean at the 99% confidence level.  Two individuals are running for mayor of Duluth. You conduct an election survey of 100 adult Duluth residents 1 week before the election and find that 45% of the sample support candidate Long Duck Dong, while 40% plan to vote for candidate Singalingdon. Using a 95% confidence level, based on your findings, can you predict a winner?

What influences confidence intervals?  The width of a confidence interval depends on three things  The confidence level can be raised (e.g., to 99%) or lowered (e.g., to 90%) N: we have more confidence in larger sample sizes so as N increases, the interval decreases Variation: more variation = more error  % agree closer to 50%  Higher standard deviations