Statistics for Social and Behavioral Sciences Session #15: Interval Estimation, Confidence Interval (Agresti and Finlay, Chapter 5) Prof. Amine Ouazad.

Slides:



Advertisements
Similar presentations
Chapter 23: Inferences About Means
Advertisements

Statistics for Social and Behavioral Sciences Session #16: Confidence Interval and Hypothesis Testing (Agresti and Finlay, from Chapter 5 to Chapter 6)
Statistics for Social and Behavioral Sciences Part IV: Causality Randomized Experiments, ANOVA Chapter 12, Section 12.1 Prof. Amine Ouazad.
Estimation in Sampling
Chap 8-1 Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chapter 8 Estimation: Single Population Statistics for Business and Economics.
Statistics and Quantitative Analysis U4320
Statistics for Social and Behavioral Sciences Session #11: Random Variable, Expectations (Agresti and Finlay, Chapter 4) Prof. Amine Ouazad.
Statistics for Social and Behavioral Sciences Session #9: Linear Regression and Conditional distribution Probabilities (Agresti and Finlay, Chapter 9)
1 Business 90: Business Statistics Professor David Mease Sec 03, T R 7:30-8:45AM BBC 204 Lecture 23 = Finish Chapter “Confidence Interval Estimation” (CIE)
1 Business 90: Business Statistics Professor David Mease Sec 03, T R 7:30-8:45AM BBC 204 Lecture 22 = More of Chapter “Confidence Interval Estimation”
1 Business 90: Business Statistics Professor David Mease Sec 03, T R 7:30-8:45AM BBC 204 Lecture 21 = Start Chapter “Confidence Interval Estimation” (CIE)
Chapter 8 Estimation: Single Population
1 Doing Statistics for Business Doing Statistics for Business Data, Inference, and Decision Making Marilyn K. Pelosi Theresa M. Sandifer Chapter 7 Sampling.
Fall 2006 – Fundamentals of Business Statistics 1 Business Statistics: A Decision-Making Approach 6 th Edition Chapter 7 Estimating Population Values.
Chapter 7 Estimating Population Values
Statistics for Social and Behavioral Sciences Part IV: Causality Association and Causality Session 22 Prof. Amine Ouazad.
Inferential Statistics
Midterm 1 Well done !! Mean 80.23% Median 84.6% Standard deviation of ppt. 5 th percentile is 53.
Business Statistics, A First Course (4e) © 2006 Prentice-Hall, Inc. Chap 8-1 Chapter 8 Confidence Interval Estimation Business Statistics, A First Course.
Statistics for Managers Using Microsoft® Excel 7th Edition
Review of normal distribution. Exercise Solution.
Statistics for Social and Behavioral Sciences Session #17: Hypothesis Testing: The Confidence Interval Method and the T-Statistic Method (Agresti and Finlay,
Statistics for Social and Behavioral Sciences Part IV: Causality Multivariate Regression Chapter 11 Prof. Amine Ouazad.
Statistics for Social and Behavioral Sciences Session #18: Literary Analysis using Tests (Agresti and Finlay, from Chapter 5 to Chapter 6) Prof. Amine.
MM207 Statistics Welcome to the Unit 8 Seminar Prof. Charles Whiffen.
Statistics for Social and Behavioral Sciences Session #14: Estimation, Confidence Interval (Agresti and Finlay, Chapter 5) Prof. Amine Ouazad.
June 18, 2008Stat Lecture 11 - Confidence Intervals 1 Introduction to Inference Sampling Distributions, Confidence Intervals and Hypothesis Testing.
Statistics for Social and Behavioral Sciences
Dan Piett STAT West Virginia University
Albert Morlan Caitrin Carroll Savannah Andrews Richard Saney.
Confidence Intervals for Means. point estimate – using a single value (or point) to approximate a population parameter. –the sample mean is the best point.
Estimates and Sample Sizes Lecture – 7.4
AP Statistics Chap 10-1 Confidence Intervals. AP Statistics Chap 10-2 Confidence Intervals Population Mean σ Unknown (Lock 6.5) Confidence Intervals Population.
Statistics for Social and Behavioral Sciences Session #6: The Regression Line C’ted (Agresti and Finlay, Chapter 9) Prof. Amine Ouazad.
Statistics for Social and Behavioral Sciences Part IV: Causality Multivariate Regression R squared, F test, Chapter 11 Prof. Amine Ouazad.
Copyright ©2011 Nelson Education Limited Large-Sample Estimation CHAPTER 8.
STA Lecture 181 STA 291 Lecture 18 Exam II Next Tuesday 5-7pm Memorial Hall (Same place) Makeup Exam 7:15pm – 9:15pm Location TBA.
Statistics for Social and Behavioral Sciences Part IV: Causality Inference for Slope and Correlation Section 9.5 Prof. Amine Ouazad.
Confidence Intervals: The Basics BPS chapter 14 © 2006 W.H. Freeman and Company.
July, 2000Guang Jin Statistics in Applied Science and Technology Chapter 7 - Sampling Distribution of Means.
Determination of Sample Size: A Review of Statistical Theory
Estimation Chapter 8. Estimating µ When σ Is Known.
Chap 7-1 A Course In Business Statistics, 4th © 2006 Prentice-Hall, Inc. A Course In Business Statistics 4 th Edition Chapter 7 Estimating Population Values.
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Chap 8-1 Confidence Interval Estimation.
Copyright © 2010 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Chapter 8 Confidence Intervals.
STA Lecture 191 STA 291 Lecture 19 Exam II Next Tuesday 5-7pm Memorial Hall (Same place as exam I) Makeup Exam 7:15pm – 9:15pm Location CB 234.
CHAPTER-6 Sampling error and confidence intervals.
Statistics for Social and Behavioral Sciences Part IV: Causality Comparison of two groups Chapter 7 Prof. Amine Ouazad.
Lesson 9 - R Chapter 9 Review.
Chap 7-1 A Course In Business Statistics, 4th © 2006 Prentice-Hall, Inc. A Course In Business Statistics 4 th Edition Chapter 7 Estimating Population Values.
Chapter 7 Statistical Inference: Estimating a Population Mean.
What is a Confidence Interval?. Sampling Distribution of the Sample Mean The statistic estimates the population mean We want the sampling distribution.
Understanding Basic Statistics
Review Normal Distributions –Draw a picture. –Convert to standard normal (if necessary) –Use the binomial tables to look up the value. –In the case of.
Medical Statistics Medical Statistics Tao Yuchun Tao Yuchun 5
Chapter 8, continued.... III. Interpretation of Confidence Intervals Remember, we don’t know the population mean. We take a sample to estimate µ, then.
Point Estimates point estimate A point estimate is a single number determined from a sample that is used to estimate the corresponding population parameter.
Statistics for Social and Behavioral Sciences Session #19: Estimation and Hypothesis Testing, Wrap-up & p-value (Agresti and Finlay, from Chapter 5 to.
10.1 – Estimating with Confidence. Recall: The Law of Large Numbers says the sample mean from a large SRS will be close to the unknown population mean.
ESTIMATION OF THE MEAN. 2 INTRO :: ESTIMATION Definition The assignment of plausible value(s) to a population parameter based on a value of a sample statistic.
Review Confidence Intervals Sample Size. Estimator and Point Estimate An estimator is a “sample statistic” (such as the sample mean, or sample standard.
AGENDA: QUIZ # minutes30 minutes Work Day PW Lesson #11:CLT Begin Unit 1 Lesson #12 UNIT 1 TEST 1 Thursday DG9 Fri.
Chapter 6 Test Review z area ararea ea
Chapter Seven Point Estimation and Confidence Intervals.
Confidence Interval for a Population Mean Estimating p Estimating  (  known) Estimating  (  unknown) Sample Size.
Statistical Estimation
From Samples to Populations
Estimates and Sample Sizes Lecture – 7.4
How Confident Are You?.
Presentation transcript:

Statistics for Social and Behavioral Sciences Session #15: Interval Estimation, Confidence Interval (Agresti and Finlay, Chapter 5) Prof. Amine Ouazad

Statistics Course Outline P ART I. I NTRODUCTION AND R ESEARCH D ESIGN P ART II. D ESCRIBING DATA P ART III. D RAWING CONCLUSIONS FROM DATA : I NFERENTIAL S TATISTICS P ART IV. : C ORRELATION AND C AUSATION : R EGRESSION A NALYSIS Week 1 Weeks 2-4 Weeks 5-9 Weeks This is where we talk about Zmapp and Ebola! Firenze or Lebanese Express’s ratings are within a MoE of each other!

Last Session: Inference A conservative Margin of Error (= 2 standard errors) for Cafe Firenze’s restaurant rating is 1.1 with 14 votes. For any rating from 1 to 5, the largest possible Margin of Error is 4/√N, where N is the number of ratings. With TripAdvisor, we see the rating of each individual customer, and so we can calculate s X ! Central Limit Theorem: with a large sample size N, the sampling distribution of the sample mean is approximately normal. The mean of the sampling distribution is the population mean. The standard deviation of the sampling distribution is  X /√N, where  X is the standard deviation of X. Central Limit Theorem: with a large sample size N, the sampling distribution of the sample mean is approximately normal. The mean of the sampling distribution is the population mean. The standard deviation of the sampling distribution is  X /√N, where  X is the standard deviation of X.

Today Use this margin of error to provide interval estimates: – A 95% confidence interval for Café Firenze is [2.3,4.5]. – “The true rating of Café Firenze is between 2.3 and 4.5 with probability 95%”. – Note: average was 3.4 and MoE was 1.1. – A 95% confidence interval for Cory Gardner’s vote share in Colorado is [48-3.6,48+3.6]=[44.4,51.6]. – “The true vote share for Cory Gardner is between 42.9% of the vote and 50.1% of the vote with 95% probability”. – Note: MoE was 3.6.

News: Last Tuesday We learnt the population proportion  !!! – Proportion of voters for Cory Gardner. The latest poll was giving us a sample proportion of the vote p (N around 1000).

Outline 1.Interval Estimation Confidence Interval 2.Choosing between 90, 95, 99% confidence 3.When distributions are normal: t-distribution Next time:Estimation, Confidence Intervals (continued) Chapter 5 of A&F

Parameters and Interval Estimate An interval estimate is an interval of numbers around the point estimate, which includes the parameter with probability either 90%, 95%, or 99%. Example: “the interval estimate [156.2 cm – 0.49cm ; cm cm] includes the population average height with probability 95%.” Sample mean: 156.2cm, MoE = 0.49 cm.

Parameters and Interval Estimate An interval estimate that includes the parameter with probability 95% is called a 95% confidence interval. The expression “95% confidence interval” is widely used. Example: “[156.2 cm – 0.49cm ; cm cm] is a 95% confidence interval for the population average height.” Sample mean: 156.2cm, MoE = 0.49 cm.

How do we build a 95% confidence interval? Goal: estimate the population average . From previous sessions: [  – MoE ;  + MoE] includes the sample mean with probability 95%. We conclude: the interval [m – MoE; m+MoE] includes the population mean with probability 95%. [m – MoE; m+MoE] is a 95% confidence interval for . MoE = 1.96 x Standard Error Standard Error = sX/√N We use 1.96 instead of 2 from now on.

Outline 1.Interval Estimation Confidence Interval 2.Choosing between 90, 95, 99% confidence 3.When distributions are normal: t-distribution Next time:Estimation, Confidence Intervals (continued) Chapter 5 of A&F

Choosing between 90%, 95%, 99% The interval estimate [Sample Mean – MoE, Sample Mean + MoE] includes the population mean (the parameter) with probability: 99% if MoE = 2.58 * Standard Error 95% if MoE = 1.96 * Standard Error 90% if MoE = 1.65 * Standard Error The width of a confidence interval: 1.Increases as the confidence level increases. 2.Decreases as the sample size increases.

Building 90%, 95%, 99% confidence intervals Exercise: The sample mean weight (a sample of individuals in the US) is 60.0 kg, and the sample standard deviation is 29.9 kg. Find a 90% (resp., 95%, 99%) confidence interval for the population mean weight.

Why 90%, 95%, 99%? Invented by Jerzy Newman in the 1930s. R.A. Fisher developed the theory of statistical testing. Sample sizes were small at the time (a few hundred), and 95% seemed a reasonable confidence level. Medical sciences introduced confidence intervals in medicine soon after their discoveries. 95% became the standard. R.A. Fisher

Outline 1.Interval Estimation Confidence Interval 2.Choosing between 90, 95, 99% confidence 3.When distributions are normal: t-distribution Next time:Estimation, Confidence Intervals (continued) Chapter 5 of A&F

Central Limit Theorem Requires a large sample size N. This is because it applies to any distribution of X. Example #1: – We had a sample of N songs, and the number of times X i that song had been played. – The number of times X i a song is played on Spotify does not have a normal distribution. – But we can build a confidence interval for the average number of times a song is played (  ), provided we have a large enough number N of songs. – MoE = 1.96 *  X /√N for a 95% confidence interval.

We can use our formulas to find a 95% confidence interval for m= as: N is large. Even though X does not have a normal distribution.

What if N is small? If N is “small”, the Central Limit Theorem does not apply…. – We cannot use our formulas. “Small” ? Less than a few hundred (from experience). If N is very small: These sampling distributions are not normal. N=2 N=5

If N is small s X is potentially very far from  x. But… we can still find confidence intervals if X is normal. The sampling distribution of the sample mean is Student’s t distribution, with degrees of freedom (df) equal to N-1, and with standard deviation s x /√N.

If N is small A 95% confidence interval for the sample mean is: [Sample Mean – MoE, Sample Mean + MoE] With MoE = z * Standard Error. z= 1.96 when the df = ∞ z> 1.96 when the df are small. See next table for the exact value of z.

t Table

Why is it called Student’s t distribution? The t distribution was allegedly invented by a person called Student. That “Student” was an engineer at Guinness’s Factories in Ireland: William Sealy Gossett. He was producing small samples of a drink, seeking guidance for industrial quality control: – He was trying a small number of samples (N=2,4, perhaps 7). – And from these samples was trying to infer the quality of all containers of the product (the population). W.S. Gosset and Some Neglected Concepts in Experimental Statistics: Guinnessometrics II, Stephen T. Ziliak, 2011.

Wrap up Interval estimates for a population mean (a parameter) when N is large, for any distribution of X. Build a confidence interval for a parameter: the interval [Sample Mean – MoE ; Sample + MoE] includes the parameter with probability: 99% if MoE = 2.58 * Standard Error 95% if MoE = 1.96 * Standard Error 90% if MoE = 1.65 * Standard Error The t-distribution gives confidence intervals when the sample size N is small… and when the distribution of X is normal. Use z given by Table 5.1 of Agresti and Finlay for degrees of freedom N-1.

Coming up: Readings: This week and next week: – Chapter 5 entirely – estimation, confidence intervals. Online quiz deadline Tuesday 9am. Deadlines are sharp and attendance is followed. For help: Amine Ouazad Office 1135, Social Science building Office hour: Tuesday from 5 to 6.30pm. GAF: Irene Paneda Sunday recitations. At the Academic Resource Center, Monday from 2 to 4pm.