Statistics for Social and Behavioral Sciences Session #14: Estimation, Confidence Interval (Agresti and Finlay, Chapter 5) Prof. Amine Ouazad.

Slides:



Advertisements
Similar presentations
Statistics for Social and Behavioral Sciences Session #16: Confidence Interval and Hypothesis Testing (Agresti and Finlay, from Chapter 5 to Chapter 6)
Advertisements

Statistics for Social and Behavioral Sciences Part IV: Causality Randomized Experiments, ANOVA Chapter 12, Section 12.1 Prof. Amine Ouazad.
Sampling: Final and Initial Sample Size Determination
1 Virtual COMSATS Inferential Statistics Lecture-7 Ossam Chohan Assistant Professor CIIT Abbottabad.
Statistics for Social and Behavioral Sciences Session #11: Random Variable, Expectations (Agresti and Finlay, Chapter 4) Prof. Amine Ouazad.
Statistics for Social and Behavioral Sciences Session #9: Linear Regression and Conditional distribution Probabilities (Agresti and Finlay, Chapter 9)
Slide 9- 1 Copyright © 2010 Pearson Education, Inc. Active Learning Lecture Slides For use with Classroom Response Systems Business Statistics First Edition.
1 BA 275 Quantitative Business Methods Statistical Inference: Confidence Interval Estimation Estimating the population mean  Margin of Error Sample Size.
Chapter 7: Variation in repeated samples – Sampling distributions
1 Doing Statistics for Business Doing Statistics for Business Data, Inference, and Decision Making Marilyn K. Pelosi Theresa M. Sandifer Chapter 7 Sampling.
Stat 321 – Day 23 Point Estimation (6.1). Last Time Confidence interval for  vs. prediction interval  One-sample t Confidence interval in Minitab Needs.
Today Today: Chapter 8, start Chapter 9 Assignment: Recommended Questions: 9.1, 9.8, 9.20, 9.23, 9.25.
Statistics for Social and Behavioral Sciences Part IV: Causality Association and Causality Session 22 Prof. Amine Ouazad.
Midterm 1 Well done !! Mean 80.23% Median 84.6% Standard deviation of ppt. 5 th percentile is 53.
QUIZ CHAPTER Seven Psy302 Quantitative Methods. 1. A distribution of all sample means or sample variances that could be obtained in samples of a given.
Standard error of estimate & Confidence interval.
Statistics for Social and Behavioral Sciences Session #15: Interval Estimation, Confidence Interval (Agresti and Finlay, Chapter 5) Prof. Amine Ouazad.
Estimation Goal: Use sample data to make predictions regarding unknown population parameters Point Estimate - Single value that is best guess of true parameter.
Statistics for Social and Behavioral Sciences Session #17: Hypothesis Testing: The Confidence Interval Method and the T-Statistic Method (Agresti and Finlay,
Statistics for Social and Behavioral Sciences Part IV: Causality Multivariate Regression Chapter 11 Prof. Amine Ouazad.
Statistics for Social and Behavioral Sciences Session #18: Literary Analysis using Tests (Agresti and Finlay, from Chapter 5 to Chapter 6) Prof. Amine.
MM207 Statistics Welcome to the Unit 8 Seminar Prof. Charles Whiffen.
Chapter 7 Estimation: Single Population
Statistics for Social and Behavioral Sciences
McGraw-Hill/IrwinCopyright © 2009 by The McGraw-Hill Companies, Inc. All Rights Reserved. Chapter 7 Sampling Distributions.
Chapter 11: Estimation Estimation Defined Confidence Levels
STA Lecture 161 STA 291 Lecture 16 Normal distributions: ( mean and SD ) use table or web page. The sampling distribution of and are both (approximately)
Chapter 8: Confidence Intervals
McGraw-Hill/Irwin Copyright © 2007 by The McGraw-Hill Companies, Inc. All rights reserved. Chapter 6 Sampling Distributions.
AP Statistics Chapter 9 Notes.
LECTURE 16 TUESDAY, 31 March STA 291 Spring
Estimates and Sample Sizes Lecture – 7.4
Lecture 12 Statistical Inference (Estimation) Point and Interval estimation By Aziza Munir.
Statistics for Social and Behavioral Sciences Session #6: The Regression Line C’ted (Agresti and Finlay, Chapter 9) Prof. Amine Ouazad.
1 BA 275 Quantitative Business Methods Confidence Interval Estimation Estimating the Population Proportion Hypothesis Testing Elements of a Test Concept.
Statistics for Social and Behavioral Sciences Part IV: Causality Multivariate Regression R squared, F test, Chapter 11 Prof. Amine Ouazad.
Copyright © 2010 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Chapter 7 Sampling Distributions.
Statistics for Social and Behavioral Sciences Part IV: Causality Inference for Slope and Correlation Section 9.5 Prof. Amine Ouazad.
Sullivan – Fundamentals of Statistics – 2 nd Edition – Chapter 3 Section 2 – Slide 1 of 27 Chapter 3 Section 2 Measures of Dispersion.
Chapter 7 Probability and Samples: The Distribution of Sample Means
Determination of Sample Size: A Review of Statistical Theory
Chapter 7 Sampling and Sampling Distributions ©. Simple Random Sample simple random sample Suppose that we want to select a sample of n objects from a.
Estimation Chapter 8. Estimating µ When σ Is Known.
Statistics for Social and Behavioral Sciences Part IV: Causality Comparison of two groups Chapter 7 Prof. Amine Ouazad.
Chapter 7 Point Estimation of Parameters. Learning Objectives Explain the general concepts of estimating Explain important properties of point estimators.
Ex St 801 Statistical Methods Inference about a Single Population Mean.
Sampling Distributions Chapter 18. Sampling Distributions A parameter is a measure of the population. This value is typically unknown. (µ, σ, and now.
Review Normal Distributions –Draw a picture. –Convert to standard normal (if necessary) –Use the binomial tables to look up the value. –In the case of.
Statistics for Social and Behavioral Sciences Session #19: Estimation and Hypothesis Testing, Wrap-up & p-value (Agresti and Finlay, from Chapter 5 to.
Chapter 9 Inferences Based on Two Samples: Confidence Intervals and Tests of Hypothesis.
Ex St 801 Statistical Methods Inference about a Single Population Mean (CI)
Review Confidence Intervals Sample Size. Estimator and Point Estimate An estimator is a “sample statistic” (such as the sample mean, or sample standard.
Chapter 8 Estimation ©. Estimator and Estimate estimator estimate An estimator of a population parameter is a random variable that depends on the sample.
Hypothesis Testing and Statistical Significance
LECTURE 26 TUESDAY, 24 NOVEMBER STA291 Fall
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 7 Inferences Concerning Means.
Chapter 9 Estimation using a single sample. What is statistics? -is the science which deals with 1.Collection of data 2.Presentation of data 3.Analysis.
Chapter 9 Lesson 9.1 Estimation Using a Simple Sample 9.1: Point Estimation.
STA 291 Spring 2010 Lecture 12 Dustin Lueker.
STAT 312 Chapter 7 - Statistical Intervals Based on a Single Sample
Inference: Conclusion with Confidence
Sampling Distributions
Sampling Distributions
Statistics in Applied Science and Technology
Elementary Statistics: Picturing The World

Estimation Goal: Use sample data to make predictions regarding unknown population parameters Point Estimate - Single value that is best guess of true parameter.
CHAPTER 15 SUMMARY Chapter Specifics
From Samples to Populations
How Confident Are You?.
Presentation transcript:

Statistics for Social and Behavioral Sciences Session #14: Estimation, Confidence Interval (Agresti and Finlay, Chapter 5) Prof. Amine Ouazad

Statistics Course Outline P ART I. I NTRODUCTION AND R ESEARCH D ESIGN P ART II. D ESCRIBING DATA P ART III. D RAWING CONCLUSIONS FROM DATA : I NFERENTIAL S TATISTICS P ART IV. : C ORRELATION AND C AUSATION : R EGRESSION A NALYSIS Week 1 Weeks 2-4 Weeks 5-9 Weeks This is where we talk about Zmapp and Ebola! Firenze or Lebanese Express now

Last 2 Sessions A statistic is a random variable. The distribution of a statistic is called its sampling distribution. In particular the mean of a variable in a sample is a statistic. The expected value of the sample mean is equal to the true mean. The standard deviation of the sample mean is called the standard error. Central Limit theorem: with a large sample size, the sampling distribution of the mean of X is normal, and the empirical rule applies. The standard error is  X / √N.

Last 2 Sessions For a proportion (X is 0,1):  X = √(  (1-  ) ). As we typically do not observe the true proportion , but the sample proportion p. For other variables (X is not 0,1): As we do not observe the true standard deviation  X but rather the sample standard deviation s X, we approximate  X by s X and thus approximate the standard error by s X / √N. We are interested in estimating parameters, but we only observe statistics. Can we use statistics as estimators?

Outline 1.Back to Zomato Just applying the formulas we know 2.Estimators: Point Estimator Biased vs Unbiased Estimators Efficient vs Inefficient Estimators Interval Estimator Next time:Estimation, Confidence Intervals (continued) Chapter 5 of A&F

Back to Zomato 1.What statistical issue would preclude us from using the Central Limit Theorem? 2.Assuming we can use the CLT, what is the Margin of Error on Cafe Firenze and Lebanese Express’s ratings? Think !!

Questions: 1.When rating a restaurant, what are the possible choices for the user? 2.What is 3.4 on this rating? 3.What are we trying to estimate? 4.What is the formula for the standard error of ratings? Is a rating X a 0,1 variable? 5.What is the standard deviation sX of ratings? 6.Finally what is the standard error of the rating 3.4? 7.And what is the margin of error for the rating 3.4? (MoE = twice the standard error)

Recap: Central Limit Theorem Central Limit Theorem: with large sample size, the distribution of the sample mean is normal, with mean the true mean and with standard deviation (=standard error) equal to: X is not 0,1: Approximate the true standard deviation  X using the sample standard deviation s X. X is 0,1: Approximate  X = √(  (1-  ) ), where  is the true proportion, using the sample proportion for p. Café Firenze’s case

Back to Zomato If we had all the ratings of individual users: – John3 “Hated it, service is poor” – Abdullah4“Great venue” – Anthony5“Perfect, loved the al dente pasta” – Claire3“Ok for a downtown lunch” – Al Bloom3“The italian restaurant of the world” – John Sexton3“Can achieve more” – Ayesha3“There are alternatives” The average is 3.4, and we would find s X =…………….

Zomato Problemo The website only reports the sample mean of ratings… We thus have to figure out a conservative of s X (the largest possible). What is the highest possible s x ?

Outline 1.Back to Zomato Just applying the formulas we know 2.Estimators: Point Estimate Biased vs Unbiased Estimators Efficient vs Inefficient Estimators Interval Estimate Next time:Estimation, Confidence Intervals (continued) Chapter 5 of A&F

Parameters and their point estimates Parameters (« True » values)Point Estimate Population mean  Example: Population mean rating of Cafe Firenze Sample mean m Sample mean rating of Cafe Firenze Population medianSample median Population standard deviation  X Example: Population standard deviation of ratings of Cafe Firenze Sample standard deviation s X. Sample standard deviation of ratings of Cafe Firenze Population variance  X 2 Sample variance s X 2 Population p-th percentileSample p-th percentile This is called a “point estimate” because we give a single number (a “point” on the axis).

Biased vs Unbiased Estimator We have seen that to get the standard error of the sample mean, we need to have an estimate of  X. So far we have used: And the textbook has given: These are two different estimators of the same quantity  X. The textbook’s estimator of  X is unbiased. These two formulas are “point estimates”.

Efficient vs Inefficient Estimator Among all possible estimators, an estimator is efficient if it has the smallest standard error. The standard error of Is smaller than the standard error of The slides’ version is efficient, while the textbook’s version is unbiased. There is a conundrum. These two formulas are “point estimates”.

What do you actually need to remember? “Good” estimators are unbiased and efficient. – The sample mean is an unbiased and efficient estimator of the population mean. “Less good” estimators may be either unbiased or efficient. – The sample standard deviation with denominator N-1 is unbiased but inefficient. – The sample standard deviation with denominator N is biased but efficient. – We keep using the formula we learnt…

Parameters and Interval Estimate An interval estimate is an interval of numbers around the point estimate, which includes the parameter with probability either 90%, 95%, or 99%. Example: “the interval estimate [156.2 cm – 0.49cm ; cm cm] includes the population average height with probability 95%.”

Parameters and Interval Estimate An interval estimate that includes the parameter with probability 95% is called a 95% confidence interval. The expression “95% confidence interval” is widely used. Example: “[156.2 cm – 0.49cm ; cm cm] is a 95% confidence interval for the population average height.”

How do we build a 95% confidence interval? Goal: estimate the population average . From previous session: [  – MoE ;  + MoE] includes the sample mean with probability 95%. We conclude: the interval [m – MoE; m+MoE] includes the population mean with probability 95%. [m – MoE; m+MoE] is a 95% confidence interval for . MoE = 1.96 x Standard Error Standard Error = sX/√N

Wrap up Central Limit theorem: with a large sample size, the sampling distribution of the sample mean of X is normal, and the empirical rule applies. The standard error is the standard deviation of the sampling distribution  X / √N. For a proportion:  X = √(  (1-  ) ). As we typically do not observe the true proportion , but the sample proportion p. For other variables: As we do not observe the true standard deviation  X but rather the sample standard deviation s X, we approximate the standard error by s X / √N. We are interested in estimating parameters, but we only observe statistics. Can we use statistics as estimators? Estimators can be unbiased, and efficient.

Coming up: Readings: This week and next week: – Chapter 5 entirely – estimation, confidence intervals. – Understand the confidence interval, the point estimate. Online quiz on Thursday. Deadlines are sharp and attendance is followed. Tonight is the midterm election!! Watch : For help: Amine Ouazad Office 1135, Social Science building Office hour: Tuesday from 5 to 6.30pm. GAF: Irene Paneda Sunday recitations. At the Academic Resource Center, Monday from 2 to 4pm.