Statistics for Social and Behavioral Sciences Session #16: Confidence Interval and Hypothesis Testing (Agresti and Finlay, from Chapter 5 to Chapter 6)

Slides:



Advertisements
Similar presentations
Probability models- the Normal especially.
Advertisements

1 COMM 301: Empirical Research in Communication Lecture 15 – Hypothesis Testing Kwan M Lee.
Lecture (11,12) Parameter Estimation of PDF and Fitting a Distribution Function.
Chapter 6 Sampling and Sampling Distributions
Statistics for Social and Behavioral Sciences Part IV: Causality Randomized Experiments, ANOVA Chapter 12, Section 12.1 Prof. Amine Ouazad.
1 1 Slide © 2009 Thomson South-Western. All Rights Reserved Slides by JOHN LOUCKS St. Edward’s University.
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Chapter 9 Hypothesis Testing Developing Null and Alternative Hypotheses Developing Null and.
Statistics and Quantitative Analysis U4320
Chapter 10 Section 2 Hypothesis Tests for a Population Mean
Statistics for Social and Behavioral Sciences Session #11: Random Variable, Expectations (Agresti and Finlay, Chapter 4) Prof. Amine Ouazad.
Statistics for Social and Behavioral Sciences Session #9: Linear Regression and Conditional distribution Probabilities (Agresti and Finlay, Chapter 9)
EPIDEMIOLOGY AND BIOSTATISTICS DEPT Esimating Population Value with Hypothesis Testing.
Hypothesis Testing Steps of a Statistical Significance Test. 1. Assumptions Type of data, form of population, method of sampling, sample size.
Chapter 7 Sampling and Sampling Distributions
Final Jeopardy $100 $200 $300 $400 $500 $100 $200 $300 $400 $500 $100 $200 $300 $400 $500 $100 $200 $300 $400 $500 $100 $200 $300 $400 $500 LosingConfidenceLosingConfidenceTesting.
Inferences About Means of Single Samples Chapter 10 Homework: 1-6.
Chapter 3 Hypothesis Testing. Curriculum Object Specified the problem based the form of hypothesis Student can arrange for hypothesis step Analyze a problem.
BCOR 1020 Business Statistics Lecture 21 – April 8, 2008.
Part III: Inference Topic 6 Sampling and Sampling Distributions
Chapter 11: Inference for Distributions
BCOR 1020 Business Statistics Lecture 18 – March 20, 2008.
Chapter 9 Hypothesis Testing II. Chapter Outline  Introduction  Hypothesis Testing with Sample Means (Large Samples)  Hypothesis Testing with Sample.
“There are three types of lies: Lies, Damn Lies and Statistics” - Mark Twain.
Statistics for Social and Behavioral Sciences Part IV: Causality Association and Causality Session 22 Prof. Amine Ouazad.
Chapter 9 Hypothesis Testing II. Chapter Outline  Introduction  Hypothesis Testing with Sample Means (Large Samples)  Hypothesis Testing with Sample.
Statistics for Managers Using Microsoft® Excel 7th Edition
Statistics for Social and Behavioral Sciences Session #15: Interval Estimation, Confidence Interval (Agresti and Finlay, Chapter 5) Prof. Amine Ouazad.
Statistics for Social and Behavioral Sciences Session #17: Hypothesis Testing: The Confidence Interval Method and the T-Statistic Method (Agresti and Finlay,
Statistics for Social and Behavioral Sciences Part IV: Causality Multivariate Regression Chapter 11 Prof. Amine Ouazad.
Overview Definition Hypothesis
Statistics for Social and Behavioral Sciences Session #18: Literary Analysis using Tests (Agresti and Finlay, from Chapter 5 to Chapter 6) Prof. Amine.
Review of Statistical Inference Prepared by Vera Tabakova, East Carolina University ECON 4550 Econometrics Memorial University of Newfoundland.
STAT 5372: Experimental Statistics Wayne Woodward Office: Office: 143 Heroy Phone: Phone: (214) URL: URL: faculty.smu.edu/waynew.
Statistics for Social and Behavioral Sciences Session #14: Estimation, Confidence Interval (Agresti and Finlay, Chapter 5) Prof. Amine Ouazad.
Sullivan – Fundamentals of Statistics – 2 nd Edition – Chapter 11 Section 2 – Slide 1 of 25 Chapter 11 Section 2 Inference about Two Means: Independent.
More About Significance Tests
June 18, 2008Stat Lecture 11 - Confidence Intervals 1 Introduction to Inference Sampling Distributions, Confidence Intervals and Hypothesis Testing.
Fundamentals of Data Analysis Lecture 4 Testing of statistical hypotheses.
Statistics for Social and Behavioral Sciences
Chapter 9 Hypothesis Testing and Estimation for Two Population Parameters.
Chapter 9 Hypothesis Testing II: two samples Test of significance for sample means (large samples) The difference between “statistical significance” and.
Business Research Methods William G. Zikmund
6.1 - One Sample One Sample  Mean μ, Variance σ 2, Proportion π Two Samples Two Samples  Means, Variances, Proportions μ 1 vs. μ 2.
1 1 Slide © 2007 Thomson South-Western. All Rights Reserved OPIM 303-Lecture #7 Jose M. Cruz Assistant Professor.
1 1 Slide © 2007 Thomson South-Western. All Rights Reserved Chapter 9 Hypothesis Testing Developing Null and Alternative Hypotheses Developing Null and.
Statistics for the Behavioral Sciences Second Edition Chapter 11: The Independent-Samples t Test iClicker Questions Copyright © 2012 by Worth Publishers.
The Practice of Statistics Third Edition Chapter 10: Estimating with Confidence Copyright © 2008 by W. H. Freeman & Company Daniel S. Yates.
Statistics for Social and Behavioral Sciences Part IV: Causality Multivariate Regression R squared, F test, Chapter 11 Prof. Amine Ouazad.
Normal Distr Practice Major League baseball attendance in 2011 averaged 30,000 with a standard deviation of 6,000. i. What percentage of teams had between.
Statistics for Social and Behavioral Sciences Part IV: Causality Inference for Slope and Correlation Section 9.5 Prof. Amine Ouazad.
1 Psych 5500/6500 The t Test for a Single Group Mean (Part 1): Two-tail Tests & Confidence Intervals Fall, 2008.
Statistical Hypotheses & Hypothesis Testing. Statistical Hypotheses There are two types of statistical hypotheses. Null Hypothesis The null hypothesis,
1 Chapter 9 Hypothesis Testing. 2 Chapter Outline  Developing Null and Alternative Hypothesis  Type I and Type II Errors  Population Mean: Known 
Statistics for Social and Behavioral Sciences Part IV: Causality Comparison of two groups Chapter 7 Prof. Amine Ouazad.
Chapter 8 Parameter Estimates and Hypothesis Testing.
Ex St 801 Statistical Methods Inference about a Single Population Mean.
Sullivan – Fundamentals of Statistics – 2 nd Edition – Chapter 11 Section 1 – Slide 1 of 26 Chapter 11 Section 1 Inference about Two Means: Dependent Samples.
Statistics for Social and Behavioral Sciences Session #19: Estimation and Hypothesis Testing, Wrap-up & p-value (Agresti and Finlay, from Chapter 5 to.
Copyright © 1998, Triola, Elementary Statistics Addison Wesley Longman 1 Assumptions 1) Sample is large (n > 30) a) Central limit theorem applies b) Can.
MATB344 Applied Statistics I. Experimental Designs for Small Samples II. Statistical Tests of Significance III. Small Sample Test Statistics Chapter 10.
Descriptive and Inferential Statistics Descriptive statistics The science of describing distributions of samples or populations Inferential statistics.
Statistical Inference for the Mean Objectives: (Chapter 8&9, DeCoursey) -To understand the terms variance and standard error of a sample mean, Null Hypothesis,
AP Statistics Chapter 11 Section 1. TestConfidence Interval FormulasAssumptions 1-sample z-test mean SRS Normal pop. Or large n (n>40) Know 1-sample t-test.
T-TEST. Outline  Introduction  T Distribution  Example cases  Test of Means-Single population  Test of difference of Means-Independent Samples 
Chapter 9 Hypothesis Testing Understanding Basic Statistics Fifth Edition By Brase and Brase Prepared by Jon Booze.
Chapter 6 Sampling and Sampling Distributions
Chapter 9 -Hypothesis Testing
Slides by JOHN LOUCKS St. Edward’s University.
Lecture Nine - Twelve Tests of Significance.
Presentation transcript:

Statistics for Social and Behavioral Sciences Session #16: Confidence Interval and Hypothesis Testing (Agresti and Finlay, from Chapter 5 to Chapter 6) Prof. Amine Ouazad

Statistics Course Outline P ART I. I NTRODUCTION AND R ESEARCH D ESIGN P ART II. D ESCRIBING DATA P ART III. D RAWING CONCLUSIONS FROM DATA : I NFERENTIAL S TATISTICS P ART IV. : C ORRELATION AND C AUSATION : R EGRESSION A NALYSIS Week 1 Weeks 2-4 Weeks 5-9 Weeks This is where we talk about Zmapp and Ebola! Firenze or Lebanese Express’s ratings are within a MoE of each other!

Last Session Interval estimates for a population mean (a parameter) when N is large, for any distribution of X. Build a confidence interval for a parameter: the interval [Sample Mean – MoE ; Sample + MoE] includes the parameter with probability: 99% if MoE = 2.58 * Standard Error 95% if MoE = 1.96 * Standard Error 90% if MoE = 1.65 * Standard Error

Today What happens if N is small? – The Central Limit Theorem does not apply. – If X is normally distributed, then we have a way of getting the sampling distribution of the sample mean…. And thus build confidence intervals, standard errors. – The sample mean follows a t-distribution. Hypothesis testing in Statistics: the Foundation of (Social) Sciences. – The Confidence Interval method of testing  =v.

Outline 1.Small samples, normal distribution: t-distribution 2.Testing hypothesis: The Foundation of (Social) Sciences Next time:t test of mean and proportion Chapter 6 of A&F

Central Limit Theorem Requires a large sample size N. This is because it applies to any distribution of X. Example: – We had a sample of N songs, and the number of times X i that song had been played. – The number of times X i a song is played on Spotify does not have a normal distribution. – But we can build a confidence interval for the average number of times a song is played (  ), provided we have a large enough number N of songs. – MoE = 1.96 *  X /√N for a 95% confidence interval.

We can use our formulas to find a 95% confidence interval for m= as: N is large. Even though X does not have a normal distribution. N is large !! The distribution of X may be different from a normal distribution.

What if N is small? If N is “small”, the Central Limit Theorem does not apply…. – We cannot use our formulas. “Small” ? Less than a few hundred (from experience). If N is very small: These sampling distributions are not normal. N=2 N=5

If N is small s X is potentially very far from  x. But… we can still find confidence intervals if X is normal. The sampling distribution of the sample mean is Student’s t distribution, with degrees of freedom (df) equal to N-1, and with standard deviation s x /√N.

If N is small A 95% confidence interval for the sample mean is: [Sample Mean – MoE, Sample Mean + MoE] With MoE = z * Standard Error. z= 1.96 when the df = ∞ z> 1.96 when the df are small. See next table for the exact value of z.

t Table

Why is it called Student’s t distribution? The t distribution was allegedly invented by a person called Student. That “Student” was an engineer at factories in Ireland: William Sealy Gossett. He was producing small samples of a p, seeking guidance for industrial quality control: – He was trying a small number of samples (N=2,4, perhaps 7). – And from these samples was trying to infer the quality of all containers of the product (the population). W.S. Gosset and Some Neglected Concepts in Experimental Statistics, Stephen T. Ziliak, 2011.

Outline 1.Small samples, normal distribution: t-distribution 2.Testing hypothesis: The Foundation of (Social) Sciences Next time: t test of mean and proportion Chapter 6 of A&F

Thinking like a statistician: Step 1 Empirical question, type #1: “ What is an estimate of the population proportion of voters for Republicans in Colorado?” Or… Empirical question, type #2: In Plain English “Is Cory Gardner likely to win the election?” In more formal terms: “Can we reject the hypothesis that Cory Gardner will lose the election with some confidence?”

Thinking like a statistician: Step 1 Empirical question, type #1: “ What is an estimate of the population impact of Zmapp on Ebola ?” Or… Empirical question, type #2: In Plain English “Is Zmapp effective at treating Ebola?” In more formal terms: “Can we reject the hypothesis that Zmapp does not treat Ebola patients with some confidence?”

Hypothesis testing Hypothesis: an empirical statement about a population parameter. Usually of the shape: – “The parameter is equal to a given value” – “The parameter is greater than a given value” – “The parameter is lower than a given value” Almost all scientific/sociological/economic statements can be reduced to one of these three types. – “The population proportion of voters for Cory Gardner is greater than 50%.” (second type of hypothesis) – “The impact of ZMapp on Ebola patients’ condition is zero.” (first type of hypothesis)

Hypothesis Fundamental principle of scientific analysis: we can only provide evidence to reject a hypothesis. We never actually accept a hypothesis… “Science must begin with myths, and with the criticism of myths.” Karl Popper. Scientific hypothesis are falsifiable, i.e. it is possible to bring data to test such hypothesis. This applies to social science as well: impact of taxes on individuals’ mobility, impact of abortion on crime (Steve Levitt, Freakonomics). Logik der Forschung, Vienna, Translated into The Logic of Scientific Discovery, 1959.

Formulating Hypothesis H 0 Null hypothesis (to be rejected by evidence): “Zmapp does not improve Ebola patients’ condition.” or “The population impact of Zmapp on Ebola is zero.” H a Alternative hypothesis: “Zmapp has a positive impact on the population of Ebola patients’ condition.” See that the formulation of the alternative matters. Beware of your priors.

Formulating Hypothesis (simpler) H 0 Null hypothesis (to be rejected by evidence): “The fraction of men in Abu Dhabi is equal to 50%.” H a Alternative hypothesis: “The fraction of men in Abu Dhabi is different from 50%.” See that the formulation of the alternative matters. Beware of your priors. A tourist’s pic. What’s going on?

Testing H 0 :  =v using confidence intervals H 0 : “The fraction of men in Abu Dhabi is 50%.” equivalently “  = 0.5”. By simple random sampling, gather N observations X i =0,1. Build a confidence interval for the sample mean m of X i. – Same methods as seen in previous sessions. If the null hypothesis is true, only 5 of the 95% confidence intervals will not include 0.5. Thus if the null hypothesis is true, there is only a 5% probability that my confidence interval will not include 0.5. ☞ Reject the null hypothesis if the confidence interval for m does not include v.

Statistical Errors: “The Truth (  ) is out there” Null hypothesis is true Alternative hypothesis is true Do not reject the null hypothesis Correct decisionType II error We reject the null hypothesis Type I errorCorrect decision By selecting 95% confidence intervals, what is the probability of a type I error? This is called the significance level (  level) of the test. It is not possible to make no type I and no type II error.

Then MoE = z * Standard Error Wrap up When the sample size is small: The t-distribution gives confidence intervals … and when the distribution of X is normal. Use z given by Table 5.1 of Agresti and Finlay for degrees of freedom N-1. Hypothesis testing is the foundation of (social) sciences. Hypothesis: A parameter is equal to …., A parameter is greater than …., A parameter is lower than ….. We can only provide evidence to reject a null hypothesis. Confidence interval method for the test of H 0 :  = v. H a :  ≠ v. – Reject the H 0 with significance level 5% if the 95% confidence interval for the sample mean m does not include v. – Reject the H 0 with significance level 10% if the 90% confidence interval for the sample mean m does not include v.

Coming up: Readings: Mid term on Tuesday, November 25. – Coverage: up to Chapter 6. Deadlines are sharp and attendance is followed. For help: Amine Ouazad Office 1135, Social Science building Office hour: Tuesday from 5 to 6.30pm. GAF: Irene Paneda Sunday recitations. At the Academic Resource Center, Monday from 2 to 4pm.