5.1 Chapter 5 Inference in the Simple Regression Model In this chapter we study how to construct confidence intervals and how to conduct hypothesis tests.

Slides:



Advertisements
Similar presentations
Inference in the Simple Regression Model
Advertisements

Anthony Greene1 Simple Hypothesis Testing Detecting Statistical Differences In The Simplest Case:  and  are both known I The Logic of Hypothesis Testing:
Testing a Claim about a Proportion Assumptions 1.The sample was a simple random sample 2.The conditions for a binomial distribution are satisfied 3.Both.
Hypothesis: It is an assumption of population parameter ( mean, proportion, variance) There are two types of hypothesis : 1) Simple hypothesis :A statistical.
Hypothesis Testing A hypothesis is a claim or statement about a property of a population (in our case, about the mean or a proportion of the population)
The Multiple Regression Model Prepared by Vera Tabakova, East Carolina University.
EPIDEMIOLOGY AND BIOSTATISTICS DEPT Esimating Population Value with Hypothesis Testing.
Hypothesis Testing: One Sample Mean or Proportion
10 Hypothesis Testing. 10 Hypothesis Testing Statistical hypothesis testing The expression level of a gene in a given condition is measured several.
4.1 All rights reserved by Dr.Bill Wan Sing Hung - HKBU Lecture #4 Studenmund (2006): Chapter 5 Review of hypothesis testing Confidence Interval and estimation.
Tests of significance Confidence intervals are used when the goal of our analysis is to estimate an unknown parameter in the population. A second goal.
Chapter 9: Inferences Involving One Population Student’s t, df = 5 Student’s t, df = 15 Student’s t, df = 25.
1 Hypothesis Testing In this section I want to review a few things and then introduce hypothesis testing.
T-test.
政治大學中山所共同選修 課程名稱: 社會科學計量方法與統計-計量方法 Methodology of Social Sciences 授課內容: Inference in the Simple Regression Model 日期: 2003 年 10 月日.
Chapter 3 Hypothesis Testing. Curriculum Object Specified the problem based the form of hypothesis Student can arrange for hypothesis step Analyze a problem.
BCOR 1020 Business Statistics Lecture 21 – April 8, 2008.
Inference about a Mean Part II
Chapter 9 Hypothesis Testing.
Chapter 9: Introduction to the t statistic
Chapter 8 Introduction to Hypothesis Testing. Hypothesis Testing Hypothesis testing is a statistical procedure Allows researchers to use sample data to.
AM Recitation 2/10/11.
Statistics 11 Hypothesis Testing Discover the relationships that exist between events/things Accomplished by: Asking questions Getting answers In accord.
Probability Distributions and Test of Hypothesis Ka-Lok Ng Dept. of Bioinformatics Asia University.
Inferential Statistics Part 2: Hypothesis Testing Chapter 9 p
Overview of Statistical Hypothesis Testing: The z-Test
Confidence Intervals and Hypothesis Testing - II
Chapter 8 Inferences Based on a Single Sample: Tests of Hypothesis.
Fundamentals of Hypothesis Testing: One-Sample Tests
Review of Statistical Inference Prepared by Vera Tabakova, East Carolina University ECON 4550 Econometrics Memorial University of Newfoundland.
Copyright © Cengage Learning. All rights reserved. 13 Linear Correlation and Regression Analysis.
Significance Tests …and their significance. Significance Tests Remember how a sampling distribution of means is created? Take a sample of size 500 from.
Section 10.1 ~ t Distribution for Inferences about a Mean Introduction to Probability and Statistics Ms. Young.
Copyright © 2013, 2010 and 2007 Pearson Education, Inc. Chapter Inference on the Least-Squares Regression Model and Multiple Regression 14.
More About Significance Tests
Interval Estimation and Hypothesis Testing
Chapter 9 Hypothesis Testing and Estimation for Two Population Parameters.
1 Statistical Inference. 2 The larger the sample size (n) the more confident you can be that your sample mean is a good representation of the population.
QMS 6351 Statistics and Research Methods Regression Analysis: Testing for Significance Chapter 14 ( ) Chapter 15 (15.5) Prof. Vera Adamchik.
Chapter 9 Fundamentals of Hypothesis Testing: One-Sample Tests.
Statistical Hypotheses & Hypothesis Testing. Statistical Hypotheses There are two types of statistical hypotheses. Null Hypothesis The null hypothesis,
Large sample CI for μ Small sample CI for μ Large sample CI for p
EMIS 7300 SYSTEMS ANALYSIS METHODS FALL 2005 Dr. John Lipp Copyright © Dr. John Lipp.
Chapter 7 Inferences Based on a Single Sample: Tests of Hypotheses.
Copyright © Cengage Learning. All rights reserved. 13 Linear Correlation and Regression Analysis.
Interval Estimation and Hypothesis Testing Prepared by Vera Tabakova, East Carolina University.
Statistical Inference for the Mean Objectives: (Chapter 9, DeCoursey) -To understand the terms: Null Hypothesis, Rejection Region, and Type I and II errors.
3-1 MGMG 522 : Session #3 Hypothesis Testing (Ch. 5)
Chapter 8 Parameter Estimates and Hypothesis Testing.
Fall 2002Biostat Statistical Inference - Confidence Intervals General (1 -  ) Confidence Intervals: a random interval that will include a fixed.
Chap 8-1 Fundamentals of Hypothesis Testing: One-Sample Tests.
Ex St 801 Statistical Methods Inference about a Single Population Mean.
Inen 460 Lecture 2. Estimation (ch. 6,7) and Hypothesis Testing (ch.8) Two Important Aspects of Statistical Inference Point Estimation – Estimate an unknown.
© Copyright McGraw-Hill 2004
Review of Statistics.  Estimation of the Population Mean  Hypothesis Testing  Confidence Intervals  Comparing Means from Different Populations  Scatterplots.
Applied Quantitative Analysis and Practices LECTURE#14 By Dr. Osman Sadiq Paracha.
Statistical Inference Statistical inference is concerned with the use of sample data to make inferences about unknown population parameters. For example,
Hypothesis Testing. Suppose we believe the average systolic blood pressure of healthy adults is normally distributed with mean μ = 120 and variance σ.
Chapter 9: Introduction to the t statistic. The t Statistic The t statistic allows researchers to use sample data to test hypotheses about an unknown.
Statistical Inference for the Mean Objectives: (Chapter 8&9, DeCoursey) -To understand the terms variance and standard error of a sample mean, Null Hypothesis,
Hypothesis Tests for 1-Proportion Presentation 9.
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 7 Inferences Concerning Means.
Statistical principles: the normal distribution and methods of testing Or, “Explaining the arrangement of things”
4-1 Statistical Inference Statistical inference is to make decisions or draw conclusions about a population using the information contained in a sample.
Chapter 9 Introduction to the t Statistic
Chapter 9 Hypothesis Testing.
Chapter Review Problems
Elementary Statistics
Interval Estimation and Hypothesis Testing
Presentation transcript:

5.1 Chapter 5 Inference in the Simple Regression Model In this chapter we study how to construct confidence intervals and how to conduct hypothesis tests using the simple regression model from Chapters 3 and 4. Concepts for review: The estimators b 1 and b 2 are random variables where b 2 ~Normal(  2, Var(b 2 )) b 1 ~Normal(  1, Var(b 1 ))

5.2 Interval Estimation Least Squares gives us point estimates for  1 and  2. Need to address the issue of precision using knowledge of 1)the variance of b 2 and 2)the shape of b 2 ’s probability distribution We can construct a margin for error around the point estimates. Review Confidence Intervals: We know that 95% of all possible values for a normal random variable lie within 1.96 standard deviations of the mean 2 b2b2

5.3 Note that the above interval makes a probabilistic statement about the width of the interval, not about  2 where If we knew , then we would have no problem constructing the interval: However,  is unknown and must be estimated. This adds an additional source of uncertainty to the interval and also changes the shape of the standardized distribution.

5.4 The Student t-distribution We know how to estimate  : However, when we standardize b 2 using an estimate of , we no longer have a standard normal random variable. Instead we have a random variable with a t-distribution: But: what is se(b 2 ) ??

5.5 About the Student t-distribution Compare a z random variable to a t random variable: 1) In the expression for z, the only random variable is b 2  z has the same distribution as b 2 because  2 and  b2 are constants. The distribution is Normal. 2) In the expression for t, b 2 and se(b 2 ) are random variables where b 2 has a normal distribution and se(b 2 ) is a function of which has a  2 distribution. The ratio of a normal random variable to a  2 random variable has a t- distribution.

5.6 More on the t t-values have a measure of degrees of freedom. For a simple regression model, this is T – 2. See Table 2 front cover of book. Suppose T = 40  38 d.o.f and 95% of the values lie within  of the mean. Identify the relevant area on the diagram.

5.7 Confidence Intervals Using the t-Distribution is the critical t value that leaves 2.5% of the values in the tails. It’s value depends on the degrees of freedom and the level of confidence. A confidence interval for b 2 has the general form:

5.8 Example of a Confidence Interval In Chapter 3 we found for the food expenditure example: In Chapter 4 we found for the food expenditure example:

5.9 This is the 95% confidence interval. There is A 95% probability that this interval contains the true value of  2.

5.10 Hypothesis Testing The Idea: A hypothesis is a conjecture about a population parameter such as “we believe the marginal propensity to spend on food is $0.10 out of every dollar”   2 = 0.10 Remember that population parameters are unknown constants. We “test” hypotheses about  2 using b 2, our estimator of  2. b 2 is calculated using a sample of data. If b 2 is “reasonably” close to the hypothesized value for  2, then we say that the data support the hypothesis. If b 2 is NOT “reasonably” close, then we say that the data do not support the hypothesis.

5.11 Formal Hypothesis Testing y =  1 +  2 x + e 1) Null Hypothesis: specify a value for the parameter H o :  2 = c where c can be any value. For example, let c = 0, then the Null Hypothesis becomes H o :  2 = 0. Note that if this were true, then it says that x has no effect on y. This test is called a test of significance.

5.12 2)Alternative Hypothesis: a logical alternative to the Null Hypothesis because if we reject the Null Hypothesis, then we must be prepared to accept the Alternative Hypothesis. Typically, it is H 1 :  2  c or H 1 :  2 c. If we have a test of significance where H o :  2 = 0, then the Alternative Hypothesis is: H 1 :  2  0 or H 1 :  2 0 Whether we use , depends on the situation and economic theory. For example, it is theoretically impossible that  2 < 0 where  2 is the marginal propensity to consume. Therefore, a test of significance would be: H o :  2 = 0 H 1 :  2 > 0

5.13 3)Test Statistic: we use a statistic to “test” the hypothesis. The idea: if the test statistic “disagrees” with the H o  reject H o. Whether or not the test statistic agrees or disagrees with H o must be addressed in probabilistic terms. Our test statistic is based on b 2. The mean of b 2 is  2 but  2 is unknown. *** Make this assumption: H o is true. Suppose H o :  2 = c  we now know that b 2 ’s distribution is centered at c. This is our test statistic. What do we do with it ?????

5.14 4) The Rejection Region: We have assumed the H o to be true  examine the distribution of b 2 under this hypothesis. Suppose that we calculate our test statistic and it falls into the tail of this distribution. There are 2 reasons why this might happen: i)The assumption that H o is true is a bad one (meaning the true distribution is centered at a value other than c) ii)The H o is true but our sample data were very unlikely (came from the tail) Extreme values are those values that fall into the tails, depending on the alternative hypothesis. We typically use the 5% most extreme values; a region of low probability. b2b2 t  2 = c 0

5.15 b2b2 t 2=02=0 0 Suppose H o :  2 = 0 H 1 :  2  0 The test statistic is The rejection region will be t values that fall into either tail: Two Tailed Test because H 1 :  2  0. If we use a 5% level of significance, then we put 2.5% into each tail. What t-values leave in the tail? Use t-table. Suppose T=40 so that we have 38 degrees of freedom

5.16 b2b2 t 2=02= Suppose H o :  2 = 0 H 1 :  2 > 0 The test statistic is The rejection region will be t values that fall into the right tail: One Tailed Test If we use a 5% level of significance, then we put 5% into the right tail What t-values leave 0.05 in the tail? Use t-table. Suppose T=40 so that we have 38 degrees of freedom.

5.17 5) Conduct the Test: Compare the t-statistic to the rejection region and conclude whether the data fail to reject or reject the null hypothesis (H o ) Example: Food Expenditure H o :  2 = 0 H 1 :  2 > 0 Conclusion??

5.18 6) Think about Possible Errors We never know for sure whether we have made an error because the truth is never revealed to us. We can only analyze the probability of making an error. When we set our level of significance, we are actually setting the probability of a Type I error. Why? Suppose that Ho is true  5% of the time we will get samples of data that generate a test statistic t that lies in the rejection region, leading us to reject Ho when in fact it is true. We can make the probability of a Type I error smaller by using a 1% level of significance instead of 5% The truth Our DecisionHo is trueHo is false Reject HoType I ErrorCorrect Fail to Reject HoCorrectType II Error

5.19 A Type II Error occurs when we fail to reject Ho when in fact it is false (meaning the alternative hypothesis H 1 is true.). In order to measure the probability of this error occurring we need a more specific H 1

5.20 7)P-Values As an alternative to specifying the level of significance for a test, we can calculate the p-value of the test, which stands for “probability” value. It is simply the probability of getting the sample test statistic or something more extreme under the assumption that Ho is true. Suppose H o :  2 = 0 H 1 :  2 > 0 and our b 2 =  P-value is P(b 2  ) = P(t  4.20) = area in right tail. In Excel, use this formula: =TDIST(4.2,38,1) b2b2 t 2=02=

5.21 For a two-tailed test, we multiply the p-value by 2 Suppose H o :  2 = 0 H 1 :  2  0 and our b 2 =  P-value is 2 x P(b 2  ) = 2 x P(t  4.20) = In Excel, use this formula =TDIST(4.2,38,2)

5.22 Least Squares Predictor This “predictor” is a random variable because it is a function of b 1 and b 2 which are random variables. Suppose x = x o, the model predicts The error is The variance of this error tells us about the precision of the prediction:

5.23 An estimator of var(f) uses an estimator for  2 We can now construct a confidence interval for our predictor Example:

5.24 The Idea Behind of Hypothesis Testing 1)The probability distribution for b 2 is centered at β 2, which is an unknown parameter. [Remember that E(b 2 ) = β 2 ]. 2)Assume a value for β 2. The value we assume is the value of β 2 in the null hypothesis. By assuming a value, we tie down the distribution for b 2 (we center the distribution for b 2 at the assumed value for β 2.) 3)Use a sample of data on X and Y to calculate the b 2 estimate. 4)Take this value of b 2 and match it up to the distribution from 2) above. Does the value of b 2 fall near the center of the distribution or out into the tails? If it falls near the center, then this value of b 2 has a high probability of occurring under the assumed β 2 value; therefore, the assumed value is said to be consistent with the data. If on the other hand, the b 2 value falls into the tails, then we say that it has a low probability of occurring under the assumed value; therefore, the assumed value is not consistent with the data. Now, we just need to clarify what it means to be out into the tails or near the center…….this is determined by setting a significance level and the rejection region.