1 Regression Econ 240A. 2 Retrospective w Week One Descriptive statistics Exploratory Data Analysis w Week Two Probability Binomial Distribution w Week.

Slides:



Advertisements
Similar presentations
Lecture (11,12) Parameter Estimation of PDF and Fitting a Distribution Function.
Advertisements

Chap 12-1 Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chapter 12 Simple Regression Statistics for Business and Economics 6.
Hypothesis Testing Steps in Hypothesis Testing:
Econ 140 Lecture 81 Classical Regression II Lecture 8.
Regression Analysis Once a linear relationship is defined, the independent variable can be used to forecast the dependent variable. Y ^ = bo + bX bo is.
Learning Objectives Copyright © 2004 John Wiley & Sons, Inc. Bivariate Correlation and Regression CHAPTER Thirteen.
Objectives (BPS chapter 24)
Chapter 15 (Ch. 13 in 2nd Can.) Association Between Variables Measured at the Interval-Ratio Level: Bivariate Correlation and Regression.
© 2010 Pearson Prentice Hall. All rights reserved Least Squares Regression Models.
1 Econ 240A Power 7. 2 This Week, So Far §Normal Distribution §Lab Three: Sampling Distributions §Interval Estimation and Hypothesis Testing.
Correlation and Regression. Spearman's rank correlation An alternative to correlation that does not make so many assumptions Still measures the strength.
Chapter 10 Simple Regression.
Chapter 12 Simple Regression
1 Regression Econ 240A. 2 Outline w A cognitive device to help understand the formulas for estimating the slope and the intercept, as well as the analysis.
1 Econ 240A Power 7. 2 This Week, So Far §Normal Distribution §Lab Three: Sampling Distributions §Interval Estimation and HypothesisTesting.
The Simple Regression Model
SIMPLE LINEAR REGRESSION
1 Econ 240A Power 6. 2 The Challenger Disaster l sjoly/RB-intro.html sjoly/RB-intro.html.
1 Econ 240A Power Outline Review Projects 3 Review: Big Picture 1 #1 Descriptive Statistics –Numerical central tendency: mean, median, mode dispersion:
Chapter Topics Types of Regression Models
1 Economics 240A Power Eight. 2 Outline Lab Four Lab Four Maximum Likelihood Estimation Maximum Likelihood Estimation The UC Budget Again The UC Budget.
1 Economics 240A Power Eight. 2 Outline n Maximum Likelihood Estimation n The UC Budget Again n Regression Models n The Income Generating Process for.
Chapter 11 Multiple Regression.
1 Econ 240A Power 7. 2 Last Week §Normal Distribution §Lab Three: Sampling Distributions §Interval Estimation and HypothesisTesting.
1 Econ 240A Power 7. 2 This Week, So Far §Normal Distribution §Lab Three: Sampling Distributions §Interval Estimation and Hypothesis Testing.
SIMPLE LINEAR REGRESSION
BCOR 1020 Business Statistics
Simple Linear Regression and Correlation
Linear Regression/Correlation
SIMPLE LINEAR REGRESSION
Correlation and Linear Regression
Copyright © Cengage Learning. All rights reserved. 13 Linear Correlation and Regression Analysis.
Copyright © 2013, 2010 and 2007 Pearson Education, Inc. Chapter Inference on the Least-Squares Regression Model and Multiple Regression 14.
EQT 272 PROBABILITY AND STATISTICS
Ms. Khatijahhusna Abd Rani School of Electrical System Engineering Sem II 2014/2015.
OPIM 303-Lecture #8 Jose M. Cruz Assistant Professor.
Statistics for Business and Economics 7 th Edition Chapter 11 Simple Regression Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Ch.
INTRODUCTORY LINEAR REGRESSION SIMPLE LINEAR REGRESSION - Curve fitting - Inferences about estimated parameter - Adequacy of the models - Linear.
Production Planning and Control. A correlation is a relationship between two variables. The data can be represented by the ordered pairs (x, y) where.
1 Chapter 12 Simple Linear Regression. 2 Chapter Outline  Simple Linear Regression Model  Least Squares Method  Coefficient of Determination  Model.
1 Economics 240A Power Eight. 2 Outline n Maximum Likelihood Estimation n The UC Budget Again n Regression Models n The Income Generating Process for.
Inference for Regression Simple Linear Regression IPS Chapter 10.1 © 2009 W.H. Freeman and Company.
Y X 0 X and Y are not perfectly correlated. However, there is on average a positive relationship between Y and X X1X1 X2X2.
MGS3100_04.ppt/Sep 29, 2015/Page 1 Georgia State University - Confidential MGS 3100 Business Analysis Regression Sep 29 and 30, 2015.
Lesson Multiple Regression Models. Objectives Obtain the correlation matrix Use technology to find a multiple regression equation Interpret the.
Multiple Regression Petter Mostad Review: Simple linear regression We define a model where are independent (normally distributed) with equal.
Chapter 13 Multiple Regression
STA 286 week 131 Inference for the Regression Coefficient Recall, b 0 and b 1 are the estimates of the slope β 1 and intercept β 0 of population regression.
VI. Regression Analysis A. Simple Linear Regression 1. Scatter Plots Regression analysis is best taught via an example. Pencil lead is a ceramic material.
Regression Analysis © 2007 Prentice Hall17-1. © 2007 Prentice Hall17-2 Chapter Outline 1) Correlations 2) Bivariate Regression 3) Statistics Associated.
Multiple Regression. Simple Regression in detail Y i = β o + β 1 x i + ε i Where Y => Dependent variable X => Independent variable β o => Model parameter.
Inferences Concerning Variances
Regression Analysis. 1. To comprehend the nature of correlation analysis. 2. To understand bivariate regression analysis. 3. To become aware of the coefficient.
Advanced Statistical Methods: Continuous Variables REVIEW Dr. Irina Tomescu-Dubrow.
Significance Tests for Regression Analysis. A. Testing the Significance of Regression Models The first important significance test is for the regression.
REGRESSION AND CORRELATION SIMPLE LINEAR REGRESSION 10.2 SCATTER DIAGRAM 10.3 GRAPHICAL METHOD FOR DETERMINING REGRESSION 10.4 LEAST SQUARE METHOD.
Bivariate Regression. Bivariate Regression analyzes the relationship between two variables. Bivariate Regression analyzes the relationship between two.
Chapter 4: Basic Estimation Techniques
Regression and Correlation
Chapter 4 Basic Estimation Techniques
Chapter 14 Inference on the Least-Squares Regression Model and Multiple Regression.
REGRESSION G&W p
Economics 240A Power Eight.
CHAPTER 29: Multiple Regression*
Undergraduated Econometrics
Product moment correlation
SIMPLE LINEAR REGRESSION
Warsaw Summer School 2017, OSU Study Abroad Program
Correlation and Simple Linear Regression
Presentation transcript:

1 Regression Econ 240A

2 Retrospective w Week One Descriptive statistics Exploratory Data Analysis w Week Two Probability Binomial Distribution w Week Three Normal Distribution Interval Estimation, Hypothesis Testing, Decision Theory

3 Last Week w Bivariate Relationships w Correlation and Analysis of Variance

4 Outline w A cognitive device to help understand the formulas for estimating the slope and the intercept, as well as the analysis of variance w Table of Analysis of Variance (ANOVA) for regression w F distribution for testing the significance of the regression, i.e. does the independent variable, x, significantly explain the dependent variable, y?

5 Outline (Cont.) w The Coefficient of Determination, R 2, and the Coefficient of Correlation, r.  Estimate of the error variance,  2. w Hypothesis tests on the slope, b.

6 Part I: A Cognitive Device

7 A Cognitive Device: The Conceptual Model w (1) y i = a + b*x i + e i w Take expectations, E: w (2) E y i = a + b*E x i +E e i, where assume (3) E e i =0 w Subtract (2) from (1) to obtain model in deviations: w (4) [y i - E y i ] = b*[x i - E x i ] + e i w Multiply (3) by [x i - E x i ] and take expectations:

8 A Cognitive Device: (Cont.) w (5) E{[y i - E y i ] [x i - E x i ]} = b*E[x i - E x i ] 2 + E{e i [x i - E x i ] }, where assume E{e i [x i - E x i ] }= 0, i.e. e and x are independent w By definition, (6) cov yx = b* var x, i.e. w (7) b= cov yx/ var x w The corresponding empirical estimate, by the method of moments:

9 A Cognitive Device (Cont.) w The empirical counter part to (2) w Square both sides of (4), and take expectations, w (10) E [y i - E y i ] 2 = b 2 *E[x i - E x i ] 2 + 2E{e i *[x i - E x i ]}+ E[e i ] 2 w Where (11) E{e i *[x i - E x i ] = 0, i.e. the explanatory variable x and the error e are assumed to be independent, cov ex = 0

10 A Cognitive Device (Cont.) w From (10) by definition w (11) var y = b 2 * var x + var e, this is the partition of the total variance in y into the variance explained by x, b 2 * var x, and the unexplained or error variance, var e. w the empirical counterpart to (11) is the total sum of squares equals the explained sum of squares plus the unexplained sum of squares:

11 A Cognitive Device (Cont.) w From Eq. 7, substitute for b in Eq. 11: Var y = [covyx] 2 /Var x + Var e w Divide by Var y: 1 = [covyx] 2 /vary*varx + var e/var y or 1 = r 2 + var e/var y where r is the correlation coefficient

12 Population Model and Sample Model Side by Side

13 Conceptual Vs. Fitted Model w Conceptual w (1) y i = a + b*x i + e i w Take expectations, E w (2) Ey = a + b*Ex + Ee i w (3) Where Ee i = 0 w Subtract (2) from (1) w (4)[y i - Ey] = b*[x i - Ex] + e i w Fitted w Minimize

14 Conceptual Vs. Fitted (Cont.) w Conceptual w Multiply (4) by [x i - Ex] and take expectations, E w E [y i - Ey] [x i -Ex] = b*E [x i -Ex] 2 + Ee i * [x i -Ex], w (5) where Ee i * [x i -Ex] = 0 w (6) cov[y*x] = b*varx w (7) b = cov[y*x]/varx w Fitted w First order condition w compare (3) & (vi) w From (v) the fitted line goes through the sample means

15 Conceptual vs. Fitted (Cont.)

16 Part II: ANOVA in Regression

17 ANOVA w Testing the significance of the regression, i.e. does x significantly explain y? w F 1, n -2 = EMS/UMS w Distributed with the F distribution with 1 degree of freedom in the numerator and n-2 degrees of freedom in the denominator

18 Table of Analysis of Variance (ANOVA) F 1,n -2 = Explained Mean Square / Error Mean Square

19 Example from Lab Four w Linear Trend Model for UC Budget

20

21 Time index, t = 0 for , t=1 for etc

22 Example from Lab Four w Exponential trend model for UC Budget w UCBud(t) =exp[a+b*t+e(t)] w taking the logarithms of both sides w ln UCBud(t) = a + b*t +e(t)

23

24 Time index, t = 0 for , t=1 for etc. Exp( ) = 376.9

25 Part III: The F Distribution

26 The F Distribution  The density function of the F distribution: 1 and 2 are the numerator and denominator degrees of freedom. ! ! !

27  This density function generates a rich family of distributions, depending on the values of 1 and 2 The F Distribution 1 = 5, 2 = 10 1 = 50, 2 = 10 1 = 5, 2 = 10 1 = 5, 2 = 1

28 Determining Values of F w The values of the F variable can be found in the F table, Table 6(a) in Appendix B for a type I error of 5%, or Excel.  The entries in the table are the values of the F variable of the right hand tail probability (A), for which P(F 1, 2 >F A ) = A.

29

30 Time index, t = 0 for , t=1 for etc

31 Part IV: The Pearson Coefficient of Correlation, r w The Pearson coefficient of correlation, r, is (13) r = cov yx/[var x] 1/2 [var y] 1/2 w Estimated counterpart w Comparing (13) to (7) note that (15) r*{[var y] 1/2 /[var x] 1/2 } = b

32 A Cognitive Device: (Cont.) w (5) E{[y i - E y i ] [x i - E x i ]} = b*E[x i - E x i ] 2 + E{e i [x i - E x i ] }, where assume E{e i [x i - E x i ] }= 0, i.e. e and x are independent w By definition, (6) cov yx = b* var x, i.e. w (7) b= cov yx/ var x w The corresponding empirical estimate:

33 Part IV (Cont.) The coefficient of Determination, R 2 w For a bivariate regression of y on a single explanatory variable, x, R 2 = r 2, i.e. the coefficient of determination equals the square of the Pearson coefficient of correlation w Using (14) to square the estimate of r

34 Part IV (Cont.) w Using (8), (16) can be expressed as w And so w In general, including multivariate regression, the estimate of the coefficient of determination,, can be calculated from (21) =1 -USS/TSS.

35 Part IV (Cont.) w For the bivariate regression, the F-test can be calculated from F 1, n-2 = [(n-2)/1][ESS/TSS]/[USS/TSS] F 1, n-2 = [(n-2)/1][ESS/USS]=[(n-2)] w For a multivariate regression with k explanatory variables, the F-test can be calculated as F k, n-2 = [(n-k-1)/k][ESS/USS] F k, n-2 = [(n-k-1)/k]

36 Time index, t = 0 for , t=1 for etc F 1, 33 = (n-2)*[R 2 /(1 - R 2 ) = 34*(0.968/0.032) = 500

37 Part V:Estimate of the Error Variance  Var e i =   w Estimate is unexplained mean square, UMS w Standard error of the regression is

38 Time index, t = 0 for , t=1 for etc

39 Part VI: Hypothesis Tests on the Slope w Hypotheses, H 0 : b=0; H A : b>0 w Test statistic: w Set probability for the type I error, say 5% w Note: for bivariate regression, the square of the t-statistic for the null that the slope is zero is the F-statistic

40 t = { ]/3.76 = 22.4 t 2 = F, i.e *22.36 = 500

41 Part VII: Student’s t-Distribution

42 The Student t Distribution w The Student t density function  is the parameter of the student t distribution E(t) = 0 V(t) =  (  – 2) (for n > 2)

43 The Student t Distribution = 3 = 10

44 Determining Student t Values w The student t distribution is used extensively in statistical inference. w Thus, it is important to determine values of t A associated with a given number of degrees of freedom. w We can do this using t tables, Table 4 Appendix B Excel

45 tAtA t.100 t.05 t.025 t.01 t.005 A=.05 A -t A The t distribution is symmetrical around 0 =1.812 =  The table provides the t values (t A ) for which P(t > t A ) = A Using the t Table tttt

46 Problem 6.32 in Text Table of Joint Probabilities

47 Problem 6.32 w The method of instruction in college and university applied statistics courses is changing. Historically, most courses were taught with an emphasis on manual calculation. The alternative is to employ a computer and a software package to perform the calculations. An analysis of applied statistics courses investigated whether the instructor’s educational background is primarily mathematics (or statistics) or some other field.

48 Problem 6.32 w A. What is the probability that a randomly selected applied statistics course instructor whose education was in statistics emphasizes manual calculations? w What proportion of applied statistics courses employ a computer and software? w Are the educational background of the instructor and the way his or her course are taught independent?

49 Midterm 2000.(15 points) The following table shows the results of regressing the natural logarithm of California General Fund expenditures, in billions of nominal dollars, against year beginning in 1968 and ending in A plot of actual, estimated and residual values follows. –.How much of the variance in the dependent variable is explained by trend? –.What is the meaning of the F statistic in the table? Is it significant? –.Interpret the estimated slope. –.If General Fund expenditures was $ billion in California for fiscal year , provide a point estimate for state expenditures for w

50 w Cont. A state senator believes that state expenditures in nominal dollars have grown over time at 7% a year. Is the senator in the ballpark, or is his impression significantly below the estimated rate, using a 5% level of significance? If you were an aide to the Senator, how might you criticize this regression?

51