Part 1: Simple Linear Model 1-1/301-1 Regression Models Professor William Greene Stern School of Business IOMS Department Department of Economics
Part 1: Simple Linear Model 1-2/301-2 Regression and Forecasting Models Part 1 – Simple Linear Model
Part 1: Simple Linear Model 1-3/30 Theory Demand Theory: Q = f(Price) “The Law of Demand” Demand curves slope downward What does “ceteris paribus” mean here?
Part 1: Simple Linear Model 1-4/30 Data on the U.S. Gasoline Market Quantity = G = Expenditure / Price
Part 1: Simple Linear Model 1-5/30 Shouldn’t Demand Curves Slope Downward?
Part 1: Simple Linear Model 1-6/30 Data on 62 Movies in 2010
Part 1: Simple Linear Model 1-7/30 Average Box Office Revenue is about $20.7 Million
Part 1: Simple Linear Model 1-8/30 Is There a Theory for This? Scatter plot of box office revenues vs. number of “Can’t Wait To See It” votes on Fandango for 62 movies.
Part 1: Simple Linear Model 1-9/30 Average Box Office by Internet Buzz Index = Average Box Office for Buzz in Interval
Part 1: Simple Linear Model 1-10/30 Deterministic Relationship: Not a Theory Expected High Temperatures, August 11-20, 2013, ZIP 10012, NY
Part 1: Simple Linear Model 1-11/30 Probabilistic Relationship What Explains the Noise? Fuel Bill = Function of Rooms + Random Variation
Part 1: Simple Linear Model 1-12/30 Movie Buzz Data Probabilistic Relationship?
Part 1: Simple Linear Model 1-13/30 The Regression Model y = 0 + 1 x + y = dependent variable x = independent variable The ‘regression’ is the deterministic part, 0 + 1 x The ‘disturbance’ (noise) is . The regression model is E[y|x] = 0 + 1 x
Part 1: Simple Linear Model 1-14/30 0 = y intercept 1 = slope E[y|x] = 0 + 1 x y x Linear Regression Model
Part 1: Simple Linear Model 1-15/30 The Model Constructed to provide a framework for interpreting the observed data What is the meaning of the observed relationship (assuming there is one) How it’s used Prediction: What reason is there to assume that we can use sample observations to predict outcomes? Testing relationships
Part 1: Simple Linear Model 1-16/30 The slope is the interesting quantity. Each additional year of education is associated with an increase of in disability adjusted life expectancy.
Part 1: Simple Linear Model 1-17/30 A Cost Model Electricity.mpj Total cost in $Million Output in Million KWH N = 123 American electric utilities Model: Cost = 0 + 1 KWH + ε
Part 1: Simple Linear Model 1-18/30 Cost Relationship
Part 1: Simple Linear Model 1-19/30 Sample Regression
Part 1: Simple Linear Model 1-20/30 Interpreting the Model Cost = Output + e Cost is $Million, Output is Million KWH. Fixed Cost = Cost when output = 0 Fixed Cost = $2.44Million Marginal cost = Change in cost/change in output = * $Million/Million KWH = $/KWH = cents/KWH.
Part 1: Simple Linear Model 1-21/30 Covariation and Causality Does more education make you live longer (on average)?
Part 1: Simple Linear Model 1-22/30 Causality? Height (inches) and Income ($/mo.) in first post-MBA Job (men). WSJ, 12/30/86. Ht. Inc. Ht. Inc. Ht. Inc Estimated Income = Height
Part 1: Simple Linear Model 1-23/30 b0b0 b1b1 How to compute the y intercept, b 0, and the slope, b 1, in y = b 0 + b 1 x.
Part 1: Simple Linear Model 1-24/30 Least Squares Regression
Part 1: Simple Linear Model 1-25/30 Fitting a Line to a Set of Points Choose b 0 and b 1 to minimize the sum of squared residuals Gauss’s method of least squares. Residuals YiYi XiXi Predictions b 0 + b 1 x i
Part 1: Simple Linear Model 1-26/30 Computing the Least Squares Parameters b 0 and b 1
Part 1: Simple Linear Model 1-27/30 b 0 = b 1 =
Part 1: Simple Linear Model 1-28/30 Least Squares Uses Calculus
Part 1: Simple Linear Model 1-29/30 Least squares minimizes the sum of squared deviations from the line.
Part 1: Simple Linear Model 1-30/30 Summary Theory vs. practice Linear Relationship Deterministic Random, stochastic, ‘probabilistic’ Mean is a function of x Regression Relationship Causality vs. correlation Least squares