QM222 Class 14 Today’s New topic: What if the Dependent Variable is a Dummy Variable? QM222 Fall 2017 Section A1.

Slides:

Advertisements

Similar presentations

AP Statistics Section 3.2 C Coefficient of Determination

Advertisements

Topic 12: Multiple Linear Regression

© Copyright 2001, Alan Marshall1 Regression Analysis Time Series Analysis.

Simple Linear Regression. Start by exploring the data Construct a scatterplot  Does a linear relationship between variables exist?  Is the relationship.

Chapter 13 Multiple Regression

Stat 217 – Day 26 Regression, cont.. Last Time – Two quantitative variables Graphical summary  Scatterplot: direction, form (linear?), strength Numerical.

Lecture 20 – Tues., Nov. 18th Multiple Regression: –Case Studies: Chapter 9.1 –Regression Coefficients in the Multiple Linear Regression Model: Chapter.

Regression Analysis. Scatter plots Regression analysis requires interval and ratio-level data. To see if your data fits the models of regression, it is.

Welcome to Econ 420 Applied Regression Analysis Study Guide Week Five Ending Wednesday, September 26 (Note: Exam 1 is on September 27)

Multiple Linear Regression. Purpose To analyze the relationship between a single dependent variable and several independent variables.

3.2 Least Squares Regression Line. Regression Line Describes how a response variable changes as an explanatory variable changes Formula sheet: Calculator.

Simple Linear Regression. Deterministic Relationship If the value of y (dependent) is completely determined by the value of x (Independent variable) (Like.

Objective: Understanding and using linear regression Answer the following questions: (c) If one house is larger in size than another, do you think it affects.

Statistics for Business and Economics 8 th Edition Chapter 11 Simple Regression Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall Ch.

Copyright © Cengage Learning. All rights reserved. 8 4 Correlation and Regression.

Predicting Energy Consumption in Buildings using Multiple Linear Regression Introduction Linear regression is used to model energy consumption in buildings.

Correlation and Regression

QM222 Class 12 Section D1 1. A few Stata things 2

For presentation schedule and makeup test signups, see:

The simple linear regression model and parameter estimation

Sit in your permanent seat

Sit in your permanent seat

Statistics 200 Lecture #6 Thursday, September 8, 2016

Regression Analysis.

QM222 Class 9 Section A1 Coefficient statistics

QM222 Class 11 Section D1 1. Review and Stata: Time series data, multi-category dummies, etc. (chapters 10,11) 2. Capturing nonlinear relationships (Chapter.

QM222 Class 10 Section D1 1. Goodness of fit -- review 2

Simple Linear Regression

QM222 Nov. 7 Section D1 Multicollinearity Regression Tables What to do next on your project QM222 Fall 2016 Section D1.

Bring project data to enter into Fathom

Topic 10 - Linear Regression

Basic Estimation Techniques

QM222 Class 13 Section D1 Omitted variable bias (Chapter 13.)

Least-Squares Regression

Review Multiple Regression Multiple-Category Dummy Variables

QM222 Class 16 & 17 Today’s New topic: Estimating nonlinear relationships QM222 Fall 2017 Section A1.

Examining Relationships Least-Squares Regression & Cautions about Correlation and Regression PSBE Chapters 2.3 and 2.4 © 2011 W. H. Freeman and Company.

SCATTERPLOTS, ASSOCIATION AND RELATIONSHIPS

QM222 Class 11 Section A1 Multiple Regression

QM222 Class 19 Omitted Variable Bias pt 2 Different slopes for a single variable QM222 Fall 2017 Section A1.

QM222 Class 14 Section D1 Different slopes for the same variable (Chapter 14) Review: Omitted variable bias (Chapter 13.) The bias on a regression coefficient.

QM222 Class 18 Omitted Variable Bias

QM222 Class 9 Section D1 1. Multiple regression – review and in-class exercise 2. Goodness of fit 3. What if your Dependent Variable is an 0/1 Indicator.

QM222 A1 More on Excel QM222 Fall 2017 Section A1.

QM222 Class 15 Today’s New topic: Time Series

Regression Analysis.

QM222 A1 On tests and projects

QM222 Class 8 Section A1 Using categorical data in regression

QM222 Class 8 Section D1 1. Review: coefficient statistics: standard errors, t-statistics, p-values (chapter 7) 2. Multiple regression 3. Goodness of fit.

QM222 A1 Nov. 27 More tips on writing your projects

QM222 A1 How to proceed next in your project Multicollinearity

Simple Linear Regression

Regression and Residual Plots

Basic Estimation Techniques

Stat 112 Notes 4 Today: Review of p-values for one-sided tests

QM222 Your regressions and the test

Prepared by Lee Revere and John Large

QM222 Dec. 5 Presentations For presentation schedule, see:

QM222 Class 15 Section D1 Review for test Multicollinearity

Section 3.3 Linear Regression

AP Statistics, Section 3.3, Part 1

Least-Squares Regression

Looking at data: relationships - Caution about correlation and regression - The question of causation IPS chapters 2.4 and 2.5 © 2006 W. H. Freeman and.

Correlation/regression using averages

Statistics 350 Lecture 18.

A medical researcher wishes to determine how the dosage (in mg) of a drug affects the heart rate of the patient. Find the correlation coefficient & interpret.

MGS 3100 Business Analysis Regression Feb 18, 2016

Correlation/regression using averages

The Squared Correlation r2 – What Does It Tell Us?

Presentation transcript:

QM222 Class 14 Today’s New topic: What if the Dependent Variable is a Dummy Variable? QM222 Fall 2017 Section A1

To-dos We have class tomorrow. Assignment 3 due tomorrow Come to office hours (I have Monday office hours today 2:15-3:15, although I will be there later as well! ) Or make an appointment If you don’t understand the class material. If you are stuck on your data. QM222 Fall 2017 Section A1

Assignment 3 has several parts Question 1 log file should be posted on QuestromTools→Assignments Question 1 data file should be uploaded to https://tinyurl.com/qm222a1. Put your name in the filename. Written (typed?) answers to Question 2 along with A revised Project Description questions 1-5 if changed. Project Description : Some of you never handed one in, others have changed their’s. QM222 Fall 2017 Section A1

Today we… Review how to measure goodness of fit in regressions Get more practice with an in-class exercise Learn how to interpret regressions when the dependent variable is a dummy. QM222 Fall 2017 Section A1

Multiple regression: Why use it? 2 reasons why we do multiple regression To get the closer to the“correct/causal” (unbiased) coefficient by controlling for confounding factors To increase the predictive power of a regression

Measuring Goodness of Fit with R2 R2: the fraction of the variation in Y explained by the regression It is always between 0 and 1 What do I mean by the variation in Y? ……..the Sum of Squared Errors or SSE In a simple regression with ONE X variable R2 = [correlation(X,Y)]2 QM222 Fall 2017 Section A1

The R2 measures the goodness of fit. Higher is better. Compare .9414 to .6408 QM222 Fall 2017 Section A1

Measuring Goodness of Fit with R2 What does R2 =1 tell us (with a single X)? It is the same as a correlation of 1 OR -1 It means that the regression predicts Y perfectly What does R2 =1 tell us (with multiple X’s)? It also means that the regression predicts Y perfectly QM222 Fall 2017 Section A1

Regression of price on Size in sq. ft. regress yvariablename xvariablename For instance, to run a regression of price on size, type: regress price size This R2 is quite large Size is a better predictor of the Condo Price QM222 Fall 2017 Section A1

Goodness of Fit in a Multiple Regression If you compare two regressions (on the same Y observations) with the same number of explanatory variables, the one with a higher R2 has a better fit. However, every time you add a new variable, you fit Y a little better….. So you always get a higher R2 Even if the new variable has a |t|<1 and is not at all significant. In multiple regressions, use the adjusted R2 instead (right below R2) Number of obs = 1085 F( 2, 1082) = 1627.49 Prob > F = 0.0000 R-squared = 0.7505 Adj R-squared = 0.7501 Root MSE = 1.3e+05 If you compare two regressions (on the same Y observations) with a different number of explanatory variables, the one with a higher adjusted R2 has a better fit. QM222 Fall 2017 Section A1

When you add in a new X explanatory variable, what happens to the adjusted R2 The adjusted R2 goes up if you add in a variable with |t|>=1 (so that you are >= 68% certain that the coefficient is not 0 ) The adjusted R2 goes down if you add in a variable with |t|<1 (so that you are < 68% certain that the coefficient is not 0 ) QM222 Fall 2017 Section A1

Today we… Review how to measure goodness of fit in regressions Get more practice with an in-class exercise Learn how to interpret regressions when the dependent variable is a dummy. QM222 Fall 2017 Section A1

Pay for grades? www.youtube.com/watch?v=tkVcO8M4QVc 4:01 QM222 Fall 2016 Sections E1 and G1

In-Class exercise: Pay to Learn? 1 a b c 2 a d 4 QM222 Fall 2017 Section A1

Today we… Review how to measure goodness of fit in regressions Get more practice with an in-class exercise Learn how to interpret regressions when the dependent variable is a dummy. QM222 Fall 2017 Section A1

Example: Your company wants to how many people will buy a specific ebook at different prices. Let’s say you did an experiment randomly offering the book at different prices. You make a dummy variable for “Did they purchase the book?” (0= no, 1=purchased one or more) Visitor #1: 1 (i.e. bought the ebook) P=$10 Visitor #2: 0 (i.e. didn’t buy) P=$10 Visitor #3: 0 (i.e. didn’t buy) P=$5 Visitor #4: 1 (i.e. bought the book) P=$5 Visitor #5: 1 (i.e. bought the book) P=$7 What is the average probability that the person bought the book? It is just the average of all of the 1’s and 0’s.

Example: Your company wants to how many people will buy a specific ebook at different prices. Visitor #1: 1 (i.e. bought the ebook) P=$10 Visitor #2: 0 (i.e. didn’t buy) P=$10 Visitor #3: 0 (i.e. didn’t buy) P=$5 Visitor #4: 1 (i.e. bought the book) P=$5 Visitor #5: 1 (i.e. bought the book) P=$7 We said that the average probability that the person bought the book is just the average of all of the 1’s and 0’s. How can we measure how the price affects this probability? We run a regression where the dummy variable is the dependent (Y, LHS) variable. The explanatory X variable is Price. This will predict the “Probability of Purchase” based on price.

Example Buy = .9 - .05 Price If Price = 5, Buy = .9 – .25 = .65 or a 65% probability of buying If Price = 10, Buy = .9 – .5 = .4 or a 40% probability of buying QM222 Fall 2016 Section D1

Example continued Buy = .9 - .05 Price If Price = 5, Buy = .9 – .25 = .65 or a 65% probability of buying If Price = 10, Buy = .9 – .5 = .4 or a 40% probability of buying YET DON’T EVER EXTRAPOLATE OUTSIDE THE PRICES YOU OBSERVE…. And even then you might have a problem! If Price = 20, Buy = .9 – 1.0 = -.1 or a -10% probability of buying…. which makes no sense. With dependent dummy variables, this problem can occur even within the range of X’s you observe. So if you have this problem in your project, there are other ways to model it, not using linear regression. QM222 Fall 2016 Section D1

Today we… Reviewed how to measure goodness of fit in regressions Got more practice with an in-class exercise Learned how to interpret regressions when the dependent variable is a dummy. QM222 Fall 2017 Section A1