QM222 A1 Nov. 27 More tips on writing your projects

Slides:



Advertisements
Similar presentations
Managerial Economics in a Global Economy
Advertisements

Here we add more independent variables to the regression.
The Use and Interpretation of the Constant Term
Choosing a Functional Form
Multiple Linear Regression Model
Copyright © 2008 by the McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Managerial Economics, 9e Managerial Economics Thomas Maurice.
Copyright (c) Bani K. Mallick1 STAT 651 Lecture #18.
Chapter 4 Multiple Regression.
Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc. Chap 14-1 Chapter 14 Introduction to Multiple Regression Basic Business Statistics 11 th Edition.
Part 18: Regression Modeling 18-1/44 Statistics and Data Analysis Professor William Greene Stern School of Business IOMS Department Department of Economics.
Week 14 Chapter 16 – Partial Correlation and Multiple Regression and Correlation.
STATISTICS: BASICS Aswath Damodaran 1. 2 The role of statistics Aswath Damodaran 2  When you are given lots of data, and especially when that data is.
1 Chapter 3: Examining Relationships 3.1Scatterplots 3.2Correlation 3.3Least-Squares Regression.
Least-Squares Regression Section 3.3. Why Create a Model? There are two reasons to create a mathematical model for a set of bivariate data. To predict.
Chapter 11 Linear Regression Straight Lines, Least-Squares and More Chapter 11A Can you pick out the straight lines and find the least-square?
Scatterplot and trendline. Scatterplot Scatterplot explores the relationship between two quantitative variables. Example:
Copyright © 2005 by the McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Managerial Economics Thomas Maurice eighth edition Chapter 4.
Regression Chapter 16. Regression >Builds on Correlation >The difference is a question of prediction versus relation Regression predicts, correlation.
© Copyright McGraw-Hill Correlation and Regression CHAPTER 10.
Discussion of time series and panel models
Relationships If we are doing a study which involves more than one variable, how can we tell if there is a relationship between two (or more) of the.
11/23/2015Slide 1 Using a combination of tables and plots from SPSS plus spreadsheets from Excel, we will show the linkage between correlation and linear.
EXCEL DECISION MAKING TOOLS AND CHARTS BASIC FORMULAE - REGRESSION - GOAL SEEK - SOLVER.
Regression Analysis: A statistical procedure used to find relations among a set of variables B. Klinkenberg G
1. Analyzing patterns in scatterplots 2. Correlation and linearity 3. Least-squares regression line 4. Residual plots, outliers, and influential points.
Stats Methods at IC Lecture 3: Regression.
QM222 Class 19 Section D1 Tips on your Project
Chapter 4: Basic Estimation Techniques
QM222 Class 9 Section A1 Coefficient statistics
QM222 Class 11 Section D1 1. Review and Stata: Time series data, multi-category dummies, etc. (chapters 10,11) 2. Capturing nonlinear relationships (Chapter.
Chapter 4 Basic Estimation Techniques
CHAPTER 3 Describing Relationships
Econ 326 Lecture 19.
QM222 Class 10 Section D1 1. Goodness of fit -- review 2
Simple Linear Regression
QM222 Nov. 7 Section D1 Multicollinearity Regression Tables What to do next on your project QM222 Fall 2016 Section D1.
LSRL.
Least Squares Regression Line.
QM222 Nov. 28 Presentations Some additional tips on the project
26134 Business Statistics Week 5 Tutorial
Basic Estimation Techniques
Statistics 200 Lecture #5 Tuesday, September 6, 2016
QM222 Class 13 Section D1 Omitted variable bias (Chapter 13.)
QM222 Class 16 & 17 Today’s New topic: Estimating nonlinear relationships QM222 Fall 2017 Section A1.
QM222 Class 11 Section A1 Multiple Regression
Political Science 30: Political Inquiry
QM222 Class 19 Omitted Variable Bias pt 2 Different slopes for a single variable QM222 Fall 2017 Section A1.
Multiple Regression Analysis and Model Building
QM222 Class 14 Section D1 Different slopes for the same variable (Chapter 14) Review: Omitted variable bias (Chapter 13.) The bias on a regression coefficient.
QM222 Class 18 Omitted Variable Bias
QM222 Class 9 Section D1 1. Multiple regression – review and in-class exercise 2. Goodness of fit 3. What if your Dependent Variable is an 0/1 Indicator.
QM222 A1 More on Excel QM222 Fall 2017 Section A1.
Multiple Regression.
Week 14 Chapter 16 – Partial Correlation and Multiple Regression and Correlation.
QM222 A1 On tests and projects
QM222 Class 8 Section A1 Using categorical data in regression
Chapter 3.2 LSRL.
QM222 A1 How to proceed next in your project Multicollinearity
Regression and Residual Plots
QM222 Class 14 Today’s New topic: What if the Dependent Variable is a Dummy Variable? QM222 Fall 2017 Section A1.
Basic Estimation Techniques
Prepared by Lee Revere and John Large
QM222 Dec. 5 Presentations For presentation schedule, see:
Introduction to bivariate data
Least Squares Regression Line LSRL Chapter 7-continued
Chapter 5 LSRL.
Section 6.2 Prediction.
Regression Analysis.
MGS 3100 Business Analysis Regression Feb 18, 2016
Presentation transcript:

QM222 A1 Nov. 27 More tips on writing your projects QM222 Fall 2017 Section A1

From Nov. 10 notes: How important is each variable? The t-stat tells you if the impact of the variable might be zero, i.e. if it is statistically significant. How can you tell how much each variable contributes to explaining the variation in Y? In other words, is the variable practically important? Does it make a meaningful difference. I can suggest two ways you can do this. In terms of fit: Using your best regression, drop that variable and see how much the adjusted R-squared changes. THIS IS PARTICULARLY USEFUL FOR SETS OF DUMMIES! For each variable, multiply coef * (max X – min X), where the maximum X is the maximum in your sample (same for min). This is the largest change in Y that this variable can be responsible for. QM222 Fall 2016 Section D1

How important is each variable? 3. For (2- category) dummy variables, explain their coefficient in the context of the average…. i.e. If the coefficient is 200, how much is this compared to the mean of Y? 4. For multiple category dummies, each coefficient is relative to the excluded reference category. More generally, as you discuss your results, make sure it is clear what the coefficient tells us, whether it is a dummy or numerical variable. QM222 Fall 2016 Section D1

Some additional tips Your readers will be helped by knowing some key summary statistics of your variables – especially your dependent and key explanatory variable. Do not use terms t-stat or adjusted R-squared. Instead, say things like: This regression explains 50% of the variation in Cigarette consumption”, or, “We are more than 99% certain that spending has an impact on total points scored.” Cite any fact your bring that is not from your own analysis. To do this, in the test write (author, year) such as (US Bureau of the Census, 2012) and then put the actual citation in references at the paper’s end. QM222 Fall 2017 Section A1

Some additional tips Many projects would benefit by having a section called somethings like “background” that introduces the background to the issue and the key summary statistics. Many projects would benefit by dividing the results into several sections to tell your story step by step. Label every graph and table e.g. Exhibit 1 (or Table 1), as well as giving it a descriptive title. QM222 Fall 2017 Section A1

Some additional tips If your dependent variable is a dummy, explain how to interpret the coefficient (as percentage point differences in Y). … You might also add that since the average Y is __%, the percentage change in Y is around ____ times the coefficient. g. if average  Y is .5, the average percentage change is twice the coefficient. (Then continue to make this percentage point/percentage differentiation when talking about results.) DO NOT focus on getting a high adjusted R-squared. Focus only on getting the least biased measure of the impact of a variable. We should know who the client is upfront! QM222 Fall 2017 Section A1

Do NOT put regressions in text but in a regression table.   (1) (2) (3) (4) (5) Medals Has Medals  (standard errors in parentheses) Civil unrest -6.463*** 1.480 0.358 0.578 -0.026 (2.534) (2.290) (2.271) (2.275) (0.058) GDP 1.394 0.002 -0.025 0.087 (0.108) (0.355) (0.009) GDP squared 0.061*** 0.064*** -0.0025 (0.015) (0.000) Olympic year -0.071 0.0024** (0.052) (0.001) Intercept 9.542*** -2.850 1.686 141.641 -4.664* (0.888) (1.236) (1.642) (102.517) (2.612) # observations 637 Adjusted R-squared 0.0106 0.2512 0.2739 0.2751 0.2646 Column for each regression 2 lines for each variable: one for coefficient and one for standard error or t-stat Asterisks for significance, #obs, Adj Rsq, Footnotes that include if se or t- stat, what excluded categories are QM222 Fall 2017 Section Sections E1 & H1

Some additional tips: On Quadratics Quadratics: When you have a linear and a squared variable, you cannot interpret the linear term’s coefficient only as its effect (for instance age squared.) Instead, both the linear and quadratic terms must be combined to figure out the effect.   In general, if the quadratic is    b X + a X2  , the best way to calculate the effect of a change in X is to do the derivative (or slope) which is b + 2*a*X.   As you see, the slope depends on the specific value of X (e.g. age) you are at.  Moreover, you cannot use the t-stats of the two different terms to measure the significance of the impact of the variable – let’s say age.  Instead, you need to measure the joint significance at a specific age, let’s say the age of 30. To do this, after the regression, type lincom age+agesq*2*30 (which is the slope at the age of 30). Or you can test if together they add explanatory power (are significant): test age agesq QM222 Fall 2017 Section A1

Some additional tips: On Quadratics A good way to illustrate the quadratic is to draw a graph so you can see the shape of the relationship.  Use the range of X’s (e.g. age) in your sample.    But many of you are confused about the Y-axis since we are only including the contribution of these two terms (to Y).  I have a way to solve this.  Let’s consider the variable age and assume that the youngest people in your dataset are 18. Start the Y value at the average Y of 18 year olds in your sample.  Let’s say that the Y value at age 18 is 100.  Then, make the rest of the curve change from there (by adding 100 to all predicted effects of age at the different ages.) QM222 Fall 2017 Section A1