Does Poverty Cause Domestic Terrorism? Who knows? (Regression alone can’t establish causality.) There does appear to be some.

Slides:



Advertisements
Similar presentations
Correlation... beware. Definition Var(X+Y) = Var(X) + Var(Y) + 2·Cov(X,Y) The correlation between two random variables is a dimensionless number between.
Advertisements

Irwin/McGraw-Hill © Andrew F. Siegel, 1997 and l Chapter 12 l Multiple Regression: Predicting One Factor from Several Others.
Lecture 28 Categorical variables: –Review of slides from lecture 27 (reprint of lecture 27 categorical variables slides with typos corrected) –Practice.
General Qualitative Data, and “Dummy Variables” How might we have represented “make-of-car” in the motorpool case, had there been more than just two makes?
Statistics: Unlocking the Power of Data Lock 5 STAT 101 Dr. Kari Lock Morgan Multiple Regression SECTION 10.3 Categorical variables Variable selection.
Chance, bias and confounding
Chapter 13 Multiple Regression
McGraw-Hill/Irwin Copyright © 2013 by The McGraw-Hill Companies, Inc. All rights reserved. Chapter 7: Demand Estimation and Forecasting.
Multiple Linear Regression Model
Class 17: Tuesday, Nov. 9 Another example of interpreting multiple regression coefficients Steps in multiple regression analysis and example analysis Omitted.
Lecture 23: Tues., Dec. 2 Today: Thursday:
Chapter 12 Simple Regression
QUALITATIVE AND LIMITED DEPENDENT VARIABLE MODELS.
Chapter 12 Multiple Regression
Topic 3: Regression.
Social Research Methods
Regression Analysis: How to DO It Example: The “car discount” dataset.
1 1 Slide © 2003 South-Western/Thomson Learning™ Slides Prepared by JOHN S. LOUCKS St. Edward’s University.
Dr. Mario MazzocchiResearch Methods & Data Analysis1 Correlation and regression analysis Week 8 Research Methods & Data Analysis.
Data Analysis Statistics. Levels of Measurement Nominal – Categorical; no implied rankings among the categories. Also includes written observations and.
Stat 112: Lecture 9 Notes Homework 3: Due next Thursday
Relationships Among Variables
Review for Final Exam Some important themes from Chapters 9-11 Final exam covers these chapters, but implicitly tests the entire course, because we use.
Multiple Linear Regression A method for analyzing the effects of several predictor variables concurrently. - Simultaneously - Stepwise Minimizing the squared.
Example of Simple and Multiple Regression
Copyright © 2011 Pearson Education, Inc. Multiple Regression Chapter 23.
This Week: Testing relationships between two metric variables: Correlation Testing relationships between two nominal variables: Chi-Squared.
3 CHAPTER Cost Behavior 3-1.
LEARNING PROGRAMME Hypothesis testing Intermediate Training in Quantitative Analysis Bangkok November 2007.
Simple Linear Regression
Portfolio Management Lecture: 26 Course Code: MBF702.
Relationships Scatterplots and correlation BPS chapter 4 © 2006 W.H. Freeman and Company.
Section Copyright © 2014, 2012, 2010 Pearson Education, Inc. Lecture Slides Elementary Statistics Twelfth Edition and the Triola Statistics Series.
Chapter 12 Correlation & Regression
Moderation & Mediation
Evidence-Based Medicine 3 More Knowledge and Skills for Critical Reading Karen E. Schetzina, MD, MPH.
The Practice of Statistics Third Edition Chapter 4: More about Relationships between Two Variables Copyright © 2008 by W. H. Freeman & Company Daniel S.
Product Characteristics, Competition and Dividends by Hoberg, Phillips, and Prabhala University of Maryland Discussion by Gustavo Grullon Rice University.
Statistics and Quantitative Analysis U4320 Segment 12: Extension of Multiple Regression Analysis Prof. Sharyn O’Halloran.
Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc. Chap 12-1 Correlation and Regression.
Introduction to Linear Regression
Multiple Regression The Basics. Multiple Regression (MR) Predicting one DV from a set of predictors, the DV should be interval/ratio or at least assumed.
E CON 432--C HAPTER 2 Tools of Positive Analysis.
Managerial Optimism and Corporate Investment: Some Empirical Evidence from Taiwan Yueh-hsiang Lin Shing-yang Hu Ming-shen Chen Department of Finance National.
Chapter 13 Multiple Regression
Discussion of time series and panel models
Stat 112 Notes 9 Today: –Multicollinearity (Chapter 4.6) –Multiple regression and causal inference.
Welcome to Econ 420 Applied Regression Analysis Study Guide Week Seven.
Lecture 10: Correlation and Regression Model.
7.4 DV’s and Groups Often it is desirous to know if two different groups follow the same or different regression functions -One way to test this is to.
Correlation & Regression Analysis
Does Poverty Cause Domestic Terrorism? Who knows? (Regression alone can’t establish causality.) There does appear to be some.
Introduction to Multiple Regression Lecture 11. The Multiple Regression Model Idea: Examine the linear relationship between 1 dependent (Y) & 2 or more.
Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc.Chap 14-1 Statistics for Managers Using Microsoft® Excel 5th Edition Chapter.
Jump to first page Inferring Sample Findings to the Population and Testing for Differences.
Discounts on Car Purchases: Does Salesperson Identity Matter? Assume there are five salesfolks: Andy, Bob, Chuck, Dave and Ed Take one (e.g., Andy) as.
Regression Analysis: A statistical procedure used to find relations among a set of variables B. Klinkenberg G
HW 23 Key. 24:41 Promotion. These data describe promotional spending by a pharmaceutical company for a cholesterol-lowering drug. The data cover 39 consecutive.
Fixed Effects Model (FEM)
Chapter 14 Introduction to Multiple Regression
Modeling: Variable Selection
Regression Analysis: How to DO It
Discounts on Car Purchases: Does Salesperson Identity Matter?
Discounts on Car Purchases: Does Salesperson Identity Matter?

Prepared by Lee Revere and John Large
Session 4.1: We Approach the End
Correlation ... beware.
UNDERSTANDING RESEARCH RESULTS: STATISTICAL INFERENCE
Presentation transcript:

Does Poverty Cause Domestic Terrorism? Who knows? (Regression alone can’t establish causality.) There does appear to be some level of positive association. But the level of political freedom within a nation also plays a role. The article states: “… the relationship between the level of political rights and terrorism is not a simple one. Countries with an intermediate range of political rights experience a greater risk of terrorism than countries either with a very high degree of political rights or severely authoritarian countries with very low levels of political rights.” This clearly signals a nonlinear relationship (a downward-bending “U”), and suggests adding the square of the “political rights” variable to a model which predicts a nation’s level of domestic terrorism. And indeed, this is what the author did.

The second appendix to the full article reports the following regression (the “no rights” variable takes values between 1 (great political freedom) and 7 (an oppressive authoritarian regime)): Regression: log(Global Terrorism Index) constant log(GDP/cap) no rights (no rights) 2 coefficient something std error of coef something significance something % % % adjusted coef of det 24% there’s strong evidence that the squared variable belongs in the relationship (from the significance level of the squared term) the no-rights variable relates to domestic terrorism in the form of a downward-bending “U” (the coefficient of the squared term is negative) the “U” peaks at a no-rights level of ‐0.2966/(2  ( )) = 4.94 (using the –b/(2c) formula), i.e., between the extremes, as seen in this chart from the article.

CEO Overconfidence, Corporate Investment, and the Market’s Reaction The next article examines the link between the personal characteristics of a CEO, and his/her propensity to invest corporate resources unwisely. It reports (in the middle of the second column): ”… overconfidence among acquiring CEOs is one important explanation of merger activity. Using a dataset of large U.S. companies from 1980 to 1994 and the CEOs’ personal portfolio decisions as measures of overconfidence, they find that overconfident CEOs conduct more mergers and, in particular, more value- destroying mergers. These effects are most pronounced in firms with abundant cash or untapped debt capacity.” In other words, the effect of CEO overconfidence on overinvestment in value- destroying merger activity depends on the availability of ready financial resources (waiting to be misspent). What we have here is an interaction, captured in the regression model by the introduction of the product of the “overconfidence” and “ready financial resources” variables.

Smoking, Drinking, and Drug Use Respond to Price Changes Finally, the last article, suggests “that legalization and taxation (of currently- illegal drugs) — the approach that characterizes the regulation of cigarettes and alcohol — may be better than the current approach.” It notes (starting at the bottom of the first column on the last page of the Digest): “Alcohol use and abuse cannot be correlated indisputably with reductions in the real prices of alcoholic drinks without factoring in other elements. These include changes in the minimum legal drinking age and the redefining of blood-alcohol levels in regard to drunk driving. However, when these factors are taken into account, the 7 percent increase in the real price of beer between 1990 and 1992 attributable to the Federal excise tax hike on that beverage in 1991 explains almost 90 percent of the 4- percentage-point reduction in binge drinking in that period.” Clearly, a direct regression of “binge drinking” onto “real price of alcoholic drinks” suffers from specification bias, and fails to accurately capture the true effect of price on alcohol abuse. But, when the confounding variables – “legal drinking age” and “illegal blood-alcohol level” – are taken into account, the price effect is clearly revealed in the resulting “more complete” model.

General Qualitative Data, and “Dummy Variables” How might we have represented “make-of-car” in the motorpool case, had there been more than just two makes? – Assume that Make takes four categorical values (Ford, Honda, BMW, and Sterling). Choose one value as the “foundation” case. Create three 0/1 (“yes”/”no”, so-called “dummy”) variables for the other three cases. These three variables jointly represent the four-valued qualitative Make variable. Here are the details. Here We’ll use this representational trick in order to include “day of game” (either Friday, Saturday, or Sunday) in a model which predicts attendance at a professional indoor soccer team’s home games. Here is the example.Here – Using this trick requires that we extend the “significance level” (with respect to whether a variable “belongs” in the model) to groups of variables. This is done via “analysis of variance” (ANOVA).

Discounts on Car Purchases: Does Salesperson Identity Matter? Assume there are five salesfolks: Andy, Bob, Chuck, Dave and Ed Take one (e.g., Andy) as the foundation case, and add four new “dummy” variables D B = 1 only if Bob, 0 otherwise D C = 1 only if Chuck, 0 otherwise D D = 1 only if Dave, 0 otherwise D E = 1 only if Ed, 0 otherwise The coefficient of each (in the most-complete model) will differentiate the average discount that each salesperson gives a customer from the average discount Andy would give the same customer

Does Salesperson Identity Matter? Imagine that, after adding the new variables (four new columns of data) to your model, the regression yields: Discount pred =  Age –  Income  Sex  D B + (–300)  D C + (–50)  D D  D E With similar customers, you’d expect Bob to give a discount $240 higher than would Andy With similar customers, you’d expect Chuck to give a discount $300 lower than would Andy, $540 lower than would Bob, and also lower than would Dave (by $250) and Ed (by $670)

Does “Salesperson” Interact with “Sex”? Are some of the salesfolk better at selling to a particular Sex of customer? – Add D B, D C, D D, D E, and D B  Sex, D C  Sex, D D  Sex, D E  Sex to the model – Imagine that your regression yields: Discount pred =  Age  Income  Sex  D B – 350  D C + 75  D D + 10  D E – 375  (D B  Sex) – 150  (D C  Sex) – 50  (D D  Sex)  (D E  Sex) – Interpret this back in the “conceptual” model: Discount pred =  Age –  Income  Sex + (240 – 375  Sex)  D B + (–350 – 150  Sex)  D C + (75 – 50  Sex)  D D + (  Sex)  D E

Discount pred =  Age –  Income  Sex + (240 – 375  Sex)  D B + (–350 – 150  Sex)  D C + (75 – 50  Sex)  D D + (  Sex)  D E – Given a male (Sex=0) customer, you’d expect Bob (D B =1) to give a greater discount (by $240-$375  0 = $240) than Andy – Given a female (Sex=1) customer, you’d expect Bob to give a smaller discount (by $240-$375  1 = -$135) than Andy – Chuck has been giving smaller discounts to both men and women than has Andy, and Dave and Ed have been giving larger discounts than Andy to both sexes – And we could take the same approach to investigate whether “Salesperson” interacts with Age, including also D B  Age, D C  Age, D D  Age, D E  Age in our model

Outliers An outlier is a sample observation which fails to “fit” with the rest of the sample data. Such observations may distort the results of an entire study. – Types of outliers (three) – Identification of outliers (via “model analysis”) – Dealing with outliers (perhaps yielding a better model) These issues are dealt with here.here

Additional Session 4 Materials Optional readings on logarithmic transformations, and on testing for differences (benchmarking) Two more thorough sample exams. – One based on a firm converting from Microsoft office software to open-source Linux software, choosing between training programs, with a 90-minute prerecorded Webex tutorial – One based on a real-estate developer studying the impact on home values of having a clubhouse in a development