Multivariate Analysis Lec 4

Slides:



Advertisements
Similar presentations
Forecasting Using the Simple Linear Regression Model and Correlation
Advertisements

6-1 Introduction To Empirical Models 6-1 Introduction To Empirical Models.
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 13 Nonlinear and Multiple Regression.
Statistics for Managers Using Microsoft® Excel 5th Edition
BA 555 Practical Business Analysis
Statistics for Managers Using Microsoft® Excel 5th Edition
Multiple Regression Involves the use of more than one independent variable. Multivariate analysis involves more than one dependent variable - OMS 633 Adding.
Multivariate Data Analysis Chapter 4 – Multiple Regression.
Chapter Topics Types of Regression Models
Predictive Analysis in Marketing Research
Chapter 11 Multiple Regression.
Topic 3: Regression.
Introduction to Probability and Statistics Linear Regression and Correlation.
Ch. 14: The Multiple Regression Model building
Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc. Chap 15-1 Chapter 15 Multiple Regression Model Building Basic Business Statistics 11 th Edition.
© 2000 Prentice-Hall, Inc. Chap Forecasting Using the Simple Linear Regression Model and Correlation.
Multiple Linear Regression A method for analyzing the effects of several predictor variables concurrently. - Simultaneously - Stepwise Minimizing the squared.
Chapter 8 Forecasting with Multiple Regression
Quantitative Business Analysis for Decision Making Multiple Linear RegressionAnalysis.
Objectives of Multiple Regression
Marketing Research Aaker, Kumar, Day and Leone Tenth Edition
Introduction to Linear Regression and Correlation Analysis
Correlation and Regression
Chapter 13: Inference in Regression
Chapter 12 Multiple Regression and Model Building.
1 Least squares procedure Inference for least squares lines Simple Linear Regression.
OPIM 303-Lecture #8 Jose M. Cruz Assistant Professor.
LOGO Chapter 4 Multiple Regression Analysis Devilia Sari - Natalia.
Chapter 12 Examining Relationships in Quantitative Research Copyright © 2013 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin.
Soc 3306a Lecture 9: Multivariate 2 More on Multiple Regression: Building a Model and Interpreting Coefficients.
Copyright © 2010 Pearson Education, Inc., publishing as Prentice-Hall. 4-1 Chapter 4 Multiple Regression Analysis.
Regression Analyses. Multiple IVs Single DV (continuous) Generalization of simple linear regression Y’ = b 0 + b 1 X 1 + b 2 X 2 + b 3 X 3...b k X k Where.
Multiple Regression and Model Building Chapter 15 Copyright © 2014 by The McGraw-Hill Companies, Inc. All rights reserved.McGraw-Hill/Irwin.
Lesson Multiple Regression Models. Objectives Obtain the correlation matrix Use technology to find a multiple regression equation Interpret the.
Chapter 16 Data Analysis: Testing for Associations.
Simple Linear Regression (SLR)
Simple Linear Regression (OLS). Types of Correlation Positive correlationNegative correlationNo correlation.
Regression Analysis © 2007 Prentice Hall17-1. © 2007 Prentice Hall17-2 Chapter Outline 1) Correlations 2) Bivariate Regression 3) Statistics Associated.
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 14-1 Chapter 14 Multiple Regression Model Building Statistics for Managers.
B AD 6243: Applied Univariate Statistics Multiple Regression Professor Laku Chidambaram Price College of Business University of Oklahoma.
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Chap 15-1 Chapter 15 Multiple Regression Model Building Basic Business Statistics 10 th Edition.
Multiple Regression Analysis. LEARNING OBJECTIVES Upon completing this chapter, you should be able to do the following: Determine when regression analysis.
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 14-1 Chapter 14 Multiple Regression Model Building Statistics for Managers.
Biostatistics Regression and Correlation Methods Class #10 April 4, 2000.
Logistic Regression: Regression with a Binary Dependent Variable.
Yandell – Econ 216 Chap 15-1 Chapter 15 Multiple Regression Model Building.
Stats Methods at IC Lecture 3: Regression.
Chapter 15 Multiple Regression Model Building
Chapter 14 Introduction to Multiple Regression
Chapter 20 Linear and Multiple Regression
Chapter 15 Multiple Regression and Model Building
Chapter 7. Classification and Prediction
Inference for Least Squares Lines
Correlation, Bivariate Regression, and Multiple Regression
Multiple Regression Prof. Andy Field.
Chapter 9 Multiple Linear Regression
Multiple Regression Analysis and Model Building
CHAPTER 29: Multiple Regression*
Prepared by Lee Revere and John Large
Multiple Regression Models
CHAPTER- 17 CORRELATION AND REGRESSION
INFS 815: Quantitative Research methods
CH2. Cleaning and Transforming Data
Simple Linear Regression
Product moment correlation
Regression Forecasting and Model Building
Chapter 13 Additional Topics in Regression Analysis
Chapter 6 Logistic Regression: Regression with a Binary Dependent Variable Copyright © 2010 Pearson Education, Inc., publishing as Prentice-Hall.
Chapter 4 Multiple Regression Analysis
Presentation transcript:

Multivariate Analysis Lec 4 Regression Analysis A single dependent (criterion) variable and several independent (predictor) variables Weighted independent variables to ensure maximal prediction Regression variate To apply The data must be metric, or appropriately transformed Decision which variable is to be dependent and which remaining variables will be independent Fall, 2008 Multivariate Analysis Lec 4

Multivariate Analysis Lec 4 An Example Fall, 2008 Multivariate Analysis Lec 4

Setting a Baseline Prediction without an Independent Variables Predicted # of credit cards = Average # of credit cards But how accurate is the baseline prediction The sum of squared errors (SSE) Giving the amount of prediction errors Fall, 2008 Multivariate Analysis Lec 4

Multivariate Analysis Lec 4 Fall, 2008 Multivariate Analysis Lec 4

Multivariate Analysis Lec 4 Simple Regression Fall, 2008 Multivariate Analysis Lec 4

Specifying the Equation Ŷ = B0 + B1V1 Regression coefficient Prediction error – residual (e) Least squares Fall, 2008 Multivariate Analysis Lec 4

Multivariate Analysis Lec 4 Fall, 2008 Multivariate Analysis Lec 4

Confidence Interval for the Prediction Like to estimate the range of predicted values that we might expect, rather than replying just on the single (point) estimate Standard error of the estimate (SEE) – establish the upper and lower bounds for our prediction Fall, 2008 Multivariate Analysis Lec 4

Assessing prediction Accuracy The sum of squares regression (SSR) Total sum of squares (TSS) TSS = SSE + SSR Coefficient of Determination (R2) = SSR/TSS = (TSS-SSE)/TSS Sign and strength of the relationship Fall, 2008 Multivariate Analysis Lec 4

Multivariate Analysis Lec 4 Multiple Regression The impact of multicollinearity Reducing the predictive power of any single independent variable The multiple regression equation Predicted # cards = b0+b1V1+b2V2+e Fall, 2008 Multivariate Analysis Lec 4

Multivariate Analysis Lec 4 Fall, 2008 Multivariate Analysis Lec 4

A Decision Process for MRA Factors that impact the creation, estimation, interpretation, and validation of a regression analysis Fall, 2008 Multivariate Analysis Lec 4

Multivariate Analysis Lec 4 Fall, 2008 Multivariate Analysis Lec 4

Stage 1: Objectives of MR Research problems appropriate for MR Prediction Maximize the overall predictive power (achieve acceptable levels of predictive accuracy) Compare two or more sets of independent variables Explanation – interpretation of the variate The importance of the IVs The types of relationships found The interrelationships among the independent variables Fall, 2008 Multivariate Analysis Lec 4

Multivariate Analysis Lec 4 Specifying a statistical relationship A functional relationship and a statistical relationship Fall, 2008 Multivariate Analysis Lec 4

Multivariate Analysis Lec 4 Selection of DV and IV’s Indiscriminately and solely on empirical ground Measurement error in DV Specification error for IV’s selection Inclusion of irrelevant variables Reduce model parsimony Mask or replace the effects of more useful variables Make the test of significance less precise Exclusion of relevant variables (most troublesome) Bias the results Measurement error in IV’s Fall, 2008 Multivariate Analysis Lec 4

Stage 2: Research Design Sample size Statistical power and sample size Detecting significant R2 and coefficients > 20: simple regression and too lower the power <1000: tests are too powerful Fall, 2008 Multivariate Analysis Lec 4

Multivariate Analysis Lec 4 Fixed vs. Ransom Effects Predictors A random IV, selected at random Most regression models based on survey data are random effects models: the IV’s are randomly selected from the population (inference regarding the population) Two estimation procedures are the same except for error terms In the random effects model, a portion of the random error comes from the sampling of the independent variables Procedures based on the fixed model are robust Fall, 2008 Multivariate Analysis Lec 4

Multivariate Analysis Lec 4 Creating additional variables The basic relationship – the linear association between metric DV and IV’s based on product moment correlation Transformations Desire to deal with nonmetric data and nonlinear relationships Theoretical reason: the nature of data Data derived: by examining the data Fall, 2008 Multivariate Analysis Lec 4

Multivariate Analysis Lec 4 Incorporating nonmetric data with dummy variables Indicator coding Differences in means from the reference category (a all zeroes category) Fall, 2008 Multivariate Analysis Lec 4

Multivariate Analysis Lec 4 Representing curvilinear effects with polynomials Power transformations of an independent variable Only simple curvilinear relationships – U-shaped relationships No statistical means for assessing whether the curvilinear or linear relationship model is more appropriate Accommodate only univariate relationships Y = b0+b1X1+b2X12 Interaction or moderator effects Y = b0+b1X1+b2X2+b3X1X2 Fall, 2008 Multivariate Analysis Lec 4

Multivariate Analysis Lec 4 Stage 3: Assumptions Assessing individual variables vs. the variate Principal measure of predictive error for the variate: the residual Need some form of standardization: Studentized residual Residual plot The residual vs. the predicted dependent values Null plot: when all assumptions are met Fall, 2008 Multivariate Analysis Lec 4

Multivariate Analysis Lec 4 Fall, 2008 Multivariate Analysis Lec 4

Multivariate Analysis Lec 4 Linearity of the phenomenon The degree to which the change in the dependent variable is associated to the IV’s The regression coefficient is constant across the range of values for IV The concept of correlation: linearity Partial regression plots The relationship between a specific IV and the DV Non-horizontal line, slope up or down Examine the residuals around the line Fall, 2008 Multivariate Analysis Lec 4

Multivariate Analysis Lec 4 Constant variance of the error term Unequal variance (heteroscedasticity): the most common assumption violation The most common pattern of residual plot: triangle-shaped in either direction A diamond-shaped – more variance in the midrange A number of violations can occur simultaneously Levene test for homogeneity of variance Remedies Weighted least squares Variance-stabilizing transformations Fall, 2008 Multivariate Analysis Lec 4

Multivariate Analysis Lec 4 Independence of the Error Terms Basic assumption in regression: each predicted value is independent, I.e., not sequenced by any variable Association with time Basic model conditions change: seasonal effect Normality of the error term distribution Normal probability plot Fall, 2008 Multivariate Analysis Lec 4

Multivariate Analysis Lec 4 Fall, 2008 Multivariate Analysis Lec 4

Stage 4: Estimating the Regression Model and Overall Model Fit Select a method for model specification Assess the significance of the overall model Any undue influential observations Fall, 2008 Multivariate Analysis Lec 4

Multivariate Analysis Lec 4 General approaches to variable selection Confirmatory specification - though in concept, must assured that the set of variables achieve the maximum prediction while maintaining a parsimonious model Sequential search methods Stepwise estimation: Based on incremental contribution Only one variable at a time, no combined effect Forward addition and backward elimination: Largely trial-and-error process Caveats Multicollinearity Multiple significance tests in the stepwise procedure Fall, 2008 Multivariate Analysis Lec 4

Multivariate Analysis Lec 4 Fall, 2008 Multivariate Analysis Lec 4

Multivariate Analysis Lec 4 Testing the regression variate for meeting the regression assumptions Examining the statistical significance of our model The F ratio Adjusted R2 Significance tests of regression coefficient Sampling variation for estimated regression coefficients (Table 4.8) Fall, 2008 Multivariate Analysis Lec 4

Multivariate Analysis Lec 4 Fall, 2008 Multivariate Analysis Lec 4

Multivariate Analysis Lec 4 Identifying influential observations Fall, 2008 Multivariate Analysis Lec 4

Stage 5: Interpreting the Regression Variate Using the regression coefficients Used to calculate the predicted values Interpretation: individual IV’s impact The problem of measurement scales Beta coefficient (unit: SD) Used when collinearity is minimal Interpreted only in the context of (relation to) other IV’s in the equation The levels affect the beta value Fall, 2008 Multivariate Analysis Lec 4

Multivariate Analysis Lec 4 Assessing multicollinearity The effect of multicollinearity Estimation: the ability of the regression procedure and the researcher to represent and understand the effects of each IV in the regression variate Limited the size of the coefficient of determination (difficult to add unique explanatory prediction) Difficult to determine the contribution of individual IV Fall, 2008 Multivariate Analysis Lec 4

Multivariate Analysis Lec 4 Identifying multicollinearity The extent of collinearity The degree to which the estimated coefficients are affected The simplest and obvious means: correlational matrix (generally, 0.9 and above) However, collinearity may be due to combined effect Common measures The tolerance value (the variation of an IV explained by other IV’s): 0.1 Its inverse – the variance inflation factor (VIF): 10 Fall, 2008 Multivariate Analysis Lec 4

Multivariate Analysis Lec 4 Fall, 2008 Multivariate Analysis Lec 4

Multivariate Analysis Lec 4 Remedies for multicollinearity Omitting some IV(s) Use the model for prediction only Use the simple correlation between a DV and an IV to understand the relationship Use a more sophisticated method, such as Bayesian regression or regression on principal components Make a judgment on the variables included in the regression variate, which should always be guided by the theoretical background of the study Fall, 2008 Multivariate Analysis Lec 4

Stage 6: Validation the Results Additional or split sample Calculating the PRESS statistic Estimate n –1 regression models Similar to R square for predictive accuracy Similar to bootstrapping Comparing regression models R square increases with the # of IV’s Use adjusted R square Fall, 2008 Multivariate Analysis Lec 4

Multivariate Analysis Lec 4 Predicting with the model Apply the model to a new set of data Factors to be considered Considering the sampling variations from both samples: confidence intervals of predictions Conditions and relationships have not changed Use the model to estimate beyond the range of IVs Fall, 2008 Multivariate Analysis Lec 4

Multivariate Analysis Lec 4 Fall, 2008 Multivariate Analysis Lec 4

Multivariate Analysis Lec 4 Fall, 2008 Multivariate Analysis Lec 4

Multivariate Analysis Lec 4 Fall, 2008 Multivariate Analysis Lec 4

Multivariate Analysis Lec 4 Fall, 2008 Multivariate Analysis Lec 4

Multivariate Analysis Lec 4 Interpreting the regression variate Y = -1.151 + .319X9 + .369X6 + .775X12 + (-.417)X7 + .174X11 Multicollinearity Fall, 2008 Multivariate Analysis Lec 4

Multivariate Analysis Lec 4 Fall, 2008 Multivariate Analysis Lec 4

Multivariate Analysis Lec 4 Fall, 2008 Multivariate Analysis Lec 4

Multivariate Analysis Lec 4 Fall, 2008 Multivariate Analysis Lec 4

Multivariate Analysis Lec 4 Fall, 2008 Multivariate Analysis Lec 4

Multivariate Analysis Lec 4 Fall, 2008 Multivariate Analysis Lec 4

Multivariate Analysis Lec 4 Fall, 2008 Multivariate Analysis Lec 4

Multivariate Analysis Lec 4 Fall, 2008 Multivariate Analysis Lec 4

Multivariate Analysis Lec 4 Fall, 2008 Multivariate Analysis Lec 4