Extension The General Linear Model with Categorical Predictors.

Slides:



Advertisements
Similar presentations
Multiple Regression and Model Building
Advertisements

11-1 Empirical Models Many problems in engineering and science involve exploring the relationships between two or more variables. Regression analysis.
A. The Basic Principle We consider the multivariate extension of multiple linear regression – modeling the relationship between m responses Y 1,…,Y m and.
Lesson 10: Linear Regression and Correlation
 Population multiple regression model  Data for multiple regression  Multiple linear regression model  Confidence intervals and significance tests.
6-1 Introduction To Empirical Models 6-1 Introduction To Empirical Models.
Regression Analysis Module 3. Regression Regression is the attempt to explain the variation in a dependent variable using the variation in independent.
Ch11 Curve Fitting Dr. Deshi Ye
Understanding the General Linear Model
The Use and Interpretation of the Constant Term
Logistic Regression Multivariate Analysis. What is a log and an exponent? Log is the power to which a base of 10 must be raised to produce a given number.
Choosing a Functional Form
Chapter 10 Simple Regression.
Bivariate Regression CJ 526 Statistical Analysis in Criminal Justice.
Elaboration Elaboration extends our knowledge about an association to see if it continues or changes under different situations, that is, when you introduce.
Introduction to Linear and Logistic Regression. Basic Ideas Linear Transformation Finding the Regression Line Minimize sum of the quadratic residuals.
Measures of Association Deepak Khazanchi Chapter 18.
11-1 Empirical Models Many problems in engineering and science involve exploring the relationships between two or more variables. Regression analysis.
Brown, Suter, and Churchill Basic Marketing Research (8 th Edition) © 2014 CENGAGE Learning Basic Marketing Research Customer Insights and Managerial Action.
1 Chapter 17: Introduction to Regression. 2 Introduction to Linear Regression The Pearson correlation measures the degree to which a set of data points.
Simple Linear Regression and Correlation
Simple Linear Regression Analysis
Correlation Question 1 This question asks you to use the Pearson correlation coefficient to measure the association between [educ4] and [empstat]. However,
Review for Final Exam Some important themes from Chapters 9-11 Final exam covers these chapters, but implicitly tests the entire course, because we use.
Multiple Linear Regression A method for analyzing the effects of several predictor variables concurrently. - Simultaneously - Stepwise Minimizing the squared.
Regression and Correlation
Chapter 8: Bivariate Regression and Correlation
Marketing Research Aaker, Kumar, Day and Leone Tenth Edition
Introduction to Linear Regression and Correlation Analysis
Chapter 13: Inference in Regression
Chapter 11 Simple Regression
Correlation and Linear Regression
Chapter 14 – Correlation and Simple Regression Math 22 Introductory Statistics.
Inferences in Regression and Correlation Analysis Ayona Chatterjee Spring 2008 Math 4803/5803.
Lecture 22 Dustin Lueker.  The sample mean of the difference scores is an estimator for the difference between the population means  We can now use.
Simple Linear Regression One reason for assessing correlation is to identify a variable that could be used to predict another variable If that is your.
Statistics for clinicians Biostatistics course by Kevin E. Kip, Ph.D., FAHA Professor and Executive Director, Research Center University of South Florida,
Correlation and Linear Regression. Evaluating Relations Between Interval Level Variables Up to now you have learned to evaluate differences between the.
Section 5.2: Linear Regression: Fitting a Line to Bivariate Data.
Multiple Regression and Model Building Chapter 15 Copyright © 2014 by The McGraw-Hill Companies, Inc. All rights reserved.McGraw-Hill/Irwin.
Simple Linear Regression. The term linear regression implies that  Y|x is linearly related to x by the population regression equation  Y|x =  +  x.
Regression Lesson 11. The General Linear Model n Relationship b/n predictor & outcome variables form straight line l Correlation, regression, t-tests,
Chapter 10 For Explaining Psychological Statistics, 4th ed. by B. Cohen 1 A perfect correlation implies the ability to predict one score from another perfectly.
Simple Linear Regression In the previous lectures, we only focus on one random variable. In many applications, we often work with a pair of variables.
In Stat-I, we described data by three different ways. Qualitative vs Quantitative Discrete vs Continuous Measurement Scales Describing Data Types.
Multiple regression.
Correlation – Recap Correlation provides an estimate of how well change in ‘ x ’ causes change in ‘ y ’. The relationship has a magnitude (the r value)
Example x y We wish to check for a non zero correlation.
4 basic analytical tasks in statistics: 1)Comparing scores across groups  look for differences in means 2)Cross-tabulating categoric variables  look.
Simple Linear Regression The Coefficients of Correlation and Determination Two Quantitative Variables x variable – independent variable or explanatory.
Copyright © 2011 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Multiple Regression Chapter 14.
Quantitative Methods. Bivariate Regression (OLS) We’ll start with OLS regression. Stands for  Ordinary Least Squares Regression. Relatively basic multivariate.
Biostatistics Regression and Correlation Methods Class #10 April 4, 2000.
Copyright © 2012 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 18 Multivariate Statistics.
Chapter 14 Introduction to Regression Analysis. Objectives Regression Analysis Uses of Regression Analysis Method of Least Squares Difference between.
11-1 Copyright © 2014, 2011, and 2008 Pearson Education, Inc.
Chapter 11 REGRESSION Multiple Regression  Uses  Explanation  Prediction.
Chapter 13 Linear Regression and Correlation. Our Objectives  Draw a scatter diagram.  Understand and interpret the terms dependent and independent.
Inference about the slope parameter and correlation
Analysis and Interpretation: Multiple Variables Simultaneously
Bivariate & Multivariate Regression Analysis
ENM 310 Design of Experiments and Regression Analysis
Equations of Lines and Modeling
Multivariate Statistics
Soc 3306a Lecture 11: Multivariate 4
BIVARIATE ANALYSIS: Measures of Association Between Two Variables
Correlation and Regression
BIVARIATE ANALYSIS: Measures of Association Between Two Variables
Descriptive Statistics Univariate Data
Regression and Correlation of Data
Presentation transcript:

Extension The General Linear Model with Categorical Predictors

Extension  Regression can actually handle different types of predictors, and in the social sciences we are often interested in differences between groups  For now we will concern ourselves with the two independent groups case  E.g. gender, republican vs. democrat etc.

Dummy coding  There are different ways to code categorical data for regression, and in general, to represent a categorical variable you need k-1 coded variables 1  k = number of categories/groups  Dummy coding involves using zeros and ones to identify group membership, and since we only have two groups, one group will be zero (the reference group) and the other 1

Dummy coding  Example  The thing to note at this point is that we have a simple bivariate correlation/simple regression setting  The correlation between group and the DV is.76  This is sometimes referred to as the point biserial correlation (r pb ) because of the categorical variable  However, don’t be fooled, it is calculated exactly the same way as the Pearson before i.e. you treat that 0,1 grouping variable like any other in calculating the correlation coefficient  However, the sign is arbitrary since either group could have been a one or zero, and so that needs to be noted Group Outcome

Example  Graphical display  The R-square is.76 2 =.577  The regression equation is

Example  Look closely at the descriptive output compared to the coefficients.  What do you see?

The constant  Note again our regression equation  Recall the definition for the slope and constant  First the constant, what does “when X = O” mean here in this setting?  It means when we are in the O group  What is that predicted value?  Y pred = (0) = 4  That is the group’s mean  The constant here is thus the reference group’s mean

The coefficient  Now think about the slope  What does a ‘1 unit change in X’ mean in this setting?  It means we go from one group to the other  Based on that coefficient, what does the slope represent in this case (i.e. can you derive that coefficient from the descriptive stats in some way?)  The coefficient is the difference between means

The regression line  The regression line covers the values represented  i.e. 0, 1, for the two groups  It passes through each of their means  Using least squares regression the regression line always passes through the mean of X and Y, though the mean of X here is nonsensical  The constant (if we are using dummy coding) is the mean for the zero (reference) group  The coefficient is the difference between means

 Furthermore, the previous gives the same results we would have gotten via a t-test, to which we are about to turn,  However, you now can see it is not a distinct procedure from regression with a linear model of some outcome predicted by a grouping variable. Two Sample t-test data: Outcome by Group t = , df = 8, p-value = percent confidence interval:

 Understanding the basics regarding the general linear model can go a long way toward one’s ability to understand any analysis  It not only specifically holds here but is utilized in more complex univariate and multivariate analyses, and even in some nonlinear situations (e.g. logistic regression), we use ‘generalized’ linear models  Y = Xb + e  For properly specified models, linear models provide reasonable fits and an intuitive understanding relative to more complex approaches.