Chapter 8: Bivariate Regression and Correlation

Slides:



Advertisements
Similar presentations
9: Examining Relationships in Quantitative Research ESSENTIALS OF MARKETING RESEARCH Hair/Wolfinbarger/Ortinau/Bush.
Advertisements

Managerial Economics in a Global Economy
Lesson 10: Linear Regression and Correlation
Review ? ? ? I am examining differences in the mean between groups
Regression Greg C Elvers.
Learning Objectives Copyright © 2002 South-Western/Thomson Learning Data Analysis: Bivariate Correlation and Regression CHAPTER sixteen.
Learning Objectives Copyright © 2004 John Wiley & Sons, Inc. Bivariate Correlation and Regression CHAPTER Thirteen.
Association for Interval Level Variables
Learning Objectives 1 Copyright © 2002 South-Western/Thomson Learning Data Analysis: Bivariate Correlation and Regression CHAPTER sixteen.
Chapter 4 The Relation between Two Variables
Correlation Chapter 9.
Chapter 15 (Ch. 13 in 2nd Can.) Association Between Variables Measured at the Interval-Ratio Level: Bivariate Correlation and Regression.
Chapter 10 Simple Regression.
9. SIMPLE LINEAR REGESSION AND CORRELATION
PPA 501 – Analytical Methods in Administration Lecture 8 – Linear Regression and Correlation.
PPA 415 – Research Methods in Public Administration
The Simple Regression Model
Chapter Eighteen MEASURES OF ASSOCIATION
Fall 2006 – Fundamentals of Business Statistics 1 Chapter 13 Introduction to Linear Regression and Correlation Analysis.
The Basics of Regression continued
SIMPLE LINEAR REGRESSION
Matching level of measurement to statistical procedures
Linear Regression and Correlation Analysis
1 Simple Linear Regression Chapter Introduction In this chapter we examine the relationship among interval variables via a mathematical equation.
Chapter 13 Introduction to Linear Regression and Correlation Analysis
Correlations and T-tests
Regression Chapter 10 Understandable Statistics Ninth Edition By Brase and Brase Prepared by Yixun Shi Bloomsburg University of Pennsylvania.
Business Statistics - QBM117 Least squares regression.
1 Relationships We have examined how to measure relationships between two categorical variables (chi-square) one categorical variable and one measurement.
Correlation and Regression Analysis
Regression Analysis We have previously studied the Pearson’s r correlation coefficient and the r2 coefficient of determination as measures of association.
Leon-Guerrero and Frankfort-Nachmias,
Review Regression and Pearson’s R SPSS Demo
Relationships Among Variables
Correlation and Linear Regression
Lecture 16 Correlation and Coefficient of Correlation
February  Study & Abstract StudyAbstract  Graphic presentation of data. Graphic presentation of data.  Statistical Analyses Statistical Analyses.
Introduction to Linear Regression and Correlation Analysis
Chapter 11 Simple Regression
Linear Regression and Correlation
Correlation and Linear Regression
Correlation and regression 1: Correlation Coefficient
ASSOCIATION BETWEEN INTERVAL-RATIO VARIABLES
STATISTICS: BASICS Aswath Damodaran 1. 2 The role of statistics Aswath Damodaran 2  When you are given lots of data, and especially when that data is.
Correlation and Regression. The test you choose depends on level of measurement: IndependentDependentTest DichotomousContinuous Independent Samples t-test.
Chapter 6 & 7 Linear Regression & Correlation
Agenda Review Association for Nominal/Ordinal Data –  2 Based Measures, PRE measures Introduce Association Measures for I-R data –Regression, Pearson’s.
Linear Functions 2 Sociology 5811 Lecture 18 Copyright © 2004 by Evan Schofer Do not copy or distribute without permission.
Chapter 12 Examining Relationships in Quantitative Research Copyright © 2013 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin.
Correlation and Linear Regression. Evaluating Relations Between Interval Level Variables Up to now you have learned to evaluate differences between the.
Chapter 8 – 1 Chapter 8: Bivariate Regression and Correlation Overview The Scatter Diagram Two Examples: Education & Prestige Correlation Coefficient Bivariate.
Correlation is a statistical technique that describes the degree of relationship between two variables when you have bivariate data. A bivariate distribution.
Examining Relationships in Quantitative Research
1 Chapter 12 Simple Linear Regression. 2 Chapter Outline  Simple Linear Regression Model  Least Squares Method  Coefficient of Determination  Model.
Y X 0 X and Y are not perfectly correlated. However, there is on average a positive relationship between Y and X X1X1 X2X2.
10B11PD311 Economics REGRESSION ANALYSIS. 10B11PD311 Economics Regression Techniques and Demand Estimation Some important questions before a firm are.
Chapter Sixteen Copyright © 2006 McGraw-Hill/Irwin Data Analysis: Testing for Association.
Chapter 16 Data Analysis: Testing for Associations.
Relationships If we are doing a study which involves more than one variable, how can we tell if there is a relationship between two (or more) of the.
11/23/2015Slide 1 Using a combination of tables and plots from SPSS plus spreadsheets from Excel, we will show the linkage between correlation and linear.
Examining Relationships in Quantitative Research
Chapter Thirteen Copyright © 2006 John Wiley & Sons, Inc. Bivariate Correlation and Regression.
Chapter 9: Correlation and Regression Analysis. Correlation Correlation is a numerical way to measure the strength and direction of a linear association.
CHAPTER 5 CORRELATION & LINEAR REGRESSION. GOAL : Understand and interpret the terms dependent variable and independent variable. Draw a scatter diagram.
Correlation & Regression Analysis
Chapter 8 – 1 Regression & Correlation:Extended Treatment Overview The Scatter Diagram Bivariate Linear Regression Prediction Error Coefficient of Determination.
Regression Analysis. 1. To comprehend the nature of correlation analysis. 2. To understand bivariate regression analysis. 3. To become aware of the coefficient.
Chapter 15 Linear Regression
CHAPTER 26: Inference for Regression
Presentation transcript:

Chapter 8: Bivariate Regression and Correlation Overview The Scatter Diagram Two Examples: Education & Prestige Correlation Coefficient Bivariate Linear Regression Line SPSS Output Interpretation Covariance

Overview Independent Variables Interval Nominal Dependent Variable Nominal Interval Considers the distribution of one variable across the categories of another variable Considers the difference between the mean of one group on a variable with another group Considers how a change in a variable affects a discrete outcome Considers the degree to which a change in one variable results in a change in another

You already know how to deal with two nominal variables Overview You already know how to deal with two nominal variables Independent Variables Nominal Interval Considers how a change in a variable affects a discrete outcome Lambda Dependent Variable Interval Nominal Considers the difference between the mean of one group on a variable with another group Considers the degree to which a change in one variable results in a change in another

You already know how to deal with two nominal variables Overview You already know how to deal with two nominal variables TODAY! Independent Variables Nominal Interval Considers how a change in a variable affects a discrete outcome Lambda Dependent Variable Interval Nominal Considers the degree to which a change in one variable results in a change in another Confidence Intervals T-Test We will deal with this later in the course

Overview TODAY! What about this cell? Independent Variables Regression You already know how to deal with two nominal variables What about this cell? Independent Variables Nominal Interval Considers how a change in a variable affects a discrete outcome Lambda Dependent Variable Interval Nominal TODAY! Confidence Intervals T-Test Regression Correlation We will deal with this later in the course

You already know how to deal with two nominal variables Overview You already know how to deal with two nominal variables This cell is not covered in this course Independent Variables Nominal Interval Logistic Regression Lambda Dependent Variable Interval Nominal TODAY! Confidence Intervals T-Test Regression Correlation We will deal with this later in the course

General Examples Does a change in one variable significantly affect another variable? Do two scores tend to co-vary positively (high on one score high on the other, low on one, low on the other)? Do two scores tend to co-vary negatively (high on one score low on the other; low on one, hi on the other)?

Specific Examples Does getting older significantly influence a person’s political views? Does marital satisfaction increase with length of marriage? How does an additional year of education affect one’s earnings?

Scatter Diagrams Scatter Diagram (scatterplot)—a visual method used to display a relationship between two interval-ratio variables. Typically, the independent variable is placed on the X-axis (horizontal axis), while the dependent variable is placed on the Y-axis (vertical axis.)

Scatter Diagram Example The data…

Scatter Diagram Example

A Scatter Diagram Example of a Negative Relationship

Linear Relationships Linear relationship – A relationship between two interval-ratio variables in which the observations displayed in a scatter diagram can be approximated with a straight line. Deterministic (perfect) linear relationship – A relationship between two interval-ratio variables in which all the observations (the dots) fall along a straight line. The line provides a predicted value of Y (the vertical axis) for any value of X (the horizontal axis.

Graph the data below and examine the relationship:

The Seniority-Salary Relationship

Example: Education & Prestige Does education predict occupational prestige? If so, then the higher the respondent’s level of education, as measured by number of years of schooling, the greater the prestige of the respondent’s occupation. Take a careful look at the scatter diagram on the next slide and see if you think that there exists a relationship between these two variables…

Scatterplot of Prestige by Education

Example: Education & Prestige The scatter diagram data can be represented by a straight line, therefore there does exist a relationship between these two variables. In addition, since occupational prestige becomes higher, as years of education increases, we can say also that the relationship is a positive one.

The mean age for U.S. residents. Take your best guess? If you know nothing else about a person, except that he or she lives in United States and I asked you to his or her age, what would you guess? The mean age for U.S. residents. Now if I tell you that this person owns a skateboard, would you change your guess? (Of course!) With quantitative analyses we are generally trying to predict or take our best guess at value of the dependent variable. One way to assess the relationship between two variables is to consider the degree to which the extra information of the second variable makes your guess better. If someone owns a skateboard, that is likely to indicate to us that s/he is younger and we may be able to guess closer to the actual value.

Take your best guess? Similar to the example of age and the skateboard, we can take a much better guess at someone’s occupational prestige, if we have information about her/his years or level of education.

Equation for a Straight Line Y= a + bX where a = intercept b = slope Y = dependent variable X = independent variable X Y a rise run = b

Bivariate Linear Regression Equation ^ Y = a + bX Y-intercept (a)—The point where the regression line crosses the Y-axis, or the value of Y when X=0. Slope (b)—The change in variable Y (the dependent variable) with a unit change in X (the independent variable.) The estimates of a and b will have the property that the sum of the squared differences between the observed and predicted (Y-Y)2 is minimized using ordinary least squares (OLS). Thus the regression line represents the Best Linear and Unbiased Estimators (BLUE) of the intercept and slope. ˆ

SPSS Regression Output (GSS) Education & Prestige

SPSS Regression Output (GSS) Education & Prestige Now let’s interpret the SPSS output...

The Regression Equation Prediction Equation: Y = 6.120 + 2.762(X) This line represents the predicted values for Y for any and all values of X ˆ

The Regression Equation Prediction Equation: Y = 6.120 + 2.762(X) This line represents the predicted values for Y for any and all values of X ˆ

Interpreting the regression equation Y = 6.120 + 2.762(X) ˆ If a respondent had zero years of schooling, this model predicts that his occupational prestige score would be 6.120 points. For each additional year of education, our model predicts a 2.762 point increase in occupational prestige.

Ordinary Least Squares Least-squares line (best fitting line) – A line where the errors sum of squares, or e2, is at a minimum. Least-squares method – The technique that produces the least squares line.

Estimating the slope: b The bivariate regression coefficient or the slope of the regression line can be obtained from the observed X and Y scores.

Covariance and Variance Variance of X = Covariance of X and Y—a measure of how X and Y vary together. Covariance will be close to zero when X and Y are unrelated. It will be greater than zero when the relationship is positive and less than zero when the relationship is negative. Variance of X—we have talked a lot about variance in the dependent variable. This is simply the variance for the independent variable

Estimating the Intercept The regression line always goes through the point corresponding to the mean of both X and Y, by definition. So we utilize this information to solve for a:

Back to the original scatterplot:

A Representative Line

Other Representative Lines

Calculating the Regression Equation

Calculating the Regression Equation

The Least Squares Line!

Summary: Properties of the Regression Line Represents the predicted values for Y for any and all values of X. Always goes through the point corresponding to the mean of both X and Y. It is the best fitting line in that it minimizes the sum of the squared deviations. Has a slope that can be positive or negative; null hypothesis is that the slope is zero.

Coefficient of Determination Coefficient of Determination (r2) – A PRE measure reflecting the proportional reduction of error that results from using the linear regression model. It reflects the proportion of the total variation in the dependent variable, Y, explained by the independent variable, X.

Coefficient of Determination

Coefficient of Determination

The Correlation Coefficient Pearson’s Correlation Coefficient (r) — The square root of r2. It is a measure of association between two interval-ratio variables. Symmetrical measure—No specification of independent or dependent variables. Ranges from –1.0 to +1.0. The sign () indicates direction. The closer the number is to 1.0 the stronger the association between X and Y.

The Correlation Coefficient r = 0 means that there is no association between the two variables. r = 0 Y X

The Correlation Coefficient r = 0 means that there is no association between the two variables. r = +1 means a perfect positive correlation. r = +1 Y X

The Correlation Coefficient r = 0 means that there is no association between the two variables. r = +1 means a perfect positive correlation. r = –1 means a perfect negative correlation. Y r = –1 X