Simple Linear Regression 1. Correlation indicates the magnitude and direction of the linear relationship between two variables. Linear Regression: variable.

Slides:



Advertisements
Similar presentations
Kin 304 Regression Linear Regression Least Sum of Squares
Advertisements

Forecasting Using the Simple Linear Regression Model and Correlation
Correlation and Linear Regression.
Correlation and Regression By Walden University Statsupport Team March 2011.
Regression Analysis Once a linear relationship is defined, the independent variable can be used to forecast the dependent variable. Y ^ = bo + bX bo is.
Simple Linear Regression. G. Baker, Department of Statistics University of South Carolina; Slide 2 Relationship Between Two Quantitative Variables If.
Chapter 15 (Ch. 13 in 2nd Can.) Association Between Variables Measured at the Interval-Ratio Level: Bivariate Correlation and Regression.
Some Terms Y =  o +  1 X Regression of Y on X Regress Y on X X called independent variable or predictor variable or covariate or factor Which factors.
LINEAR REGRESSION: Evaluating Regression Models Overview Assumptions for Linear Regression Evaluating a Regression Model.
LINEAR REGRESSION: Evaluating Regression Models. Overview Assumptions for Linear Regression Evaluating a Regression Model.
LINEAR REGRESSION: Evaluating Regression Models. Overview Standard Error of the Estimate Goodness of Fit Coefficient of Determination Regression Coefficients.
LINEAR REGRESSION: What it Is and How it Works. Overview What is Bivariate Linear Regression? The Regression Equation How It’s Based on r Assumptions.
Statistics for the Social Sciences Psychology 340 Spring 2005 Prediction cont.
Statistics for the Social Sciences
1-1 Regression Models  Population Deterministic Regression Model Y i =  0 +  1 X i u Y i only depends on the value of X i and no other factor can affect.
Bivariate Regression CJ 526 Statistical Analysis in Criminal Justice.
REGRESSION What is Regression? What is the Regression Equation? What is the Least-Squares Solution? How is Regression Based on Correlation? What are the.
Lesson #32 Simple Linear Regression. Regression is used to model and/or predict a variable; called the dependent variable, Y; based on one or more independent.
Chapter Topics Types of Regression Models
Topics: Regression Simple Linear Regression: one dependent variable and one independent variable Multiple Regression: one dependent variable and two or.
© 2000 Prentice-Hall, Inc. Chap Forecasting Using the Simple Linear Regression Model and Correlation.
Introduction to Linear and Logistic Regression. Basic Ideas Linear Transformation Finding the Regression Line Minimize sum of the quadratic residuals.
1 Relationships We have examined how to measure relationships between two categorical variables (chi-square) one categorical variable and one measurement.
Multiple Regression Research Methods and Statistics.
Correlation and Regression Analysis
Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. More About Regression Chapter 14.
Linear Regression.  Uses correlations  Predicts value of one variable from the value of another  ***computes UKNOWN outcomes from present, known outcomes.
Correlation & Regression
Chapter 12 Correlation and Regression Part III: Additional Hypothesis Tests Renee R. Ha, Ph.D. James C. Ha, Ph.D Integrative Statistics for the Social.
Regression and Correlation Methods Judy Zhong Ph.D.
Introduction to Linear Regression and Correlation Analysis
Inference for regression - Simple linear regression
ASSOCIATION BETWEEN INTERVAL-RATIO VARIABLES
1 Least squares procedure Inference for least squares lines Simple Linear Regression.
OPIM 303-Lecture #8 Jose M. Cruz Assistant Professor.
Review of Statistical Models and Linear Regression Concepts STAT E-150 Statistical Methods.
Copyright © 2010 Pearson Education, Inc Chapter Seventeen Correlation and Regression.
Chapter 12 Examining Relationships in Quantitative Research Copyright © 2013 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin.
Simple Linear Regression One reason for assessing correlation is to identify a variable that could be used to predict another variable If that is your.
Introductory Statistics for Laboratorians dealing with High Throughput Data sets Centers for Disease Control.
Soc 3306a Lecture 9: Multivariate 2 More on Multiple Regression: Building a Model and Interpreting Coefficients.
Applied Quantitative Analysis and Practices LECTURE#23 By Dr. Osman Sadiq Paracha.
Ch4 Describing Relationships Between Variables. Section 4.1: Fitting a Line by Least Squares Often we want to fit a straight line to data. For example.
Examining Relationships in Quantitative Research
Regression Chapter 16. Regression >Builds on Correlation >The difference is a question of prediction versus relation Regression predicts, correlation.
Regression Lesson 11. The General Linear Model n Relationship b/n predictor & outcome variables form straight line l Correlation, regression, t-tests,
Copyright ©2011 Brooks/Cole, Cengage Learning Inference about Simple Regression Chapter 14 1.
Simple & Multiple Regression 1: Simple Regression - Prediction models 1.
Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-Hall, Inc. Chap 13-1 Introduction to Regression Analysis Regression analysis is used.
1 Regression Analysis The contents in this chapter are from Chapters of the textbook. The cntry15.sav data will be used. The data collected 15 countries’
Yesterday Correlation Regression -Definition
Chapter 6 (cont.) Difference Estimation. Recall the Regression Estimation Procedure 2.
Copyright © 2010 Pearson Education, Inc Chapter Seventeen Correlation and Regression.
MARE 250 Dr. Jason Turner Linear Regression. Linear regression investigates and models the linear relationship between a response (Y) and predictor(s)
Multiple regression.
D/RS 1013 Data Screening/Cleaning/ Preparation for Analyses.
1 Simple Linear Regression and Correlation Least Squares Method The Model Estimating the Coefficients EXAMPLE 1: USED CAR SALES.
رگرسیون چندگانه Multiple Regression
Correlation, Bivariate Regression, and Multiple Regression
Statistics for Managers using Microsoft Excel 3rd Edition
Kin 304 Regression Linear Regression Least Sum of Squares
Multiple Regression.
BPK 304W Regression Linear Regression Least Sum of Squares
BPK 304W Correlation.
Understanding Research Results: Description and Correlation
Multiple Regression A curvilinear relationship between one variable and the values of two or more other independent variables. Y = intercept + (slope1.
Introduction to Regression
Correlation and Simple Linear Regression
Correlation and Simple Linear Regression
Presentation transcript:

Simple Linear Regression 1

Correlation indicates the magnitude and direction of the linear relationship between two variables. Linear Regression: variable Y (criterion) is predicted by variable X (predictor) using a linear equation. Advantages: Scores on X allow prediction of scores on Y. Allows for multiple predictors (continuous and categorical) so you can control for variables. 2

Linear Regression Equation Geometry equation for a line: y = mx + b Regression equation for a line (population): y = β 0 + β 1 x β 0 : point where the line intercepts y-axis β 1 : slope of the line

Regression: Finding the Best-Fitting Line Course Evaluations Grade in Class

Best-Fitting Line Course Evaluations Grade in Class Minimize this squared distance across all data points

Slope and Intercept in Scatterplots y = b 0 + b 1 x + e y = x + e y = b 0 + b 1 x + e y = 3 - 2x + e slope is: rise/run = -2/1

Estimating Equation from Scatterplot y = b 0 + b 1 x + e y = 5 +.3x + e run = 50 rise = 15 slope = 15/50 =.3 Predict price at quality = 90 y = 5 +.3x + e y = 5 +.3*90 = 35

Example Van Camp, Barden & Sloan (2010) Contact with Blacks Scale: Ex: “What percentage of your neighborhood growing up was Black?” 0%-100% Race Related Reasons for College Choice: Ex: “To what extent did you come to Howard specifically because the student body is predominantly Black?” 1(not very much) – 10 (very much) Your predictions, how would prior contact predicts race related reasons?

Results Van Camp, Barden & Sloan (2010) Regression equation (sample): y = b 0 + b 1 x + e Contact(x) predict Reasons: y = x + e b 0 : t(107) = 14.17, p <.01 b 1 : t(107) = -2.93, p <.01 df = N – k – 1 = 109 – 1 – 1 k: predictors entered

Unstandardized and Standardized b unstandardized b: in the original units of X and Y tells us how much a change in X will produce a change in Y in the original units (meters, scale points…) not possible to compare relative impact of multiple predictors standardized b: scores 1 st standardized to SD units +1 SD change in X produces b*SD change in Y indicates relative importance of multiple predictors of Y

Results Van Camp, Barden & Sloan (2010) Contact predicts Reasons: Unstandardized: y = x + e Standardized: y = x + e (M x = 5.89, SD x = 2.53; M y = 5.61, SD y = 2.08) (M x = 0, SD x = 1.00; M y = 0, SD y = 1.00)

save new variables that are standardized versions of current variables

add fit lines add reference lines (may need to adjust to mean) select fit line

Predicting Y from X Once we have a straight line we can know what the change in Y is with each change in X Y prime (Y’) is the prediction of Y at a given X, and it is the average Y score at that X score. Warning: Predictions can only be made: (1) within the range of the sample (2) for individuals taken from a similar population under similar circumstances.

Errors around the regression line Regression equation give us the straight line that minimizes the error involved in making predictions (least squares regression line). Residual: difference between an actual Y value and predicted (Y’) value: Y – Y’ – It is the amount of the original value that is left over after the prediction is subtracted out – The amount of error above and below the line is the same

Y’ Y residual Y

Dividing up Variance Total: deviation of individual data points from the sample mean Explained: deviation of the regression line from the mean Unexplained: deviation of individual data points from the regression line (error in prediction) unexplained explained total variance variance variance (residual)

Y Y’ Y residual explained total variance unexplained explained total variancevariance (residual)

Coefficient of determination: proportion of the total variance that is explained by the predictor variable R 2 = explained variance total variance

SPSS - regression Analyze → regression → linear Select criterion variable (Y) [Racereas] [SPSS calls DV] Select predictor variable (X) [ContactBlacks] [SPSS calls IV] OK

Coefficients a Model Unstandardized Coefficients Standardized Coefficients tSig. BStd. ErrorBeta 1(Constant) ContactBlacksperc a. Dependent Variable: RaceReasons ANOVA b Model Sum of Squaresdf Mean SquareFSig. 1Regression a Residual Total a. Predictors: (Constant), ContactBlacksperc124 b. Dependent Variable: RaceReasons Model Summary b ModelRR Square Adjusted R Square Std. Error of the Estimate a a. Predictors: (Constant), ContactBlacksperc124 b. Dependent Variable: RaceReasons Unstandardized: Standardized: y = x + e y = x + e coefficient of determination Reporting in Results: b = -.27, t(107) = -2.93, p <.01. (pp. 240 in Van Camp 2010) SS error : minimized in OLS

Assumptions Underlying Linear Regression 1.Independent random sampling 2.Normal distribution 3.Linear relationships (not curvilinear) 4.Homoscedasticity of errors (homogeneity) Best way to check 2-4? Diagnostic Plots.

Test for Normality Right (positive) Skew Narrow Distribution Positive Outliers o o o o Normal Distribution: Solution? transform data not serious investigate further

Homoscedastic? Linear Appropriate? Heteroscedasticity (of residual errors) Curvilinear relationship Homoscedastic residual errors & Linear relationship Solution: add x 2 as predictor (linear regression not appropriate) Solution: transform data or weighted least squares (WLS)

SPSS—Diagnostic Graphs

END