4 basic analytical tasks in statistics: 1)Comparing scores across groups  look for differences in means 2)Cross-tabulating categoric variables  look.

Slides:



Advertisements
Similar presentations
Lesson 10: Linear Regression and Correlation
Advertisements

Kin 304 Regression Linear Regression Least Sum of Squares
13- 1 Chapter Thirteen McGraw-Hill/Irwin © 2005 The McGraw-Hill Companies, Inc., All Rights Reserved.
Correlation and Linear Regression.
6-1 Introduction To Empirical Models 6-1 Introduction To Empirical Models.
Learning Objectives Copyright © 2002 South-Western/Thomson Learning Data Analysis: Bivariate Correlation and Regression CHAPTER sixteen.
Learning Objectives Copyright © 2004 John Wiley & Sons, Inc. Bivariate Correlation and Regression CHAPTER Thirteen.
Correlation & Regression Chapter 15. Correlation statistical technique that is used to measure and describe a relationship between two variables (X and.
LINEAR REGRESSION: Evaluating Regression Models Overview Assumptions for Linear Regression Evaluating a Regression Model.
LINEAR REGRESSION: Evaluating Regression Models. Overview Assumptions for Linear Regression Evaluating a Regression Model.
Chapter 10 Simple Regression.
Bivariate Regression CJ 526 Statistical Analysis in Criminal Justice.
Lecture 11 PY 427 Statistics 1 Fall 2006 Kin Ching Kong, Ph.D
Lecture 19: Tues., Nov. 11th R-squared (8.6.1) Review
SIMPLE LINEAR REGRESSION
Correlation. Two variables: Which test? X Y Contingency analysis t-test Logistic regression Correlation Regression.
Topics: Regression Simple Linear Regression: one dependent variable and one independent variable Multiple Regression: one dependent variable and two or.
Chapter 11 Multiple Regression.
Simple Linear Regression Analysis
Ch. 14: The Multiple Regression Model building
Analysis of Individual Variables Descriptive – –Measures of Central Tendency Mean – Average score of distribution (1 st moment) Median – Middle score (50.
Correlational Designs
1 Chapter 17: Introduction to Regression. 2 Introduction to Linear Regression The Pearson correlation measures the degree to which a set of data points.
Correlation and Regression Analysis
Summary of Quantitative Analysis Neuman and Robson Ch. 11
Simple Linear Regression Analysis
Multiple Linear Regression A method for analyzing the effects of several predictor variables concurrently. - Simultaneously - Stepwise Minimizing the squared.
Correlation and Regression
Leedy and Ormrod Ch. 11 Gray Ch. 14
Lecture 16 Correlation and Coefficient of Correlation
Objectives of Multiple Regression
Chapter 12 Correlation and Regression Part III: Additional Hypothesis Tests Renee R. Ha, Ph.D. James C. Ha, Ph.D Integrative Statistics for the Social.
Marketing Research Aaker, Kumar, Day and Leone Tenth Edition
Introduction to Linear Regression and Correlation Analysis
Correlation and Regression
Chapter 11 Simple Regression
Correlation and Linear Regression
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Chapter 15 Multiple Regression n Multiple Regression Model n Least Squares Method n Multiple.
L 1 Chapter 12 Correlational Designs EDUC 640 Dr. William M. Bauer.
EQT 373 Chapter 3 Simple Linear Regression. EQT 373 Learning Objectives In this chapter, you learn: How to use regression analysis to predict the value.
Examining Relationships in Quantitative Research
Inference for Regression Simple Linear Regression IPS Chapter 10.1 © 2009 W.H. Freeman and Company.
© Copyright McGraw-Hill Correlation and Regression CHAPTER 10.
Chapter 16 Data Analysis: Testing for Associations.
MARKETING RESEARCH CHAPTER 18 :Correlation and Regression.
1 Regression & Correlation (1) 1.A relationship between 2 variables X and Y 2.The relationship seen as a straight line 3.Two problems 4.How can we tell.
Examining Relationships in Quantitative Research
The basic task of most research = Bivariate Analysis
The basic task of most research = Bivariate Analysis A.What does that involve?  Analyzing the interrelationship of 2 variables  Null hypothesis = independence.
Correlation & Regression Analysis
Regression Analysis. 1. To comprehend the nature of correlation analysis. 2. To understand bivariate regression analysis. 3. To become aware of the coefficient.
Advanced Statistical Methods: Continuous Variables REVIEW Dr. Irina Tomescu-Dubrow.
Copyright © 2011 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Simple Linear Regression Analysis Chapter 13.
© 2011 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license.
OLS Regression What is it? Closely allied with correlation – interested in the strength of the linear relationship between two variables One variable is.
Biostatistics Regression and Correlation Methods Class #10 April 4, 2000.
Copyright © 2012 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 18 Multivariate Statistics.
Chapter 14 Introduction to Regression Analysis. Objectives Regression Analysis Uses of Regression Analysis Method of Least Squares Difference between.
Copyright © 2008 by Nelson, a division of Thomson Canada Limited Chapter 18 Part 5 Analysis and Interpretation of Data DIFFERENCES BETWEEN GROUPS AND RELATIONSHIPS.
Chapter 11 REGRESSION Multiple Regression  Uses  Explanation  Prediction.
Multiple Regression.
Simple Bivariate Regression
REGRESSION G&W p
Bivariate & Multivariate Regression Analysis
Multiple Regression.
BIVARIATE REGRESSION AND CORRELATION
Multiple Regression.
Simple Linear Regression
Product moment correlation
3 basic analytical tasks in bivariate (or multivariate) analyses:
Presentation transcript:

4 basic analytical tasks in statistics: 1)Comparing scores across groups  look for differences in means 2)Cross-tabulating categoric variables  look for contingencies 3)Computing correlations among variables  look for covariances 4)Predicting scores on an outcome variable from numerical predictor variables  look for causal effects (or predicted outcomes) -- Focus this week on the 4 th task

“Correlation” (revisited) Correlation = strength of the linear association between 2 numeric variables It reflects the degree to which the association is described by a “straight-line” relationship –The degree to which two variable covary or share common variance – [“covariance” = a key term] It reflects the “commonality” (“predictability”) between the two variables Note: r 2 (r-squared) = the proportion of variance that “shared” or common to both variables

“Regression” = closely related topic The relationship/difference between correlation and regression? –Correlation = compute the degree to which values of variables cluster around a straight line  a symmetric description (r xy = r yx )  a standardized measure –Regression = compute the equation for the “best fitting” straight line (Y = a + bX)  It is an asymmetric description (b xy <> b yx )  an unstandardized measure (usually)

Linear Regression

So, what’s the deal with “Regression” ? Why is “regression” called that? a)Term introduced by Francis Galton in late-19 th century to describe prediction of genetic traits across generations  reflecting imperfect correlations between parents and children b)It referred to tendency of extreme values of traits to “regress toward the mean” across successive generations  reflecting Galton’s interest in the inheritability of genius & other unusual traits c)Correct word use: we “regress the dependent variable on the independent variable”  Y = a + b yx X

What’s the deal with “Regression”? (cont.) Why is regression used in data analysis?  To describe the functional pattern that links 2 variables together in a correlation – i.e., what are the optimal values of a and b for X & Y?  Two basic uses of regression: a)Prediction: -- predict values of one variable (Y) from values of another variable (X) (using linear equation) b)Explanation: -- Estimate the causal influence of one variable (X) on another (Y) (based on measurable correlation). -- test a causal hypothesis about how Y and X are related.

How is regression analysis done? By fitting a straight line to a set of bivariate points (values on 2 variables for the same data units) –y = a + b yx x (basic formula for linear relation) –y = the dependent variable –x = the independent variable –a = the “intercept” –b yx = the “slope” of the line Concern is with fitting a straight line that minimizes the errors of prediction (of y from x) – (observed = predicted + error)

2 ways of expression the prediction equation: or

Regression example (continued)

“Regression” How to obtain the straight line that “best fits” the data? –Rely on a method called “least squares”  which minimizes the sum of the squared errors (deviations between the line and the data points) –Yields best-fitting line to the points –Yields formulas for a and b provided in the book How to compute regression coefficients? By hand calculations: –Definitional formula (the familiar one) –Computational formula (no deviation scores) By SPSS: Analyze  Regression  Linear

Regression Coefficient: Definitional Formula Regression Coefficient: Computational Formula Intercept (Constant): Computational Formula

“Regression” Use Example from Fox/Levin/Forde text (p. 277) (handout) Prior ChargesSentence (mos)

# Priors X Sentence YX2X2 Y2Y2 XY O Σ= 40Σ=260 Σ=8010Σ=1340

Regression Example (cont.) == = 3.0 = 14.0

Regression example (continued)

Regression (continued) - How to interpret the results? Slope (b) = predicted change in Y for a 1-unit change in X  Unstandardized b (b) = in original units/metric  Standardized b (β)[beta]= in standard (Z) units Intercept (a) = predicted value of Y when X=0  Interpretable only when zero is a meaningful value of X  Also called the “constant” term since it is the same for all values of X R (multiple r) = correlation between Y and the predictor(s) (predictability of Y from Xs)

Regression (continued) What are assumptions/requirements of regression? 1.Numeric variables (interval or ratio level) 2.Linear relationship between variables 3.Random sampling 4.Normal distribution of data 5.Homoscedasticity (equal conditional variances) What if the assumptions do not hold? 1.Don’t worry about small deviations 2.May be able to transform variables 3.May use alternative procedures

Regression (continued) How to test for significance of results? –F-test for overall regression –t-test for individual b coefficients What is R? (or R 2 ?) Can we use more than one independent variable? –Yes – it’s called “multiple regression” –Regress a single dependent variable (Y) on multiple independent variables (a linear combination that best predicts Y)

Multiple Regression - addenda Simultaneous analysis of the regression of a dependent variable on 2 or more independent variables  Y i = a +b 1 X 1 + b 2 X 2 + b 3 X 3 + e i All coefficients are computed at once –In this case, the b coefficients are partial regression coefficients –They reflect the unique predictive ability of each variable (with the covariance of other independent variables “partialled out”)

Multiple Regression What is Multiple Regression good for?  allows us to estimate: –The combined effects of multiple variables –The unique effects of individual variables  allows us to test causal theories –The combined effects of multiple variables –The unique effects of individual variables In this case, R 2 measure how well the entire model does in predicting Y.  The overall F-test refers to whole set of variables  The t-tests apply to coefficients of each variable