Cal State Northridge  320 Andrew Ainsworth PhD Regression.

Slides:



Advertisements
Similar presentations
13- 1 Chapter Thirteen McGraw-Hill/Irwin © 2005 The McGraw-Hill Companies, Inc., All Rights Reserved.
Advertisements

Inference for Regression
Regression Analysis Module 3. Regression Regression is the attempt to explain the variation in a dependent variable using the variation in independent.
Regression Analysis Once a linear relationship is defined, the independent variable can be used to forecast the dependent variable. Y ^ = bo + bX bo is.
Describing Relationships Using Correlation and Regression
Regression. The Basic Problem How do we predict one variable from another?How do we predict one variable from another? How does one variable change as.
© 2010 Pearson Prentice Hall. All rights reserved Least Squares Regression Models.
PSY 307 – Statistics for the Behavioral Sciences
The Simple Regression Model
Intro to Statistics for the Behavioral Sciences PSYC 1900
SIMPLE LINEAR REGRESSION
Cal State Northridge  320 Andrew Ainsworth PhD Correlation.
Topics: Regression Simple Linear Regression: one dependent variable and one independent variable Multiple Regression: one dependent variable and two or.
Regression The basic problem Regression and Correlation Accuracy of prediction in regression Hypothesis testing Regression with multiple predictors.
C82MCP Diploma Statistics School of Psychology University of Nottingham 1 Linear Regression and Linear Prediction Predicting the score on one variable.
Simple Linear Regression Analysis
Cal State Northridge 427 Ainsworth
Relationships Among Variables
Review Guess the correlation. A.-2.0 B.-0.9 C.-0.1 D.0.1 E.0.9.
Lecture 5 Correlation and Regression
Example of Simple and Multiple Regression
SIMPLE LINEAR REGRESSION
Marketing Research Aaker, Kumar, Day and Leone Tenth Edition
Simple Linear Regression Models
CORRELATION & REGRESSION
Introduction to Regression Analysis. Two Purposes Explanation –Explain (or account for) the variance in a variable (e.g., explain why children’s test.
Correlation and Regression. The test you choose depends on level of measurement: IndependentDependentTest DichotomousContinuous Independent Samples t-test.
Chapter 15 Correlation and Regression
Simple Linear Regression One reason for assessing correlation is to identify a variable that could be used to predict another variable If that is your.
Fundamental Statistics for the Behavioral Sciences, 5th edition David C. Howell Chapter 10 Regression © 2003 Brooks/Cole Publishing Company/ITP.
Sullivan – Fundamentals of Statistics – 2 nd Edition – Chapter 4 Section 2 – Slide 1 of 20 Chapter 4 Section 2 Least-Squares Regression.
MGS3100_04.ppt/Sep 29, 2015/Page 1 Georgia State University - Confidential MGS 3100 Business Analysis Regression Sep 29 and 30, 2015.
Reasoning in Psychology Using Statistics
© Copyright McGraw-Hill Correlation and Regression CHAPTER 10.
MARKETING RESEARCH CHAPTER 18 :Correlation and Regression.
STA 286 week 131 Inference for the Regression Coefficient Recall, b 0 and b 1 are the estimates of the slope β 1 and intercept β 0 of population regression.
VI. Regression Analysis A. Simple Linear Regression 1. Scatter Plots Regression analysis is best taught via an example. Pencil lead is a ceramic material.
11/23/2015Slide 1 Using a combination of tables and plots from SPSS plus spreadsheets from Excel, we will show the linkage between correlation and linear.
Regression Analysis © 2007 Prentice Hall17-1. © 2007 Prentice Hall17-2 Chapter Outline 1) Correlations 2) Bivariate Regression 3) Statistics Associated.
Multiple Regression. Simple Regression in detail Y i = β o + β 1 x i + ε i Where Y => Dependent variable X => Independent variable β o => Model parameter.
1 Regression & Correlation (1) 1.A relationship between 2 variables X and Y 2.The relationship seen as a straight line 3.Two problems 4.How can we tell.
Practice You collect data from 53 females and find the correlation between candy and depression is Determine if this value is significantly different.
Chapter 14 Correlation and Regression
June 30, 2008Stat Lecture 16 - Regression1 Inference for relationships between variables Statistics Lecture 16.
Copyright © 2010 Pearson Education, Inc Chapter Seventeen Correlation and Regression.
Copyright © 2011 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Simple Linear Regression Analysis Chapter 13.
McGraw-Hill/IrwinCopyright © 2009 by The McGraw-Hill Companies, Inc. All Rights Reserved. Simple Linear Regression Analysis Chapter 13.
Regression Analysis Deterministic model No chance of an error in calculating y for a given x Probabilistic model chance of an error First order linear.
© The McGraw-Hill Companies, Inc., Chapter 10 Correlation and Regression.
Chapter 11 Linear Regression and Correlation. Explanatory and Response Variables are Numeric Relationship between the mean of the response variable and.
Correlation and Simple Linear Regression
Regression The basic problem Regression and Correlation
Multiple Regression.
Correlation and Simple Linear Regression
Correlation and Regression
CHAPTER 29: Multiple Regression*
Fundamental Statistics for the Behavioral Sciences, 4th edition
24/02/11 Tutorial 3 Inferential Statistics, Statistical Modelling & Survey Methods (BS2506) Pairach Piboonrungroj (Champ)
Correlation and Simple Linear Regression
Correlation and Regression
Chapter 13 Group Differences
Product moment correlation
Reasoning in Psychology Using Statistics
Reasoning in Psychology Using Statistics
Introduction to Regression
Chapter Thirteen McGraw-Hill/Irwin
Introduction to Regression
MGS 3100 Business Analysis Regression Feb 18, 2016
Correlation and Simple Linear Regression
Presentation transcript:

Cal State Northridge  320 Andrew Ainsworth PhD Regression

2 What is regression? How do we predict one variable from another? How does one variable change as the other changes? Cause and effect Psy Cal State Northridge

3 Linear Regression A technique we use to predict the most likely score on one variable from those on another variable Uses the nature of the relationship (i.e. correlation) between two (or more; next chapter) variables to enhance your prediction Psy Cal State Northridge

4 Linear Regression: Parts Y - the variables you are predicting –i.e. dependent variable X - the variables you are using to predict –i.e. independent variable - your predictions (also known as Y ’) Psy Cal State Northridge

5 Why Do We Care? We may want to make a prediction. More likely, we want to understand the relationship. –How fast does CHD mortality rise with a one unit increase in smoking? –Note: we speak about predicting, but often don’t actually predict. Psy Cal State Northridge

6 An Example Cigarettes and CHD Mortality from Chapter 9 Data repeated on next slide We want to predict level of CHD mortality in a country averaging 10 cigarettes per day. Psy Cal State Northridge

7 The Data Based on the data we have what would we predict the rate of CHD be in a country that smoked 10 cigarettes on average? First, we need to establish a prediction of CHD from smoking… Psy Cal State Northridge

8 For a country that smokes 6 C/A/D… We predict a CHD rate of about 14 Regression Line Psy Cal State Northridge

9 Regression Line Formula – = the predicted value of Y (e.g. CHD mortality) – X = the predictor variable (e.g. average cig./adult/country) Psy Cal State Northridge

10 Regression Coefficients “Coefficients” are a and b b = slope –Change in predicted Y for one unit change in X a = intercept –value of when X = 0 Psy Cal State Northridge

11 Calculation Slope Intercept

12 For Our Data Cov XY = s 2 X = = b = 11.12/5.447 = a = *5.952 = 2.32 See SPSS printout on next slide Answers are not exact due to rounding error and desire to match SPSS. Psy Cal State Northridge

13 SPSS Printout Psy Cal State Northridge

14 Note: The values we obtained are shown on printout. The intercept is the value in the B column labeled “constant” The slope is the value in the B column labeled by name of predictor variable. Psy Cal State Northridge

15 Making a Prediction Second, once we know the relationship we can predict We predict people/10,000 in a country with an average of 10 C/A/D will die of CHD Psy Cal State Northridge

16 Accuracy of Prediction Finnish smokers smoke 6 C/A/D We predict: They actually have 23 deaths/10,000 Our error (“residual”) = = 8.38 –a large error Psy Cal State Northridge

17 Cigarette Consumption per Adult per Day CHD Mortality per 10, Residual Prediction Psy Cal State Northridge

18 Residuals When we predict Ŷ for a given X, we will sometimes be in error. Y – Ŷ for any X is a an error of estimate Also known as: a residual We want to Σ(Y- Ŷ) as small as possible. BUT, there are infinitely many lines that can do this. Just draw ANY line that goes through the mean of the X and Y values. Minimize Errors of Estimate… How? Psy Cal State Northridge

19 Minimizing Residuals Again, the problem lies with this definition of the mean: So, how do we get rid of the 0’s? Square them. Psy Cal State Northridge

20 Regression Line: A Mathematical Definition The regression line is the line which when drawn through your data set produces the smallest value of: Called the Sum of Squared Residual or SS residual Regression line is also called a “least squares line.” Psy Cal State Northridge

21 Summarizing Errors of Prediction Residual variance –The variability of predicted values Psy Cal State Northridge

22 Standard Error of Estimate Standard error of estimate –The standard deviation of predicted values A common measure of the accuracy of our predictions –We want it to be as small as possible. Psy Cal State Northridge

23 Example

24 Regression and Z Scores When your data are standardized (linearly transformed to z-scores), the slope of the regression line is called β DO NOT confuse this β with the β associated with type II errors. They’re different. When we have one predictor, r = β Z y = βZ x, since A now equals 0 Psy Cal State Northridge

25 Partitioning Variability Sums of square deviations –Total –Regression –Residual we already covered SS total = SS regression + SS residual Psy Cal State Northridge

26 Partitioning Variability Degrees of freedom –Total df total = N - 1 –Regression df regression = number of predictors –Residual df residual = df total – df regression df total = df regression + df residual Psy Cal State Northridge

27 Partitioning Variability Variance (or Mean Square) –Total Variance s 2 total = SS total / df total –Regression Variance s 2 regression = SS regression / df regression –Residual Variance s 2 residual = SS residual / df residual Psy Cal State Northridge

28 Example

29 Example Psy Cal State Northridge

30 Coefficient of Determination It is a measure of the percent of predictable variability The percentage of the total variability in Y explained by X Psy Cal State Northridge

31 r 2 for our example r =.713 r 2 = =.508 or Approximately 50% in variability of incidence of CHD mortality is associated with variability in smoking. Psy Cal State Northridge

32 Coefficient of Alienation It is defined as 1 - r 2 or Example =.492 Psy Cal State Northridge

33 r 2, SS and s Y-Y’ r 2 * SS total = SS regression (1 - r 2 ) * SS total = SS residual We can also use r 2 to calculate the standard error of estimate as: Psy Cal State Northridge

34 Hypothesis Testing Test for overall model Null hypotheses –b = 0 –a = 0 –population correlation (  ) = 0 We saw how to test the last one in Chapter 9. Psy Cal State Northridge

35 Testing Overall Model We can test for the overall prediction of the model by forming the ratio: If the calculated F value is larger than a tabled value (Table D.3  =.05 or Table D.4  =.01 ) we have a significant prediction Psy Cal State Northridge

36 Testing Overall Model Example Table D.3 – F critical is found using 2 things df regression (numerator) and df residual. (demoninator) Table D.3 our F crit (1,19) = > 4.38, significant overall Should all sound familiar… Psy Cal State Northridge

37 SPSS output Psy Cal State Northridge

38 Testing Slope and Intercept The regression coefficients can be tested for significance Each coefficient divided by it’s standard error equals a t value that can also be looked up in a table (Table D.6) Each coefficient is tested against 0 Psy Cal State Northridge

39 Testing Slope With only 1 predictor, the standard error for the slope is: For our Example: Psy Cal State Northridge

40 Testing Slope and Intercept With only 1 predictor, the standard error for the intercept is: For our Example: Psy Cal State Northridge

41 Testing Slope These are given in computer printout as a t test. Psy Cal State Northridge

42 Testing The t values in the second from right column are tests on slope and intercept. The associated p values are next to them. The slope is significantly different from zero, but not the intercept. Why do we care? Psy Cal State Northridge

43 Testing What does it mean if slope is not significant? –How does that relate to test on r? What if the intercept is not significant? Does significant slope mean we predict quite well? Psy Cal State Northridge