Linear Regression:. The relationship between two variables (e.g. height and weight; age and IQ) can be described graphically with a scatterplot : shortmediumlong.

Slides:



Advertisements
Similar presentations
Overview Correlation Regression -Definition
Advertisements

Chapter 15 (Ch. 13 in 2nd Can.) Association Between Variables Measured at the Interval-Ratio Level: Bivariate Correlation and Regression.
Copyright (c) Bani K. Mallick1 STAT 651 Lecture #18.
C82MCP Diploma Statistics School of Psychology University of Nottingham 1 Linear Regression and Linear Prediction Predicting the score on one variable.
Graphing & Interpreting Data
Linear Regression.
Relationship of two variables
Correlation and regression 1: Correlation Coefficient
How do scientists show the results of investigations?
Correlation and regression. Lecture  Correlation  Regression Exercise  Group tasks on correlation and regression  Free experiment supervision/help.
Correlation is a statistical technique that describes the degree of relationship between two variables when you have bivariate data. A bivariate distribution.
Examining Relationships in Quantitative Research
STATISTICS 12.0 Correlation and Linear Regression “Correlation and Linear Regression -”Causal Forecasting Method.
Relationships If we are doing a study which involves more than one variable, how can we tell if there is a relationship between two (or more) of the.
11/23/2015Slide 1 Using a combination of tables and plots from SPSS plus spreadsheets from Excel, we will show the linkage between correlation and linear.
Chapter 9: Correlation and Regression Analysis. Correlation Correlation is a numerical way to measure the strength and direction of a linear association.
Regression David Young & Louise Kelly Department of Mathematics and Statistics, University of Strathclyde Royal Hospital for Sick Children, Yorkhill NHS.
STATISTICS 12.0 Correlation and Linear Regression “Correlation and Linear Regression -”Causal Forecasting Method.
REGRESSION AND CORRELATION SIMPLE LINEAR REGRESSION 10.2 SCATTER DIAGRAM 10.3 GRAPHICAL METHOD FOR DETERMINING REGRESSION 10.4 LEAST SQUARE METHOD.
Irwin/McGraw-Hill © Andrew F. Siegel, 1997 and l Chapter 9 l Simple Linear Regression 9.1 Simple Linear Regression 9.2 Scatter Diagram 9.3 Graphical.
Mathematics Vocabulary – Grade 8 ©Partners for Learning, Inc. Slope-intercept form An equation of the form y = mx + b, where m is the slope and b is the.
Linear Regression Essentials Line Basics y = mx + b vs. Definitions
Statistics 200 Lecture #6 Thursday, September 8, 2016
Regression and Correlation
Scatterplots Chapter 6.1 Notes.
CHAPTER 3 Describing Relationships
Warm Up Scatter Plot Activity.
Inference for Least Squares Lines
Linear Equation in Two Variables
CHAPTER 3 Describing Relationships
Two Quantitative Variables
Lines of Best Fit When data show a correlation, you can estimate and draw a line of best fit that approximates a trend for a set of data and use it to.
SIMPLE LINEAR REGRESSION MODEL
Multiple Regression.
Simple Linear Regression
Understanding Standards Event Higher Statistics Award
Correlation and Regression
The Least-Squares Regression Line
CHAPTER 10 Correlation and Regression (Objectives)
Regression and Residual Plots
Lecture Slides Elementary Statistics Thirteenth Edition
Lecture Notes The Relation between Two Variables Q Q
CHAPTER 3 Describing Relationships
The Weather Turbulence
Graphing Linear Equations
Unit 4 Vocabulary.
Least Squares Method: the Meaning of r2
Correlation and Regression
M248: Analyzing data Block D UNIT D2 Regression.
Least-Squares Regression
Correlation and Regression
Graphing Linear Equations
CHAPTER 3 Describing Relationships
CHAPTER 3 Describing Relationships
CHAPTER 3 Describing Relationships
CHAPTER 3 Describing Relationships
CHAPTER 3 Describing Relationships
Graphing Systems of Equations
Homework: pg. 180 #6, 7 6.) A. B. The scatterplot shows a negative, linear, fairly weak relationship. C. long-lived territorial species.
Correlations: Correlation Coefficient:
CHAPTER 3 Describing Relationships
Ch 12.1 Graph Linear Equations
7.1 Draw Scatter Plots and Best Fitting Lines
9/27/ A Least-Squares Regression.
Dr. Fowler  AFM  Unit 8-5 Linear Correlation
Homework: PG. 204 #30, 31 pg. 212 #35,36 30.) a. Reading scores are predicted to increase by for each one-point increase in IQ. For x=90: 45.98;
Chapter 14 Multiple Regression
Graphing Linear Equations
Honors Statistics Review Chapters 7 & 8
CHAPTER 3 Describing Relationships
Presentation transcript:

Linear Regression:

The relationship between two variables (e.g. height and weight; age and IQ) can be described graphically with a scatterplot : shortmediumlong y-axis: age (years) old medium young An individual's performance (each person supplies two scores, age and RT) x-axis: reaction time (msec)

Often in psychology, we are interested in seeing whether or not a linear relationship exists between two variables. Here, there is a strong positive relationship between RT and age:

Here is an equally strong but negative relationship between RT and age:

Here, there is no relationship between RT and age:

If we find a reasonably strong linear relationship between two variables, we might want to fit a straight line to the scatterplot. There are two reasons for wanting to do this: (a) for description: the line acts as a succinct description of the "idealised" relationship between our two variables, a relationship which we assume the real data reflect somewhat imperfectly. (b) for prediction: we could use the line to obtain estimates of values for one of the variables, on the basis of knowledge of the value of the other variable (e.g. if we knew a person's height, we could predict their weight).

Linear Regression is an objective method of fitting a line to our scatterplot - better than trying to do it by eye! Which line is the best fit to the data?

The recipe for drawing a straight line: To draw a line, we need two values: (a) the intercept - the point at which the line intercepts the vertical axis of the graph; (b) the slope of the line. same intercept, different slopes:different intercepts, same slope:

The formula for a straight line: Y = a + b * X Y is a value on the vertical (Y) axis; a is the intercept (the point at which the line intersects the vertical axis of the graph); b is the slope of the line; X is any value on the horizontal (X) axis.

Linear regression step-by-step: 10 individuals do two tests: a stress test, and a statistics test. What is the relationship between stress and statistics performance? subject: stress (X) test score (Y) A1884 B3167 C2563 D2989 E2193 F3263 G4055 H3670 I3553 J2777

Draw a scatterplot to see what the data look like:

There is a negative relationship between stress scores and statistics scores: people who scored high on the statistics test tend to have low stress levels, and people who scored low on the statistics test tend to have high stress levels.

Calculating the regression line: We need to find "a" (the intercept) and "b" (the slope) of the line. Work out "b" first, and "a" second.

To calculate “b”, the slope of the line:

stress test subject: X X 2 YXY A = * 84 = 1512 B = * 67 = 2077 C = * 63 = 1575 D = * 89 = 2581 E = * 93 = 1953 F = * 63 = 2016 G = * 55 = 2200 H = * 70 = 2520 I = * 53 = 1855 J = * 77 = 2079  X =  X 2 =  Y =  XY =

We also need: N = the number of pairs of scores, = 10 in this case. (  X) 2 = "the sum of X squared" = 294 * 294 = NB: (  X) 2 means "square the sum of X"; add together all of the X values to get a total, and then square this total.  X 2 means "sum the squared X values"; square each X value, and then add together these squared X values to get a total.

Working through the formula for b:   

b = b is negative, because the regression line slopes downwards from left to right: as stress scores (X) increase, statistics scores (Y) decrease.

Now work out a: Y is the mean of the Y scores: = X is the mean of the X scores: = b = Therefore a = ( * 29.4) =

The complete regression equation: Y' = ( * X) To draw the line, input any three different values for X, in order to get associated values for Y'. For X = 10, Y' = ( * 10) = For X = 30, Y' = ( * 30) = For X = 50, Y' = ( * 50) =

Regression line for predicting test scores (Y) from stress scores (X): stress score (X) test score (Y) Plot: X = 10, Y' = X = 30, Y' = X = 50, Y' = intercept =

This is the regression line for predicting test score on the basis of knowledge of a person's stress score; this is the "regression of Y on X". To predict stress score on the basis of knowledge of test score (the "regression of X on Y"), we can't use this regression line! To predict Y from X requires a line that minimises the deviations of the predicted Y's from actual Y's. To predict X from Y requires a line that minimises the deviations of the predicted X's from actual X's - a different task! Solution: to calculate regression of X on Y, swap the column labels (so that the "X" values are now the "Y" values, and vice versa); and re-do the calculations.

Regression lines for predicting stress score from test score, and vice versa: Y' = ( * X)Y' = ( * X) (The previous graph redrawn, so that in both cases the predicted variable is on the vertical axis of the graph)

Linear Regression using SPSS: Analyze... > Regression... > Curve Estimation

b, the slope a, the intercept R 2 : how much variation in test score is accounted for by its relationship with stress? ANOVA: is our regression any better at predicting test score than simply using the mean test score?