Lecture 6 Correlation and Regression STAT 3120 Statistical Methods I.

Slides:



Advertisements
Similar presentations
Lesson 10: Linear Regression and Correlation
Advertisements

Forecasting Using the Simple Linear Regression Model and Correlation
Regression BPS chapter 5 © 2006 W.H. Freeman and Company.
LECTURE 3 Introduction to Linear Regression and Correlation Analysis
1 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Summarizing Bivariate Data Introduction to Linear Regression.
Some Terms Y =  o +  1 X Regression of Y on X Regress Y on X X called independent variable or predictor variable or covariate or factor Which factors.
LINEAR REGRESSION: What it Is and How it Works Overview What is Bivariate Linear Regression? The Regression Equation How It’s Based on r.
Multiple regression analysis
LINEAR REGRESSION: What it Is and How it Works. Overview What is Bivariate Linear Regression? The Regression Equation How It’s Based on r.
Correlation and Regression Analysis
Fall 2006 – Fundamentals of Business Statistics 1 Chapter 13 Introduction to Linear Regression and Correlation Analysis.
Linear Regression and Correlation Analysis
Correlation and Regression. Correlation What type of relationship exists between the two variables and is the correlation significant? x y Cigarettes.
REGRESSION AND CORRELATION
© 2000 Prentice-Hall, Inc. Chap Forecasting Using the Simple Linear Regression Model and Correlation.
Business Statistics - QBM117 Least squares regression.
Multiple Regression Research Methods and Statistics.
Correlation and Regression Analysis
Relationships Among Variables
Correlation & Regression Math 137 Fresno State Burger.
Correlational Research Strategy. Recall 5 basic Research Strategies Experimental Nonexperimental Quasi-experimental Correlational Descriptive.
Linear Regression Analysis
Correlation & Regression
MATH 1107 Elementary Statistics Lecture 6 Scatterplots, Association and Correlation.
Correlation and Linear Regression
Regression and Correlation Methods Judy Zhong Ph.D.
Introduction to Linear Regression and Correlation Analysis
VCE Further Maths Least Square Regression using the calculator.
Relationships between Variables. Two variables are related if they move together in some way Relationship between two variables can be strong, weak or.
Examining Relationships
Chapter 6 & 7 Linear Regression & Correlation
M22- Regression & Correlation 1  Department of ISM, University of Alabama, Lesson Objectives  Know what the equation of a straight line is,
1 Experimental Statistics - week 10 Chapter 11: Linear Regression and Correlation.
Correlation and Linear Regression. Evaluating Relations Between Interval Level Variables Up to now you have learned to evaluate differences between the.
Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc. Chap 12-1 Correlation and Regression.
Correlation and Linear Regression INCM Correlation  Correlation coefficients assess strength of linear relationship between two quantitative variables.
STAT 1301 Chapter 8 Scatter Plots, Correlation. For Regression Unit You Should Know n How to plot points n Equation of a line Y = mX + b m = slope b =
Summarizing Bivariate Data
Causality and confounding variables Scientists aspire to measure cause and effect Correlation does not imply causality. Hume: contiguity + order (cause.
Review Multiple Choice Regression: Chapters 7, 8, 9.
Bivariate Data and Scatter Plots Bivariate Data: The values of two different variables that are obtained from the same population element. While the variables.
Lecture 8 Chi-Square STAT 3120 Statistical Methods I.
Regression BPS chapter 5 © 2010 W.H. Freeman and Company.
Relationships If we are doing a study which involves more than one variable, how can we tell if there is a relationship between two (or more) of the.
Multiple Regression INCM 9102 Quantitative Methods.
PS 225 Lecture 17 Correlation Line Review. Scatterplot (Scattergram)  X: Independent Variable  Y: Dependent Variable  Plot X,Y Pairs Length (in)Weight.
Creating a Residual Plot and Investigating the Correlation Coefficient.
Chapter 4 Summary Scatter diagrams of data pairs (x, y) are useful in helping us determine visually if there is any relation between x and y values and,
Correlation and Regression: The Need to Knows Correlation is a statistical technique: tells you if scores on variable X are related to scores on variable.
Chapter 9: Correlation and Regression Analysis. Correlation Correlation is a numerical way to measure the strength and direction of a linear association.
Scatter Plots, Correlation and Linear Regression.
Scatter Diagrams scatter plot scatter diagram A scatter plot is a graph that may be used to represent the relationship between two variables. Also referred.
CHAPTER 5 CORRELATION & LINEAR REGRESSION. GOAL : Understand and interpret the terms dependent variable and independent variable. Draw a scatter diagram.
 What is an association between variables?  Explanatory and response variables  Key characteristics of a data set 1.
© 2001 Prentice-Hall, Inc.Chap 13-1 BA 201 Lecture 18 Introduction to Simple Linear Regression (Data)Data.
STATISTICS 12.0 Correlation and Linear Regression “Correlation and Linear Regression -”Causal Forecasting Method.
Chapter 8 Linear Regression. Fat Versus Protein: An Example 30 items on the Burger King menu:
Simple Linear Regression The Coefficients of Correlation and Determination Two Quantitative Variables x variable – independent variable or explanatory.
GOAL: I CAN USE TECHNOLOGY TO COMPUTE AND INTERPRET THE CORRELATION COEFFICIENT OF A LINEAR FIT. (S-ID.8) Data Analysis Correlation Coefficient.
Copyright © 2015 McGraw-Hill Education. All rights reserved. No reproduction or distribution without the prior written consent of McGraw-Hill Education.
Introduction Many problems in Engineering, Management, Health Sciences and other Sciences involve exploring the relationships between two or more variables.
1. Analyzing patterns in scatterplots 2. Correlation and linearity 3. Least-squares regression line 4. Residual plots, outliers, and influential points.
Statistics 200 Lecture #6 Thursday, September 8, 2016
Correlation & Regression
Data Analysis Module: Correlation and Regression
Correlation and Regression Basics
Correlation and Regression Basics
6-1 Introduction To Empirical Models
Section 1.4 Curve Fitting with Linear Models
Presentation transcript:

Lecture 6 Correlation and Regression STAT 3120 Statistical Methods I

STAT3120 – Correlation and Linear Regression Dependent Variable Independent (predictor) Variable Statistical Test Comments QuantitativeCategoricalT-TEST (one, two or paired sample) Determines if categorical variable (factor) affects dependent variable; typically used for experimental or planned change studies Quantitative Correlation /Regression Analysis Test establishes a regression model; used to explain, predict or control dependent variable Categorical Chi-SquareTests if variables are statistically independent (i.e. are they related or not?)

STAT Correlation  Correlation coefficients assess strength of linear relationship between two quantitative variables. The correlation measure ranges from -1 to +1. A negative correlation means that X and Y are inversely related. A positive correlation means that X and Y are directly related. zero correlation means that X and Y are not linearly related. A correlation of +1 indicates X and Y are directly related and that all the points fall on the same straight line. A correlation of -1 indicates X and Y are inversely related and that all the points fall on the same straight line  Plot Scatter Diagram of Each Predictor variable and Dependent Variable Look of Departures from Linearity Look for extreme data points (Outliers)  Examine Partial Correlation Can’t determine causality, but isolate confounding variables

STAT Correlation For example, lets take two variables and evaluate their correlation…open the stats98 dataset in Excel… What would you expect the correlation of the Verbal SAT scores and the Math SAT scores to be? Why? What would you expect the correlation of the Math SAT scores and the percent taking the test to be? Why?

STAT Correlation What would you expect the correlation of the Verbal SAT scores and the Math SAT scores to be? Why?

STAT Correlation What would you expect the correlation of the Math SAT scores and the Percent of HS students that took the test? Why?

STAT Correlation Lets pull up the 2000 Florida Vote Count in Excel…

STAT Correlation Lets pull up the UCDAVIS2 dataset in Excel…plot Ideal Height versus Actual Height…what would you expect the correlation value to be? Can you explain someone’s Ideal Height using their Actual Height?

STAT3120 – Regression

STAT Regression From the previous slide, the “regression line” has been imposed onto the relationship between ideal height and height. The equation of this line takes the general form of y=mx+b, where: Y is the dependent variable (ideal height) M is the slope of the line X is the independent variable (actual height) B is the Y-intercept. When we discussion regression models, we transform this equation to be: Y = b o + b 1 x 1 + …b n x n Where b o is the y-intercept and b 1 is the slope of the line. The “slope” is also the effect of a one unit change of x on y.

STAT Regression From the previous slide, the model equation is presented in the form of the equation of a line: y=.8174x From this, we would say: 1.For every 1 inch of change in someone’s actual height, there is a.8174 inch change in their ideal height. 2.Everyone “starts” with inches. 3.If someone has an actual height of 68 inches, their ideal height is inches. That R2 value of.7372 is interpreted as “73.72% of the change in ideal height can be explained by a linear model with actual height as the only predictor”.

STAT Regression Lets do this in SAS. After you import the data, the code to run a correlation looks like this: Proc Corr data=jlp.ucdavis2; Var Idealht Height; Run; The output looks like this:

STAT Regression The SAS Code to develop a regression model on the data looks like this: Proc Reg data=jlp.ucdavis2; Model idealht = height/p r; output out=preds p=pred r=resid; run; In this code, the regression model is developed using the “model” statement. Here, the dependent variable of interest is set first. The independent variable(s) then follow after the = sign. The p and r options after the / will produce the predictions and the residuals respectively. The output statement will create a new (temporary) dataset called “preds” that will contain the predictions and the residuals – so that we can examine them.

STAT Regression Here is some of the associated output: This information tells us about the performance of the model This information tells us about the Equation of the model and the impact of the predictor(s).