BIVARIATE/MULTIVARIATE DESCRIPTIVE STATISTICS Displaying and analyzing the relationship between continuous variables.

Slides:



Advertisements
Similar presentations
Lesson 10: Linear Regression and Correlation
Advertisements

CORRELATION. Overview of Correlation u What is a Correlation? u Correlation Coefficients u Coefficient of Determination u Test for Significance u Correlation.
Correlation and Linear Regression.
Learning Objectives Copyright © 2002 South-Western/Thomson Learning Data Analysis: Bivariate Correlation and Regression CHAPTER sixteen.
Learning Objectives 1 Copyright © 2002 South-Western/Thomson Learning Data Analysis: Bivariate Correlation and Regression CHAPTER sixteen.
Overview Correlation Regression -Definition
CORRELATION. Overview of Correlation u What is a Correlation? u Correlation Coefficients u Coefficient of Determination u Test for Significance u Correlation.
PPA 501 – Analytical Methods in Administration Lecture 8 – Linear Regression and Correlation.
PPA 415 – Research Methods in Public Administration
The Simple Regression Model
Nemours Biomedical Research Statistics April 2, 2009 Tim Bunnell, Ph.D. & Jobayer Hossain, Ph.D. Nemours Bioinformatics Core Facility.
RESEARCH STATISTICS Jobayer Hossain Larry Holmes, Jr November 6, 2008 Examining Relationship of Variables.
Elaboration Elaboration extends our knowledge about an association to see if it continues or changes under different situations, that is, when you introduce.
Dr. Mario MazzocchiResearch Methods & Data Analysis1 Correlation and regression analysis Week 8 Research Methods & Data Analysis.
Correlation and Regression. Relationships between variables Example: Suppose that you notice that the more you study for an exam, the better your score.
Multiple Regression Research Methods and Statistics.
1 Chapter 17: Introduction to Regression. 2 Introduction to Linear Regression The Pearson correlation measures the degree to which a set of data points.
Summary of Quantitative Analysis Neuman and Robson Ch. 11
Correlation Coefficients Pearson’s Product Moment Correlation Coefficient  interval or ratio data only What about ordinal data?
Week 14 Chapter 16 – Partial Correlation and Multiple Regression and Correlation.
Chapter 9: Correlational Research. Chapter 9. Correlational Research Chapter Objectives  Distinguish between positive and negative bivariate correlations,
Review Regression and Pearson’s R SPSS Demo
Relationships Among Variables
Review for Final Exam Some important themes from Chapters 9-11 Final exam covers these chapters, but implicitly tests the entire course, because we use.
Chapter 8: Bivariate Regression and Correlation
LIS 570 Summarising and presenting data - Univariate analysis continued Bivariate analysis.
Correlation and Regression
Correlation By Dr.Muthupandi,. Correlation Correlation is a statistical technique which can show whether and how strongly pairs of variables are related.
ASSOCIATION BETWEEN INTERVAL-RATIO VARIABLES
ASSOCIATION: CONTINGENCY, CORRELATION, AND REGRESSION Chapter 3.
Learning Objective Chapter 14 Correlation and Regression Analysis CHAPTER fourteen Correlation and Regression Analysis Copyright © 2000 by John Wiley &
Prior Knowledge Linear and non linear relationships x and y coordinates Linear graphs are straight line graphs Non-linear graphs do not have a straight.
Chapter 6 & 7 Linear Regression & Correlation
Agenda Review Association for Nominal/Ordinal Data –  2 Based Measures, PRE measures Introduce Association Measures for I-R data –Regression, Pearson’s.
Chapter 12 Examining Relationships in Quantitative Research Copyright © 2013 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin.
Correlation and Regression PS397 Testing and Measurement January 16, 2007 Thanh-Thanh Tieu.
Statistics in Applied Science and Technology Chapter 13, Correlation and Regression Part I, Correlation (Measure of Association)
Correlation is a statistical technique that describes the degree of relationship between two variables when you have bivariate data. A bivariate distribution.
Examining Relationships in Quantitative Research
Regression. Types of Linear Regression Model Ordinary Least Square Model (OLS) –Minimize the residuals about the regression linear –Most commonly used.
Chapter Sixteen Copyright © 2006 McGraw-Hill/Irwin Data Analysis: Testing for Association.
Correlation & Regression Chapter 15. Correlation It is a statistical technique that is used to measure and describe a relationship between two variables.
Chapter 16 Data Analysis: Testing for Associations.
MARKETING RESEARCH CHAPTER 18 :Correlation and Regression.
Relationships If we are doing a study which involves more than one variable, how can we tell if there is a relationship between two (or more) of the.
Testing hypotheses Continuous variables. H H H H H L H L L L L L H H L H L H H L High Murder Low Murder Low Income 31 High Income 24 High Murder Low Murder.
CORRELATION. Correlation key concepts: Types of correlation Methods of studying correlation a) Scatter diagram b) Karl pearson’s coefficient of correlation.
Examining Relationships in Quantitative Research
Chapter Thirteen Copyright © 2006 John Wiley & Sons, Inc. Bivariate Correlation and Regression.
Scatterplots and Correlations
Creating a Residual Plot and Investigating the Correlation Coefficient.
Correlation and Regression: The Need to Knows Correlation is a statistical technique: tells you if scores on variable X are related to scores on variable.
CHAPTER 5 CORRELATION & LINEAR REGRESSION. GOAL : Understand and interpret the terms dependent variable and independent variable. Draw a scatter diagram.
Chapter 2 Examining Relationships.  Response variable measures outcome of a study (dependent variable)  Explanatory variable explains or influences.
Correlation & Regression Analysis
Copyright © 2010 Pearson Education, Inc Chapter Seventeen Correlation and Regression.
Regression Analysis. 1. To comprehend the nature of correlation analysis. 2. To understand bivariate regression analysis. 3. To become aware of the coefficient.
Multiple Regression Learning Objectives n Explain the Linear Multiple Regression Model n Interpret Linear Multiple Regression Computer Output n Test.
SOCW 671 #11 Correlation and Regression. Uses of Correlation To study the strength of a relationship To study the direction of a relationship Scattergrams.
Chapter 15 Association Between Variables Measured at the Interval-Ratio Level.
Copyright © 2015 McGraw-Hill Education. All rights reserved. No reproduction or distribution without the prior written consent of McGraw-Hill Education.
Correlation & Regression
Chapter 9: Correlational Research
Multiple Regression.
Week 14 Chapter 16 – Partial Correlation and Multiple Regression and Correlation.
Inferential Statistics
Testing hypotheses Continuous variables.
Topic 8 Correlation and Regression Analysis
Testing hypotheses Continuous variables.
3 basic analytical tasks in bivariate (or multivariate) analyses:
Presentation transcript:

BIVARIATE/MULTIVARIATE DESCRIPTIVE STATISTICS Displaying and analyzing the relationship between continuous variables

Correlation and Regression Correlation: measure of the strength of an association (relationship) between continuous variables Regression: predicting the value of a continuous dependent variable (y) based on the value of a continuous independent variable (x)

Correlation statistic - r Values of r Range from –1 to is a perfect negative association (correlation), meaning that as the scores of one variable increase, the scores of the other variable decrease at exactly the same rate +1 is a perfect positive association, meaning that both variables go up or down together, in lock-step Intermediate values of r (close to zero) indicate weak or no relationship Zero r (never in real life) means no relationship – that the variables do not change or “vary” together except by chance.

X Y X Y r = +1r = - 1 Can changes in one variable be predicted by changes in the other? Two “scattergrams” – each with a “cloud” of dots

X Y r = 0 Can changes in one variable be predicted by changes in the other?

“Line of best fit” To arrive at a value of “r” a straight line is placed through the cloud of dots (the actual “observed” data) Linear relationship between the variables is assumed This line is placed so that the overall distance between itself and the dots is minimized X Y 2

“Line of best fit” To place this line in the cloud of dots it is necessary to compute a and b for each observed (known) value of x. –a = where the line crosses the y axis –b = “slope”, or no. of units that the value of y changes when x changes one unit When x is the “independent variable”:  (x -  x)(y -  y) b =  (x -  x) 2 a =  y - b  x

y = a + bx a = where the line crosses the y axis b = “slope”, or no. of units that y changes when x changes one unit X Y a

How closely will a straight line fit the “observed” (actual) data? X Y A perfect fit yields an r of +1 or X Y

An intermediate fit yields an intermediate value of r r = X Y 2

X Y r = -.19 A poor fit yields a low value of r

The line of best fit predicts a value for one variable given the value of the other variable There will be a difference between these estimated values and the actual, known (“observed”) values. This difference is called a “residual” or an “error of the estimate.” As the error between the known and predicted values decrease – as the dots cluster more tightly around the line – the absolute value of r (ignoring the + or – sign) increases X Y if x =.5, y=2.3 if y =5, x=3.4

X Y Measurement scales can be changed from continuous to categorical 2

X Y To evaluate a relationship between categorical variables, count the cell frequencies, then compare changes in the distribution of the dependent variable as the value of the independent variable changes

Coefficient of determination (r 2 ) Proportion of the change in the dependent variable that is accounted for by changes in the independent variable Multiple correlation (R 2 ) is proportion of the change of the dependent variable that is accounted for by the combined effects of multiple independent variables –Relative contribution of each independent variable can be estimated

Other correlation/regression techniques “Partial correlation” –Using a control or “test” variable to assess its potential influence on a bivariate (two-variable) relationship –All variables must be continuous “Spearman’s r” is used to assess the correlation between two ordinal variables “Logit” and “logistic” regression are used when one desires to use regression techniques and the independent and/or dependent variables are categorical (can only have two possible values) –Create “dummy” variables that range from 0 – 1

Correlation matrix Load Height weight gender age.sav or.xls Choose Analyze|Correlate|Bivariate Load all variables

Scattergram Usually display only two variables at a time Graphs|Scatter/dot|Simple Convention is to place independent variable on “X” axis Optional: add fit line

Controlling for third variable Analyze|Correlate|Partial Place “Age” in “Controlling for” Did the original, “zero-order” relationship between height and weight change? –Any reduction suggests that the independent variables are “intercorrelated” –When we measure height or weight, the effect of the other variable winds up being included

Recoding from continuous to categorical (Height weight gender age REC.sav and.xls) Recode Height –61 to 68 inches: Short –69 inches +: Tall Recode Weight – pounds: Light –146 pounds +: Heavy Run crosstabs –Does there appear to be a strong association when these variables are categorical?