Relationships Among Variables

Slides:



Advertisements
Similar presentations
Managerial Economics in a Global Economy
Advertisements

Lesson 10: Linear Regression and Correlation
13- 1 Chapter Thirteen McGraw-Hill/Irwin © 2005 The McGraw-Hill Companies, Inc., All Rights Reserved.
Regression Analysis Module 3. Regression Regression is the attempt to explain the variation in a dependent variable using the variation in independent.
Learning Objectives Copyright © 2002 South-Western/Thomson Learning Data Analysis: Bivariate Correlation and Regression CHAPTER sixteen.
Regression Analysis Once a linear relationship is defined, the independent variable can be used to forecast the dependent variable. Y ^ = bo + bX bo is.
Learning Objectives Copyright © 2004 John Wiley & Sons, Inc. Bivariate Correlation and Regression CHAPTER Thirteen.
Chapter 10 Curve Fitting and Regression Analysis
Learning Objectives 1 Copyright © 2002 South-Western/Thomson Learning Data Analysis: Bivariate Correlation and Regression CHAPTER sixteen.
Correlation and Regression
Statistical Issues in Research Planning and Evaluation
Describing Relationships Using Correlation and Regression
Chapter 10 Regression. Defining Regression Simple linear regression features one independent variable and one dependent variable, as in correlation the.
Correlation & Regression Chapter 15. Correlation statistical technique that is used to measure and describe a relationship between two variables (X and.
Correlation and Linear Regression
© 2010 Pearson Prentice Hall. All rights reserved Least Squares Regression Models.
9. SIMPLE LINEAR REGESSION AND CORRELATION
Lecture 11 PY 427 Statistics 1 Fall 2006 Kin Ching Kong, Ph.D
Linear Regression and Correlation
SIMPLE LINEAR REGRESSION
SIMPLE LINEAR REGRESSION
Correlational Designs
C82MCP Diploma Statistics School of Psychology University of Nottingham 1 Linear Regression and Linear Prediction Predicting the score on one variable.
Correlation 1. Correlation - degree to which variables are associated or covary. (Changes in the value of one tends to be associated with changes in the.
1 Chapter 17: Introduction to Regression. 2 Introduction to Linear Regression The Pearson correlation measures the degree to which a set of data points.
Correlation and Regression Analysis
Simple Linear Regression Analysis
Smith/Davis (c) 2005 Prentice Hall Chapter Eight Correlation and Prediction PowerPoint Presentation created by Dr. Susan R. Burns Morningside College.
Lecture 5 Correlation and Regression
Correlation and Linear Regression
Lecture 15 Basics of Regression Analysis
1 Doing Statistics for Business Doing Statistics for Business Data, Inference, and Decision Making Marilyn K. Pelosi Theresa M. Sandifer Chapter 11 Regression.
SIMPLE LINEAR REGRESSION
Introduction to Linear Regression and Correlation Analysis
Linear Regression and Correlation
Simple Linear Regression Models
CORRELATION & REGRESSION
Correlation and Regression
Chapter 15 Correlation and Regression
Chapter 6 & 7 Linear Regression & Correlation
L 1 Chapter 12 Correlational Designs EDUC 640 Dr. William M. Bauer.
Production Planning and Control. A correlation is a relationship between two variables. The data can be represented by the ordered pairs (x, y) where.
Correlation & Regression
Examining Relationships in Quantitative Research
1 Chapter 12 Simple Linear Regression. 2 Chapter Outline  Simple Linear Regression Model  Least Squares Method  Coefficient of Determination  Model.
Correlation & Regression Chapter 15. Correlation It is a statistical technique that is used to measure and describe a relationship between two variables.
© Copyright McGraw-Hill Correlation and Regression CHAPTER 10.
Chapter 16 Data Analysis: Testing for Associations.
Chapter 10 For Explaining Psychological Statistics, 4th ed. by B. Cohen 1 A perfect correlation implies the ability to predict one score from another perfectly.
Chapter Thirteen Copyright © 2006 John Wiley & Sons, Inc. Bivariate Correlation and Regression.
Chapter 14 Correlation and Regression
Correlation & Regression Analysis
Regression Analysis. 1. To comprehend the nature of correlation analysis. 2. To understand bivariate regression analysis. 3. To become aware of the coefficient.
Copyright © 2011 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Simple Linear Regression Analysis Chapter 13.
Intro to Statistics for the Behavioral Sciences PSYC 1900 Lecture 7: Regression.
Chapter 8 Relationships Among Variables. Outline What correlational research investigates Understanding the nature of correlation What the coefficient.
Chapter 14 Introduction to Regression Analysis. Objectives Regression Analysis Uses of Regression Analysis Method of Least Squares Difference between.
Chapter 13 Linear Regression and Correlation. Our Objectives  Draw a scatter diagram.  Understand and interpret the terms dependent and independent.
Chapter 4 Basic Estimation Techniques
Correlation and Simple Linear Regression
Correlation and Simple Linear Regression
Correlation and Regression
Correlation and Simple Linear Regression
Simple Linear Regression and Correlation
Product moment correlation
Introduction to Regression
Chapter Thirteen McGraw-Hill/Irwin
Correlation and Simple Linear Regression
Correlation and Simple Linear Regression
Presentation transcript:

Relationships Among Variables Chapter 8 Relationships Among Variables Research Methods in Physical Activity

Correlation — A statistical technique used to determine the relationship between two or more variables. Correlations may be simple, when they involve only two variables of comparison, or may be multiple correlations when they involve more than two variables. Multiple correlations have a dependent variable (criterion variable) and two or more independent variables (predictor variables). A canonical correlation, establishes the relationships between two or more dependent variables and two or more independent variables. Research Methods in Physical Activity

Positive correlation — A relationship between two variables in which a small value for one variable is associated with a small value for another variable, and a large value for one variable is associated with a large value for the other. Research Methods in Physical Activity

Negative correlation — A relationship between two variables in which a small value for the first variable is associated with a large value for the second variable, and a large value for the first variable is associated with a small value for the second variable. Research Methods in Physical Activity

Correlation and Causation A correlation between two variables does not mean that one variable causes the other. While two variables must be correlated for a cause and effect relationship to exist, correlation alone does not guarantee such a relationship. Correlation is a necessary but not sufficient condition for causation. The only way that causation can be shown is with an experimental study in which an independent variable can be manipulated to bring about an effect. Research Methods in Physical Activity

coefficient of correlation [ r ] — A quantitative value of the relationship between two or more variables that can range from .00 to 1.00 in either a positive or negative direction. Pearson product moment coefficient of correlation — The most commonly used method of computing correlation between two variables; also called interclass correlation, simple correlation, or Pearson r. The Pearson r has one criterion (or dependent) variable and one predictor (or independent) variable. An important assumption for the use of r is that the relationship between the variables is expected to be linear, that is, that a straight line is the best model of the relationship. When that is not true (e.g., figure 8.4d, p. 129 ), r is an inappropriate way to analyze the data. Research Methods in Physical Activity

Computation of the correlation coefficient The computation of the correlation coefficient involves the relative distances of the scores from the two means of the distributions. The formula consists of only three operations: Sum each set of scores. Square and sum each set of scores. Multiply each pair of scores and obtain the cumulative sum of these products. See Example 8.1, p.130, for example of computation Research Methods in Physical Activity

Computation of the correlation coefficient In a correlation problem that simply determines the relationship between two variables, it does not matter which one is X and which is Y. If the investigator wants to predict one score from the other, then Y designates the criterion (dependent) variable (that which is being predicted) and X the predictor (independent) variable. In the example of the positive correlation to the left, the criterion variable is the “Years of education”, and the predictor variable is the annual income. Thus, we would “predict” the years of educational experience based upon the annual income. Research Methods in Physical Activity

Interpreting the reliability of r What does a coefficient of correlation mean in terms of being high or low, satisfactory or unsatisfactory? One criterion is its reliability, or significance. Does it represent a real relationship? That is, if the study were repeated, what is the probability of finding a similar relationship? For this statistical criterion of significance, simply consult a table. Table 3 in the appendix (p. 428) contains the necessary correlation coefficients for significance at the .05 and .01 levels. In using the Table 3, select the desired level of significance, such as the .05 level, and then find the appropriate degrees of freedom (df, which is based on the number of participants corrected for sample bias), which, for r, is equal to N – 2 (remember, the variable N in correlation refers to the number of pairs of scores). Research Methods in Physical Activity

Some Observations about “significant r” (refer to Table 3) 1) The correlation needed for significance decreases with increased numbers of participants (df). Very low correlation coefficients can be significant if you have a large sample of participants. At the .05 level, r = .38 is significant with 25 df, r = .27 is significant with 50 df, and r = .195 is significant with 100 df. The second observation to note from the table is that a higher correlation is required for significance at the .01 level than at the .05 level. The .05 level means that if 100 experiments were conducted, the null hypothesis (that there is no relationship) would be rejected incorrectly, just by chance, on 5 of the 100 occasions. At the .01 level, we would expect a relationship of this magnitude because of chance less than once in 100 experiments. Therefore, the test of significance at the .01 level is more stringent than at the .05 level, so a higher correlation is required for significance at the .01 level. Research Methods in Physical Activity

Interpreting the Meaningfulness of r The interpretation of a correlation for statistical significance is important, but because of the vast influence of sample size, this criterion is not always meaningful. The most commonly used criterion for interpreting the meaningfulness of the correlation coefficient is the coefficient of determination (r2). With r2 the portion of common association of the factors that influence the two variables is determined. Thus, the coefficient of determination indicates the portion of the total variance in one measure that can be explained, or accounted for, by the variance in the other measure. The Venn diagram visually depicts this idea. Circle A represents the variance in one variable, and Circle B represents the variance in a second variable. Overlay C, r = .60; thus r2 = .36 (shared variance). Thus, 36% of changes in A can be explained by changes in B. (Unexplained variance is equal to 1- r2. Research Methods in Physical Activity

Using Correlation for Prediction (Regression) Prediction is based on correlation. The higher the relationship is between two variables, the more accurately you can predict one from the other. If the correlation were perfect, you could predict with complete accuracy. Thus, r = .00 means no predictive ability, and r = 1.0 means absolute predictive ability. Prediction equation — A formula to predict some criterion (e.g., some measure of performance) based on the relationship between the predictor variable(s) and the criterion; also called regression equation. We predict “Y” (criterion or dependent variable) on the ordinate/vertical axis from “X” ( predictor or independent variable) on the abscissa/horizontal axis. Research Methods in Physical Activity

Using Correlation for Prediction (regression) We predict “Y” (criterion or dependent variable) on the ordinate/vertical axis from “X” ( predictor or independent variable) on the abscissa/horizontal axis…. Where, Y = a + bX (equation for a straight line) Y = the predicted score (dependent score) X = the predictor score (independent score) a = the intercept point on Y b = the slope of the line Keep it simple. 1) “a” is the place on the “Y” axis, where the line will intersect, and 2) the “slope of the line” is really about how “X” changes with “Y” (degree or magnitude of slope) and their direction (positive or negative) So, if we want to predict “Y” from “X” then we need to calculate “a” and “b”. Research Methods in Physical Activity

Using Correlation for Prediction (regression) Calculating “a” and “b” First you will need to calculate “b” which is determined by the correlation coefficient and the standardized variance (standard deviation ) of variables “X” and “Y” with the following formula: b= r(sY/sX) sY = the standard deviation of “Y” sX = the standard deviation of “X” Note that the slope of the line is not only about the association of “X” and “Y” (direction: positive or negative), but also the degree to which the variance of “X” is related to the variance of “Y” (rise over run = degree or magnitude of slope). Research Methods in Physical Activity

Using Correlation for Prediction (regression) Calculating “a” and “b” Next you can calculate “a” which is the intercept on the “Y” axis: a = MY - bMX MY = the Mean of the “Y” scores MX = the Mean of the “X” scores b = the slope of the regression line (see last slide) Note this formula will only produce one value dependent upon the measure of central tendency (means of “X” and “Y”) and the variance of “X” and “Y” (the degree to which the variance of “X” is related to the variance of “Y” or “b”) So, the intercept of the line and the slope of the line are dependent on the mean and standard deviation of “X” and “Y”. (see your text for examples of using the regression formula) Research Methods in Physical Activity

Line of Best Fit (regression line) The line of best fit is the line that passes through the intersection of the X and Y means. The slope of the line is dependent not only on the mean but also the variance of X and Y (see previous formulas). Thus the line of best fit is the least distance for all of the X and Y coordinates, it is the “best fit” for all the X and Y data coordinates. The line is a regression line because used to predict Y from X. (see previous slides) Those X and Y coordinates that do not fall on the line are called residuals or residual scores. residual scores — The difference between the predicted and actual scores that represents the error of prediction. Note that if you have perfect correlation all scores in the scatter plot would be in a straight line (line of best fit) and there would be no residual scores. Also, residual scores are really unexplained variance (error of prediction). Research Methods in Physical Activity

Line of Best Fit (regression line) If we were to compute all the residual scores (variance scores) the mean would be zero (ie. the line of best fit is the least distance for all of the X and Y coordinates), and the unexplained variance (standard deviation) is called the standard error of prediction, or standard error of the estimate. The larger the standard error of the estimate the less predictive ability and the larger the r 2 is, the smaller the error of prediction. Note: Chapter 8 also contains information on Partial, Semi-partial, and Multiple regression principles, and Fischer Z transformation of r. This information is beyond the scope of our class in the introduction of statistical principles. I welcome you to read the information, but I will not review nor test you on the material Research Methods in Physical Activity

End of Lecture Research Methods in Physical Activity