Download presentation
Presentation is loading. Please wait.
Published byAbigail Potter Modified over 9 years ago
1
Psychology 820 Correlation Regression & Prediction
2
Concept of Correlation A coefficient of correlation (r or ρ ‘rho’) is a statistical summary of the degree and direction of relationship or association between two variables (X and Y) Degree of Relationship Correlations range from 0 to 1.00 Direction of Relationship Positive (+) relationship: High score on X goes with a High score on Y Negative (-) relationship: High score on X goes with Low score on Y
3
The Bivariate Normal Distribution A family of three dimensional surfaces
4
Scatterplots The chief purpose of the scatterplot is for the study of the nature of the relationship between two variables. Components of r Pearson Product Moment Correlation
5
Additional Measures of Relationships Spearman Rank Correlation Both X and Y are ranks Phi Coefficient Both X and Y are dichotomies Point-Biserial Coefficient One dichotomous variable and one continuous measure Biserial Correlation One artificial dichotomy and one continuous measure Tetrachoric Coefficient Both X and Y are artificial dichotomies
6
Linear and Curvilinear Relationships Only the degree of linear relationship is described by r or ρ If there is a substantial nonlinear relationship between two variables, a different correlation coefficient (such as eta η) should be used
7
Linear Transformations and Correlation Any transformation of X or Y that is linear does not affect the correlation coefficient This includes transformations to z- scores, T-scores, addition of a constant to all values, subtracting multiplying or dividing by non-zero constants
8
Effects of Variability on Correlation The variability (heterogeneity) of the sample has an important influence on r Range restriction
9
Causation and Correlation Correlation must be carefully distinguished from causation. Third Variable Factor Effect of Outliers
10
Regression and Prediction Prediction and correlation are opposite sides of the same coin Regression is usually the statistical method of choice when the predicted variable is an ordinal, interval, or ratio scale. Simple linear regression (1 IV & 1 DV) extends to multiple regression (more than 1 IV)
11
The Regression Effect The sons of tall fathers tend to be taller than average, but shorter than their fathers. The sons of short fathers tend to be shorter than average, but taller than their fathers. Regression to the Mean
12
Regression Equation Y = b X + c ( the equation of a straight line) Line of best fit Line of least-squares Prediction equation
13
Proportion of Variance Interpretation of Correlation The coefficient of determination (r 2 ) is the proportion of variance in Y that can be accounted for by knowing X and, conversely, the proportion of variance in X that can be accounted for by knowing Y. The coefficient of nondetermination (k 2 ) is the proportion of variance “not accounted for”
14
Homoscedasticity This assumption means that the variance around the regression line is the same for all values of the predictor variable (X). The plot on the right shows a violation of this assumption. For the lower values on the X-axis, the points are all very near the regression line. For the higher values on the X-axis, there is much more variability around the regression line. varianceregression line In a bivariate normal distribution the variance of scores on Y will be the same for all values of X (equal variance of Y scores for each value of X) is known as homoscedasticity.
15
Part Correlation It is the correlation of X 1 (IQ) with X 2 (achievement posttest) after the portion of the posttest that can be predicted from the pretest has been removed.
16
Partial Correlation Simple extension of part correlation The correlation of X 1 and X 2 with X 3 “held constant”, removed, or partialed out is a partial correlation.
17
Multiple Regression Multiple regression is the statistical method most commonly employed for predicting Y from two or more independent variables.
18
Multiple Correlation The correlation between Y and Y predicted when the prediction is based on two or more independent variables is termed multiple correlation
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.