Correlation A bit about Pearson’s r.

Slides:



Advertisements
Similar presentations
Tests of Significance for Regression & Correlation b* will equal the population parameter of the slope rather thanbecause beta has another meaning with.
Advertisements

Hypothesis Testing Steps in Hypothesis Testing:
Regression Analysis Once a linear relationship is defined, the independent variable can be used to forecast the dependent variable. Y ^ = bo + bX bo is.
Correlation Mechanics. Covariance The variance shared by two variables When X and Y move in the same direction (i.e. their deviations from the mean are.
PSY 307 – Statistics for the Behavioral Sciences
Basic Statistical Concepts Psych 231: Research Methods in Psychology.
The Simple Regression Model
Basic Statistical Concepts
Topics: Regression Simple Linear Regression: one dependent variable and one independent variable Multiple Regression: one dependent variable and two or.
Copyright © 2014 Pearson Education, Inc.12-1 SPSS Core Exam Guide for Spring 2014 The goal of this guide is to: Be a side companion to your study, exercise.
Data Analysis Statistics. Inferential statistics.
Basic Statistical Concepts Part II Psych 231: Research Methods in Psychology.
Correlation and Regression Analysis
Correlation A bit about Pearson’s r.
Relationships Among Variables
AM Recitation 2/10/11.
Copyright © 2012 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 17 Inferential Statistics.
Copyright © 2008 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 22 Using Inferential Statistics to Test Hypotheses.
1 G Lect 10a G Lecture 10a Revisited Example: Okazaki’s inferences from a survey Inferences on correlation Correlation: Power and effect.
Psychology 301 Chapters & Differences Between Two Means Introduction to Analysis of Variance Multiple Comparisons.
Tests and Measurements Intersession 2006.
By: Amani Albraikan.  Pearson r  Spearman rho  Linearity  Range restrictions  Outliers  Beware of spurious correlations….take care in interpretation.
Lecture 8 Simple Linear Regression (cont.). Section Objectives: Statistical model for linear regression Data for simple linear regression Estimation.
Intro to Statistics for the Behavioral Sciences PSYC 1900 Lecture 7: Regression.
Chapter 13 Understanding research results: statistical inference.
Lecture 7: Bivariate Statistics. 2 Properties of Standard Deviation Variance is just the square of the S.D. If a constant is added to all scores, it has.
Inference about the slope parameter and correlation
Statistical analysis.
Comparing Systems Using Sample Data
Dependent-Samples t-Test
1. According to ______ the larger the sample, the closer the sample mean is to the population mean. (p. 251) Murphy’s law the law of large numbers the.
ESTIMATION.
REGRESSION G&W p
Correlation I have two variables, practically „equal“ (traditionally marked as X and Y) – I ask, if they are independent and if they are „correlated“,
Correlation and Simple Linear Regression
Statistical analysis.
Correlation and Regression
Hypothesis Testing and Confidence Intervals (Part 1): Using the Standard Normal Lecture 8 Justin Kern October 10 and 12, 2017.
Fundamentals of regression analysis
Chapter 11 Simple Regression
Central Tendency and Variability
Reasoning in Psychology Using Statistics
Correlation and Simple Linear Regression
Introduction to Inferential Statistics
What if. . . You were asked to determine if psychology and sociology majors have significantly different class attendance (i.e., the number of days a person.
Statistical Methods For Engineers
CHAPTER 29: Multiple Regression*
Elementary Statistics
6-1 Introduction To Empirical Models
Chi-square and F Distributions
Introduction to ANOVA.
Ass. Prof. Dr. Mogeeb Mosleh
Correlation and Simple Linear Regression
Inferences about Population Means
STATISTICS Topic 1 IB Biology Miss Werba.
Psych 231: Research Methods in Psychology
Simple Linear Regression and Correlation
Statistics II: An Overview of Statistics
Product moment correlation
Inferential Statistics
Power and Sample Size I HAVE THE POWER!!! Boulder 2006 Benjamin Neale.
Psych 231: Research Methods in Psychology
Reasoning in Psychology Using Statistics
Chapter Nine: Using Statistics to Answer Questions
Reasoning in Psychology Using Statistics
MGS 3100 Business Analysis Regression Feb 18, 2016
Correlation and Prediction
Correlation and Simple Linear Regression
Correlation and Simple Linear Regression
Presentation transcript:

Correlation A bit about Pearson’s r

Questions What does it mean when a correlation is positive? Negative? What is the purpose of the Fisher r to z transformation? What is range restriction? Range enhancement? What do they do to r? Give an example in which data properly analyzed by ANOVA cannot be used to infer causality. Why do we care about the sampling distribution of the correlation coefficient? What is the effect of reliability on r?

Basic Ideas Nominal vs. continuous IV Degree (direction) & closeness (magnitude) of linear relations Sign (+ or -) for direction Absolute value for magnitude Pearson product-moment correlation coefficient

Illustrations Positive, negative, zero

Always Plot Your Data!

Simple Formulas Use either N throughout or else use N-1 throughout (SD and denominator); result is the same as long as you are consistent. Pearson’s r is the average cross product of z scores. Product of (standardized) moments from the means.

Graphic Representation Conversion from raw to z. 2. Points & quadrants. Positive & negative products. 3. Correlation is average of cross products. Sign & magnitude of r depend on where the points fall. 4. Product at maximum (average =1) when points on line where zX=zY.

r = 1.0 Descriptive Statistics N Minimum Maximum Mean Std. Deviation Ht 10 60.00 78.00 69.0000 6.05530 Wt 10 110.00 200.00 155.0000 30.27650 Valid N (listwise) 10 r = 1.0

r=1 Leave X, add error to Y. r=.99

r=.99 Add more error. r=.91

With 2 variables, the correlation is the z-score slope.

Review What does it mean when a correlation is positive? Negative?

Sampling Distribution of r Statistic is r, parameter is ρ (rho). In general, r is slightly biased. The sampling variance is approximately: Sampling variance depends both on N and on ρ.

Fisher’s r to z Transformation .10 .20 .30 .40 .50 .60 .70 .80 .90 z .10 .20 .31 .42 .55 .69 .87 1.10 1.47 Sampling distribution of z is normal as N increases. Pulls out short tail to make better (normal) distribution. Sampling variance of z = (1/(n-3)) does not depend on ρ. R to z function is also atanh in geometry.

Hypothesis test: Result is compared to t with (N-2) df for significance. Say r=.25, N=100 p< .05 t(.05, 98) = 1.984.

Hypothesis test 2: One sample z test where r is sample value and ρ is hypothesized population value. Say N=200, r = .54, and ρ is .30. =4.13 Compare to unit normal, e.g., 4.13 > 1.96 so it is significant. Our sample was not drawn from a population in which rho is .30.

Hypothesis test 3: Testing equality of correlations from 2 INDEPENDENT samples. Say N1=150, r1=.63, N2=175, r2=70. = -1.18, n.s.

Hypothesis test 4: Testing equality of any number of independent correlations. Compare Q to chi-square with k-1 df. Study r n z (n-3)z zbar (z-zbar)2 (n-3)(z-zbar)2 1 .2 200 39.94 .41 .0441 8.69 2 .5 150 .55 80.75 .0196 2.88 3 .6 75 .69 49.91 .0784 5.64 sum 425 170.6 17.21=Q Chi-square at .05 with 2 df = 5.99. Not all rho are equal.

Hypothesis test 5: dependent r Hotelling-Williams test Say N=101, r12=.4, r13=.6, r23=.3 t(.05, 98) = 1.98 See my notes.

Review What is the purpose of the Fisher r to z transformation? Test the hypothesis that Given that r1 = .50, N1 = 103 r2 = .60, N2 = 128 and the samples are independent. Why do we care about the sampling distribution of the correlation coefficient?

Range Restriction

Range enhancement

Reliability Reliability sets the ceiling for validity. Measurement error attenuates correlations. If correlation between true scores is .7 and reliability of X and Y are both .8, observed correlation is 7.sqrt(.8*.8) = .7*.8 = .56. Disattenuated correlation If our observed correlation is .56 and the reliabilities of both X and Y are .8, our estimate of the correlation between true scores is .56/.8 = .70.

Add Error to Y only The correlation decreases. Distribution of X does not change. Distribution of Y becomes wider (increased variance). Slope of Y on X remains constant (SDy effect on b and r cancels out. Not true for error in X.

Review What is range restriction? Range enhancement? What do they do to r? What is the effect of reliability on r?

SAS Power Estimation proc power; onecorr dist=fisherz corr = 0.35 nullcorr = 0.2 sides = 1 ntotal = 100 power = .; run; proc power; onecorr corr = 0.35 nullcorr = 0 sides = 2 ntotal = . power = .8; run; Computed N Total Alpha = .05 Actual Power = .801 Ntotal = 61 Computed Power Actual alpha = .05 Power = .486

Power for Correlations Rho N required against Null: rho = 0 .10 782 .15 346 .20 193 .25 123 .30 84 .35 61 Sample sizes required for powerful conventional significance tests for typical values of the correlation coefficient in psychology. Power = .8, two tails, alpha is .05.

Programs Review ‘corrs’ Excel program from website Download Excel file Show examples of tests for correlations Review R program for computing correlations

Exercises Download Spector’s data Compute univariates & correlation matrix 5 vbls: Age, Autonomy, Work hours, Interpersonal conflict, Job Satisfaction Problems: Which pairs are significant? (use the per comparison or nominal alpha) Is the absolute value of the correlation between conflict and job satisfaction significantly different from .5? Is the correlation between age and conflict different than the correlation between age and job satisfaction?