Correlation A bit about Pearson’s r.

Slides:



Advertisements
Similar presentations
CORRELATION. Overview of Correlation u What is a Correlation? u Correlation Coefficients u Coefficient of Determination u Test for Significance u Correlation.
Advertisements

Tests of Significance for Regression & Correlation b* will equal the population parameter of the slope rather thanbecause beta has another meaning with.
Hypothesis: It is an assumption of population parameter ( mean, proportion, variance) There are two types of hypothesis : 1) Simple hypothesis :A statistical.
Covariance and Correlation: Estimator/Sample Statistic: Population Parameter: Covariance and correlation measure linear association between two variables,
RIMI Workshop: Power Analysis Ronald D. Yockey
Correlation Mechanics. Covariance The variance shared by two variables When X and Y move in the same direction (i.e. their deviations from the mean are.
Correlation CJ 526 Statistical Analysis in Criminal Justice.
CORRELATION. Overview of Correlation u What is a Correlation? u Correlation Coefficients u Coefficient of Determination u Test for Significance u Correlation.
CJ 526 Statistical Analysis in Criminal Justice
PSY 307 – Statistics for the Behavioral Sciences
Lecture 8 PY 427 Statistics 1 Fall 2006 Kin Ching Kong, Ph.D
Basic Statistical Concepts Psych 231: Research Methods in Psychology.
The Simple Regression Model
Basic Statistical Concepts
Statistics Psych 231: Research Methods in Psychology.
Intro to Statistics for the Behavioral Sciences PSYC 1900
Intro to Statistics for the Behavioral Sciences PSYC 1900
Topics: Regression Simple Linear Regression: one dependent variable and one independent variable Multiple Regression: one dependent variable and two or.
PSYC512: Research Methods PSYC512: Research Methods Lecture 8 Brian P. Dyre University of Idaho.
Copyright © 2014 Pearson Education, Inc.12-1 SPSS Core Exam Guide for Spring 2014 The goal of this guide is to: Be a side companion to your study, exercise.
Statistics for CS 312. Descriptive vs. inferential statistics Descriptive – used to describe an existing population Inferential – used to draw conclusions.
Correlation and Regression Analysis
Summary of Quantitative Analysis Neuman and Robson Ch. 11
Getting Started with Hypothesis Testing The Single Sample.
Review for Exam 2 Some important themes from Chapters 6-9 Chap. 6. Significance Tests Chap. 7: Comparing Two Groups Chap. 8: Contingency Tables (Categorical.
Simple Linear Regression Analysis
Chapter 9 For Explaining Psychological Statistics, 4th ed. by B. Cohen 1 What is a Perfect Positive Linear Correlation? –It occurs when everyone has the.
Relationships Among Variables
INFERENTIAL STATISTICS – Samples are only estimates of the population – Sample statistics will be slightly off from the true values of its population’s.
Statistics for the Social Sciences Psychology 340 Fall 2013 Thursday, November 21 Review for Exam #4.
AM Recitation 2/10/11.
Correlation. Definition: the degree of relationship between two or more variables. For example, smoking and lung cancer are correlated: – if we look at.
Topic 5 Statistical inference: point and interval estimate
Copyright © 2012 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 17 Inferential Statistics.
Copyright © 2008 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 22 Using Inferential Statistics to Test Hypotheses.
Regression. Correlation and regression are closely related in use and in math. Correlation summarizes the relations b/t 2 variables. Regression is used.
Inferential Statistics 2 Maarten Buis January 11, 2006.
Statistics 11 Correlations Definitions: A correlation is measure of association between two quantitative variables with respect to a single individual.
Psychology 301 Chapters & Differences Between Two Means Introduction to Analysis of Variance Multiple Comparisons.
Tests and Measurements Intersession 2006.
Relationship between two variables Two quantitative variables: correlation and regression methods Two qualitative variables: contingency table methods.
CORRELATIONS: TESTING RELATIONSHIPS BETWEEN TWO METRIC VARIABLES Lecture 18:
By: Amani Albraikan.  Pearson r  Spearman rho  Linearity  Range restrictions  Outliers  Beware of spurious correlations….take care in interpretation.
Lecture 8 Simple Linear Regression (cont.). Section Objectives: Statistical model for linear regression Data for simple linear regression Estimation.
ITEC6310 Research Methods in Information Technology Instructor: Prof. Z. Yang Course Website: c6310.htm Office:
Correlation Assume you have two measurements, x and y, on a set of objects, and would like to know if x and y are related. If they are directly related,
General Linear Model 2 Intro to ANOVA.
I271B The t distribution and the independent sample t-test.
Linear Correlation. PSYC 6130, PROF. J. ELDER 2 Perfect Correlation 2 variables x and y are perfectly correlated if they are related by an affine transform.
Midterm Review Ch 7-8. Requests for Help by Chapter.
Chapter 10 The t Test for Two Independent Samples
Chapter Eight: Using Statistics to Answer Questions.
Inferences Concerning Variances
Intro to Statistics for the Behavioral Sciences PSYC 1900 Lecture 7: Regression.
Chapter Eleven Performing the One-Sample t-Test and Testing Correlation.
Chapter 13 Understanding research results: statistical inference.
Jump to first page Inferring Sample Findings to the Population and Testing for Differences.
Hypothesis Testing and Statistical Significance
Lecture 7: Bivariate Statistics. 2 Properties of Standard Deviation Variance is just the square of the S.D. If a constant is added to all scores, it has.
6-1 Copyright © 2014, 2011, and 2008 Pearson Education, Inc.
Statistical principles: the normal distribution and methods of testing Or, “Explaining the arrangement of things”
NURS 306, Nursing Research Lisa Broughton, MSN, RN, CCRN RESEARCH STATISTICS.
CHI SQUARE DISTRIBUTION. The Chi-Square (  2 ) Distribution The chi-square distribution is the probability distribution of the sum of several independent,
Inference about the slope parameter and correlation
1. According to ______ the larger the sample, the closer the sample mean is to the population mean. (p. 251) Murphy’s law the law of large numbers the.
Correlation A bit about Pearson’s r.
Elementary Statistics
Inferences about Population Means
Chapter Nine: Using Statistics to Answer Questions
Presentation transcript:

Correlation A bit about Pearson’s r

Questions Why does the maximum value of r equal 1.0? What does it mean when a correlation is positive? Negative? What is the purpose of the Fisher r to z transformation? What is range restriction? Range enhancement? What do they do to r? Give an example in which data properly analyzed by ANOVA cannot be used to infer causality. Why do we care about the sampling distribution of the correlation coefficient? What is the effect of reliability on r?

Basic Ideas Nominal vs. continuous IV Degree (direction) & closeness (magnitude) of linear relations Sign (+ or -) for direction Absolute value for magnitude Pearson product-moment correlation coefficient

Illustrations Positive, negative, zero

Simple Formulas Use either N throughout or else use N-1 throughout (SD and denominator); result is the same as long as you are consistent. Pearson’s r is the average cross product of z scores. Product of (standardized) moments from the means.

Graphic Representation Conversion from raw to z. 2. Points & quadrants. Positive & negative products. 3. Correlation is average of cross products. Sign & magnitude of r depend on where the points fall. 4. Product at maximum (average =1) when points on line where zX=zY.

r = 1.0 Descriptive Statistics N Minimum Maximum Mean Std. Deviation Ht 10 60.00 78.00 69.0000 6.05530 Wt 10 110.00 200.00 155.0000 30.27650 Valid N (listwise) 10 r = 1.0

r=1 Leave X, add error to Y. r=.99

r=.99 Add more error. r=.91

With 2 variables, the correlation is the z-score slope.

Review Why does the maximum value of r equal 1.0? What does it mean when a correlation is positive? Negative?

Sampling Distribution of r Statistic is r, parameter is ρ (rho). In general, r is slightly biased. The sampling variance is approximately: Sampling variance depends both on N and on ρ.

Fisher’s r to z Transformation .10 .20 .30 .40 .50 .60 .70 .80 .90 z .10 .20 .31 .42 .55 .69 .87 1.10 1.47 Sampling distribution of z is normal as N increases. Pulls out short tail to make better (normal) distribution. Sampling variance of z = (1/(n-3)) does not depend on ρ.

Hypothesis test: Result is compared to t with (N-2) df for significance. Say r=.25, N=100 p< .05 t(.05, 98) = 1.984.

Hypothesis test 2: One sample z test where r is sample value and ρ is hypothesized population value. Say N=200, r = .54, and ρ is .30. =4.13 Compare to unit normal, e.g., 4.13 > 1.96 so it is significant. Our sample was not drawn from a population in which rho is .30.

Hypothesis test 3: Testing equality of correlations from 2 INDEPENDENT samples. Say N1=150, r1=.63, N2=175, r2=70. = -1.18, n.s.

Hypothesis test 4: Testing equality of any number of independent correlations. Compare Q to chi-square with k-1 df. Study r n z (n-3)z zbar (z-zbar)2 (n-3)(z-zbar)2 1 .2 200 39.94 .41 .0441 8.69 2 .5 150 .55 80.75 .0196 2.88 3 .6 75 .69 49.91 .0784 5.64 sum 425 170.6 17.21=Q Chi-square at .05 with 2 df = 5.99. Not all rho are equal.

Hypothesis test 5: dependent r Hotelling-Williams test Say N=101, r12=.4, r13=.6, r23=.3 t(.05, 98) = 1.98 See my notes.

Review What is the purpose of the Fisher r to z transformation? Test the hypothesis that Given that r1 = .50, N1 = 103 r2 = .60, N2 = 128 and the samples are independent. Why do we care about the sampling distribution of the correlation coefficient?

Range Restriction/Enhancement

Reliability Reliability sets the ceiling for validity. Measurement error attenuates correlations. If correlation between true scores is .7 and reliability of X and Y are both .8, observed correlation is 7.sqrt(.8*.8) = .7*.8 = .56. Disattenuated correlation If our observed correlation is .56 and the reliabilities of both X and Y are .8, our estimate of the correlation between true scores is .56/.8 = .70.

Review What is range restriction? Range enhancement? What do they do to r? What is the effect of reliability on r?

SAS Power Estimation proc power; onecorr dist=fisherz corr = 0.35 nullcorr = 0.2 sides = 1 ntotal = 100 power = .; run; proc power; onecorr corr = 0.35 nullcorr = 0 sides = 2 ntotal = . power = .8; run; Computed N Total Alpha = .05 Actual Power = .801 Ntotal = 61 Computed Power Actual alpha = .05 Power = .486

Power for Correlations Rho N required against Null: rho = 0 .10 782 .15 346 .20 193 .25 123 .30 84 .35 61 Sample sizes required for powerful conventional significance tests for typical values of the correlation coefficient in psychology. Power = .8, two tails, alpha is .05.