Correlation & Regression Chapter 5 Correlation: Do you have a relationship? Between two Quantitative Variables (measured on Same Person) (1) If you have.

Slides:



Advertisements
Similar presentations
Kin 304 Regression Linear Regression Least Sum of Squares
Advertisements

CORRELATION. Overview of Correlation u What is a Correlation? u Correlation Coefficients u Coefficient of Determination u Test for Significance u Correlation.
Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. Relationships Between Quantitative Variables Chapter 5.
Correlation & Regression Chapter 15. Correlation statistical technique that is used to measure and describe a relationship between two variables (X and.
LINEAR REGRESSION: Evaluating Regression Models Overview Assumptions for Linear Regression Evaluating a Regression Model.
MULTIPLE REGRESSION. OVERVIEW What Makes it Multiple? What Makes it Multiple? Additional Assumptions Additional Assumptions Methods of Entering Variables.
LINEAR REGRESSION: Evaluating Regression Models. Overview Assumptions for Linear Regression Evaluating a Regression Model.
CORRELATION. Overview of Correlation u What is a Correlation? u Correlation Coefficients u Coefficient of Determination u Test for Significance u Correlation.
Relationships Between Quantitative Variables
Inferences About Means of Two Independent Samples Chapter 11 Homework: 1, 2, 3, 4, 6, 7.
Lecture 11 PY 427 Statistics 1 Fall 2006 Kin Ching Kong, Ph.D
Multiple Regression Models: Some Details & Surprises Review of raw & standardized models Differences between r, b & β Bivariate & Multivariate patterns.
USES OF CORRELATION Test reliability Test validity Predict future scores (regression) Test hypotheses about relationships between variables.
Simple Regression correlation vs. prediction research prediction and relationship strength interpreting regression formulas –quantitative vs. binary predictor.
Cal State Northridge  320 Andrew Ainsworth PhD Correlation.
Simple Correlation Scatterplots & r Interpreting r Outcomes vs. RH:
Bivariate & Multivariate Regression correlation vs. prediction research prediction and relationship strength interpreting regression formulas process of.
Introduction to Linear and Logistic Regression. Basic Ideas Linear Transformation Finding the Regression Line Minimize sum of the quadratic residuals.
Correlational Designs
Multiple Regression – Basic Relationships
1 Chapter 17: Introduction to Regression. 2 Introduction to Linear Regression The Pearson correlation measures the degree to which a set of data points.
Chapter 7 Correlational Research Gay, Mills, and Airasian
Norms & Norming Raw score: straightforward, unmodified accounting of performance Norms: test performance data of a particular group of test takers that.
Week 14 Chapter 16 – Partial Correlation and Multiple Regression and Correlation.
SW388R7 Data Analysis & Computers II Slide 1 Multiple Regression – Basic Relationships Purpose of multiple regression Different types of multiple regression.
Chapter 9: Correlational Research. Chapter 9. Correlational Research Chapter Objectives  Distinguish between positive and negative bivariate correlations,
Review Regression and Pearson’s R SPSS Demo
Relationships Among Variables
Review for Final Exam Some important themes from Chapters 9-11 Final exam covers these chapters, but implicitly tests the entire course, because we use.
SW388R7 Data Analysis & Computers II Slide 1 Multiple Regression – Split Sample Validation General criteria for split sample validation Sample problems.
Correlation and Regression Quantitative Methods in HPELS 440:210.
Correlation and Regression A BRIEF overview Correlation Coefficients l Continuous IV & DV l or dichotomous variables (code as 0-1) n mean interpreted.
Chapter 8: Bivariate Regression and Correlation
Understanding Research Results
Marketing Research Aaker, Kumar, Day and Leone Tenth Edition
Introduction to Regression Analysis. Two Purposes Explanation –Explain (or account for) the variance in a variable (e.g., explain why children’s test.
Chapter 15 Correlation and Regression
Lecture 22 Dustin Lueker.  The sample mean of the difference scores is an estimator for the difference between the population means  We can now use.
L 1 Chapter 12 Correlational Designs EDUC 640 Dr. William M. Bauer.
Statistics for the Social Sciences Psychology 340 Fall 2013 Correlation and Regression.
Chapter 8 – 1 Chapter 8: Bivariate Regression and Correlation Overview The Scatter Diagram Two Examples: Education & Prestige Correlation Coefficient Bivariate.
Chapter 9 Analyzing Data Multiple Variables. Basic Directions Review page 180 for basic directions on which way to proceed with your analysis Provides.
Tests and Measurements Intersession 2006.
Regression Analyses. Multiple IVs Single DV (continuous) Generalization of simple linear regression Y’ = b 0 + b 1 X 1 + b 2 X 2 + b 3 X 3...b k X k Where.
Multiple Linear Regression. Purpose To analyze the relationship between a single dependent variable and several independent variables.
Part IV Significantly Different: Using Inferential Statistics
Chapter 7 Relationships Among Variables What Correlational Research Investigates Understanding the Nature of Correlation Positive Correlation Negative.
Educational Research: Competencies for Analysis and Application, 9 th edition. Gay, Mills, & Airasian © 2009 Pearson Education, Inc. All rights reserved.
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 13 Multiple Regression Section 13.3 Using Multiple Regression to Make Inferences.
Chapter 16 Data Analysis: Testing for Associations.
Political Science 30: Political Inquiry. Linear Regression II: Making Sense of Regression Results Interpreting SPSS regression output Coefficients for.
Chapter 10 For Explaining Psychological Statistics, 4th ed. by B. Cohen 1 A perfect correlation implies the ability to predict one score from another perfectly.
Psychology 820 Correlation Regression & Prediction.
© 2006 by The McGraw-Hill Companies, Inc. All rights reserved. 1 Chapter 12 Testing for Relationships Tests of linear relationships –Correlation 2 continuous.
Regression Analysis: Part 2 Inference Dummies / Interactions Multicollinearity / Heteroscedasticity Residual Analysis / Outliers.
Correlation They go together like salt and pepper… like oil and vinegar… like bread and butter… etc.
Advanced Statistical Methods: Continuous Variables REVIEW Dr. Irina Tomescu-Dubrow.
 Seeks to determine group membership from predictor variables ◦ Given group membership, how many people can we correctly classify?
SOCW 671 #11 Correlation and Regression. Uses of Correlation To study the strength of a relationship To study the direction of a relationship Scattergrams.
Chapter 8 Relationships Among Variables. Outline What correlational research investigates Understanding the nature of correlation What the coefficient.
Copyright © 2012 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 18 Multivariate Statistics.
Chapter 8 Relationships Among Variables. Chapter Outline What correlational research investigates Understanding the nature of correlation What the coefficient.
(Slides not created solely by me – the internet is a wonderful tool) SW388R7 Data Analysis & Compute rs II Slide 1.
Chapter 9: Correlational Research
Quantitative Methods Simple Regression.
BPK 304W Correlation.
Understanding Research Results: Description and Correlation
Product moment correlation
Multiple Regression – Split Sample Validation
Presentation transcript:

Correlation & Regression Chapter 5 Correlation: Do you have a relationship? Between two Quantitative Variables (measured on Same Person) (1) If you have a relationship (p<0.05)? (2) What is the Direction (+ vs. -)? (3) What is the Strength (r: from –1 to +1)? Regression: If you have a Significant Correlation: How well can you Predict a subject’s y-score if you know their X-score (and vice versa) Are predictions for members of the Population as good As predictions for Sample members?

Correlations measure LINEAR Relationships No Relationship: r=0.0 Y-scores do not have a Tendency to go up or down as X-scores go up You cannot Predict a person’s Y-value if you know his X- Value any better than if you Didn’t know his X-score Positive Linear Relationship: Y-scores tend to go up as X-scores go up

Correlations measure LINEAR Relationships, cont. There IS a relationship, but its not Linear R=0.0, but that DOESN’T mean that the two variables are Unrelated

Interpreting r-values Coefficient of Determination – r 2 : Square of r-value r 2 * 100 = Percent of Shared Variance; the Rest of the variance Is Independent of the other variable r=0.50r=0.6928

Interpreting r-values If the Coefficient of Determination between height and weight Is r 2 =0.3 (r=0.9): 30% of variability in peoples weight can be Related to their height 70% of the difference between people in their of weight Is Independent of their height Remember: This does not mean that weight is partially Caused by height Arm and leg length have a high coefficient of Determination but a growing leg does not cause Your arm to grow

IV & DV both Quantitative Correlation: Each data point represents Two Measures from Same person. 1.Is There a Relationship? 2.What Direction is the Relationship? 3.How Strong is the Relationship? The stronger the relationship, the better you can predict one score if you know the other. Rr=0.0 R=0.6 r=0.0 r=0.6 r=0.9 r= r=0.8 r=0.95

Negative Correlation Quasi-Independent Variable: # of cigarettes/day Dependent Variable: Physical Endurance The fatter the field, the weaker the correlation r=-0.95 r=-0.99 r=-0.70 r=-0.30 r=-0.50 r=-090

Correlations # of Cigarettes Smoked per Day x 10 # of Malformed Cells in Lung Biopsy

Correlation # of cigarettes smoked per day x 10 Lung Capacity

Methodology: Restriction of Range Restriction of Range cases an artificially low (underestimated) value of r. E.G. using just high GRE scores represented by the open circles. Common when using the scores to determine Who is used in the correlational analysis. E.G.: Only applicants with high GRE scores get into Grad School.

Computing r Deviation Scores Raw Scores

Computing r, cont. Z-scores

Can You Predict Y i If: You Know X i ?

Methodology: Reliability An instrument used to measure a Trait (vs. State) must be Reliable. Measurements taken twice on the same subjects should agree. Disagreement: Not a Trait Poor Instrument Criterion for Reliability: r=0.80 Coefficient of Stability: Correlation of measures taken more than 6mo. apart

Regression Creates a line of “Best Fit” running through the data Uses Method of Least Squares The smallest Squared Distances between the Points and The Line Y-hat = a +b*X and y= a +b*X-hat a=intercept b=slope The Regression Line (line of best fit) give you a & b Plug in X to predict Y, or Y to predict X

Regression, cont. Method of Lest Squares: Minimizes deviations from regression line Therefore, minimizes Errors of Prediction

Regression, cont. Correlation between X & Y = Correlation between Y & Y-hat Error of Estimation: Difference between Y and Y-hat Standard Error of Estimation: sqrt (Σ(Y-Y-hat) 2 /n) Remember what “Standard” means The higher the correlation: The lower the Standard Error of Estimation Shrinkage: Reduction in size of correlation between sample correlation and the population correlation which it measures

Multiple Correlation & Regression Using several measures to predict a measure or future measure Y-hat = a + b 1 X 1 + b 2 X 2 + b 3 X 3 + b 4 X 4 Y-hat is the Dependent Variable X 1, X 2, X 3, & X 4 are the Predictor (Independent) Variables College GPA-hat = a + b 1 H.S.GPA + b 2 SAT + b 3 ACT + b 4 HoursWork R = Multiple Correlation (Range: ) R2 = Coefficient of Determination (R*R * 100; %) Uses Partial Correlations for all but the first Predictor Variable

Partial Correlations The relationship (shared variance) between two variables when the variance which they BOTH share with a third variable is removed Used in multiple regression to subtract Redundant variance when Assessing the Combined relationship between the Predictor Variables And the Dependent Variable. E.G., H.S. GPA and SAT scores.

Step-wise Regression Build your regression equation one dependent variable at a time. Start with the P.V. with the highest simple correlation with the DV Compute the partial correlations between the remaining PVs and The DV Take the PV with the highest partial correlation Compute the partial correlations between the remaining PVs and The DV with the redundancy with the First Two Pvs removed. Take the PV with the highest partial correlation. Keep going until you run out of PVs

Step-wise Regression, cont. Simple Correlations with college GPA: HS GPA=.6 SAT=.5 ACT=.48 (but highly Redundant with SAT, measures same thing Work=-.3 College GPA-hat = a + b 1 H.S.GPA + b 2 SAT + b 3 HoursWork + b 4 ACT

Stepwise Multiple Regression

Shrinkage Step 1: Construct Regression Equation using sample which has already graduated from college. Step 2: Use the a, b1, b2, b3, b3 from this equation to Predict College GPA (Y-hat) of high school graduates/applicants The regression equation will do a better job of predicting College GPA (Y-hat) of the original sample because it factors in all the Idiosyncratic relationships (correlations) of the original sample. Shrinkage: Original R 2 will be Larger than future R 2 s

Forced Order of Entry Specify order in which PVs are added to the regression equation Used to test (1) Hypotheses and to control for (2) Confounding Variables E.G.: Is there gender bias in the salaries of lawyers? Point-Biserial Correlation (r pb ) of Gender and Salary: r pb =0.4 Correlation between Dichotomous and Continuous Variable But females are younger, less experienced, & have fewer years on current job 1.Create Multiple Regression formula with all the other variables 2.Then Add the test variable (Gender) 3.Look at R 2 : Does it Increase a lot or little at all? If R 2 goes up appreciably, then Gender has a Unique Influence

Other Types Of Correlation Pearson Product-Moment Correlation: Standard correlation r = Ratio of shared variance to total variance Requires two continuous variables of interval/ratio level Point Biserial correlation (r pbs or r pb ): One Truly Dichotomous (only two values) One continuous (interval/ratio) variable Measures proportion of variance in the continuous variable Which can be related to group membership E.g., Biserial correlation between height and gender

Discriminant Function Analysis Logistic Regression Look at relationship between Group Membership (DV) and PVs Using a regression equation. Depression = a + b 1 hours of sleep + b 2 blood pressure + b 3 calories consumed Code Depression: 0 for No; 1 for Yes If Y-hat >0.5, predict that subject has depression Look at: Sensitivity: Percent of Depressed individuals found Selectivity: Percent of Positives which are Correct

Discriminant Function Analysis Logistic Regression Four possible outcomes for each prediction (Y-hat): Correct Rejection Miss False Positive Hit Y-hat < 0.5 Y-hat > 0.5 Y = 0 Y = 1  Selectivity  % Correct Hits  Sensitivity % Hits 

Discriminant Function Analysis Logistic Regression Expect Shrinkage: Double Cross Validation: 1.Split sample in half 2.Construct Regression Equations for each 3.Use Regression Equations to predict Other Sample DV Look at Sensitivity and Selectivity If DV is continuous look at correlation between Y and Y-hat If IVs are valid predictors, both equations should be good 4.Construct New regression equation using combined samples

Discriminant Function Analysis Logistic Regression Can have more than two groups, if they are related quantitatively. E.G.: Mania = 1 Normal = 0 Depression = -1

Which Procedure To Use? Logistic Regression produces a more efficient regression equation Than does Discriminant Function Analysis: Greater Sensitivity and Selectivity