Chapter 9 Correlation. 2 Chapter 9 Correlation Major Points The problemThe problem ScatterplotsScatterplots An exampleAn example The correlation coefficientThe.

Slides:



Advertisements
Similar presentations
Correlation Oh yeah!.
Advertisements

Copyright © 2010 Pearson Education, Inc. Slide
Inference for Regression
Review ? ? ? I am examining differences in the mean between groups
Correlation. The Problem Are two variables related?Are two variables related? XDoes one increase as the other increases? e. g. skills and incomee. g.
Describing Relationships Using Correlation and Regression
Chapter 15 (Ch. 13 in 2nd Can.) Association Between Variables Measured at the Interval-Ratio Level: Bivariate Correlation and Regression.
Cal State Northridge  320 Andrew Ainsworth PhD Regression.
PSY 307 – Statistics for the Behavioral Sciences
Major Points Scatterplots The correlation coefficient –Correlations on ranks Factors affecting correlations Testing for significance Intercorrelation matrices.
Basic Statistical Concepts Psych 231: Research Methods in Psychology.
Lecture 4: Correlation and Regression Laura McAvinue School of Psychology Trinity College Dublin.
The Simple Regression Model
Basic Statistical Concepts
Statistics Psych 231: Research Methods in Psychology.
Intro to Statistics for the Behavioral Sciences PSYC 1900 Lecture 6: Correlation.
Cal State Northridge  320 Andrew Ainsworth PhD Correlation.
Basic Statistical Concepts Part II Psych 231: Research Methods in Psychology.
Correlation 1. Correlation - degree to which variables are associated or covary. (Changes in the value of one tends to be associated with changes in the.
Correlation and Regression Analysis
Linear Regression 2 Sociology 5811 Lecture 21 Copyright © 2005 by Evan Schofer Do not copy or distribute without permission.
Cal State Northridge 427 Ainsworth
Chapter 9 For Explaining Psychological Statistics, 4th ed. by B. Cohen 1 What is a Perfect Positive Linear Correlation? –It occurs when everyone has the.
8/10/2015Slide 1 The relationship between two quantitative variables is pictured with a scatterplot. The dependent variable is plotted on the vertical.
Correlation and Linear Regression
Lecture 16 Correlation and Coefficient of Correlation
Understanding Research Results
This Week: Testing relationships between two metric variables: Correlation Testing relationships between two nominal variables: Chi-Squared.
Covariance and correlation
Correlation.
Chapter 15 Correlation and Regression
Inferences for Regression
Copyright © 2010 Pearson Education, Inc. Chapter 6 The Standard Deviation as a Ruler and the Normal Model.
Statistics 11 Correlations Definitions: A correlation is measure of association between two quantitative variables with respect to a single individual.
Experimental Research Methods in Language Learning Chapter 11 Correlational Analysis.
Hypothesis of Association: Correlation
Research & Statistics Looking for Conclusions. Statistics Mathematics is used to organize, summarize, and interpret mathematical data 2 types of statistics.
Figure 15-3 (p. 512) Examples of positive and negative relationships. (a) Beer sales are positively related to temperature. (b) Coffee sales are negatively.
When trying to explain some of the patterns you have observed in your species and community data, it sometimes helps to have a look at relationships between.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. Chapter 6 The Standard Deviation as a Ruler and the Normal Model.
Objectives (IPS Chapter 2.1)
Copyright © 2009 Pearson Education, Inc. Chapter 6 The Standard Deviation as a Ruler and the Normal Model.
Examining Relationships in Quantitative Research
Psych 230 Psychological Measurement and Statistics Pedro Wolf September 23, 2009.
TYPES OF STATISTICAL METHODS USED IN PSYCHOLOGY Statistics.
Correlation Analysis. Correlation Analysis: Introduction Management questions frequently revolve around the study of relationships between two or more.
By: Amani Albraikan.  Pearson r  Spearman rho  Linearity  Range restrictions  Outliers  Beware of spurious correlations….take care in interpretation.
Correlation & Regression Chapter 15. Correlation It is a statistical technique that is used to measure and describe a relationship between two variables.
© Copyright McGraw-Hill Correlation and Regression CHAPTER 10.
Agresti/Franklin Statistics, 1 of 88 Chapter 11 Analyzing Association Between Quantitative Variables: Regression Analysis Learn…. To use regression analysis.
Chapter Eight: Using Statistics to Answer Questions.
Data Analysis.
Modern Languages Row A Row B Row C Row D Row E Row F Row G Row H Row J Row K Row L Row M
Chapter 16: Correlation. So far… We’ve focused on hypothesis testing Is the relationship we observe between x and y in our sample true generally (i.e.
Chapter 15 The Chi-Square Statistic: Tests for Goodness of Fit and Independence PowerPoint Lecture Slides Essentials of Statistics for the Behavioral.
Chapter 7 Calculation of Pearson Coefficient of Correlation, r and testing its significance.
Linear Regression and Correlation Chapter GOALS 1. Understand and interpret the terms dependent and independent variable. 2. Calculate and interpret.
Organizing and Analyzing Data. Types of statistical analysis DESCRIPTIVE STATISTICS: Organizes data measures of central tendency mean, median, mode measures.
Copyright © 2008 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 6 The Standard Deviation as a Ruler and the Normal Model.
Lecturer’s desk Physics- atmospheric Sciences (PAS) - Room 201 s c r e e n Row A Row B Row C Row D Row E Row F Row G Row H Row A
©The McGraw-Hill Companies, Inc. 2008McGraw-Hill/Irwin Linear Regression and Correlation Chapter 13.
Data Analysis. Qualitative vs. Quantitative Data collection methods can be roughly divided into two groups. It is essential to understand the difference.
Inferences for Regression
Inference for Regression
Chapter 15: Correlation.
Fundamental Statistics for the Behavioral Sciences, 4th edition
Inferential Statistics
Chapter Nine: Using Statistics to Answer Questions
Inferences for Regression
Presentation transcript:

Chapter 9 Correlation

2 Chapter 9 Correlation Major Points The problemThe problem ScatterplotsScatterplots An exampleAn example The correlation coefficientThe correlation coefficient –Correlations on ranks Factors affecting correlationsFactors affecting correlations Cont.

3 Chapter 9 Correlation Major Points--cont. Testing for significanceTesting for significance Intercorrelation matricesIntercorrelation matrices Other kinds of correlationsOther kinds of correlations Review questionsReview questions

4 Chapter 9 Correlation The Problem Are two variables related?Are two variables related? –Does one increase as the other increases? e. g. skills and incomee. g. skills and income –Does one decrease as the other increases? e. g. health problems and nutritione. g. health problems and nutrition How can we get a numerical measure of the degree of relationship? Remember though, that correlation does not equal causation!!!!How can we get a numerical measure of the degree of relationship? Remember though, that correlation does not equal causation!!!! Up to this point we have been comparing means on one variable. Now we are looking at the relationship between 2 or more variables.Up to this point we have been comparing means on one variable. Now we are looking at the relationship between 2 or more variables.

5 Chapter 9 CorrelationScatterplots Examples from text pg 172Examples from text pg 172 –See next three slides Every point on the plot represents a subjects scores on 2 variablesEvery point on the plot represents a subjects scores on 2 variables Typically the “predictor” goes on the abcissa or X axis, the “criterion” or outcome variable goes on the ordinate, or Y axis. Sometimes can’t distinguishTypically the “predictor” goes on the abcissa or X axis, the “criterion” or outcome variable goes on the ordinate, or Y axis. Sometimes can’t distinguish

6 Chapter 9 Correlation Each point represents two scores for a country, # physicians and mortality As the number of physicians rises per 100,000 population, so does the infant mortality rate. Notice that previously we plotted a score value and its frequency. Now EACH individual is plotted with his/her two respective scores, on two variables.

7 Chapter 9 Correlation How strongly are these related? Not strongly at all. There is virtually no relationship, the line is almost flat (horizontal). So we could say that spending more on healthcare doesn’t seem to be related to greater life expectancy, likewise living longer not related to increased health care costs for men. But what if we looked at this association for men age 20-75? Differences? Outliers in this chart?

8 Chapter 9 Correlation What is the trend here? As the amount of solar radiation increases (exposure to sunshine), what happens to the breast cancer rate? This is a negative association, and appears to be fairly strong.

9 Chapter 9 Correlation Heart Disease and Cigarettes Landwehr & Watkins report data on heart disease and cigarette smoking in 21 developed countriesLandwehr & Watkins report data on heart disease and cigarette smoking in 21 developed countries Data have been rounded to whole numbers for computational convenience.Data have been rounded to whole numbers for computational convenience. –The results were not affected.

10 Chapter 9 Correlation The Data Surprisingly, the U.S. it the first country on the list--the country with the highest consumption and highest mortality. What is the unit of analysis here? Each country.

11 Chapter 9 Correlation Scatterplot of Heart Disease CHD Mortality goes on ordinate (y axis)CHD Mortality goes on ordinate (y axis) –Why? Cigarette consumption on abscissa (X axis)Cigarette consumption on abscissa (X axis) –Why? What does each dot represent?What does each dot represent? Best fitting line included for clarityBest fitting line included for clarity

12 Chapter 9 Correlation {X = 6, Y = 11}

13 Chapter 9 Correlation What Does the Scatterplot Show? As smoking increases, so does coronary heart disease mortality.As smoking increases, so does coronary heart disease mortality. Relationship looks strongRelationship looks strong Not all data points on line.Not all data points on line. –This gives us “residuals” or “errors of prediction” To be discussed laterTo be discussed later

14 Chapter 9 Correlation Correlation Coefficient A measure of degree (strength) of relationship. Most common is Pearson rA measure of degree (strength) of relationship. Most common is Pearson r Sign refers to direction. (+ or - )Sign refers to direction. (+ or - ) Based on covarianceBased on covariance –Measure of degree to which large scores go with large scores, and small scores with small scores –As scores on variable increase, scores on the other variable tend to….

15 Chapter 9 CorrelationCovariance The formulaThe formula Basically tells if they tend to vary together, when x is above mean, does y tend to be above the mean? X below mean, is y below mean?Basically tells if they tend to vary together, when x is above mean, does y tend to be above the mean? X below mean, is y below mean? When would cov XY be large and positive?When would cov XY be large and positive? When would cov XY be large and negative?When would cov XY be large and negative?

16 Chapter 9 Correlation Correlation Coefficient Symbolized by Pearson rSymbolized by Pearson r Covariance is a function of the standard deviations, so we standardize the covariance by dividing by SD’s (just like we did with z’s, to get rid of original units)Covariance is a function of the standard deviations, so we standardize the covariance by dividing by SD’s (just like we did with z’s, to get rid of original units) Covariance ÷ (product of st. dev.)Covariance ÷ (product of st. dev.) Always a value between -1 and 1Always a value between -1 and 1

17 Chapter 9 Correlation Calculation: Heart Disease and Smoking Cov XY = 11.13, not meaningfulCov XY = 11.13, not meaningful s X = 2.33s X = 2.33 s Y = 6.69s Y = 6.69 Remember that the covariance divided by the standard deviations cannot be greater that 1.0, or less than –1.00.

18 Chapter 9 CorrelationCorrelation--cont. Correlation =.71Correlation =.71 Sign is positiveSign is positive –Why? If sign were negativeIf sign were negative –What would it mean? –Would not alter the degree of relationship. Interpretation: correlations of +/-Interpretation: correlations of +/ weak moderate, +.65 strong strong correlations were defined as an absolute value of r ranging from.65 to 1.00, moderate correlations were defined as an absolute value of r ranging from.30 to.65, and weak correlations were defined as an absolute value of r ranging from.00 to.30.

19 Chapter 9 Correlation Factors Affecting r Range restrictionsRange restrictions –See next slide Data only for countries with low consumption, greatly reduces correlationData only for countries with low consumption, greatly reduces correlation NonlinearityNonlinearity –e.g. age and size of vocabulary have a curvilinear association, not a straight line Homogeneity of variance or homoskedasticity: equal spread of data points at each point on the plot, assumed (football or loaf of french bread around line of best fit). Heteroskedasticity: uneven spread/variability (indicated by fan shaped or bowtie shaped spreadHomogeneity of variance or homoskedasticity: equal spread of data points at each point on the plot, assumed (football or loaf of french bread around line of best fit). Heteroskedasticity: uneven spread/variability (indicated by fan shaped or bowtie shaped spread Heterogeneous subsamplesHeterogeneous subsamples –Examples: ex. male and female height and weight, p 187. Different correlations for 2 groups, Combining the data may inflate or reduce the correlation Subsamples: data could be divided into two distinct sets on the basis of some other variable.

20 Chapter 9 Correlation Countries With Low Consumptions Data With Restricted Range Truncated at 5 Cigarettes Per Day Cigarette Consumption per Adult per Day CHD Mortality per 10, What did this data look like when the range was broader?

21 Chapter 9 Correlation Testing r: Statistical significance of the correlation Population parameter = Population parameter =  Null hypothesis H 0 :  = 0, correlation in population is zeroNull hypothesis H 0 :  = 0, correlation in population is zero –Test of linear independence (no relation) –What would a true null mean here? –What would a false null mean here? Alternative hypothesis (H 1 )   0Alternative hypothesis (H 1 )   0 –Two-tailed: check if r > 0 and r 0 and r < 0

22 Chapter 9 Correlation Tables of Significance We could look up the critical r, how large r needs to be to reject null, Table in Appendix D.2We could look up the critical r, how large r needs to be to reject null, Table in Appendix D.2 For N - 2 = 19 df, r crit =.433For N - 2 = 19 df, r crit =.433 Our correlation in smoking study.71 >.433Our correlation in smoking study.71 >.433 Reject H 0Reject H 0 –Correlation is significant. –More cigarette consumption associated with more CHD mortality.

23 Chapter 9 Correlation Computer Printout SPSS intercorrelation matrix Correlations **.264** ** ** Pearson Correlation Sig. (1-tailed) N Pearson Correlation Sig. (1-tailed) N Pearson Correlation Sig. (1-tailed) N mother init jointattn14 baby init jointattn14 Social play scale mother init jointattn14 baby init jointattn14Social play scale Correlation is significant at the 0.01 level (1-tailed). **. Look at pattern of correlations for a set of variables, diagonal always 1.0 Interpret strength and direction of correlations Sig or p value indicated on line 1 with**, line 2 with actual p value

24 Chapter 9 Correlation Checking the assumptions Linearity: scatterplot, line of best fit should be pretty good fit for dataLinearity: scatterplot, line of best fit should be pretty good fit for data Normally distributed variables: histograms, mean+/- 2 ruleNormally distributed variables: histograms, mean+/- 2 rule Homoskedasticty: scatterplot should have football shape or loaf of french breadHomoskedasticty: scatterplot should have football shape or loaf of french bread No Heterogenous subsamples: scatterplot not have 2 distinct patternsNo Heterogenous subsamples: scatterplot not have 2 distinct patterns No Restricted range: descriptive stats should look normal, not skewed, good rangeNo Restricted range: descriptive stats should look normal, not skewed, good range Random sample usedRandom sample used No Outliers: scatterplot should not have any points that are far away from data spreadNo Outliers: scatterplot should not have any points that are far away from data spread

25 Chapter 9 Correlation Checking scatterplot

26 Chapter 9 Correlation Review Questions What determines what goes on which axis of a scatterplot?What determines what goes on which axis of a scatterplot? What would a correlation of 0 tell us about the relationship between lab grades and exam grades?What would a correlation of 0 tell us about the relationship between lab grades and exam grades? What factors might affect the relationship between smoking and CHD Mortality?What factors might affect the relationship between smoking and CHD Mortality?

27 Chapter 9 Correlation Review Questions--cont. Indicate level (high, med., or low) and sign of the correlation for:Indicate level (high, med., or low) and sign of the correlation for: –number of guns in community and number firearm deaths –robberies and incidence of drug abuse –In incidence of protected sex and incidence of AIDS –community education level and crime rate –Frequency of solar flares (bursts of light) and suicide Why would the size of the correlation required for significance (critical r) decrease as N increases? Cont.