Correlations: testing linear relationships between two metric variables Lecture 18:

Slides:



Advertisements
Similar presentations
CORRELATION. Overview of Correlation u What is a Correlation? u Correlation Coefficients u Coefficient of Determination u Test for Significance u Correlation.
Advertisements

6-1 Introduction To Empirical Models 6-1 Introduction To Empirical Models.
Learning Objectives Copyright © 2002 South-Western/Thomson Learning Data Analysis: Bivariate Correlation and Regression CHAPTER sixteen.
Learning Objectives Copyright © 2004 John Wiley & Sons, Inc. Bivariate Correlation and Regression CHAPTER Thirteen.
Learning Objectives 1 Copyright © 2002 South-Western/Thomson Learning Data Analysis: Bivariate Correlation and Regression CHAPTER sixteen.
Correlation. Introduction Two meanings of correlation –Research design –Statistical Relationship –Scatterplots.
Basic Statistical Concepts Psych 231: Research Methods in Psychology.
Basic Statistical Concepts
Statistics Psych 231: Research Methods in Psychology.
Intro to Statistics for the Behavioral Sciences PSYC 1900 Lecture 6: Correlation.
Basic Statistical Concepts Part II Psych 231: Research Methods in Psychology.
Summary of Quantitative Analysis Neuman and Robson Ch. 11
Simple Linear Regression Analysis
Active Learning Lecture Slides
This Week: Testing relationships between two metric variables: Correlation Testing relationships between two nominal variables: Chi-Squared.
Correlation.
Chapter 15 Correlation and Regression
Learning Objective Chapter 14 Correlation and Regression Analysis CHAPTER fourteen Correlation and Regression Analysis Copyright © 2000 by John Wiley &
Data Analysis (continued). Analyzing the Results of Research Investigations Two basic ways of describing the results Two basic ways of describing the.
CORRELATIONS: TESTING RELATIONSHIPS BETWEEN TWO METRIC VARIABLES Lecture 18:
Statistical analysis Outline that error bars are a graphical representation of the variability of data. The knowledge that any individual measurement.
I271B The t distribution and the independent sample t-test.
Chapter Thirteen Copyright © 2006 John Wiley & Sons, Inc. Bivariate Correlation and Regression.
Correlation & Regression Analysis
June 30, 2008Stat Lecture 16 - Regression1 Inference for relationships between variables Statistics Lecture 16.
Introducing Communication Research 2e © 2014 SAGE Publications Chapter Seven Generalizing From Research Results: Inferential Statistics.
Correlation They go together like salt and pepper… like oil and vinegar… like bread and butter… etc.
Regression Analysis. 1. To comprehend the nature of correlation analysis. 2. To understand bivariate regression analysis. 3. To become aware of the coefficient.
Chapter 13 Understanding research results: statistical inference.
Lecture 7: Bivariate Statistics. 2 Properties of Standard Deviation Variance is just the square of the S.D. If a constant is added to all scores, it has.
NURS 306, Nursing Research Lisa Broughton, MSN, RN, CCRN RESEARCH STATISTICS.
Inference about the slope parameter and correlation
Statistical analysis.
Design and Data Analysis in Psychology II
Correlation analysis is undertaken to define the strength an direction of a linear relationship between two variables Two measurements are use to assess.
Introduction to Regression Analysis
26134 Business Statistics Week 5 Tutorial
Inference and Tests of Hypotheses
Statistical analysis.
Comparing Groups April 6-7, 2017 CS 160 – Section 10.
Statistics.
Hypothesis Testing Review
Analyzing and Interpreting Quantitative Data
Correlation and Regression Basics
12 Inferential Analysis.
Elementary Statistics
CHAPTER fourteen Correlation and Regression Analysis
POSC 202A: Lecture Lecture: Substantive Significance, Relationship between Variables 1.
Quantitative Data Analysis P6 M4
Correlation and Regression Basics
STEM Fair Graphs & Statistical Analysis
NURS 790: Methods for Research and Evidence Based Practice
Ass. Prof. Dr. Mogeeb Mosleh
Introduction to Econometrics
STAT 312 Introduction Z-Tests and Confidence Intervals for a
STATISTICS Topic 1 IB Biology Miss Werba.
12 Inferential Analysis.
BIVARIATE ANALYSIS: Measures of Association Between Two Variables
Statistics II: An Overview of Statistics
Product moment correlation
Inferential Statistics
Some statistics questions answered:
BIVARIATE ANALYSIS: Measures of Association Between Two Variables
Descriptive Statistics
Correlation and Covariance
Chapter Nine: Using Statistics to Answer Questions
Review I am examining differences in the mean between groups How many independent variables? OneMore than one How many groups? Two More than two ?? ?
Warsaw Summer School 2017, OSU Study Abroad Program
MGS 3100 Business Analysis Regression Feb 18, 2016
Presentation transcript:

Correlations: testing linear relationships between two metric variables Lecture 18:

Agenda Brief Update on Data for Final Correlations

Some Data Considerations for Final Exam: Two metric variables Correlation One binary categorical variable, one metric variable T-test Two categorical variables Chi-Square (crosstabulations) Polytomous dependent variable, metric IV ANOVA Metric dependent variable, multiple IV’s (categorical or metric) Linear Regression

Checking for simple linear relationships Pearson’s correlation coefficient Measures the extent to which two metric or interval-type variables are linearly related Statistic is Pearson r, or the linear or product-moment correlation Or, the correlation coefficient is the average of the cross products of the corresponding z-scores.

Scatterplots !!! Three ways to summarize this data: point of averages Horizontal SD Vertical SD

The correlation coefficient r = average of (x in standard units) x (y in standard units) The correlation coefficient is a pure number without units. Thus, the correlation coefficient is not affected by: Change in scale Interchanging variables Adding the same number to one of the variables Multiplying all values of one variable by a positive number

Sidenote: Why N-1? If we have a randomly distributed variable in a population, extreme cases (i.e., the tails) are less likely to be selected than common cases (i.e., within 1 SD of the mean). One result of this: sample variance is lower than actual population variance. Dividing by n-1 corrects this bias when calculating sample statistics.

Correlations Ranges from zero to 1, where 1 = perfect linear relationship between the two variables. Negative relations Positive relations The correlation coefficient is a measure of clustering around a single line, relative to the standard deviations. Remember: correlation ONLY measures linear relationships, not all relationships. This is why the scatterplot is essential; you can get a coefficient even if there is not a good linear fit.

Interpretation Correlation is a proportional measure; does not depend on specific measurements Correlation interpretation: Direction (+/-) Magnitude of Effect (-1 to 1); shown as r Statistical Significance (p<.05, p<.01, p<.001)

Correlation vs. Causality Recall that Correlation is a precondition for causality– but by itself it is not sufficient to show causality Fat in the diet causes cancer? (but fat and sugar are relatively expensive, and reduce grain consumption as trade off…) Education and Unemployment during the Great Depression? (turns out age is a confounding variable: education was going up for the young, and employers seemed to prefer younger job seekers) http://xkcd.com/552/

Factors which limit Correlation coefficient Homogeneity of sample group Non-linear relationships Censored or limited scales Unreliable measurement instrument Outliers

Homogenous Groups Limited number of education groups will give you a correlation, but how accurate?

Homogenous Groups: Adding Groups

Homogenous Groups: Adding More Groups -The point is NOT that the line changes– b/c it doesn’t– its still a positive correlation every time -The issue is the overall spread, and whether the homogeneous groups are linear in the same way.

Separate Groups: non-homogeneous but similar linear relationship

Separate Groups: non-homogeneous and different linear relationship

Non-Linear Relationships

Censored or Limited Scales…

Censored or Limited Scales

Unreliable Instrument

Unreliable Instrument

Unreliable Instrument

Outliers

Outliers Outlier

Ecological Correlations

Ecological Correlations The ecological correlation overstate the association because they eliminate the variance in the units of analysis (eg., the classroom) Freedman et al. give Sociology and Poli Sci a hard time for this, which is a valid critique– when the unit of analysis is a summary statistic, you expect to lose variability. Average by Classroom

Correlation: Null and Alt Hypotheses Null versus Alternative Hypothesis H0 H1, H2, etc Test Statistics and Significance Level Test statistic Calculated from the data Has a known probability distribution Significance level Usually reported as a p-value (probability that a result would occur if the null hypothesis were true). price mpg price 1.0000 mpg -0.4686 1.0000 0.0000