Association between 2 variables

Slides:



Advertisements
Similar presentations
Forecasting Using the Simple Linear Regression Model and Correlation
Advertisements

Correlation and Linear Regression.
Describing Relationships Using Correlation and Regression
Correlation Chapter 9.
Chapter 15 (Ch. 13 in 2nd Can.) Association Between Variables Measured at the Interval-Ratio Level: Bivariate Correlation and Regression.
Correlation. Introduction Two meanings of correlation –Research design –Statistical Relationship –Scatterplots.
© 2010 Pearson Prentice Hall. All rights reserved Least Squares Regression Models.
EPIDEMIOLOGY AND BIOSTATISTICS DEPT Esimating Population Value with Hypothesis Testing.
Statistics II: An Overview of Statistics. Outline for Statistics II Lecture: SPSS Syntax – Some examples. Normal Distribution Curve. Sampling Distribution.
PSY 1950 Correlation November 5, Definition Correlation quantifies the strength and direction of a linear relationship between two variables.
Fall 2006 – Fundamentals of Business Statistics 1 Chapter 13 Introduction to Linear Regression and Correlation Analysis.
Linear Regression and Correlation Analysis
Chapter 13 Introduction to Linear Regression and Correlation Analysis
10-2 Correlation A correlation exists between two variables when the values of one are somehow associated with the values of the other in some way. A.
Correlational Designs
Correlation and Regression Analysis
Summary of Quantitative Analysis Neuman and Robson Ch. 11
Chapter 7 Forecasting with Simple Regression
Regression Analysis We have previously studied the Pearson’s r correlation coefficient and the r2 coefficient of determination as measures of association.
Leedy and Ormrod Ch. 11 Gray Ch. 14
Week 12 Chapter 13 – Association between variables measured at the ordinal level & Chapter 14: Association Between Variables Measured at the Interval-Ratio.
AM Recitation 2/10/11.
Overview of Statistical Hypothesis Testing: The z-Test
Section 9.1 Introduction to Statistical Tests 9.1 / 1 Hypothesis testing is used to make decisions concerning the value of a parameter.
Copyright © 2013, 2010 and 2007 Pearson Education, Inc. Chapter Inference on the Least-Squares Regression Model and Multiple Regression 14.
Correlation.
Section Copyright © 2014, 2012, 2010 Pearson Education, Inc. Lecture Slides Elementary Statistics Twelfth Edition and the Triola Statistics Series.
Sections 9-1 and 9-2 Overview Correlation. PAIRED DATA Is there a relationship? If so, what is the equation? Use that equation for prediction. In this.
Is there a relationship between the lengths of body parts ?
Chapter 15 Correlation and Regression
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved Section 10-1 Review and Preview.
Section Copyright © 2014, 2012, 2010 Pearson Education, Inc. Lecture Slides Elementary Statistics Twelfth Edition and the Triola Statistics Series.
F OUNDATIONS OF S TATISTICAL I NFERENCE. D EFINITIONS Statistical inference is the process of reaching conclusions about characteristics of an entire.
L 1 Chapter 12 Correlational Designs EDUC 640 Dr. William M. Bauer.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved Lecture Slides Elementary Statistics Eleventh Edition and the Triola.
Section Copyright © 2014, 2012, 2010 Pearson Education, Inc. Lecture Slides Elementary Statistics Twelfth Edition and the Triola Statistics Series.
Experimental Research Methods in Language Learning Chapter 11 Correlational Analysis.
Data Analysis (continued). Analyzing the Results of Research Investigations Two basic ways of describing the results Two basic ways of describing the.
Hypothesis of Association: Correlation
Correlation Association between 2 variables 1 2 Suppose we wished to graph the relationship between foot length Height
C.2000 Del Siegle for Created by Del Siegle For EPSY 5601 You will need to repeatedly click your mouse or space bar to progress through the information.
1 rules of engagement no computer or no power → no lesson no SPSS → no lesson no homework done → no lesson GE 5 Tutorial 5.
Investigating the Relationship between Scores
© 2011 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license.
Correlation & Regression
Psych 230 Psychological Measurement and Statistics Pedro Wolf September 23, 2009.
Correlation Analysis. Correlation Analysis: Introduction Management questions frequently revolve around the study of relationships between two or more.
Regression Chapter 16. Regression >Builds on Correlation >The difference is a question of prediction versus relation Regression predicts, correlation.
Educational Research Chapter 13 Inferential Statistics Gay, Mills, and Airasian 10 th Edition.
Correlation & Regression Chapter 15. Correlation It is a statistical technique that is used to measure and describe a relationship between two variables.
© Copyright McGraw-Hill Correlation and Regression CHAPTER 10.
Chapter 16 Data Analysis: Testing for Associations.
Slide Slide 1 Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley. Overview.
Testing Your Hypothesis In your previous assignments you were supposed to develop two hypotheses that examine a relationship between two variables. For.
© 2006 by The McGraw-Hill Companies, Inc. All rights reserved. 1 Chapter 12 Testing for Relationships Tests of linear relationships –Correlation 2 continuous.
Digression - Hypotheses Many research designs involve statistical tests – involve accepting or rejecting a hypothesis Null (statistical) hypotheses assume.
Tuesday, April 8 n Inferential statistics – Part 2 n Hypothesis testing n Statistical significance n continued….
Section Copyright © 2014, 2012, 2010 Pearson Education, Inc. Chapter 10 Correlation and Regression 10-2 Correlation 10-3 Regression.
Lecture 29 Dr. MUMTAZ AHMED MTH 161: Introduction To Statistics.
Understanding Basic Statistics Fourth Edition By Brase and Brase Prepared by: Lynn Smith Gloucester County College Chapter Nine Hypothesis Testing.
Outline of Today’s Discussion 1.Practice in SPSS: Scatter Plots 2.Practice in SPSS: Correlations 3.Spearman’s Rho.
1 Section 8.2 Basics of Hypothesis Testing Objective For a population parameter (p, µ, σ) we wish to test whether a predicted value is close to the actual.
Lecture 10 Introduction to Linear Regression and Correlation Analysis.
Chapter Eleven Performing the One-Sample t-Test and Testing Correlation.
1 MVS 250: V. Katch S TATISTICS Chapter 5 Correlation/Regression.
Chapter Nine Hypothesis Testing.
Elementary Statistics
Spearman’s Rank Correlation Test
Lecture Slides Elementary Statistics Twelfth Edition
Ass. Prof. Dr. Mogeeb Mosleh
Presentation transcript:

Association between 2 variables Correlation Association between 2 variables

Suppose we wished to graph the relationship between foot length and height of 20 subjects. In order to create the graph, which is called a scatterplot or scattergram, we need the foot length and height for each of our subjects. 58 60 62 64 66 68 70 72 74 Height 4 6 8 10 12 14 Foot Length

1. Find 12 inches on the x-axis. 2. Find 70 inches on the y-axis. 3. Locate the intersection of 12 and 70. 4. Place a dot at the intersection of 12 and 70. Assume our first subject had a 12 inch foot and was 70 inches tall. Height Foot Length

5. Find 8 inches on the x-axis. 6. Find 62 inches on the y-axis. 7. Locate the intersection of 8 and 62. 8. Place a dot at the intersection of 8 and 62. 9. Continue to plot points for each pair of scores. Assume that our second subject had an 8 inch foot and was 62 inches tall.

Notice how the scores cluster to form a pattern. The more closely they cluster to a line that is drawn through them, the stronger the linear relationship between the two variables is (in this case foot length and height).

we say the relationship between the variables is positive. If the points on the scatterplot have an upward movement from left to right, we say the relationship between the variables is negative. If the points on the scatterplot have a downward movement from left to right,

A positive relationship means that high scores on one variable are associated with high scores on the other variable It also indicates that low scores on one variable are associated with low scores on the other variable.

A negative relationship means that high scores on one variable are associated with low scores on the other variable. It also indicates that low scores on one variable are associated with high scores on the other variable.

Not only do relationships have direction (positive and negative), they also have strength (from 0.00 to 1.00 and from 0.00 to –1.00). the stronger the relationship is. The more closely the points cluster toward a straight line,

because both sets cluster similarly. has the same strength as a set of scores with r= 0.60 A set of scores with r= –0.60

For this procedure, we use Pearson’s r (also known as a Pearson Product Moment Correlation Coefficient). This statistical procedure can only be used when BOTH variables are measured on a continuous scale and you wish to measure a linear relationship. Linear Relationship NO Pearson r Curvilinear Relationship

Formula for correlations

Assumptions of the PMCC The measures are approximately normally distributed The variance of the two measures is similar (homoscedasticity) -- check with scatterplot The relationship is linear -- check with scatterplot The sample represents the population The variables are measured on a interval or ratio scale

Example We’ll use data from the class questionnaire in 2005 to see if a relationship exists between the number of times per week respondents eat fast food and their weight What’s your guess (hypothesis) about how the results of this test will turn out? .5? .8? ???

Example To get a correlation coefficient: Slide the variables over...

Example SPSS output The red is our correlation coefficient. The blue is our level of significance resulting from the test…what does that mean?

Digression - Hypotheses Many research designs involve statistical tests – involve accepting or rejecting a hypothesis Null (statistical) hypotheses assume no relationship between two or more variables. Statistics are used to test null hypotheses E.g. We assume that there is no relationship between weight and fast food consumption until we find statistical evidence that there is

Probability Probability is the odds that a certain event will occur In research, we deal with the odds that patterns in data have emerged by chance vs. they are representative of a real relationship Alpha (a) is the probability level (or significance level) set, in advance, by the researcher as the odds that something occurs by chance

Probability Alpha levels (cont.) E.g. a = .05 means that there will be a 5% chance that significant findings are due to chance rather than a relationship in the data The lower the a the better, but…a level must be set in advance

Probability Most statistical tests produce a p-value that is then compared to the a-level to accept or reject the null hypothesis E.g. Researcher sets significance level at .05 a priori; test results show p = .02. Researcher can then reject the null hypothesis and conclude the result was not due to chance but to there being a real relationship in the data How about p = .051, when a-level = .05?

Error Significance levels (e.g. a = .05) are set in order to avoid error Type I error = rejection of the null hypothesis when it was actually true Conclusion = relationship; there wasn’t one (false positive) (= a) Type II error = acceptance of the null hypothesis when it was actually false Conclusion = no relationship; there was one

 Error – Truth Table Null True Null False Accept Type II error Reject Type I error

Back to Our Example Conclusion: No relationship exists between weight and fast food consumption with this group of respondents

Really? Conclusion: No relationship exists between weight and fast food consumption with this group of subjects Do you believe this? Can you critique it? Construct validity? External validity? Thinking in this fashion will help you adopt a critical stance when reading research

Another Example Now let’s see if a relationship exists between weight and the number of piercings a person has What’s your guess (hypothesis) about how the results of this test will turn out? It’s fine to guess, but remember that our null hypothesis is that no relationship exists, until the data shows otherwise

Another Example (continued) What can we conclude from this test? Does this mean that  weight causes  piercings, or vice versa, or what?

Correlations and causality Correlations only describe the relationship, they do not prove cause and effect Correlation is a necessary, but not sufficient condition for determining causality There are Three Requirements to Infer a Causal Relationship

Correlations and causality A statistically significant relationship between the variables The causal variable occurred prior to the other variable There are no other factors that could account for the cause Correlation studies do not meet the last requirement and may not meet the second requirement (go back to internal validity – 497)

Correlations and causality If there is a relationship between weight and # piercings it could be because weight  # piercings weight  # piercings weight  some other factor  # piercings Which do you think is most likely here?

Other Types of Correlations Other measures of correlation between two variables: Point-biserial correlation=use when you have a dichotomous variable The formula for computing a PBC is actually just a mathematical simplification of the formula used to compute Pearson’s r, so to compute a PBC in SPSS, just compute r and the result is the same

Other Types of Correlations Other measures of correlation between two variables: (cont.) Spearman rho correlation; use with ordinal (rank) data Computed in SPSS the same way as Pearson’s r…simply toggle the Spearman button on the Bivariate Correlations window

Coefficient of Determination Correlation Coefficient Squared Percentage of the variability among scores on one variable that can be attributed to differences in the scores on the other variable The coefficient of determination is useful because it gives the proportion of the variance of one variable that is predictable from the other variable Next week we will discuss regression, which builds upon correlation and utilizes this coefficient of determination

Correlation in excel Use the function “correl” The “arguments” (components) of the function are the two arrays

Applets (see applets page) http://www.stat.uiuc.edu/courses/stat100/java/GCApplet/GCAppletFrame.html http://www.stat.sc.edu/~west/applets/clicktest.html http://www.stat.sc.edu/~west/applets/rplot.html