Presentation is loading. Please wait.

Presentation is loading. Please wait.

Association between 2 variables

Similar presentations


Presentation on theme: "Association between 2 variables"— Presentation transcript:

1 Association between 2 variables
Correlation Association between 2 variables

2 Suppose we wished to graph the relationship between foot length
and height of 20 subjects. In order to create the graph, which is called a scatterplot or scattergram, we need the foot length and height for each of our subjects. 58 60 62 64 66 68 70 72 74 Height 4 6 8 10 12 14 Foot Length

3 1. Find 12 inches on the x-axis. 2. Find 70 inches on the y-axis.
3. Locate the intersection of 12 and 70. 4. Place a dot at the intersection of 12 and 70. Assume our first subject had a 12 inch foot and was 70 inches tall. Height Foot Length

4 5. Find 8 inches on the x-axis.
6. Find 62 inches on the y-axis. 7. Locate the intersection of 8 and 62. 8. Place a dot at the intersection of 8 and 62. 9. Continue to plot points for each pair of scores. Assume that our second subject had an 8 inch foot and was 62 inches tall.

5 Notice how the scores cluster to form a pattern.
The more closely they cluster to a line that is drawn through them, the stronger the linear relationship between the two variables is (in this case foot length and height).

6 we say the relationship between the variables is positive.
If the points on the scatterplot have an upward movement from left to right, we say the relationship between the variables is negative. If the points on the scatterplot have a downward movement from left to right,

7 A positive relationship means that high scores on one variable
are associated with high scores on the other variable It also indicates that low scores on one variable are associated with low scores on the other variable.

8 A negative relationship means that high scores on one variable
are associated with low scores on the other variable. It also indicates that low scores on one variable are associated with high scores on the other variable.

9 Not only do relationships have direction (positive and negative), they also have strength (from 0.00 to 1.00 and from 0.00 to –1.00). the stronger the relationship is. The more closely the points cluster toward a straight line,

10 because both sets cluster similarly.
has the same strength as a set of scores with r= 0.60 A set of scores with r= –0.60

11 For this procedure, we use Pearson’s r (also known as a Pearson Product Moment Correlation Coefficient). This statistical procedure can only be used when BOTH variables are measured on a continuous scale and you wish to measure a linear relationship. Linear Relationship NO Pearson r Curvilinear Relationship

12 Formula for correlations

13 Assumptions of the PMCC
The measures are approximately normally distributed The variance of the two measures is similar (homoscedasticity) -- check with scatterplot The relationship is linear -- check with scatterplot The sample represents the population The variables are measured on a interval or ratio scale

14 Example We’ll use data from the class questionnaire in 2005 to see if a relationship exists between the number of times per week respondents eat fast food and their weight What’s your guess (hypothesis) about how the results of this test will turn out? .5? .8? ???

15 Example To get a correlation coefficient: Slide the variables over...

16 Example SPSS output The red is our correlation coefficient. The blue is our level of significance resulting from the test…what does that mean?

17 Digression - Hypotheses
Many research designs involve statistical tests – involve accepting or rejecting a hypothesis Null (statistical) hypotheses assume no relationship between two or more variables. Statistics are used to test null hypotheses E.g. We assume that there is no relationship between weight and fast food consumption until we find statistical evidence that there is

18 Probability Probability is the odds that a certain event will occur
In research, we deal with the odds that patterns in data have emerged by chance vs. they are representative of a real relationship Alpha (a) is the probability level (or significance level) set, in advance, by the researcher as the odds that something occurs by chance

19 Probability Alpha levels (cont.)
E.g. a = .05 means that there will be a 5% chance that significant findings are due to chance rather than a relationship in the data The lower the a the better, but…a level must be set in advance

20 Probability Most statistical tests produce a p-value that is then compared to the a-level to accept or reject the null hypothesis E.g. Researcher sets significance level at .05 a priori; test results show p = .02. Researcher can then reject the null hypothesis and conclude the result was not due to chance but to there being a real relationship in the data How about p = .051, when a-level = .05?

21 Error Significance levels (e.g. a = .05) are set in order to avoid error Type I error = rejection of the null hypothesis when it was actually true Conclusion = relationship; there wasn’t one (false positive) (= a) Type II error = acceptance of the null hypothesis when it was actually false Conclusion = no relationship; there was one

22  Error – Truth Table Null True Null False Accept Type II error Reject
Type I error

23 Back to Our Example Conclusion: No relationship exists between weight and fast food consumption with this group of respondents

24 Really? Conclusion: No relationship exists between weight and fast food consumption with this group of subjects Do you believe this? Can you critique it? Construct validity? External validity? Thinking in this fashion will help you adopt a critical stance when reading research

25 Another Example Now let’s see if a relationship exists between weight and the number of piercings a person has What’s your guess (hypothesis) about how the results of this test will turn out? It’s fine to guess, but remember that our null hypothesis is that no relationship exists, until the data shows otherwise

26 Another Example (continued)
What can we conclude from this test? Does this mean that  weight causes  piercings, or vice versa, or what?

27 Correlations and causality
Correlations only describe the relationship, they do not prove cause and effect Correlation is a necessary, but not sufficient condition for determining causality There are Three Requirements to Infer a Causal Relationship

28 Correlations and causality
A statistically significant relationship between the variables The causal variable occurred prior to the other variable There are no other factors that could account for the cause Correlation studies do not meet the last requirement and may not meet the second requirement (go back to internal validity – 497)

29 Correlations and causality
If there is a relationship between weight and # piercings it could be because weight  # piercings weight  # piercings weight  some other factor  # piercings Which do you think is most likely here?

30 Other Types of Correlations
Other measures of correlation between two variables: Point-biserial correlation=use when you have a dichotomous variable The formula for computing a PBC is actually just a mathematical simplification of the formula used to compute Pearson’s r, so to compute a PBC in SPSS, just compute r and the result is the same

31 Other Types of Correlations
Other measures of correlation between two variables: (cont.) Spearman rho correlation; use with ordinal (rank) data Computed in SPSS the same way as Pearson’s r…simply toggle the Spearman button on the Bivariate Correlations window

32 Coefficient of Determination
Correlation Coefficient Squared Percentage of the variability among scores on one variable that can be attributed to differences in the scores on the other variable The coefficient of determination is useful because it gives the proportion of the variance of one variable that is predictable from the other variable Next week we will discuss regression, which builds upon correlation and utilizes this coefficient of determination

33 Correlation in excel Use the function “correl”
The “arguments” (components) of the function are the two arrays

34

35 Applets (see applets page)


Download ppt "Association between 2 variables"

Similar presentations


Ads by Google