Presentation is loading. Please wait.

Presentation is loading. Please wait.

Correlations Renan Levine POL 242 July 12, 2006. Association : Crosstabulation Category Specifics Symmetry Specification Measure of Association Indication.

Similar presentations


Presentation on theme: "Correlations Renan Levine POL 242 July 12, 2006. Association : Crosstabulation Category Specifics Symmetry Specification Measure of Association Indication."— Presentation transcript:

1 Correlations Renan Levine POL 242 July 12, 2006

2 Association : Crosstabulation Category Specifics Symmetry Specification Measure of Association Indication of Direction Nominal X NominalOnly (2 X 2)Symmetrical Phi Yes Nominal X NominalGreater than (2X2)- Cramer's V No Nominal X OrdinalAt least (2 X 3)- Cramer's V No Ordinal X OrdinalSquare (e.g., 3 X 3)Symmetrical Kendall's Tau- b Yes Ordinal X OrdinalRectangle (e.g., 3 X 4)Asymmetrical Kendall's Tau- c Yes Interval X Interval (not taught yet) -- Pearson's R (not yet taught) Yes

3 Today: Correlations Correlation is a measure of a relationship between variables. Measured with a coefficient [Pearson’s r] that ranges from -1 to 1. Measure strength of relationship of interval or ratio variables r = Σ(Zx * Zy)/n – 1  Zx=Z scores for X variable and Z scores for Y variable. Sum the products and divide by number of paired cases minus one.  How to calculate Z scores can be found on-line.

4 Correlation r Absolute values closer to 0 indicate that there is little or no linear relationship.  Generally, 0.2-0.4 is weak, 0.4-0.6 is okay, 0.6 or higher is strong.  If correlation is very high, then its probably something related that you might considering indexing or choosing just one variable. The closer the coefficient is to the absolute value of 1 the stronger the relationship between the variables being correlated.

5 Positive Relationship If two variables are related positively or directly  r > 0 Variables “track together” – high values on Variable X are associated with high values on Variable Y.  Low values on X associated with low values.

6 Example Robert D. Putnam; Robert Leonardi; Raffaella Y. Nanetti; Franco Pavoncello. “Explaining Institutional Success: The Case of Italian Regional Government.” The American Political Science Review 77:1 (Mar. 1983), pp. 55-74 More fun examples: http://www.nationmaster.com/correlations/eco_gdp-economy-gdp-nominalhttp://www.nationmaster.com/correlations/eco_gdp-economy-gdp-nominal

7 Example II r = 0.84

8 Negative or Inverse Relationship Variables can be inversely or negatively related  High values of X are associated with low values of Y.

9 Example – Negative / Inverse r = -0.68 red= Republicans, blue=Democrats, grey diamonds=Independents Time/SRBI: Oct 3-6, ‘08

10 Data You need interval-level data. You will find many interval-level variables in:  Countries / World  Provinces  Election studies (feeling thermometers, odds of party entering government, etc) You can often use the index you created as an interval-level variable.

11 Compare Lots more noise here. Typical of public opinion data. Most points close to a line.

12 Differences between Public Opinion and Aggregate Data Although it is not uncommon to have one/some outliers in aggregate data, public opinion data tends to be “noisy”.  Feeling thermometer example: Many respondents gave both candidates a 50; Quite a few respondents liked both candidates Even though most who liked McCain disliked Obama A high Pearson’s r for public opinion data may be low for an association in aggregate data.

13 Guidelines for Public Opinion Data MAGNITUDE OF ASSOCIATION QUALIFICATIONCOMMENTS 0.00No Relationship Knowing the independent variable does not reduce the number of errors in predicting the dependent variable at all..00 to.15Not Useful Not Acceptable.15 to.20Very Weak Minimally acceptable.20 to.25Moderately Strong Acceptable.25 to.30Fairly Strong Good Work.30 to.40Strong Great Work.40 to.70Very Strong/Worrisomely Strong EITHER an excellent relationship OR the two variables are measuring the same thing.70 to.99Redundant (?) Proceed with caution: are the two variables testing the same thing? 1.00Perfect Relationship. If we the know the independent variable, we can predict the dependent variable with absolute success.

14 Rough Guidelines for Aggregate Data MAGNITUDE OF ASSOCIATION QUALIFICATIONCOMMENTS 0.00No Relationship Knowing the independent variable does not reduce the number of errors in predicting the dependent variable at all..00 to.30Not useful, very weak Not Acceptable.30 to.50Weak Minimally acceptable.50 to.70Fairly Strong Acceptable.70 to.85Strong Good Work.80 to.90Very Strong/Worrisomely Strong EITHER an excellent relationship OR the two variables are measuring the same thing.90 to.99Redundant (?) Proceed with caution: are the two variables testing the same thing? 1.00Perfect Relationship. If we the know the independent variable, we can predict the dependent variable with absolute success.

15 Very Strong or Worrisome?? Public Opinion: above |0.40| Aggregate: above |0.80| But these are just guidelines. It depends on how good the data is:  Lots of variation in data  Large scale (10, 20, 100 pts – like prediction odds, physicians per 100,000 people, feeling thermometer scales)  Number of observations (N) Provinces dataset is small

16 Outstanding or the same? You either have an outstanding relationship OR the variables may be measuring the same idea.  Ex. unemployment and GDP both measure economic health  Ex. Feeling thermometer Barack Obama and feeling thermometer for Joe Biden both measure attitudes towards the Democratic ticket Also inverse relationship  Example above: Obama and McCain feeling thermometers – different sides of the same coin, as both seem to measure partisanship.

17 Use Yo’ Brain Computer cannot tell you if it’s a good, strong relationship or two measures looking at the same thing. Need to understand what each variable is measuring  Same thought process about the index creation. Use your knowledge of world and theory to decide whether two variables measure the same thing or two different things. Example (above): Putnam’s relationship between civic culture and government performance. Failed states survey - appears that the higher an indicator value, the worse off the country in that particular field. http://www.fundforpeace.org/web/index.php?option=com_content&ta sk=view&id=99&Itemid=140 http://www.fundforpeace.org/web/index.php?option=com_content&ta sk=view&id=99&Itemid=140

18 Flip side Relationship you expect is strong is surprisingly not ?!?!? Make certain both variables are interval Double check that you cleaned up data  Missing values are missing  Next week: there may be the need to qualify the relationship as some sub-group of the data is not like the others and those need to be identified. Think about relationship – maybe its not linear, so that relationship is only present for part of range.

19 Usefulness Quick, easy way to look at several variables to see if they are related. With strong association, you can begin to think about predicting values of Y based on a value of X.  Ex. Positive correlation – you know a high value of X is associated with a high value of Y!

20 Webstats Output - - Correlation Coefficients - - Q375A1 Q305 Q375A3 Q1005 Q375A1 1.0000.2916.5320 -.3163 ( 686) ( 666) ( 667) ( 672) P=. P=.000 P=.000 P=.000 Q305.2916 1.0000.2679 -.1272 ( 666) ( 2776) ( 660) ( 2721) P=.000 P=. P=.000 P=.000 Q375A3.5320.2679 1.0000 -.2020 ( 667) ( 660) ( 682) ( 666) P=.000 P=.000 P=. P=.000 Q1005 -.3163 -.1272 -.2020 1.0000 ( 672) ( 2721) ( 666) ( 3181) P=.000 P=.000 P=.000 P=. Coefficients (Pearson’s r) N

21 Significance? Webstats will tell you whether or not the correlation coefficient is significant. Remember that this is just telling you whether the relationship may be due to chance.  Not the strength of the relationship Almost unheard of to have a strong relationship that is insignificant when using survey data. So, don’t spend any time discussing significance.

22 What if non-interval/non-ratio? Usually more appropriate to use the other measures of association. Webstats will perform a correlation. Be ready for results to be less strong Program may report (instead of Pearson’s r):  Spearman: ordinal x ordinal  Point-biserial: one interval/ratio, one dichotomous  Phi: two dichotomous variables All interpreted the same way


Download ppt "Correlations Renan Levine POL 242 July 12, 2006. Association : Crosstabulation Category Specifics Symmetry Specification Measure of Association Indication."

Similar presentations


Ads by Google