Theme 5. Association 1. Introduction. 2. Bivariate tables and graphs.

Slides:



Advertisements
Similar presentations
Chapter 16: Correlation.
Advertisements

Bivariate Analyses.
Correlation & Regression Chapter 15. Correlation statistical technique that is used to measure and describe a relationship between two variables (X and.
Correlation CJ 526 Statistical Analysis in Criminal Justice.
Lecture 3: Chi-Sqaure, correlation and your dissertation proposal Non-parametric data: the Chi-Square test Statistical correlation and regression: parametric.
C82MCP Diploma Statistics School of Psychology University of Nottingham 1 Designing Experiments In designing experiments we: Manipulate the independent.
Chapter Eighteen MEASURES OF ASSOCIATION
Analysis of Research Data
Chapter Seven The Correlation Coefficient. Copyright © Houghton Mifflin Company. All rights reserved.Chapter More Statistical Notation Correlational.
Correlational Designs
Summary of Quantitative Analysis Neuman and Robson Ch. 11
Relationships Among Variables
1 Chapter 10 Correlation and Regression We deal with two variables, x and y. Main goal: Investigate how x and y are related, or correlated; how much they.
Week 11 Chapter 12 – Association between variables measured at the nominal level.
Understanding Research Results
Association between Variables Measured at the Nominal Level.
This Week: Testing relationships between two metric variables: Correlation Testing relationships between two nominal variables: Chi-Squared.
MEASURES OF RELATIONSHIP Correlations. Key Concepts Pearson Correlation  interpretation  limits  computation  graphing Factors that affect the Pearson.
Chapter 3 Statistical Concepts.
Simple Covariation Focus is still on ‘Understanding the Variability” With Group Difference approaches, issue has been: Can group membership (based on ‘levels.
Covariance and correlation
Correlation.
Chapter 15 Correlation and Regression
Irkutsk State Medical University Department of Faculty Therapy Correlations Khamaeva A. A. Irkutsk, 2009.
Statistics 11 Correlations Definitions: A correlation is measure of association between two quantitative variables with respect to a single individual.
Statistics in Applied Science and Technology Chapter 13, Correlation and Regression Part I, Correlation (Measure of Association)
Correlation is a statistical technique that describes the degree of relationship between two variables when you have bivariate data. A bivariate distribution.
Basic Statistics Correlation Var Relationships Associations.
Chapter 13 Descriptive Data Analysis. Statistics  Science is empirical in that knowledge is acquired by observation  Data collection requires that we.
Chapter 11, 12, 13, 14 and 16 Association at Nominal and Ordinal Level The Procedure in Steps.
Describing Relationships Using Correlations. 2 More Statistical Notation Correlational analysis requires scores from two variables. X stands for the scores.
Correlation MEASURING ASSOCIATION Establishing a degree of association between two or more variables gets at the central objective of the scientific enterprise.
Chapter 16: Correlation. So far… We’ve focused on hypothesis testing Is the relationship we observe between x and y in our sample true generally (i.e.
Copyright © 2014 by Nelson Education Limited Chapter 11 Introduction to Bivariate Association and Measures of Association for Variables Measured.
Determining and Interpreting Associations between Variables Cross-Tabs Chi-Square Correlation.
Bivariate Association. Introduction This chapter is about measures of association This chapter is about measures of association These are designed to.
Theme 4. Measures of individual position
Theme 6. Linear regression
Part II: Two - Variable Statistics
Simple Linear Correlation
Chi-Square (Association between categorical variables)
Different Types of Data
Final Project Reminder
CORRELATION.
Final Project Reminder
CHAPTER 7 LINEAR RELATIONSHIPS
Chapter 10 CORRELATION.
Making Use of Associations Tests
Elementary Statistics
Statistics for the Social Sciences
POSC 202A: Lecture Lecture: Substantive Significance, Relationship between Variables 1.
Social Research Methods
Spearman’s rho Chi-square (χ2)
Chapter 15: Correlation.
Ch. 11: Quantifying and Interpreting Relationships Among Variables
Descriptive Analysis and Presentation of Bivariate Data
Summarising and presenting data - Bivariate analysis
Introduction to Statistics
Basic Statistical Terms
Different Scales, Different Measures of Association
Unit XI: Data Analysis in nursing research
An Introduction to Correlational Research
Correlation and the Pearson r
MBA 510 Lecture 2 Spring 2013 Dr. Tonya Balan 4/20/2019.
Correlations: Correlation Coefficient:
Making Use of Associations Tests
Review I am examining differences in the mean between groups How many independent variables? OneMore than one How many groups? Two More than two ?? ?
Chapter 18: The Chi-Square Statistic
COMPARING VARIABLES OF ORDINAL OR DICHOTOMOUS SCALES: SPEARMAN RANK- ORDER, POINT-BISERIAL, AND BISERIAL CORRELATIONS.
CORRELATION & REGRESSION compiled by Dr Kunal Pathak
Presentation transcript:

Theme 5. Association 1. Introduction. 2. Bivariate tables and graphs. 3. Quantitative variables: covariance, Pearson correlation coefficient, variance-covariance matrix and correlation matrix. 4. Semiquantitative variables: Spearman coefficient. 5. Qualitative variables: Indices Chi Square and Cramer's V. 6. Association between variables of different scales. 7. Concept of nonlinear relationships.

Introduction So far we have focused on measures of central tendency, variability, skewness and kurtosis of a single variable. However, in practice it is common to examine two or more variables together (e.g., relationship between performance and intelligence, etc.) Here we will focus on the relationship between two variables (from n paired observations) and calculate (in particular) an index that will give us the degree of relationship between the two variables: the coefficient of linear correlation (Pearson)

Graphical representation performance performance performance IQ IQ IQ Negative linear relation No relation Positive Linear relation Note: The Pearson correlation coefficient measures linear correlation.

Graphical representation performance performance IQ IQ Non linear relation Linear relation Note: The Pearson correlation coefficient measures linear correlation..

Graphical representation performance performance performance IQ IQ IQ Perfect linear relation Strong linear relation Weak linear relation Now we need an index that we report the extent to which both X and Y are related, and if the relationship is positive or negative

Covariance and Pearson’s index when the linear relationship is positive: When X is above its mean, Y is typically above its mean rendimiento Scenario 1 inteligencia when the linear relationship is negative: When X is above its mean, Y is typically below its mean rendimiento Scenario 2 inteligencia

Covariance Here's the formula: In case 1, the covariance will be positive, and in case 2, the covariance will be negative. Therefore the covariance gives us an idea of whether the relationship between X and Y is positive or negative. Problem: the covariance is not a bounded index (e.g., how to interpret a covariance of 6 in terms of the degree of association?), and does not account for the variability of the variables. So we use another index

Pearson coefficient The Pearson correlation coefficient: :

Properties of Pearson’s r Property 1. The Pearson correlation index is between -1 and +1. A Pearson correlation index of -1 indicates a perfect negative linear relationship An index of Pearson correlation of +1 indicates a perfect positive linear relationship. A Pearson correlation index of 0 indicates no linear relationship. (Notice that a value close to 0 the index does not imply that there is some kind of non-linear relationship: the Pearson index only measures linear relationship.)

Properties of Pearson’s r Property 2. The Pearson correlation index (in absolute value) does not change when we make a linear transformation on the variables. For example, the Pearson correlation between the temperature (in degrees Celsius) and the level of depression is the same as the correlation between the temperature (measured in degrees Fahrenheit) and the level of depression.

More on Pearson’s r Interpretation We have to consider what we are measuring to interpret how the strength of the relationship between the variables under study. In any case, it is very important to draw an scatterplot. For example, in the case of the left, it is clear that there is no relationship between intelligence and performance. However, if we calculate the Pearson correlation index will give a very high value, caused by the atypical score in the top right corner. performance IQ

More on Pearson’s r Interpretation (2) It is important to note that "correlation does not imply causation". The fact that two variables are highly correlation does not imply that X causes Y or that Y causes X.

More on Pearson’s r Interpretation (3) It is important to note that the Pearson correlation coefficient may be affected by third variables. For example, if we were to a school and measured height and had a test of verbal ability, the higher will also have more verbal ability ... of course, that may be simply because in the older children age will be taller than the younger children. If this "third“ variable is controlled (by "partial correlation”), there will hardly be a relationship between height and important numerical ability. There are many cases where the third variable is the cause of a high relationship between X and Y (and it is often difficult to identify) 14 a Habilidad numérica 12 a 10 a 8 a 6 años Estatura

More on Pearson’s r Interpretation (3) The Pearson coefficient value depends in part on the variability of the group. If we make the Pearson coefficient between intelligence and performance with all subjects, the Pearson coefficient value is quite high. However, if we use only the individuals with IC low (or high CI) and calculate the correlation with framerate, the Pearson coefficient value will be significantly lower. Performance A heterogeneous group would give a greater degree of relationship between variables than a homogeneous group. Low IQ High IQ IQ

5.4 Other coefficients Of course, it is possible to obtain measurements of the degree of relatedness of variables when they are not quantitative. The case in which the variables X and Y are ordinals Remember, when we have variables with ordinal scale, we can establish order between the values, but do not know the distances between values. (If we knew the distance between the values we would be at least an interval scale) We can calculate the correlation coefficient Spearman correlation coefficient or Kendall. (We will see the first one.)

Spearman's rank correlation coefficient What we have is 2 sequences of ordinal values. Spearman coefficient is a special case of the Pearson correlation coefficient. is the difference between the ordinal value X and the ordinal value of the subject Y i

Spearman's rank correlation coefficient (properties) First. It is bounded, as the Pearson coefficient, between -1 and +1. A Ppearman coefficient of +1 means that which is first to X is first to Y, which is the second in X is the second in Y, etc. Spearman coefficient of -1 means that which is first in X is the last in Y, etc… Second. Its calculation is simple (more than the Pearson correlation coefficient). However, with computers this is irrelevant these days ...

5.5 Qualitative Variables c2 test as a measure of association The chi-square test is a nonparametric test that is used to measure the association between two variables when we have contingency tables. It is also used, generally, to assess the divergence between observed scores (empirical) and a predicted scores (theoretical). Generally, the chi-square statistic is obtained as follows: fe are the empirical frequencies and ft represents the theoretical frequencies

c2 test as a measure of association: The case of 2 qualitative variables The empirical frequencies are those that have in the contingency table. Now, how do you compute the theoretical frequencies? This process is simple: If both variables are independent, the theoretical frequency of each cell will be the result of multiplying the sum frequency of the row by the sum of the fequencies of the column, and the result is divided by N To calculate "chi-square" with crosstabs on the Internet:http://faculty.vassar.edu/lowry/newcs.html

c2 as a test as a measure of association c2 as a test as a measure of association. derived coefficients and interpretation From the chi-square test, there are a number of measures of association between variables. They quantify the strength of the relationship between two variables. Case of 2x2 tables: phi coefficient This index is interpreted analogously to the Pearson coefficient

c2 test as a measure of association: Other coefficients If we have more than 2 rows or columns: Cramer’s index m is the smallest number among the number of rows-1 and columns-1 This index is interpreted similarly to Pearson’s r (except for the issue of the sign;; V is always positive). Note that if the table is 2x2 this index matches the “phi” index (see the previous slide)