Dr. C. Ertuna1 Statistical Relationship (Lesson – 02E) As One Set of Data Move in One Direction What Do the Other Set of Data Do?
Dr. C. Ertuna2 Statistical Relationship The price for TRC stock and S&P 500 index are given on the left. Is there any relationship between those two? To answer this question we need to measure the “Statistical Relationship” between those two. Data: St-CE-Ch02-x1-Examples-Slide 60
Dr. C. Ertuna3 Statistical Relationship The descriptive statistics that measures the degree of relation between 2 variables are called correlation coefficients. Three measures for statistical relationship are: Scale data Pearson’s r Normal Distribution Linearity Ordinal (or above data) Kendall’s Tau-b Distribution free Monotonicity Nominal (or above data) Chi-Square Test Raw frequency > 5
Dr. C. Ertuna4 Statistical Relationship (Cont.) Pearson Correlation coefficient (ρ, r) measures the strength of linear relationship between two variables (X and Y) assuming normal distribution. {significance!}
Dr. C. Ertuna5 Statistical Relationship (Cont.) Correlation coefficient will range from -1 to +1 A correlation of 0 indicates that there is no linear relationship between two variables Even a high correlation could be observed just by chance; to be sure we need to run a statistical test. Correlation between two variables does not mean causal relationship between them Correlation Matrix provides pair-wise correlation between more than two variables.
Dr. C. Ertuna6 Statistical Relationship (Cont.) Square of Pearson’s r (r 2 )can be interpreted as explained variance if there is a Dependent Variable (DV) Independent Variable (IV) relationship exists. For example if r = than r 2 = that means IV explains 87.05% of the variations in the DV.
Dr. C. Ertuna7 Correlation Test Assumptions Parametric Correlation Test Pearson’s r: –Interval data –Normality –Equal Variance (not needed if n > 30) –Linearity
Dr. C. Ertuna8 Statistical Relationship (Cont.) Kendall's tau-b A distribution-free (nonparametric) measure of association for ordinal (or ranked) variables that take ties into account. The sign of the coefficient indicates the direction of the relationship, and its absolute value indicates the strength, with larger absolute values indicating stronger relationships. Possible values range from -1 to 1.
Dr. C. Ertuna9 Statistical Relationship (Cont.) Spearman’s rho Commonly used distribution-free (nonparametric) measure of correlation between two ordinal variables. For all of the cases, the values of each of the variables are ranked from smallest to largest, and the Pearson correlation coefficient is computed on the ranks.
Dr. C. Ertuna10 Correlation Test Assumptions Non-Parametric Correlation Test Kandell’s tau-b & Spearman’s rho: –Ordinal data –Monotonicity
Dr. C. Ertuna11 2 -Test Chi-square test is suitable for analyzing nominal and ordinal data. (Interval and ratio data should be grouped first) Chi-square test is used for - Goodness-of-fit (1-Way classification; 1-DV, 1-IV) - Test for independence (2-Way classification; 1-DV, 2+IV) Categorical data in Rows Ordinal data in Columns
Dr. C. Ertuna12 2 -Test Assumptions –Categorical data –Any cell’s raw frequency > 5 –Random Sampling
Dr. C. Ertuna13 2 -Test PHStat2 / Multiple-Sample Tests / / Chi-Square Test Significance Level: to be entered Number of Raws: to be entered Number of Columns: to be entered If p_value < 0,05 There is a relationship
Dr. C. Ertuna14 2 -Test Strength of the Relationship is measured by Where N = total number of observations k = min( #rows, #columns)
Dr. C. Ertuna15 2 -Test Cramer’s V has a value between 0 and 1 Where 0 means independence or no relationship and 1 means perfect relation ship.
Dr. C. Ertuna16 Interpreting the Association Although there is no theoretical guideline on how to interpret the value of association, here are some guidelines: Interpret the squared value of the association 1.00 – 0.80High (strong) association 0.80 – 0.60Moderately high association 0.60 – 0.40Moderate association 0.40 – 0.20Weak association 0.20 – 0.00Very weak association
Dr. C. Ertuna17 Interpreting of Cramer’s V Although there is no theoretical guideline on how to interpret the value of association, here are some guidelines for Cramer’s V: 1.00 – 0.40Worrisomely High (strong) association 0.40 – 0.35Very High (strong) association 0.35 – 0.30High (strong) association 0.30 – 0.25Moderately high association 0.25 – 0.20Moderate association 0.20 – 0.10Weak association 0.10 – 0.00Very weak association/Not acceptable
Dr. C. Ertuna18 SPSS – Nominal Association Cathegories should be coded first: Data / Weight Cases / (variable that stands for frequencies) Analyze / Discriptives / Crosstabs /
Dr. C. Ertuna19 Example: Statistical Relationship The price for TRC stock and S&P 500 index are given on the left. 1. Compute the correlation between S&P500 and TRC 2. Explain the meaning of the result. Data: St-CE-Ch02-x1-Examples-Slide 60
Dr. C. Ertuna20 Example: Statistical Relationship Analyze/ Correlate/ Bivariate Select the variables & move to the right pane Select Pearson, Kendall’s tau_b, Spearman. (2-tailed* ; 1-tailed) Ok
Dr. C. Ertuna21 Example: Statistical Relationship
Dr. C. Ertuna22 Example: Statistical Relationship (cont.) Data: St-CE-Ch02-x1-Examples-Slide 60
Dr. C. Ertuna23 Example: Statistical Relationship (cont.) The correlation between Tracway stock price and S&P 500 index is Explain the meaning of the result.
Dr. C. Ertuna24 Meaning: Statistical Relationship (cont.) Pearson correlation coefficient of 0.93 indicates that there is a 1strong (most of the time holding), 2positive (when one changes, the other one changes in the same direction), 3linear relationship between S&P 500 index and Tracway stock price (assuming both normally distributed).
Dr. C. Ertuna25 Next Lesson (Lesson - 03A) Random Variables & Probability Distribution