Correlation.

Slides:



Advertisements
Similar presentations
Learning Objectives 1 Copyright © 2002 South-Western/Thomson Learning Data Analysis: Bivariate Correlation and Regression CHAPTER sixteen.
Advertisements

PSY 307 – Statistics for the Behavioral Sciences
The Simple Regression Model
SIMPLE LINEAR REGRESSION
Inference for Simple Regression Social Research Methods 2109 & 6507 Spring 2006 March 15, 16, 2006.
Social Research Methods
BCOR 1020 Business Statistics Lecture 24 – April 17, 2008.
Lecture 16 Correlation and Coefficient of Correlation
Correlation and Regression
Review of Statistical Inference Prepared by Vera Tabakova, East Carolina University.
Correlation.
Copyright © 2012 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 17 Inferential Statistics.
Statistics 11 Correlations Definitions: A correlation is measure of association between two quantitative variables with respect to a single individual.
WELCOME TO THETOPPERSWAY.COM.
Hypothesis of Association: Correlation
Correlation & Regression
Relationship between two variables Two quantitative variables: correlation and regression methods Two qualitative variables: contingency table methods.
By: Amani Albraikan.  Pearson r  Spearman rho  Linearity  Range restrictions  Outliers  Beware of spurious correlations….take care in interpretation.
© Copyright McGraw-Hill Correlation and Regression CHAPTER 10.
MARKETING RESEARCH CHAPTER 18 :Correlation and Regression.
Copyright (C) 2002 Houghton Mifflin Company. All rights reserved. 1 Understandable Statistics Seventh Edition By Brase and Brase Prepared by: Lynn Smith.
Slide 1 © 2002 McGraw-Hill Australia, PPTs t/a Introductory Mathematics & Statistics for Business 4e by John S. Croucher 1 n Learning Objectives –Understand.
Statistics Correlation and regression. 2 Introduction Some methods involve one variable is Treatment A as effective in relieving arthritic pain as Treatment.
Principles of Biostatistics Chapter 17 Correlation 宇传华 网上免费统计资源(八)
Hypothesis Testing. Steps for Hypothesis Testing Fig Draw Marketing Research Conclusion Formulate H 0 and H 1 Select Appropriate Test Choose Level.
Regression and Correlation
Spearman’s Rho Correlation
Chapter 14 Inference on the Least-Squares Regression Model and Multiple Regression.
CORRELATION.
Correlation I have two variables, practically „equal“ (traditionally marked as X and Y) – I ask, if they are independent and if they are „correlated“,
Correlation – Regression
Introductory Mathematics & Statistics
Understanding Standards Event Higher Statistics Award
Elementary Statistics
Social Research Methods
Correlation and Simple Linear Regression
Introduction to Behavioral Statistics
Lecture 17 Rank Correlation Coefficient
CONCEPTS OF HYPOTHESIS TESTING
Correlation and Regression
Chapter 9 Hypothesis Testing.
BIVARIATE AND PARTIAL CORRELATION
Chapter 14 – Correlation and Simple Regression
Logistic Regression --> used to describe the relationship between
What about ties?? There are two methods mentioned on p.155ff:
Correlation and Simple Linear Regression
Correlation and Regression
M248: Analyzing data Block D UNIT D3 Related variables.
NONPARAMETRIC METHODS
Simple Linear Regression and Correlation
Coefficient of Correlation
Product moment correlation
Inferences Between Two Variables
SIMPLE LINEAR REGRESSION
Topic 8 Correlation and Regression Analysis
Regression & Correlation (1)
AP Statistics Chapter 12 Notes.
Nonparametric Statistics
Chapter Fifteen Frequency Distribution, Cross-Tabulation, and
بسم الله الرحمن الرحيم. Correlation & Regression Dr. Moataza Mahmoud Abdel Wahab Lecturer of Biostatistics High Institute of Public Health University.
Scatter Graphs Spearman’s Rank correlation coefficient
pairing data values (before-after, method1 vs
Correlation & Regression
COMPARING VARIABLES OF ORDINAL OR DICHOTOMOUS SCALES: SPEARMAN RANK- ORDER, POINT-BISERIAL, AND BISERIAL CORRELATIONS.
See Table and let’s do it in R…
EE, NCKU Tien-Hao Chang (Darby Chang)
Correlation and Simple Linear Regression
Correlation and Simple Linear Regression
Presentation transcript:

Correlation

We now investigate the relationships that can exist among continuous variable. Correlation analysis: Correlation is defined as the quantification of the degree to which two random variables are related, provided that the relationship is linear.

17.1 Two-Way Scatter Plot Suppose that we are interested in a pair of continuous random variables. Example, relationship between the percentage of children who have been immunized against the infectious DPT and mortality rate. Data for a random sample of 20 countries are show the figure 17.1. (Table 17.1) X: the percentage of children immunized by age on year Y: the under-five mortality rate Before we do any analysis, we should create a two- way scatter plot of the data. (relationship exists between x and y??) The mortality rate tends to decrease as the percentage of children immunized increase.

17.1 Two-Way Scatter Plot

17.2 Pearson’s Correlation Coefficient In the underlying population form which the sample of points (xi,yi) is selected, the population correlation between the variables X and Y. (Greek letter: r; read rho) The quantifies the strength of the linear relationship between the outcomes x and y. The estimator of r is known as Pearson’s coefficient of correlation or correlation coefficient (r).

17.2 Pearson’s Correlation Coefficient The sample correlation coefficient is denoted by r. sx and sy are the sample standard deviations of the x and y values.

The correlation coefficient is dimensionless number; it has no nuits of measurement. The value r=1 and r=-1 occur when there is an exact linear relationship between x and y. (Figure 17.2 (a)(b)) If y tends to increase in magnitude as x increases, r is greater than 0; x any y are said to be positively correlated. (r >0) If y decreases as x increases, r is less than 0 and the two variables are negatively correlated. (r <0) If r=0, there is no linear relationship between x and y and the variables are uncorrelated. (r =0) (Figure 17.2 (c)(d)) Page 401

17.2 Pearson’s Correlation Coefficient

In this sample: Strong linear relationship Negative association: mortality rate decreases in magnitude as percentage of immunization increases The correlation coefficient merely tells us that a linear relationship exists between two variables; it does not specify whether the relationship is cause-and-effect. We would also like to be able to draw conclusions about the unknown population correlation  using the sample correlation coefficient r. 17.2 Pearson’s Correlation Coefficient

H0: =0 (No association between X and Y) H1: ≠0 (association between X and Y) The estimated standard error of r : The statistic (under H0): If we assume that the pairs of observations were obtained randomly and both X and Y are normally distribution. If  is equal to some other value, represented by 0, the sampling distribution is skewed, and the test statistic no longer follow at t distribution. 17.2 Pearson’s Correlation Coefficient

The coefficient of correlation r has several limitations: It quantifies only the strength of the linear relationship between two variables. Care must be taken when the data contain any outliers, or pairs of observations that lie considerably outside the range of the other data points. The estimated correlation should never be extrapolated beyond the observed ranges of the variables; the relationship between X and Y may change outside of this region. A high correlation between two variables does not imply a cause-and-effect relationship. 1. 若非線性關係,r則無法測出相關性。 2.若有多個極值存在的話,可能會導致錯誤的結果。 3. 不能估計落在變數範圍外之相關係數 17.2 Pearson’s Correlation Coefficient 11

17.3 Spearman’s Rank Correlation Coefficient Pearson’s correlation coefficient is very sensitive to outlying values. We may be interested in calculating a measure of association that is more robust. One approach is to rank the two sets of outcomes x and y separately and known as Spearman’s rank correlation coefficient.(non-parametric method) Spearman’s rank correlation coefficient: Where xri and yri are the rank associated the ith subject rather than the actual observations.

An equivalent method for computing rs is provided by n: the number of data points in the sample di is the different between the rank of xi and the rank of yi -1 ≤ rs ≤ 1 High degree of correlation between x any y: rs =-1 or 1 A lack of linear association between two variables: rs= 0 When type of data is ordinal or the conditions do not hold, we should used rs . 17.3 Spearman’s Rank Correlation Coefficient

Spearman’s rank correlation coefficient may also be thought of as a measure of the concordance(一致性) of the ranks for the outcomes x and y. Case I Nation Percentage Immunized Rank Mortality rate di Ethiopia 13 1 6 Cambodia 32 2 7 Senegal 47 3 8 … Czech Republic 99 20 208 17.3 Spearman’s Rank Correlation Coefficient

Case II Nation Percentage Immunized Rank Mortality rate di Ethiopia 13 208 20 -19 Cambodia 32 2 184 19 -17 Senegal 47 3 145 18 -15 … Czech Republic 99 6 17.3 Spearman’s Rank Correlation Coefficient

If n is not too small and if we can assume that pairs of ranks are chosen randomly, we can test null hypothesis: H0: =0. The test statistic is This testing procedure does not require that X and Y be normally distributed. About rs : It is much less sensitive to outlying values than Pearson’s correlation coefficient. It can be used when one or both of the relevant variables are ordinal. It relies on ranks rather than on actual observations. 17.3 Spearman’s Rank Correlation Coefficient