Principles of Biostatistics Chapter 17 Correlation 宇传华 网上免费统计资源(八)

Slides:



Advertisements
Similar presentations
Chapter 12 Simple Linear Regression
Advertisements

Correlation & Regression Chapter 10. Outline Section 10-1Introduction Section 10-2Scatter Plots Section 10-3Correlation Section 10-4Regression Section.
Learning Objectives Copyright © 2002 South-Western/Thomson Learning Data Analysis: Bivariate Correlation and Regression CHAPTER sixteen.
Learning Objectives Copyright © 2004 John Wiley & Sons, Inc. Bivariate Correlation and Regression CHAPTER Thirteen.
Learning Objectives 1 Copyright © 2002 South-Western/Thomson Learning Data Analysis: Bivariate Correlation and Regression CHAPTER sixteen.
Correlation and Regression
Describing Relationships Using Correlation and Regression
Correlation & Regression Chapter 15. Correlation statistical technique that is used to measure and describe a relationship between two variables (X and.
C82MCP Diploma Statistics School of Psychology University of Nottingham 1 Designing Experiments In designing experiments we: Manipulate the independent.
The Simple Regression Model
SIMPLE LINEAR REGRESSION
Math 227 Elementary Statistics Math 227 Elementary Statistics Sullivan, 4 th ed.
Social Research Methods
SIMPLE LINEAR REGRESSION
Chapter 14 Introduction to Linear Regression and Correlation Analysis
Correlation and Regression Analysis
Correlation & Regression Math 137 Fresno State Burger.
Chapter 4 Two-Variables Analysis 09/19-20/2013. Outline  Issue: How to identify the linear relationship between two variables?  Relationship: Scatter.
Lecture 16 Correlation and Coefficient of Correlation
STATISTICS ELEMENTARY C.M. Pascual
SIMPLE LINEAR REGRESSION
Correlation Scatter Plots Correlation Coefficients Significance Test.
Copyright © 2010, 2007, 2004 Pearson Education, Inc Lecture Slides Elementary Statistics Eleventh Edition and the Triola Statistics Series by.
Covariance and correlation
Correlation.
Chapter 14 – Correlation and Simple Regression Math 22 Introductory Statistics.
Is there a relationship between the lengths of body parts ?
1 Chapter 9. Section 9-1 and 9-2. Triola, Elementary Statistics, Eighth Edition. Copyright Addison Wesley Longman M ARIO F. T RIOLA E IGHTH E DITION.
Learning Objective Chapter 14 Correlation and Regression Analysis CHAPTER fourteen Correlation and Regression Analysis Copyright © 2000 by John Wiley &
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved Section 10-1 Review and Preview.
1 Experimental Statistics - week 10 Chapter 11: Linear Regression and Correlation Note: Homework Due Thursday.
INTRODUCTORY LINEAR REGRESSION SIMPLE LINEAR REGRESSION - Curve fitting - Inferences about estimated parameter - Adequacy of the models - Linear.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved Lecture Slides Elementary Statistics Eleventh Edition and the Triola.
Copyright © 2012 Pearson Education. Chapter 23 Nonparametric Methods.
Regression For the purposes of this class: –Does Y depend on X? –Does a change in X cause a change in Y? –Can Y be predicted from X? Y= mX + b Predicted.
Experimental Research Methods in Language Learning Chapter 11 Correlational Analysis.
Hypothesis of Association: Correlation
Correlation & Regression
Examining Relationships in Quantitative Research
Relationship between two variables Two quantitative variables: correlation and regression methods Two qualitative variables: contingency table methods.
Correlation & Regression Chapter 15. Correlation It is a statistical technique that is used to measure and describe a relationship between two variables.
© Copyright McGraw-Hill Correlation and Regression CHAPTER 10.
Describing Relationships Using Correlations. 2 More Statistical Notation Correlational analysis requires scores from two variables. X stands for the scores.
Chapter Thirteen Copyright © 2006 John Wiley & Sons, Inc. Bivariate Correlation and Regression.
Chapter 10 Correlation and Regression Lecture 1 Sections: 10.1 – 10.2.
Chapter Bivariate Data (x,y) data pairs Plotted with Scatter plots x = explanatory variable; y = response Bivariate Normal Distribution – for.
Correlation. Correlation Analysis Correlations tell us to the degree that two variables are similar or associated with each other. It is a measure of.
Copyright © 2010 Pearson Education, Inc Chapter Seventeen Correlation and Regression.
Regression Analysis. 1. To comprehend the nature of correlation analysis. 2. To understand bivariate regression analysis. 3. To become aware of the coefficient.
CORRELATION ANALYSIS.
Experimental Statistics - week 9
1 MVS 250: V. Katch S TATISTICS Chapter 5 Correlation/Regression.
Slide 1 © 2002 McGraw-Hill Australia, PPTs t/a Introductory Mathematics & Statistics for Business 4e by John S. Croucher 1 n Learning Objectives –Understand.
Correlation Assumptions: You can plot a scatter graph You know what positive, negative and no correlation look like on a scatter graph.
Correlation and Regression. O UTLINE Introduction  10-1 Scatter plots.  10-2 Correlation.  10-3 Correlation Coefficient.  10-4 Regression.
Chapter 11 Association between two variables 第十一章 : 两变量关联性分析.
Is there a relationship between the lengths of body parts?
Chapter 5 STATISTICS (PART 4).
CHAPTER fourteen Correlation and Regression Analysis
Social Research Methods
CHAPTER 10 Correlation and Regression (Objectives)
Correlation and Simple Linear Regression
Correlation and Regression
Correlation and Simple Linear Regression
Simple Linear Regression and Correlation
Correlation and Regression Lecture 1 Sections: 10.1 – 10.2
Warsaw Summer School 2017, OSU Study Abroad Program
Correlation and Simple Linear Regression
Correlation and Simple Linear Regression
Presentation transcript:

Principles of Biostatistics Chapter 17 Correlation 宇传华 网上免费统计资源(八)

Terminology scatter plot 散点图 correlation 相关 linear correlation 直线相关 correlation coefficient 相关系数 Pearson’s correlation coefficient Pearson 相关系数 Spearman’s rank correlation coefficient Spearman 等级 相关系数

§17.1 The Two-Way Scatter Plot CONTENTS §17.2 Pearson’s Correlation Coefficient: r §17.3 Spearman’s Correlation Coefficient: r s §17.4 Further Application

The correlation between two random variables, X and Y, is a measure (指标) of the degree of linear association between the two variables. The population correlation, denoted by    Greek letter, Symbol 字体,读音 rou  The sample correlation, denoted by r (Latin letter or English letter),   (r)  can take on any value from - 1 to 1. The correlation between two random variables, X and Y, is a measure (指标) of the degree of linear association between the two variables. The population correlation, denoted by    Greek letter, Symbol 字体,读音 rou  The sample correlation, denoted by r (Latin letter or English letter),   (r)  can take on any value from - 1 to 1.  ( r )  indicates a perfect negative linear relationship  indicates a perfect positive linear relationship  indicates no linear relationship The absolute value of  indicates the strength ( 强度 ) of the relationship. -1<  <0 indicates a negative linear relationship 0<  <1 indicates a positive linear relationship The sign of  indicates the Direction ( 方向 ) of the relationship.  ( r )  indicates a perfect negative linear relationship  indicates a perfect positive linear relationship  indicates no linear relationship The absolute value of  indicates the strength ( 强度 ) of the relationship. -1<  <0 indicates a negative linear relationship 0<  <1 indicates a positive linear relationship The sign of  indicates the Direction ( 方向 ) of the relationship. Correlation (coefficient)

Before we conduct correlation analysis, we should always created a two-way scatter plot (scatter diagram). X variable------horizontal axis Y variable------vertical axis; each point on the graph represents a combination value (X i,Y i ). Through scatter plot, we can often determine whether a linear relationship exists between X and Y. One statistical technique often employed to measure such an association is known as correlation analysis

§17.1 The Two-Way Scatter Plot 表 凝血酶浓度( X )与凝血时间( Y )间的关系

Scatter Plot

Perfect positive Strong positive Positive correlation r = 1 correlation r = 0.99 correlation r = 0.80 Strong negative No correlation Non-linear correlation correlation r = r = 0.00

The important of a scatter plot In the next chapter (simple linear regression), we also need a scatter plot to find if the relationship between X and Y is a linear relationship, if the relationship between X and Y is a positive linear relationship. So, before the analysis of correlation and regression, we should usually make a scatter plot

§17.2 Pearson’s correlation coefficient ( r) Synonyms: product moment ( 积矩 ) correlation coefficient simple linear (简单线性) correlation coefficient Definition: intensity (strength) direction r A statistical index to describe the intensity (strength) and the direction of association between two variables (X,Y). r is a dimensionless number( 无量纲数 );it has no units of measurement -1≤r ≤ 1

X,Y: random variables following normal distribution ( Bivariate Normal Distribution ). both X i and Y i are measured from the same subject ith

How do we calculate r?

Subject i Concentration of thrombin x (u/ml) Clotting time y (second) x2x2 y2y2 x×y sum xx yy y2y2  xy x2x2

l XX =0.404 , l YY = , l XY = ) Calculation of r X,Y : stronger negative relationship

Inference about correlation coefficient r hypothesis test 1)Establish testing hypothesis, determining significant level α H 0 :  =0 no linear association between X and Y H 1 :  ≠0 linear association between X and Y exists  =0.05 two-sided probability of type I error

2) Calculating statistic =n-2 For the above example =15-2=13 From t distribution table (Table A4,Appendix), the critical value is t 0.05/2(13) =2.160 < |t|=8.874,  P<0.05, Correlation coefficient is statistically significant at α=0.05. concentration of thrombin and clotting time are negatively related.

§17.3 Spearman’s Rank Correlation Coefficient: r s Spearman 等级相关系数 rank 可翻译为: 秩,等级 Spearman‘s rank correlation ( a method of nonparametric test ) is applied if two variables are distributed far from normal. i.e. the normality requirement is not satisfied

The steps of hypothesis test Rank ordering according to its magnitude of values for each of the two variables (X i,Y i ) (X ri, Y ri ) Calculating the Spearman’s rank correlation coefficient based on the ranks

Table hemorrhage degrees and thrombocyte counts (109/L) from 12 children of acute leukemia Patient i plateletX i Rank:X ir (X ir ) 2 Bleeding Y i Rank: Y ir (Y ir ) 2 X ir × Y ir (1)(2) (3)(4)(5)(6)(7)(8) – – – – – – total For tie (equal) ranks, mean rank is used instead. Six ‘–’s, mean=( )/6=3.5

Calculation of r s (numerical values are from Table above) Patientplatelet Rank:X ir (X ir ) 2 BleedingRank: Y ir (Y ir ) 2 X ir * Y ir (1)(2) (3)(4)(5)(6)(7)(8) total

Because there are some tie ranks in Y we can not use the formula latter.

(1) - 1≤r s ≤1 and similar meaning as r does (2) Difference between r s and r. r s ≠ r Calculated by ranks Calculated by original values of data Explanation of Spearman’s rank correlation coefficient: r s

Statistical inference about r s 1) Setting up hypothesis, determining significant level H 0 :  s =0 H 1 :  s  0  =0.05 2) Calculating test statistic 3) Conclusion: No association between platelet( 血小 板 ) and bleeding (出血).

Notices in application 1. r=0 does not mean no correlation (might be non-linear correlation) Y X Y X Y X H 0 :  =0

Notices in application 2.When levels of either variable X or Y are artificially selected , it is not suitable to make Pearson’s correlation analysis ( but we can do spearman’s rank correlation analysis ). Pearson’s correlation analysis requires that both X and Y follows normal distribution.

Notices in application 3. Outliers can affect correlation coefficient heavily.

Notices in application 4. Correlation  cause-effect association( 因果联系 ), Correlation  intrinsic association (固有联系). 5. The difference between statistical significance (P value)  intensity of correlation (absolute value of r ) : There are statistical significance of correlation coefficient the probability of r from the  =0 is small (P value is small). Intensity of correlation ----the absolute value of r

DATA EXP17_12; INPUT X Y; CARDS; §17.4 Further Application ; PROC CORR PEARSON SPEARMAN; VAR X Y; RUN; SAS Codes for textbook’s Table 17.1 and Table 17.2

The CORR Procedure 2 Variables: X Y Simple Statistics Variable N Mean Std Dev Median Minimum Maximum X Y Pearson Correlation Coefficients, N = 20 Prob > |r| under H0: Rho=0 X Y X <.0001 Y <.0001 Spearman Correlation Coefficients, N = 20 Prob > |r| under H0: Rho=0 X Y X Y

DATA EXP17_34; INPUT X Y; CARDS; ; PROC CORR PEARSON SPEARMAN; VAR X Y; RUN; SAS Codes for textbook’s Table 17.3 and Table 17.4

The CORR Procedure 2 Variables: X Y Simple Statistics Variable N Mean Std Dev Median Minimum Maximum X Y Pearson Correlation Coefficients, N = 20 Prob > |r| under H0: Rho=0 X Y X <.0001 Y <.0001 Spearman Correlation Coefficients, N = 20 Prob > |r| under H0: Rho=0 X Y X <.0001 Y <.0001

1.Simple linear correlation coefficient: r Condition: Both X and Y variables follow the normal distribution. 2.Spearman’s rank correlation coefficient: r s It does not require that X or Y follows the normal distribution. SUMMARY

Assignment Review Exercises 5. (pp. 412)