17 Correlation. 17 Correlation Chapter17 p399.

Slides:



Advertisements
Similar presentations
Chapter 3 Properties of Random Variables
Advertisements

Chapter 16: Correlation.
When Simultaneous observations on hydrological variables are available then one may be interested in the linear association between the variables. This.
Correlation & Regression Chapter 15. Correlation statistical technique that is used to measure and describe a relationship between two variables (X and.
Correlation and Covariance
Chapter 3 Summarizing Descriptive Relationships ©.
Intro to Statistics for the Behavioral Sciences PSYC 1900 Lecture 6: Correlation.
Statistical Methods in Computer Science Data 3: Correlations and Dependencies Ido Dagan.
Chapter 7 Scatterplots, Association, and Correlation Stats: modeling the world Second edition Raymond Dahlman IV.
Chapter Seven The Correlation Coefficient. Copyright © Houghton Mifflin Company. All rights reserved.Chapter More Statistical Notation Correlational.
Statistical Analysis of Microarray Data
Measures of Association Deepak Khazanchi Chapter 18.
Correlation and Regression. Relationships between variables Example: Suppose that you notice that the more you study for an exam, the better your score.
Basic Statistics and Shannon Entropy Ka-Lok Ng Asia University.
8/10/2015Slide 1 The relationship between two quantitative variables is pictured with a scatterplot. The dependent variable is plotted on the vertical.
Chapter 21 Correlation. Correlation A measure of the strength of a linear relationship Although there are at least 6 methods for measuring correlation,
Chapter 15 Nonparametric Statistics
(c) 2007 IUPUI SPEA K300 (4392) Outline Correlation and Covariance Bivariate Correlation Coefficient Types of Correlation Correlation Coefficient Formula.
Correlation and Covariance R. F. Riesenfeld (Based on web slides by James H. Steiger)
© 2011 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license.
Lecture 16 Correlation and Coefficient of Correlation
Correlation and Regression
Copyright © 2010, 2007, 2004 Pearson Education, Inc Lecture Slides Elementary Statistics Eleventh Edition and the Triola Statistics Series by.
CORRELATION & REGRESSION
Covariance and correlation
Correlation.
Correlation1.  The variance of a variable X provides information on the variability of X.  The covariance of two variables X and Y provides information.
Chapter 15 Correlation and Regression
14 Elements of Nonparametric Statistics
Regression and Correlation. Bivariate Analysis Can we say if there is a relationship between the number of hours spent in Facebook and the number of friends.
Copyright © 2012 Pearson Education. Chapter 23 Nonparametric Methods.
Stats/Methods I JEOPARDY. Jeopardy CorrelationRegressionZ-ScoresProbabilitySurprise $100 $200$200 $300 $500 $400 $300 $400 $300 $400 $500 $400.
WELCOME TO THETOPPERSWAY.COM.
Hypothesis of Association: Correlation
MEASURES of CORRELATION. CORRELATION basically the test of measurement. Means that two variables tend to vary together The presence of one indicates the.
Basic Statistics Correlation Var Relationships Associations.
Figure 15-3 (p. 512) Examples of positive and negative relationships. (a) Beer sales are positively related to temperature. (b) Coffee sales are negatively.
Nonparametric Statistics aka, distribution-free statistics makes no assumption about the underlying distribution, other than that it is continuous the.
Describing Relationships Using Correlations. 2 More Statistical Notation Correlational analysis requires scores from two variables. X stands for the scores.
CORRELATION. Correlation key concepts: Types of correlation Methods of studying correlation a) Scatter diagram b) Karl pearson’s coefficient of correlation.
Section Copyright © 2014, 2012, 2010 Pearson Education, Inc. Lecture Slides Elementary Statistics Twelfth Edition and the Triola Statistics Series.
Session 13: Correlation (Zar, Chapter 19). (1)Regression vs. correlation Regression: R 2 is the proportion that the model explains of the variability.
Chapter 14 Correlation and Regression
Measures of Association: Pairwise Correlation
Chapter 16: Correlation. So far… We’ve focused on hypothesis testing Is the relationship we observe between x and y in our sample true generally (i.e.
Copyright © 2010 Pearson Education, Inc Chapter Seventeen Correlation and Regression.
CORRELATION ANALYSIS.
Chapter 15: Correlation. Correlations: Measuring and Describing Relationships A correlation is a statistical method used to measure and describe the relationship.
Slide 1 © 2002 McGraw-Hill Australia, PPTs t/a Introductory Mathematics & Statistics for Business 4e by John S. Croucher 1 n Learning Objectives –Understand.
© 2011 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license.
Chapter 2.4 Paired Data and Scatter Plots. Scatter Plots A scatter plot is a graph of ordered pairs of data values that is used to determine if a relationship.
Correlation Analysis. 2 Introduction Introduction  Correlation analysis is one of the most widely used statistical measures.  In all sciences, natural,
McGraw-Hill/Irwin © 2003 The McGraw-Hill Companies, Inc.,All Rights Reserved. Part Four ANALYSIS AND PRESENTATION OF DATA.
CORRELATION. Correlation  If two variables vary in such a way that movement in one is accompanied by the movement in other, the variables are said to.
Theme 5. Association 1. Introduction. 2. Bivariate tables and graphs.
Simple Linear Correlation
Correlation and Covariance
CORRELATION.
CHAPTER 10 & 13 Correlation and Regression
Correlation – Regression
Elementary Statistics
Correlation and Regression
CHAPTER 10 Correlation and Regression (Objectives)
Chapter 15: Correlation.
2.6 Draw Scatter Plots and Best-Fitting Lines
CORRELATION ANALYSIS.
Correlation and Covariance
M248: Analyzing data Block D UNIT D3 Related variables.
Ch 4.1 & 4.2 Two dimensions concept
Correlation and Covariance
Presentation transcript:

17 Correlation

Chapter17 p399

Semimetric distance – Pearson correlation coefficient or Covariance How about higher dimension data ? It is useful to have a similar measure to find out how much the dimensions vary from the mean with respect to each other. Covariance is measured between 2 dimensions, suppose one have a 3-dimension data set (X,Y,Z), then one can calculate Cov(X,Y), Cov(X,Z) and Cov(Y,Z) - to compare heterogenous pairs of variables, define the correlation coefficient or Pearson correlation coefficient, -1≦ rXY ≦1 -1  perfect anticorrelation 0  independent +1 perfect correlation

Semimetric distance – the squared Pearson correlation coefficient Pearson correlation coefficient is useful for examining correlations in the data One may imagine an instance, for example, in which the same TF can cause both enhancement and repression of expression. A better alternative is the squared Pearson correlation coefficient (pcc), The square pcc takes the values in the range 0 ≦ rsq ≦ 1. 0  uncorrelate vector 1  perfectly correlated or anti-correlated pcc are measures of similarity Similarity and distance have a reciprocal relationship similarity↑  distance↓  d = 1 – r is typically used as a measure of distance

Semimetric distance – Pearson correlation coefficient or Covariance The resulting rXY value will be larger than 0 if a and b tend to increase together, below 0 if they tend to decrease together, and 0 if they are independent. Remark: rXY only test whether there is a linear dependence, Y=aX+b if two variables independent  low rXY, a low rXY may or may not  independent, it may be a non-linear relation a high rXY is a sufficient but not necessary condition for variable dependence

Semimetric distance – the squared Pearson correlation coefficient To test for a non-linear relation among the data, one could make a transformation by variables substitution Suppose one wants to test the relation u(v) = avn Take logarithm on both sides log u = log a + n log v Set Y = log u, b = log a, and X = log v  a linear relation, Y = b + nX  log u correlates (n>0) or anti-correlates (n<0) with log v

Semimetric distance – Pearson correlation coefficient or Covariance matrix A covariance matrix is merely collection of many covariances in the form of a d x d matrix:

Spearman’s rank correlation (SRC) One of the problems with using the PCC is that it is susceptible to being skewed by outliers: a single data point can result in two genes appearing to be correlated, even when all the other data points suggest that they are not. Spearman’s rank correlation (SRC) is a non-parametric measure of correlation that is robust to outliers. SRC is a measure that ignores the magnitude of the changes. The idea of the rank correlation is to transform the original values into ranks, and then to compute the correlation between the series of ranks. First we order the values of gene A and B in ascending order, and assign the lowest value with rank 1. The SRC between A and B is defined as the PCC between ranked A and B. In case of ties assign mid-ranks  both are ranked 5, then assign a rank of 5.5

Spearman’s rank correlation The SRC can be calculated by the following formula, where xi and yi denote the rank of the x and y respectively. An approximate formula in case of ties is given by

SRC vs. PCC PCC(A, B) = 0.633 SRC(A,B) = -0.086 Time Gene A ratio Gene B ratio Gene A rank Gene B rank 0.5 -0.76359 -4.05957 1 2 2.276659 -1.7788 6 5 2.137332 -0.97433 4 7 1.900334 -1.44114 3 9 0.932457 -0.87574 11 0.761866 -0.52328 PCC(A, B) = 0.633 SRC(A,B) = -0.086

Chapter17 p401

Chapter17 p408