Finding associated genes in large collections of microarrays

Slides:

Advertisements

Similar presentations

Relationship between Variables Assessment Statement Explain that the existence of a correlation does not establish that there is a causal relationship.

Advertisements

DREAM4 Puzzle – inferring network structure from microarray data Qiong Cheng.

CORRELATION. Overview of Correlation u What is a Correlation? u Correlation Coefficients u Coefficient of Determination u Test for Significance u Correlation.

Multiple Regression II Fenster Multiple Regression Let’s go through an example using multiple regression and compare results between simple regression.

Correlation and regression Dr. Ghada Abo-Zaid

McGraw-Hill/Irwin Copyright © 2007 by The McGraw-Hill Companies, Inc. All rights reserved. Nonparametric Methods Chapter 15.

Quantitative Techniques

Chapter 10 Simple Regression.

Today Concepts underlying inferential statistics

Finding associated genes in large collections of microarrays.

Statistical hypothesis testing – Inferential statistics II. Testing for associations.

Statistics Idiots Guide! Dr. Hamda Qotba, B.Med.Sc, M.D, ABCM.

Chapter 12 Inferential Statistics Gay, Mills, and Airasian

Inferential Statistics

Cross Tabulation and Chi-Square Testing. Cross-Tabulation While a frequency distribution describes one variable at a time, a cross-tabulation describes.

AM Recitation 2/10/11.

©aSup   Menghitung Korelasi Bivariat menggunakan SPSS Pearson's correlation coefficient, Spearman's rho, and Kendall's tau-b.

Copyright © 2010, 2007, 2004 Pearson Education, Inc Lecture Slides Elementary Statistics Eleventh Edition and the Triola Statistics Series by.

Irkutsk State Medical University Department of Faculty Therapy Correlations Khamaeva A. A. Irkutsk, 2009.

Statistics 11 Correlations Definitions: A correlation is measure of association between two quantitative variables with respect to a single individual.

Stats/Methods I JEOPARDY. Jeopardy CorrelationRegressionZ-ScoresProbabilitySurprise $100 $200$200 $300 $500 $400 $300 $400 $300 $400 $500 $400.

Review of Probability Concepts ECON 4550 Econometrics Memorial University of Newfoundland Adapted from Vera Tabakova’s notes SECOND.

Educational Research: Competencies for Analysis and Application, 9 th edition. Gay, Mills, & Airasian © 2009 Pearson Education, Inc. All rights reserved.

Multinomial Distribution

Review of Probability Concepts ECON 6002 Econometrics Memorial University of Newfoundland Adapted from Vera Tabakova’s notes.

Linear correlation and linear regression + summary of tests

Question paper 1997.

Inferential Statistics. The Logic of Inferential Statistics Makes inferences about a population from a sample Makes inferences about a population from.

Section Copyright © 2014, 2012, 2010 Pearson Education, Inc. Lecture Slides Elementary Statistics Twelfth Edition and the Triola Statistics Series.

Comp. Genomics Recitation 10 4/7/09 Differential expression detection.

Introducing Communication Research 2e © 2014 SAGE Publications Chapter Seven Generalizing From Research Results: Inferential Statistics.

Evaluation of gene-expression clustering via mutual information distance measure Ido Priness, Oded Maimon and Irad Ben-Gal BMC Bioinformatics, 2007.

Biostatistics Nonparametric Statistics Class 8 March 14, 2000.

Remember You just invented a “magic math pill” that will increase test scores. On the day of the first test you give the pill to 4 subjects. When these.

Educational Research Inferential Statistics Chapter th Chapter 12- 8th Gay and Airasian.

Statistics and probability Dr. Khaled Ismael Almghari Phone No:

Research Methods: 2 M.Sc. Physiotherapy/Podiatry/Pain Correlation and Regression.

Inferring Regulatory Networks from Gene Expression Data BMI/CS 776 Mark Craven April 2002.

Spearman Rho Correlation

Spearman’s Rho Correlation

Correlation I have two variables, practically „equal“ (traditionally marked as X and Y) – I ask, if they are independent and if they are „correlated“,

Lecture Nine - Twelve Tests of Significance.

Why is this important? Requirement Understand research articles

Correlation – Regression

CHOOSING A STATISTICAL TEST

CHAPTER fourteen Correlation and Regression Analysis

Simple Linear Regression and Correlation

Inferential Statistics

Roberto Battiti, Mauro Brunato

Reasoning in Psychology Using Statistics

Correlation and Regression

Review of Probability Concepts

Logistic Regression --> used to describe the relationship between

Ass. Prof. Dr. Mogeeb Mosleh

CORRELATION ANALYSIS.

Association, correlation and regression in biomedical research

Non – Parametric Test Dr. Anshul Singh Thapa.

Correlation and Regression

4/4/2019 Correlations.

Inferential Statistics

Power and Sample Size I HAVE THE POWER!!! Boulder 2006 Benjamin Neale.

Reasoning in Psychology Using Statistics

Nonparametric Statistics

Spearman’s Rank Correlation Coefficient

Correlation & Regression

COMPARING VARIABLES OF ORDINAL OR DICHOTOMOUS SCALES: SPEARMAN RANK- ORDER, POINT-BISERIAL, AND BISERIAL CORRELATIONS.

Descriptive statistics Pearson’s correlation

CORRELATION & REGRESSION compiled by Dr Kunal Pathak

Presentation transcript:

Finding associated genes in large collections of microarrays

Produce hypothesis of functional relations between genes Positive correlation: Co-regulated genes or positive modulator Negative correlation: Co-regulated genes or inhibitor. Used to derive networks of gene interactions.

4 simple ways of finding association Pearson correlation coefficient. Spearman’s rank correlation coefficient. Probabilistic approach (Present/Absent). Mutual information (Present/Absent)

Pearson correlation coefficient Varies between -1 and 1: Between 0.6 and 1: strong positive correlation. Between -0.6 and -1: strong negative correlation. -1 is perfect negative correlation 1 is perfect positive correlation Assumes linear relation between variables.

Pearson correlation coefficient Step 1: Prepare data. Step 2: Compute Pearson coefficient between pairs of probes of interest. Step 3: Assess significance. Step 4: Multiple testing correction.

Pearson correlation coefficient Step 1: Prepare data: Chips are normalized with MAS 5.0 or other procedure. Scale probes in each chip dividing by mean. Center and standardize each probe distribution: z-scores.

Pearson correlation coefficient Step 2: Compute Pearson coefficient between pairs of probes: when z-scores are pre-computed: n: number of chips

Pearson correlation coefficient Step 3: Assess significance: Randomize if possible. Good for less than 20 chips or Use t-Student distribution with n-2 degrees of freedom: ρ: correlation coefficient n: number of chips

Pearson correlation coefficient Step 4: Multiple testing correction

Spearman’s rank correlation coefficient Non parametric method: Less power but more robust. Does not assume normal distribution. Also varies between -1 and 1

Spearman’s rank correlation coefficient Step 1: Prepare data. Step 2: Compute Spearman’s rank correlation coefficient between probe of interest and the rest. Step 3: Assess significance. Step 4: Multiple test correction.

Spearman’s rank correlation coefficient Step 1: Prepare data: Same as Pearson. Order the values of the probes by increasing hybridization values. Construct the rank vectors.

Spearman’s rank correlation coefficient Step 2: Compute coefficient between probe sets of interest: d: differences between the ranks of the two probes n: number of chips

Spearman’s rank correlation coefficient Step 3: Assess significance: Same as Pearson. Randomize if possible. Less than 20 chips or Use t-Student distribution with n -2 degrees of freedom: ρ: correlation coefficient n: number of chips

Spearman’s rank correlation coefficient Step 4: Multiple testing correction.

Binary probabilistic approach based on Present/Absent Approach adapted from: “Computational methods for the identification of differential and coordinated gene expression.” Claverie JM Hum Mol Genet. 1999;8(10):1821-32 Use MAS 5.0 calls of Present-Marginal-Absent for each probe. Good for heterogeneous microarray collections.

Binary approach based on Present/Absent Step 1: Prepare data. Step 2: Compute p-value of # of observed matches. Step 3: Multiple test correction.

Binary approach based on Present/Absent Step 1: Obtain P/M/A calls for probes: Each call is associated to a p-value. Filter can be applied. Codify P/M/A calls as binary vectors: Encode P as 1 and M/A as 0

Binary approach based on Present/Absent Step 2: Compute p-value of # of matches probe x: 1 1 0 0 0 1 1 0 1 0 0 0 probe y: 1 1 0 0 0 0 1 0 1 0 0 0 probe z: 0 0 1 1 1 1 0 0 0 1 1 1 Find improbably high number of matches (or miss-matches). probe x & y: 11 out of 12 matches probe x & z: 11 out of 12 miss-matches

Binary approach based on Present/Absent Step 2: Compute probability for observing by chance x matches or more from the binomial distribution B(n,p). First, probability of a match. : fraction of 1s (Present) probe x. : fraction of 1s (Present) probe y.

Binary approach based on Present/Absent Step 2: Compute probability for observing by chance x matches or more from the binomial distribution: For n large one can use the normal distribution: n: number of chips.

Binary approach based on Present/Absent Step 3: Multiple test correction.

Mutual information based on Present/Absent Step 1: Prepare data. Step 2: Compute MI value for pairs of probes. Step 3: Use of a threshold for MI

Mutual information based on Present/Absent Step 1: Obtain P/M/A calls for probes: Each call is associated to a p-value. Filter can be applied. Codify P/M/A calls as binary vectors: Encode P/M as 1 and A as 0 OR Encode P as 1 and M/A as 0

Mutual information based on Present/Absent Step 2: Compute MI value for probes X and Y: p(.) frequencies of observed Ps and As p(x,y) frequencies of the joint distribution

Mutual information based on Present/Absent Step 3: Use a threshold: probes X and Y are correlated if: MI(X, Y) >1/n * log(1/P) n: number of chips. P: 1/p^2 (with p number of probes). “A simple method for reverse engineering causal networks” M. Andrecut and S. A. Kauffman J. Phys. A: Math. Gen. 39 No 46.

Try Pearson method in Stembase! Implemented by Reatha Sandie