Quantitative analysis of 2D gels Generalities. Applications Mutant / wild type Physiological conditions Tissue specific expression Disease / normal state.

Slides:



Advertisements
Similar presentations
Basic Gene Expression Data Analysis--Clustering
Advertisements

Cluster Analysis: Basic Concepts and Algorithms
Introduction to Bioinformatics
AEB 37 / AE 802 Marketing Research Methods Week 7
Cluster Analysis.
BASIC METHODOLOGIES OF ANALYSIS: SUPERVISED ANALYSIS: HYPOTHESIS TESTING USING CLINICAL INFORMATION (MLL VS NO TRANS.) IDENTIFY DIFFERENTIATING GENES Basic.
Cluster Analysis Hal Whitehead BIOL4062/5062. What is cluster analysis? Non-hierarchical cluster analysis –K-means Hierarchical divisive cluster analysis.
UNSUPERVISED ANALYSIS GOAL A: FIND GROUPS OF GENES THAT HAVE CORRELATED EXPRESSION PROFILES. THESE GENES ARE BELIEVED TO BELONG TO THE SAME BIOLOGICAL.
Statistics for Marketing & Consumer Research Copyright © Mario Mazzocchi 1 Cluster Analysis (from Chapter 12)
The Broad Institute of MIT and Harvard Clustering.
SocalBSI 2008: Clustering Microarray Datasets Sagar Damle, Ph.D. Candidate, Caltech  Distance Metrics: Measuring similarity using the Euclidean and Correlation.
© University of Minnesota Data Mining for the Discovery of Ocean Climate Indices 1 CSci 8980: Data Mining (Fall 2002) Vipin Kumar Army High Performance.
Dimension reduction : PCA and Clustering Agnieszka S. Juncker Slides: Christopher Workman and Agnieszka S. Juncker Center for Biological Sequence Analysis.
Microarray Data Preprocessing and Clustering Analysis
Dimension reduction : PCA and Clustering by Agnieszka S. Juncker
Clustering Petter Mostad. Clustering vs. class prediction Class prediction: Class prediction: A learning set of objects with known classes A learning.
Cluster Analysis: Basic Concepts and Algorithms
Computational Biology, Part 12 Expression array cluster analysis Robert F. Murphy, Shann-Ching Chen Copyright  All rights reserved.
Cluster Analysis (1).
Introduction to Bioinformatics - Tutorial no. 12
What is Cluster Analysis?
Cluster Analysis CS240B Lecture notes based on those by © Tan,Steinbach, Kumar Introduction to Data Mining 4/18/2004.
 Goal A: Find groups of genes that have correlated expression profiles. These genes are believed to belong to the same biological process and/or are co-regulated.
Cluster Analysis for Gene Expression Data Ka Yee Yeung Center for Expression Arrays Department of Microbiology.
Fuzzy K means.
Microarray analysis 2 Golan Yona. 2) Analysis of co-expression Search for similarly expressed genes experiment1 experiment2 experiment3 ……….. Gene i:
What is Cluster Analysis?
Cluster Analysis Hierarchical and k-means. Expression data Expression data are typically analyzed in matrix form with each row representing a gene and.
Clustering Ram Akella Lecture 6 February 23, & 280I University of California Berkeley Silicon Valley Center/SC.
Ulf Schmitz, Pattern recognition - Clustering1 Bioinformatics Pattern recognition - Clustering Ulf Schmitz
Health and CS Philip Chan. DNA, Genes, Proteins What is the relationship among DNA Genes Proteins ?
Elizabeth Garrett-Mayer November 5, 2003 Oncology Biostatistics
Introduction to DNA Microarray Technology Steen Knudsen Uma Chandran.
Mar 2002 (GG)1 Clustering Gene Expression Data Gene Expression Data Clustering of Genes and Conditions Methods –Agglomerative Hierarchical: Average Linkage.
Microarrays.
Cluster analysis 포항공과대학교 산업공학과 확률통계연구실 이 재 현. POSTECH IE PASTACLUSTER ANALYSIS Definition Cluster analysis is a technigue used for combining observations.
Clustering What is clustering? Also called “unsupervised learning”Also called “unsupervised learning”
Dimension reduction : PCA and Clustering Slides by Agnieszka Juncker and Chris Workman modified by Hanne Jarmer.
Chapter 14 – Cluster Analysis © Galit Shmueli and Peter Bruce 2010 Data Mining for Business Intelligence Shmueli, Patel & Bruce.
More About Clustering Naomi Altman Nov '06. Assessing Clusters Some things we might like to do: 1.Understand the within cluster similarity and between.
An Overview of Clustering Methods Michael D. Kane, Ph.D.
Course Work Project Project title “Data Analysis Methods for Microarray Based Gene Expression Analysis” Sushil Kumar Singh (batch ) IBAB, Bangalore.
By Timofey Shulepov Clustering Algorithms. Clustering - main features  Clustering – a data mining technique  Def.: Classification of objects into sets.
MLL translocations specify a distinct gene expression profile that distinguishes a unique leukemia Armstrong et al, Nature Genetics 30, (2002)
Gene expression & Clustering. Determining gene function Sequence comparison tells us if a gene is similar to another gene, e.g., in a new species –Dynamic.
Analyzing Expression Data: Clustering and Stats Chapter 16.
Clustering Patrice Koehl Department of Biological Sciences National University of Singapore
Applied Multivariate Statistics Cluster Analysis Fall 2015 Week 9.
Hierarchical Clustering
CZ5211 Topics in Computational Biology Lecture 4: Clustering Analysis for Microarray Data II Prof. Chen Yu Zong Tel:
Cluster Analysis, an Overview Laurie Heyer. Why Cluster? Data reduction – Analyze representative data points, not the whole dataset Hypothesis generation.
Given a set of data points as input Randomly assign each point to one of the k clusters Repeat until convergence – Calculate model of each of the k clusters.
Clustering Approaches Ka-Lok Ng Department of Bioinformatics Asia University.
Multivariate statistical methods Cluster analysis.
DATA MINING: CLUSTER ANALYSIS Instructor: Dr. Chun Yu School of Statistics Jiangxi University of Finance and Economics Fall 2015.
1 Unsupervised Learning from URL Corpora Deepak P*, IBM Research, Bangalore Deepak Khemani, Dept. of CS&E, IIT Madras *Work done while at IIT Madras.
Unsupervised Learning
Multivariate statistical methods
Unsupervised Learning: Clustering
Unsupervised Learning: Clustering
Semi-Supervised Clustering
Clustering Patrice Koehl Department of Biological Sciences
Chapter 15 – Cluster Analysis
Machine Learning Clustering: K-means Supervised Learning
Image from Gene-Chips (Micorrrays) Statistics for microarray analysis (SMA)
Multivariate Statistical Methods
(A) Hierarchical clustering was performed to identify groups of patients with similar RNASeq expression of 20 genes associated with reduced survivability.
Hierarchical Clustering
Unsupervised Learning
Presentation transcript:

Quantitative analysis of 2D gels Generalities

Applications Mutant / wild type Physiological conditions Tissue specific expression Disease / normal state Drug effects 2 images (or image groups) comparison Expression over time Multiple conditions analysis serial analysis

Labelling method quantification Reproducibility of migration matching Image analysis requirements Quality of separation spot detection Signal / noise accuracy

2 images comparison Statistical analysis unusable Only for important quantitative variations Essential to confirm

2 sets comparison Mimimum number of images is 3 Maximum is not limited ! Allows detection of smaller variations T test is allowed

Serial analysis Quantitative evolution of each spot Need to group the spots according to their behaviour (clustering) Use of Michael Eisen’s software package (  Cluster  TreeView The most frequent question is to find sets of proteins that have correlated expression profiles

Results 2212, , ,694642, ,913959, , , , , , , , , ,001287, ,67924, , , ,191367, ,161962,113, , , , , , , , , , ,958726, , , ,46485, , ,104295, , , ,075521, ,212681, , , ,518489, , , , , ,655632,085319, , , ,684710,585120, , , , ,36082, , , ,06883, ,802281,844582, , ,865, ,031846, , , , ,4842, , , , , , , , , , , ,716721, ,041881, , , , , ,691096, , , , , , , , , , , , , , , ,261813, , , , ,07115, , ,431045, ,29384, ,181155, , , , , , , ,303964, ,950575, , , , , , , , , , , ,834638, ,190, ,769160, , , , ,829364, , , , , ,634284,096188, , , , , , ,502398, ,637560,350,772870, , , ,269966, ,116284, ,266840, , Making sense of the data

Quantitative analysis of 2D gels Practical tips

2 sets comparison Image normalisation to obtain comparable spot volumes Using the matched spots Using a single spot Data analysis Using the analysis program Using Excel

Serial analysis Image normalisation  input data Find clusters of genes According to the method, the number of clusters will be fixed from the beginning (K-means) or determined after the analysis (hierarchical clustering)

Hierarchical clustering The length of the branch = the distance between joined genes or clusters Dendrogram The dendrogram induces a linear ordering of the data points

Hierarchical clustering Two parameters must be defined: measures how similar two series of number are. it is based on Pearson correlation coefficient. 1- The similarity between two genes: Centered correlation Uncentered correlation Absolute correlation Euclidean... a matrix of distances between all pairs of items is computed. agglomerative hierarchical clustering is performed by joining by a branch the two closest items. 2- The distance between the new cluster and the others: Average Linkage: distance between cluster centers Single Linkage: distance between closest pair Complete Linkage: distance between farthest pair it is measured by different methods. 3- The weight of each serie: it is possible to give a different weight to a particular experiment.

K-means - centroid method iteration = 0 start with random position of K centroids iteration = n iterate until centroids are stable iteration = 1 move centroids to center of assign points assign points to centroids

K-means - centroid method 1.The user chooses the number of cluster 2.The result varies with each run  compare several runs