Quantitative analysis of 2D gels Generalities
Applications Mutant / wild type Physiological conditions Tissue specific expression Disease / normal state Drug effects 2 images (or image groups) comparison Expression over time Multiple conditions analysis serial analysis
Labelling method quantification Reproducibility of migration matching Image analysis requirements Quality of separation spot detection Signal / noise accuracy
2 images comparison Statistical analysis unusable Only for important quantitative variations Essential to confirm
2 sets comparison Mimimum number of images is 3 Maximum is not limited ! Allows detection of smaller variations T test is allowed
Serial analysis Quantitative evolution of each spot Need to group the spots according to their behaviour (clustering) Use of Michael Eisen’s software package ( Cluster TreeView The most frequent question is to find sets of proteins that have correlated expression profiles
Results 2212, , ,694642, ,913959, , , , , , , , , ,001287, ,67924, , , ,191367, ,161962,113, , , , , , , , , , ,958726, , , ,46485, , ,104295, , , ,075521, ,212681, , , ,518489, , , , , ,655632,085319, , , ,684710,585120, , , , ,36082, , , ,06883, ,802281,844582, , ,865, ,031846, , , , ,4842, , , , , , , , , , , ,716721, ,041881, , , , , ,691096, , , , , , , , , , , , , , , ,261813, , , , ,07115, , ,431045, ,29384, ,181155, , , , , , , ,303964, ,950575, , , , , , , , , , , ,834638, ,190, ,769160, , , , ,829364, , , , , ,634284,096188, , , , , , ,502398, ,637560,350,772870, , , ,269966, ,116284, ,266840, , Making sense of the data
Quantitative analysis of 2D gels Practical tips
2 sets comparison Image normalisation to obtain comparable spot volumes Using the matched spots Using a single spot Data analysis Using the analysis program Using Excel
Serial analysis Image normalisation input data Find clusters of genes According to the method, the number of clusters will be fixed from the beginning (K-means) or determined after the analysis (hierarchical clustering)
Hierarchical clustering The length of the branch = the distance between joined genes or clusters Dendrogram The dendrogram induces a linear ordering of the data points
Hierarchical clustering Two parameters must be defined: measures how similar two series of number are. it is based on Pearson correlation coefficient. 1- The similarity between two genes: Centered correlation Uncentered correlation Absolute correlation Euclidean... a matrix of distances between all pairs of items is computed. agglomerative hierarchical clustering is performed by joining by a branch the two closest items. 2- The distance between the new cluster and the others: Average Linkage: distance between cluster centers Single Linkage: distance between closest pair Complete Linkage: distance between farthest pair it is measured by different methods. 3- The weight of each serie: it is possible to give a different weight to a particular experiment.
K-means - centroid method iteration = 0 start with random position of K centroids iteration = n iterate until centroids are stable iteration = 1 move centroids to center of assign points assign points to centroids
K-means - centroid method 1.The user chooses the number of cluster 2.The result varies with each run compare several runs