Presentation is loading. Please wait.

Presentation is loading. Please wait.

Cluster Overlap Distribution Map (CODM) - Software Presenter: Jia Meng by: Makoto Kano, Shuichi Tsutsumi, Nobutaka Kawahara Yan Wang, Akitake Mukasa Takaaki.

Similar presentations


Presentation on theme: "Cluster Overlap Distribution Map (CODM) - Software Presenter: Jia Meng by: Makoto Kano, Shuichi Tsutsumi, Nobutaka Kawahara Yan Wang, Akitake Mukasa Takaaki."— Presentation transcript:

1 Cluster Overlap Distribution Map (CODM) - Software Presenter: Jia Meng by: Makoto Kano, Shuichi Tsutsumi, Nobutaka Kawahara Yan Wang, Akitake Mukasa Takaaki Kirino and Hiroyuki Aburatani

2 What is CODM? Cluster Overlap Distribution Map (CODM) is a visualization methodology CODM compares the clustering results generated under two different conditions

3 Background, Problem & Objective Advances in microarray technologies have made it possible to comprehensively measure 30,000 genes at the same time. No body can handle tens of thousands genes separately. Clustering seems a significant approach to handle them. Problem Although many clustering algorithms, have been proposed, there are few effective methods to effectively compare clustering results under different conditions. Objective is: compare clustering results under different conditions

4 Basic Idea: Format two cluster sets are mapped respectively to the X-axis and on the Y-axis The statistical evaluation values of the overlaps between two clusters selected from the respective cluster sets are displayed as the height of the blocks

5 Basic Idea : compute the height of block represents statistical evaluation values of the overlaps between cluster Xi and Yj: g is the total number of genes Nx i is the number of genes in cluster Xi Ny j is the number of genes in cluster Yj Ki j is the number of overlapped genes in Xi and Yj we will evaluate the number of common genes between the two different clusters by using hypergeometric probability distributions Core idea Assuming that the generation of gene clusters is a random selection from among the total set of genes, the probability of observing at least (k) overlapping genes between randomly selected (n1) genes and (n2) genes from among all of the (g) genes is what we need.

6 Algorithm the probability of observing at least (k) overlapping genes between randomly selected (n1) genes and (n2) genes from among all of the (g) genes is When the P-value is small, the overlap is regarded as statistically meaningful. we defined the evaluation value of the overlap as:

7 Example Data acquired from two environments Compute CODM

8 Hidden Block Hidden blocks (When dealing with hierarchical clustering results)

9 About CODM software CODM, available on web site (http://www.genome.rcast.u- tokyo.ac.jp/CODM).http://www.genome.rcast.u- tokyo.ac.jp/CODM runs on a PC with Windows 2000 or Windows XP. Memory requirement is in proportion to the square of the number of genes to be analyzed. In addition, a machine with a graphic board with a hardware accelerator for the OpenGL is recommended.

10 Future Work This method can help detect similarity between two clustering results, but how to detect similar structure among three or more clustering results? This method is based on hard assignments, but if we use statistical clustering method, we have only probability. This method can’t work on soft assignments (Probabilities)


Download ppt "Cluster Overlap Distribution Map (CODM) - Software Presenter: Jia Meng by: Makoto Kano, Shuichi Tsutsumi, Nobutaka Kawahara Yan Wang, Akitake Mukasa Takaaki."

Similar presentations


Ads by Google