RedundancyMiner A novel method of clustering in genomic studies Barry Zeeberg, NCI Hongfang Liu, NCI and GU
Gene Ontology (GO) AmiGO browser Hierarchical organization of categories and mapped genes
High-Throughput GoMiner (HTGM)
Typical HTGM result clustered image map (CIM)
Redundancy problem Because of the hierarchical nature of GO structure, parent-child categories may contain partially redundant gene mappings This can “inflate” the number of categories in the CIM Thus obscure the core information content in the CIM The redundancy itself can be studied to look at fine detail nuanced associations of category clusters
RedundancyMiner (RM) is an attempt to solve that problem Remove the redundancy from the CIM –Redundancy cause the CIM to be inflated by e.g. 3-fold Place the redundancy into a META CIM –Study the redundancy as a nuanced themes of association of groups of GO categories
RM paradigm Similarity metric is probabilistic value based on the number of genes mapped in common to two GO categories Groups in the META CIM follow a “complete linkage” criterion for a selected threshold of p value
RM overcomes two problems of traditional hierarchical clustering All objects are put into one cluster or another, even if the object truly is an outlier Each object can appear in only one cluster, even though it may be related to several clusters
CIM after RM
META CIM
Additional example gene expression in NCI-60 cell lines NCI-60 is set of 60 well-studied cancer cell lines Composed of around 5 or 6 each of around 8 or 9 different cancer types
Problem Full CIM of 60 cell lines x 20,000 gene expression values is too dense to allow meaningful viewing Solution is to select sub-portion of CIM based on RM analysis
NCI-60 META CIM based on correlation threshold = 0.20
Sub-CIM of highest correlating genes from group 33 Gene expression values are adjusted z-scores Red = positive z score Green = negative z score
Sub-CIM of highest correlating genes from group 32
Conclusions RM can remove redundancy from the primary CIM RM can display the nuanced themes of redundancy structure in the META CIM The META CIM can be used as the basis of further investigation