A Combinatorial Approach to the Analysis of Differential Gene Expression Data The Use of Graph Algorithms for Disease Prediction and Screening.

A Combinatorial Approach to the Analysis of Differential Gene Expression Data The Use of Graph Algorithms for Disease Prediction and Screening

The Goal To classify patients based on expression profiles –Presence of cancer –Type of cancer –Response to treatment To identify the genes required for accurate classification –Too many = unnecessary noise –Too few = insufficient information

Classic Clustering Problem Current techniques: –Hierarchical Clustering –K-Means Clustering –Self-Organizing Maps –Others Drawbacks: –Determining cluster boundaries difficult with diffuse data –Objects can only belong to one group

Eliminate Poorly Covering Genes Raw Data Set of Discriminatory Genes Gene Scores Verify by Classification Calculate Sample Similarities Apply Threshold Eliminate Poorly Discriminating Genes Algorithmic Training Dominating Set Maximal Cliques Gene Scoring

Raw Data Eliminate Poorly Discriminating Genes Algorithmic Training

The Gene Scoring Function: Identifying Discriminators vs.

Eliminate Poorly Covering Genes Raw Data Eliminate Poorly Discriminating Genes Algorithmic Training

Eliminate Poorly Covering Genes SamplesGenes Class 2 Class 1

Eliminate Poorly Covering Genes Raw Data Calculate Sample Similarities Apply Threshold Eliminate Poorly Discriminating Genes Algorithmic Training

Create Unweighted Graph Complete, edge-weighted graph –Vertices = samples –Edge weight = similarity metric Remove edge weights –If edge weight < threshold, remove edge from graph –Otherwise, keep edge, ignore weight Result: incomplete unweighted graph

The Edge Weight Function where, expression value ij = expression value of gene i for sample j

Eliminate Poorly Covering Genes Raw Data Set of Discriminatory Genes Gene Scores Verify by Classification Calculate Sample Similarities Apply Threshold Eliminate Poorly Discriminating Genes Algorithmic Training

A completely connected subset of vertices in a graph Maximal clique = local optimization NP-complete What is a Clique?

Classification Using Clique Class2 Class 1 Class 3 Class 2 GRAPH

A Selection of Discriminators ADH1Balcohol dehydrogenase IBalcohol dehydrogenase activity FHL1four and a half LIM domains 1cell growth, cell differentiation HBBhemoglobin, betaoxygen transport CYP4B1cytochrome P450 4B1electron transport TNAtetranectinplasminogen binding protein TGFBR2transforming growth factor, beta receptor II transmembrane receptor protein serine/threonine kinase signaling pathway

Raw Data Classify Unknown Samples Calculate Sample Similarities Apply Threshold Set of Discriminatory Genes, Scores The Algorithm - Unsupervised

Summary Intersection of clique and dominating set techniques improves results Combined orthogonal scoring identifies limited number of discriminatory genes Clique offers means of validating obtained scores and weights Our technique identifies differing set of discriminatory genes from original paper Clique-based classification a viable complement to present clustering methods

Ongoing and Future Research Reverse Training Train to distinguish among types of cancer Experiment with different weight functions (ex. Pearson’s coefficient) Investigate using less stringent techniques –Near-cliques –Neighborhood search –K-dense subgraphs Port codes to SGI Altix supercomputer

Our Research Group Mike Langston, Ph. D. Lan Lin Chris Symons Xinxia Peng Bing Zhang, Ph. D.

A Combinatorial Approach to the Analysis of Differential Gene Expression Data The Use of Graph Algorithms for Disease Prediction and Screening.

Similar presentations

Presentation on theme: "A Combinatorial Approach to the Analysis of Differential Gene Expression Data The Use of Graph Algorithms for Disease Prediction and Screening."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

A Combinatorial Approach to the Analysis of Differential Gene Expression Data The Use of Graph Algorithms for Disease Prediction and Screening.

Similar presentations

Presentation on theme: "A Combinatorial Approach to the Analysis of Differential Gene Expression Data The Use of Graph Algorithms for Disease Prediction and Screening."— Presentation transcript:

Similar presentations

About project

Feedback