Presentation is loading. Please wait.

Presentation is loading. Please wait.

TORQUE: Topology-Free Querying of Protein Interaction Networks

Similar presentations


Presentation on theme: "TORQUE: Topology-Free Querying of Protein Interaction Networks"— Presentation transcript:

1 TORQUE: Topology-Free Querying of Protein Interaction Networks
Sharon Bruckner1, Falk Hüffner1 , Richard M. Karp2, Ron Shamir1, and Roded Sharan1 1 School of computer science, Tel Aviv University 2 Int. Computer Science Institute, Berkley, CA To appear in RECOMB 09

2 The problem Input: Graph G=(V,E) , |V|=n, |E|=m
Color set C={1,2,...,k} A function c: VC assigning v the color c(v).

3 The problem We seek: Is there are connected subgraph of G that has exactly one vertex of each color? Call such a subgraph “colorful”

4 But why? Our graph = A protein-protein interaction network of some species. Our colors = set of proteins from another species that constitute a complex. Each network vertex is given the color of the protein in that set most similar to it.

5

6 But why? Our graph = A protein-protein interaction network of some species. Our colors = set of proteins from another species that constitute a complex. Each network vertex is given the color of the protein in that set most similar to it. What is the meaning of a match? Hints at an evolutionary conserved region May infer the functionality of the matched subgraph from that of the complex.

7 ABOUT THE PROBLEM NP-complete Solution: A fixed parameter algorithm!
Hard even when the graph is a tree with max degree 3 (by reduction from 3SAT ([FFHV07]) But! We know the number of colors k is relatively small. Solution: A fixed parameter algorithm! A problem is fixed-parameter tractable with respect to a parameter k if an instance of size n can be solved in time where f is an arbitrary function (see e.g. [N06])

8 Defining The Basic algorithm
Every connected subgraph has a spanning tree Every colorful connected subgraph will have a colorful spanning tree Instead of looking for a colorful subgraph, look for a colorful tree Mention here that we’re using scoring, but it doesn’t change the algorithm. Input: A graph where each vertex is colored by one of k colors. Output: What is the highest scoring colorful tree? Input: A graph where each vertex is colored by one of k colors. Output: Is there a colorful tree?

9 Dynamic Programming Algorithm
IDEA: Instead of looking at all nk possible subgraphs, look only at all 2k color sets Row for each vertex Column for each subset of colors, in increasing size. Score of best tree Rooted in v3 that Is colored exactly By S3 S1 S2 S3 S4 v1 None 3.4 v2 2.3 2 v3 3.15 v4 13.5 7.42 v5 6.4 8.1 Table verts

10 Dynamic Programming Algorithm
The last column contains, for every vertex v, the highest scoring tree rooted in v colored by all the colors of the query! Running time: O(3km).

11 example B(v, { } ) w v u u v

12 Allowing deletions – matching with less colors
?

13 Allowing deletions – matching with less colors
Simply look at all columns with color sets of size at least k - num_dels S1 S2 S3 S4 v1 None 3.4 v2 2.3 2 v3 3.15 v4 13.5 7.42 v5 6.4 8.1

14 Allowing Insertions: Special non-colored vertices or arbitrary vertices

15 Allowing non-colored insertions
For j insertions, we would expect: Running time: O(3k+jm). Actually, Running time: O(3kmj). Simply make j copies of each column, and answer the question: B(v, S, j’) = What is the highest scoring tree, rooted in v, colored by S, using exactly j’ insertions?

16 Formula & Example b f a c e d g Running Time: O(3km*ins)
Give example on this graph a c e d g Running Time: O(3km*ins)

17 Details For every vertex v, color subset S, the algorithm will accurately find the best tree of those having the minimal number of insertions. Once B(v,S,j) < ∞ for some j, the value for j+i will never be computed! Cannot guarantee that B(v,S,j+i) will have exactly j+i insertions. v u

18 Allowing multiple colors per vertex – use color-coding

19 Implementation, Experiments & Results

20 Experiments We applied our method to query complexes within: Queries:
yeast (5430 proteins, interactions), fly (6650 proteins, interactions) human (7915 proteins, interactions). Queries: yeast, fly, human bovine, mouse, and rat. 21

21 Implementation comments
We color the graph according to the similarity between the network and query proteins. In practice, in some problem instances the number of colors was not significantly smaller than the graph size This is a result of data reduction in the cases where many network vertices were not sufficiently similar to any query vertex. Therefore, the dynamic programming algorithm is supplemented by an ILP algorithm and some heuristics to handle these instances!

22 Comparison with other methods
Most previous work tested queries with a known topology. ? We compare our results with those of QNet ([DSGRBS08] ) , designed to tackle topology-based queries. QNet is also based on dynamic programming and color coding .

23 Selected results All our other results follow the same trends (show tables if anyone insists)

24 Summary The colorful connected subgraph problem is motivated by the PPI network querying problem. A fixed parameter dynamic programming algorithm, allowing insertions, deletions, and multiple colors per vertex, along with an ILP formulation and heuristics, obtains good results. Thanks: The ACGT group (Igor, Ofer, Chaim, Seagull, Guy…), Nir Yosef. Israel Science Foundation, Edmond J. Safra Bioinformatics Program, Tel Aviv Univ.

25 References [FFHV07] M. R. Fellows, G. Fertin, D. Hermelin, and S. Vialette. Borderlines for finding connected motifs in vertex-colored graphs. In Proc. ICALP’07, volume 4596, pages 340–351. Springer-Verlag, 2007. [N06] R. Niedermeier. Invitation to Fixed-Parameter Algorithms. Number 31 in Oxford Lecture Series in Mathematics and Its Applications. Oxford University Press, 2006. [BFKN08] N. Betzler, M. R. Fellows, C. Komusiewicz, and R. Niedermeier. Parameterized algorithms and hardness results for some graph motif problems. In Proc. 19th CPM, volume 5029 of LNCS, pages 31{43. Springer, 2008. [AYZ95] N. Alon, R. Yuster, and U. Zwick. Color coding. Journal of the ACM, 42: 844{856, 1995}. [DSGRBS08] B. Dost, T. Shlomi, N. Gupta, E. Ruppin, V. Bafna, and R.Sharan. Qnet: A tool for querying protein interaction networks. Journal of Computational Biology, 15(7):913{925, 2008.


Download ppt "TORQUE: Topology-Free Querying of Protein Interaction Networks"

Similar presentations


Ads by Google