Simulation based approach Shang Zechao 1010161920 Graph Matching Simulation based approach Shang Zechao 1010161920
Introduction What is graph matching? When the one graph matches with another?
Introduction (cont.) Graph: G=(V, E). GQ = (VQ, EQ) Can be easily extended with labels. Exact matching: isomorphism Find a bijection function f between V and VQ (u, v) in E iff (f(u), f(v)) in EQ
Introduction (cont.) Graph isomorphism Sub-graph isomorphism Too hard! GI class Sub-graph isomorphism NP-Complete Too hard!
Simulation based approach [Henzinger95] Find a relation S: V x VQ (u, u’) in S if u and u’ has same labels for all children v’ of u’, there exists v V is child of u (v, v’) in S
Simulation based approach The major difference between graph simulation and graph isomorphism Isomorphism requires an bijection (one to one) function Graph simulation based on relation (many to many) Simulation is in polynomial time
An Example [Fan10] Drug dealer network B: Boss S: Secretary AM: Assistant manager FW: Field worker
An Example (cont.) In real world S and AM is same AM maps to multiple worker
Bounded Simulation [Fan10] Each edge in pattern graph has label Either a positive integer K Or * (infinite) The length of path connects these two nodes
The Example (cont.) AM should be able to reach FW within 3 hops.
Matching Algorithm Similar with the EffcientSimilarity algorithm in [Henzinger95]. Pre-compute the distance matrix between all pairs of node in G. Complexity O(|V||E| + |Ep||V|2 + |Vp||V|)
Strong Simulation [Ma12] Recall the condition that two nodes match: Have same label Children could be matched by simulation Two issues Parent information is not captured Matching size is not limited
An Example [Ma12] Bio can match to Bio1, Bio2, Bio3, Bio4 Actually only Bio4 makes sense
Strong Simulation two nodes match if: Have same label Children could be matched by simulation Parent could be matched by simulation The matched sub-graph should have same diameter as pattern graph
An Example (cont.) Bio only matches to Bio4 in strong simulation
Comparison of different approaches children topology parents topology connectivity cycle info simulation Y N with parent topology with diameter constrain isomorphism
Comparison of different approaches locality bounded matches bisimulation bounded cycle simulation N Y with parent topology with diameter constrain isomorphism
But Bounded cycle problem is intractable NP-hard Bisimilar problem is intractable coNP-hard
References [Henzinger95] M. R. Henzinger, T. A. Henzinger, and P. W. Kopke. 1995. Computing simulations on finite and infinite graphs. In Proceedings of the 36th Annual Symposium on Foundations of Computer Science (FOCS '95). IEEE Computer Society, Washington, DC, USA, 453-. [Fan10] Wenfei Fan, Jianzhong Li, Shuai Ma, Nan Tang, Yinghui Wu, and Yunpeng Wu. 2010. Graph pattern matching: from intractable to polynomial time. Proc. VLDB Endow. 3, 1-2 (September 2010), 264-275. [Ma12] Shuai Ma, Yang Cao, Wenfei Fan, Jinpeng Huai , Tianyu Wo. 2012. Capturing Topology in Graph Pattern Matching. PVLDB. To appear.