Presentation is loading. Please wait.

Presentation is loading. Please wait.

Answering pattern queries using views Yinghui Wu UC Santa Barbara Wenfei Fan University of EdinburghSouthwest Jiaotong University Xin Wang.

Similar presentations


Presentation on theme: "Answering pattern queries using views Yinghui Wu UC Santa Barbara Wenfei Fan University of EdinburghSouthwest Jiaotong University Xin Wang."— Presentation transcript:

1 Answering pattern queries using views Yinghui Wu UC Santa Barbara Wenfei Fan University of EdinburghSouthwest Jiaotong University Xin Wang

2 Real-life graph querying is expensive 2 social scale 100B (10 11 ) Web scale 1T (10 12 ) brain scale, 100T (10 14 ) Real-life scope 100M(10 8 ) US road Human Connectome, The Human Connectome Project, NIH knowledge graph BTC Semantic Web Web graph (Google) Internet (Opte project) An NSA Big Graph experiment, P.Burkhardt, et al, US. National Security Agency, May 2013

3 Querying collaborative network 3 customer developer project manager query 1 Customer developer query 2 PM 2 PM 1 customer 2developer 3developer 2 customer 2 developer 3 developer 2 customer 3 “Detecting Coordination Problems in Collaborative Software Development Environments”, Amrit Chintan et al, Information System management, 2010 customerdeveloper project manager A collaborative pattern PM 2 PM 1 customer 2 customer 1 developer 2 developer 3 developer 1 customer 3 A collaborative (chat) network developer k customer 3 customer n … … tester expensive!

4 Answering query using views 4 query A database D database views V(D) Q(D) query result query Q A(V) query result 19952000 2011 relational algebra 2002 XPath 2007 XML 2006 tree pattern query 1998 regular path queries RDF/SPARQL graph pattern query (bounded) simulation (our work) When? What to choose? How to evaluate?

5 Outline 5 Graph pattern matching using views ◦When, what and how? When a query can be evaluated using views? ◦Pattern containment: an iff condition How to evaluate? ◦query answering using views What to choose? ◦ minimum containment & minimal containment Extension: bounded simulation Experimental Study Conclusion

6 Graphs, patterns and views 6 customer developer pattern query customer 2 developer 3 developer 2 customer 3 query result edgesmatches (customer, developer){(customer 2, developer 2), (customer 3, developer 3)} (developer, customer) {(developer 2, customer 2), (developer 2, customer 3), (developer 3, customer 2)} (view definition) (view extension) edgesmatches (project manager, developer) {(PM 1, developer 2), (PM 2, developer 3)} (project manager, customer) {(PM 1, customer 2), (PM 2, customer 2), edgesmatches (customer, developer){(customer 2, developer 2), (customer 3, developer 3)} (developer, customer) {(developer 2, customer 2), (developer 2, customer 3), (developer 3, customer 2)} binary relation node match: satisfies predicates edge match: connects two node matches view definition 2 customer developer project manager customer developer view definition 1 view 1 view 2 view extension 1 view extension 2

7 Graph pattern matching using views 7 Given a pattern query Q, and a set V of view definitions, find another query A s.t. ◦A is equivalent to Q (A(G) = Q(G)) for all data graph G ◦A only refers to V and extensions V(G) query A data graph G views V Q(G) matches query Q A(G)

8 8 When a pattern query can be answered using views?

9 Pattern containment 9 customerdeveloper project manager customer developer project manager View 1 customer developer View 2 (customer, developer) {(customer 2, developer 2), (customer 3, developer 3)} (developer, customer) {(developer 2, customer 2), (developer 2, customer 3), (developer 3, customer 2)} (project manager, developer) {(PM 1, developer 2), (PM 2, developer 3)} (project manager, customer) {(PM 1, customer 2), (PM 2, customer 2)} (project manager, developer)(PM 1, developer 2) (project manager, customer)(PM 1, customer 2) (developer, customer)(developer 2, customer 2) (customer, developer)(customer 2, developer 2) Query result

10 Determining Pattern containment 10

11 Pattern containment: example 11 customer developer project manager View 1 customer developer View 2 customerdeveloper project manager query as “data graph” λ customer project manager developer view matches

12 12 How to answer pattern query using views?

13 Query evaluation using views 13 Given Q, a set of views V and extensions, a mapping λ, find the query result Q(G) Algorithm ◦Collect edge matches for each query edge e and λ(e) ◦Iteratively remove non-matches until no change happens ◦Return Q(G)

14 Query evaluation using views 14 customerdeveloper query project manager customer developer project manager View 1 customer developer View 2 (customer, developer) {(customer 2, developer 2), (customer 3, developer 3)} (developer, customer) {(developer 2, customer 2), (developer 2, customer 3), (developer 3, customer 2)} (project manager, developer) {(PM 1, developer 2), (PM 2, developer 3)} (project manager, customer) {(PM 1, customer 2), (PM 2, customer 2)} (project manager, developer){(PM 1, developer 2), (PM 2, developer 3)} (project manager, customer){(PM 1, customer 2), (PM 2, customer 2)} (developer, customer){(developer 2, customer 2), (developer 2, customer 3), (developer 3, customer 2)} (customer, developer){(customer 2, developer 2), (customer 3, developer 3)} Query result “bottom-up” strategy

15 15 What should be selected?

16 What to choose? 16 customer developer project manager software tester customer software customer developer project manager customer developer software customer developer project manager software customer developer project manager software tester developer software query view 2 view 1 view 3 view 4 view 5 view 6 choose all?

17 Minimum containment 17

18 An log|E p |-approximation 18

19 Minimum containment 19 customer developer project manager software tester customer software customer developer project manager customer developer project manager software customer developer project manager software tester developer software query view 2 view 1 view 4 view 6 view 5 customer developer software view 3 Ec

20 Minimal containment 20

21 Minimal containment 21 customer developer project manager software tester customer software customer developer project manager customer developer project manager software customer developer project manager software tester developer software query view 2 view 1 view 4 view 6 view 5 customer developer software view 3

22 Bounded pattern matching using views 22 Bounded pattern queries Answering bounded pattern queries ◦Idea: “reduce” bounded pattern queries to weighted pattern queries ◦View matches: weighted edge to weighted paths ◦Complexity and algorithms carry over to bounded queries customerdeveloper project manager A collaborative pattern 2 2 PM customer 2 customer 1 developer 2 developer 1 A collaborative (chat) network tester customerdeveloper project manager View 1 customerdeveloper View 2 2 3 2

23 Putting everything together 23 ProblemComplexityAlgorithm SimulationcontainmentPTIMEO(card(V)|Q| 2 +|V| 2 +|Q||V|) minimum containment NP-c/APX-hardlog|E p |-approximable O(card(V)|Q| 2 +|V| 2 +|Q||V|+|Q|card(V) 3/2 ) minimal containment PTIMEO(card(V)|Q| 2 +|V| 2 +|Q||V|) evaluationPTIMEO(|Q||V(G)| + |V(G)| 2 ) Bounded simulation containmentPTIMEO(|Q| 2 |V|) minimum containment NP-c/APX-hardlog|E p |-approximable O(|Q| 2 |V|+|Q|card(V) 3/2 ) minimal containment PTIMEO(|Q| 2 |V|) evaluationPTIMEO(|Q||V(G)| + |V(G)| 2 ) ClassesRelationalXMLgraph/RDF languageConjunctive query Relational algebra Xpath (XQuery) RPQsECRPQs(P)SPARQL(bounded) pattern query containmentNP-cundecidablecoNP-c - undecida ble undecidable EXPTIMEPTIME

24 24 Experimental study

25 Efficiency: pattern queries 25 “Music”; < 7 days Comedy; View > 10k “Sports” Rate > 4 Youtube Views 2.2 times and 1.75 times faster greater improvement over denser graphs |E| = |V| a

26 Efficiency: bounded pattern queries 26 greater improvement over larger graphs “Books”; rating > 4 “Music CD”; sales rank> 5000 10 times and 7.1 times faster “DVD”; reviews> 1000 Amazon Views

27 Minimum vs. Minimal 27 Minimum takes slightly more time to find substantially smaller sets of views

28 conclusion 28 Pattern containment is tractable for (bounded) pattern queries Query evaluation using views is much more efficient for large graphs than “batch” counterparts Journey just starts… ◦More features to select good views to cache? ◦When a query is not contained in existing views? ◦View-based subgraph queries?

29 29 Thank you! Answering pattern query using views


Download ppt "Answering pattern queries using views Yinghui Wu UC Santa Barbara Wenfei Fan University of EdinburghSouthwest Jiaotong University Xin Wang."

Similar presentations


Ads by Google