Towards Efficient Query Processing on Massive Evolving Graphs (C-Big2012) Arash Fard, Amir Abdolrashidi, Lakshmish Ramaswamy and John A. Miller UGA Presentation.

Slides:



Advertisements
Similar presentations
Graph Algorithms Algorithm Design and Analysis Victor AdamchikCS Spring 2014 Lecture 11Feb 07, 2014Carnegie Mellon University.
Advertisements

ADAPTIVE FASTEST PATH COMPUTATION ON A ROAD NETWORK: A TRAFFIC MINING APPROACH Hector Gonzalez, Jiawei Han, Xiaolei Li, Margaret Myslinska, John Paul Sondag.
Alex Cheung and Hans-Arno Jacobsen August, 14 th 2009 MIDDLEWARE SYSTEMS RESEARCH GROUP.
New Models for Graph Pattern Matching Shuai Ma ( 马 帅 )
The IEEE International Conference on Big Data 2013 Arash Fard M. Usman Nisar Lakshmish Ramaswamy John A. Miller Matthew Saltz Computer Science Department.
1 GPSR: Greedy Perimeter Stateless Routing for Wireless Networks B. Karp, H. T. Kung Borrowed slides from Richard Yang.
1 NP-completeness Lecture 2: Jan P The class of problems that can be solved in polynomial time. e.g. gcd, shortest path, prime, etc. There are many.
Chapter 8, Part I Graph Algorithms.
Data Structure and Algorithms (BCS 1223) GRAPH. Introduction of Graph A graph G consists of two things: 1.A set V of elements called nodes(or points or.
Lectures on Network Flows
CSL758 Instructors: Naveen Garg Kavitha Telikepalli Scribe: Manish Singh Vaibhav Rastogi February 7 & 11, 2008.
Routing, Anycast, and Multicast for Mesh and Sensor Networks Roland Flury Roger Wattenhofer RAM Distributed Computing Group.
Data Structures & Algorithms Graph Search Richard Newman based on book by R. Sedgewick and slides by S. Sahni.
Xyleme A Dynamic Warehouse for XML Data of the Web.
Greedy Algorithms Reading Material: Chapter 8 (Except Section 8.5)
1 A Distributed Delay-Constrained Dynamic Multicast Routing Algorithm Quan Sun and Horst Langendorfer Telecommunication Systems Journal, vol.11, p.47~58,
1 GPSR: Greedy Perimeter Stateless Routing for Wireless Networks B. Karp, H. T. Kung Borrowed some Richard Yang‘s slides.
Recursive Graph Deduction and Reachability Queries Yangjun Chen Dept. Applied Computer Science, University of Winnipeg 515 Portage Ave. Winnipeg, Manitoba,
Greedy Algorithms Like dynamic programming algorithms, greedy algorithms are usually designed to solve optimization problems Unlike dynamic programming.
Randomness in Computation and Communication Part 1: Randomized algorithms Lap Chi Lau CSE CUHK.
Connected Dominating Sets in Wireless Networks My T. Thai Dept of Comp & Info Sci & Engineering University of Florida June 20, 2006.
Minimum Spanning Trees. Subgraph A graph G is a subgraph of graph H if –The vertices of G are a subset of the vertices of H, and –The edges of G are a.
Graph Algorithms Using Depth First Search Prepared by John Reif, Ph.D. Distinguished Professor of Computer Science Duke University Analysis of Algorithms.
Yinghui Wu, SIGMOD 2012 Query Preserving Graph Compression Wenfei Fan 1,2 Jianzhong Li 2 Xin Wang 1 Yinghui Wu 1,3 1 University of Edinburgh 2 Harbin Institute.
Complexity of Bellman-Ford Theorem. The message complexity of Bellman-Ford algorithm is exponential. Proof outline. Consider a topology with an even number.
TEDI: Efficient Shortest Path Query Answering on Graphs Author: Fang Wei SIGMOD 2010 Presentation: Dr. Greg Speegle.
Computer Science 112 Fundamentals of Programming II Introduction to Graphs.
Research Directions for Big Data Graph Analytics John A. Miller, Lakshmish Ramaswamy, Krys J. Kochut and Arash Fard Department of Computer Science University.
Offline Algorithmic Techniques for Several Content Delivery Problems in Some Restricted Types of Distributed Systems Mugurel Ionut Andreica, Nicolae Tapus.
© 2006 Pearson Addison-Wesley. All rights reserved14 A-1 Chapter 14 Graphs.
Graphs Rosen, Chapter 8. Isomorphism (Rosen 560 to 563) Are two graphs G1 and G2 of equal form? That is, could I rename the vertices of G1 such that the.
Complexity of Bellman-Ford
Biconnected Components CS 312. Objectives Formulate problems as problems on graphs Implement iterative DFS Describe what a biconnected component is Be.
A correction The definition of knot in page 147 is not correct. The correct definition is: A knot in a directed graph is a subgraph with the property that.
CSE 589 Part VI. Reading Skiena, Sections 5.5 and 6.8 CLR, chapter 37.
Partitioning the Labeled Spanning Trees of an Arbitrary Graph into Isomorphism Classes Austin Mohr.
CS 484 Load Balancing. Goal: All processors working all the time Efficiency of 1 Distribute the load (work) to meet the goal Two types of load balancing.
Data Structures and Algorithms in Parallel Computing Lecture 3.
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley. Ver Chapter 13: Graphs Data Abstraction & Problem Solving with C++
Data Structures and Algorithms in Parallel Computing Lecture 7.
Research Directions for Big Data Graph Analytics John A. Miller, Lakshmish Ramaswamy, Krys J. Kochut and Arash Fard.
Data Structures and Algorithms in Parallel Computing
© 2006 Pearson Addison-Wesley. All rights reserved 14 A-1 Chapter 14 Graphs.
Trees Dr. Yasir Ali. A graph is called a tree if, and only if, it is circuit-free and connected. A graph is called a forest if, and only if, it is circuit-free.
A Framework for Reliable Routing in Mobile Ad Hoc Networks Zhenqiang Ye Srikanth V. Krishnamurthy Satish K. Tripathi.
Example Apply hierarchical clustering with d min to below data where c=3. Nearest neighbor clustering d min d max will form elongated clusters!
1 Plaxton Routing. 2 History Greg Plaxton, Rajmohan Rajaraman, Andrea Richa. Accessing nearby copies of replicated objects, SPAA 1997 Used in several.
Outline  Introduction  Subgraph Pattern Matching  Types of Subgraph Pattern Matching  Models of Computation  Distributed Algorithms  Performance.
Construction of Optimal Data Aggregation Trees for Wireless Sensor Networks Deying Li, Jiannong Cao, Ming Liu, and Yuan Zheng Computer Communications and.
Da Yan (HKUST) James Cheng (CUHK) Wilfred Ng (HKUST) Steven Liu (HKUST)
Certifying Algorithms [MNS11]R.M. McConnell, K. Mehlhorn, S. Näher, P. Schweitzer. Certifying algorithms. Computer Science Review, 5(2), , 2011.
Pagerank and Betweenness centrality on Big Taxi Trajectory Graph
Lecture 11 Graph Algorithms
Minimum Spanning Tree 8/7/2018 4:26 AM
Lectures on Network Flows
CS120 Graphs.
Graph Algorithms Using Depth First Search
Connected Components Minimum Spanning Tree
Tree Construction (BFS, DFS, MST) Chapter 5
Graphs All tree structures are hierarchical. This means that each node can only have one parent node. Trees can be used to store data which has a definite.
Minimum Spanning Tree Section 7.3: Examples {1,2,3,4}
Md. Abul Kashem, Chowdhury Sharif Hasan, and Anupam Bhattacharjee
Graph Indexing for Shortest-Path Finding over Dynamic Sub-Graphs
Subgraphs, Connected Components, Spanning Trees
Union-Find.
Chapter 14 Graphs © 2006 Pearson Addison-Wesley. All rights reserved.
Algorithms (2IL15) – Lecture 7
Chapter 14 Graphs © 2011 Pearson Addison-Wesley. All rights reserved.
Lecture 10 Graph Algorithms
Graph Traversals Some applications require visiting every vertex in the graph exactly once. The application may require that vertices be visited in some.
Presentation transcript:

Towards Efficient Query Processing on Massive Evolving Graphs (C-Big2012) Arash Fard, Amir Abdolrashidi, Lakshmish Ramaswamy and John A. Miller UGA Presentation by : Charith Wickramaarachchi

Time Evolving Graph Paradigm for molding dynamic relationships in networks. TEG : Series of snapshots of a graph which evolves over time. Web graph Relationship structure of social networks Communication flow networks Evolution History of genome families

TEG and Scalability Additional Dimension – Time New queries Historical Inverse temporal Continuous Data volume Indexing

Overview Data distribution strategies for TEGs Answering reachability queries Sub graph queries in large TEGs

TEG distribution Objectives Improve node utilization Minimize the communication cost Strategies Random distribution Improves node utilization High communication connected sub-graph distribution Low communication Low node utilization

Type of Algorithm High communication low computation Page rank, HCC - Min-cut Low communication SSSP - Radom distribution Dynamic Nature of Graph Additions and deletions of nodes. Repartitioning cost Data transfer cost. Re-wiring cost Data node configuration More partitions than compute nodes (Partition : CC ) Smaller sized partitions Small stragglers

Reachability queries in TEGs {G 1,G 2,…… G q, …..G r } – Snapshots of TEG : G Diff(G q,G q-1 ) – Changes between snapshots G q and G q-1 Vertex addition Edge addtion Reach(v,w,q) – TRUE/FALSE

Reachability Queries in Static Graphs On demand Traversal O(M+N) Pre Indexing O(1) – Pre computed spanning tree High indexing time Index table Limitations for TEGs High indexing cost – Need to index per each snapshot High storage overhead Low cost benefit ratio

Approach Interval – based indexing

Approach Steps (Assume Reach (u,v,q) where q > p and G p is indexed) Reach(u,v,p) ? Does Diff(G p,G q ) change that Naïve approach : process Diff(G p,G q ) in Chronological order A Better approach : Does the changes impact the reachability ?

Approach Reach (A,H,3) Add(E,F) ? Related ? Add(B,E) & Add(F,G) & Add(E,F)

Observations If Reach(u,v,q) = true Need to process diffs if diff stack contains at least one delete(p,q) where p,q is a edge on a path from u,v in G p If Reach(u,v,q) = false Contains at least one Add(p,q) p is reachable from u q is reachable from v

Graph Pattern Matching Subgraph Isomorphism Bijective mapping between query (Q(V q,E q ))graph and subgraph(G’(V’,E’)) of target graph G. There exist f : V’--> V q For all v’,w’ in V’ there is v q,w q in V q s.t. (v’,w’) in E’ ↔ (v q,w q ) in E q Simulation G(V,E) matches Q(V q,E q ) if there exist R subset of V q X V s.t. (u,u’) in R -> u and u’ have same label For all u in V q there is u’ in V For all (u,v) in E q there is a (u’,v’) in E

Vertex Centric approach Graph (V,E,l) Query Q(V q,E q,l q ) Output M : a Max m match in G for Q Use GPS features Master for global operations

Vertex Centric approach 1 ST - Master broadcasts the query 2 nd – Each vertex whose label is same as in Q will get flagged S : set of matched nodes (Note v in G can be matched to two vertices in Q) Each vertex keeps set of lists of labels for possible children. # of outgoing edges < any list of children : remove. Send id to children. 3 rd Children reply with id, label 4 th : If received child label is superset of matched children labels in Q keep, else remove. Pass the removal report to parents 5 th : Remove the child list, Check for validity in S. If not remove your self from S, Report to parents. Next : Goto 5 th.

Conclusion TEG processing : an emerging research area with lot of applications Need for new partitioning techniques and graph query techniques Does TEG processing applications benefits more from an EDA based model than traditional query processing model ?