Presentation is loading. Please wait.

Presentation is loading. Please wait.

PowerGraph: Distributed Graph- Parallel Computation on Natural Graphs Joseph E. Gonzalez, Yucheng Low, Haijie Gu, and Danny Bickson, Carnegie Mellon University;

Similar presentations


Presentation on theme: "PowerGraph: Distributed Graph- Parallel Computation on Natural Graphs Joseph E. Gonzalez, Yucheng Low, Haijie Gu, and Danny Bickson, Carnegie Mellon University;"— Presentation transcript:

1 PowerGraph: Distributed Graph- Parallel Computation on Natural Graphs Joseph E. Gonzalez, Yucheng Low, Haijie Gu, and Danny Bickson, Carnegie Mellon University; Carlos Guestrin, University of Washington

2 Current State 1.Many MLDM problems represented as graphs 2.Graph structured computation is important 3. Graphs are big 4. Current systems provide graph parallel computation – Pregel – GraphLab

3 Solution 1: Pregel Vertex Program

4 Solution 2: GraphLab Shared Distributed Graph

5 Problem Many graphs have skewed degree distribution Issue: Natural Graphs Machine 1 Machine 3

6 What is a Natural Graph

7 GraphLab and Pregel on Natural Graphs Work Imbalance Random Partitioning Storage is linear in degree Expensive Communication

8 Solution PregelPowerGraph Edge CutVertex Cut Replicate EdgesReplicate Vertices Parallelize Vertex Program across all machines with that vertex

9 Balanced P-way Vertex Cut V V V Idea: Distribute edges while minimizing vertex replications

10 Distributing Edges: Random Idea: Randomly Assign Edges to Machines - Why is this better than Pregel? Theorem: For a Given edge-cut with g ghosts, any vertex cut along the same partition boundary has fewer than g mirrors.

11 Distributing Edges: Greedy -Further minimize replication of vertices -Idea: Place next edge that minimizes vertex replication -Greedy Approaches -Coordinated -Oblivious

12 Edge Distribution

13 Implementations Synchronous (Pregel) Asynchronous Asynchronous and Serializable (GraphLab)

14 Discussion: Edge Placement and Run Time

15 Discussion: GAS Decomposition Gather: collect information about surrounding vertices Apply: Vertex updates value based on gathered data Scatter: Vertex shares its new value with neighbors

16 What About Alpha? PowerGraph is a solution to Natural Graphs Can we do better if alpha is always around 2?

17 Fully Characterizing Natural Graphs Conclusions: -Out degree grows overtime, changing the value of alpha -Vertex diameters often decrease as a graph grows What does this mean when graphs are constantly changing in PowerGraph?

18 Takeaways Vertex Cut implementation allows for greater parallelization of vertex programs and reduced replication of mirrors GAS Decomposition is not fundamental to PowerGraph’s Implementation


Download ppt "PowerGraph: Distributed Graph- Parallel Computation on Natural Graphs Joseph E. Gonzalez, Yucheng Low, Haijie Gu, and Danny Bickson, Carnegie Mellon University;"

Similar presentations


Ads by Google