Download presentation
Presentation is loading. Please wait.
Published byMadeleine Hart Modified over 9 years ago
1
PowerGraph: Distributed Graph- Parallel Computation on Natural Graphs Joseph E. Gonzalez, Yucheng Low, Haijie Gu, and Danny Bickson, Carnegie Mellon University; Carlos Guestrin, University of Washington
2
Current State 1.Many MLDM problems represented as graphs 2.Graph structured computation is important 3. Graphs are big 4. Current systems provide graph parallel computation – Pregel – GraphLab
3
Solution 1: Pregel Vertex Program
4
Solution 2: GraphLab Shared Distributed Graph
5
Problem Many graphs have skewed degree distribution Issue: Natural Graphs Machine 1 Machine 3
6
What is a Natural Graph
7
GraphLab and Pregel on Natural Graphs Work Imbalance Random Partitioning Storage is linear in degree Expensive Communication
8
Solution PregelPowerGraph Edge CutVertex Cut Replicate EdgesReplicate Vertices Parallelize Vertex Program across all machines with that vertex
9
Balanced P-way Vertex Cut V V V Idea: Distribute edges while minimizing vertex replications
10
Distributing Edges: Random Idea: Randomly Assign Edges to Machines - Why is this better than Pregel? Theorem: For a Given edge-cut with g ghosts, any vertex cut along the same partition boundary has fewer than g mirrors.
11
Distributing Edges: Greedy -Further minimize replication of vertices -Idea: Place next edge that minimizes vertex replication -Greedy Approaches -Coordinated -Oblivious
12
Edge Distribution
13
Implementations Synchronous (Pregel) Asynchronous Asynchronous and Serializable (GraphLab)
14
Discussion: Edge Placement and Run Time
15
Discussion: GAS Decomposition Gather: collect information about surrounding vertices Apply: Vertex updates value based on gathered data Scatter: Vertex shares its new value with neighbors
16
What About Alpha? PowerGraph is a solution to Natural Graphs Can we do better if alpha is always around 2?
17
Fully Characterizing Natural Graphs Conclusions: -Out degree grows overtime, changing the value of alpha -Vertex diameters often decrease as a graph grows What does this mean when graphs are constantly changing in PowerGraph?
18
Takeaways Vertex Cut implementation allows for greater parallelization of vertex programs and reduced replication of mirrors GAS Decomposition is not fundamental to PowerGraph’s Implementation
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.