Presentation is loading. Please wait.

Presentation is loading. Please wait.

GraphLab A New Parallel Framework for Machine Learning Carnegie Mellon Based on Slides by Joseph Gonzalez Mosharaf Chowdhury.

Similar presentations


Presentation on theme: "GraphLab A New Parallel Framework for Machine Learning Carnegie Mellon Based on Slides by Joseph Gonzalez Mosharaf Chowdhury."— Presentation transcript:

1 GraphLab A New Parallel Framework for Machine Learning Carnegie Mellon Based on Slides by Joseph Gonzalez Mosharaf Chowdhury

2 Belief Propagation SVM Kernel Methods Deep Belief Networks Neural Networks Tensor Factorization PageRank Lasso The Need for a New Abstraction 2 Data-Parallel Graph-Parallel Cross Validation Feature Extraction Map Reduce Computing Sufficient Statistics Pregel (Giraph)

3 GraphLab wants to support 1.Sparse Computational Dependencies 2.Asynchronous Iterative Computation 3.Sequential Consistency 4.Prioritized Ordering 5.Rapid Development

4 The GraphLab Framework Scheduler Consistency Model Graph Based Data Representation Update Functions User Computation 4

5 Data Graph 5 A graph with arbitrary data (C++ Objects) associated with each vertex and edge. Vertex Data: User profile text Current interests estimates Edge Data: Similarity weights Graph: Social Network

6 label_prop(i, scope){ // Get Neighborhood data (Likes[i], W ij, Likes[j])  scope; // Update the vertex data // Reschedule Neighbors if needed if Likes[i] changes then reschedule_neighbors_of(i); } Update Functions 6 An update function is a user defined program which when applied to a vertex transforms the data in the scope of the vertex

7 The Scheduler 7 CPU 1 CPU 2 The scheduler determines the order that vertices are updated. e e f f g g k k j j i i h h d d c c b b a a b b i i h h a a i i b b e e f f j j c c Scheduler The process repeats until the scheduler is empty.

8 Sequential Consistency Models – Full Consistency – Edge Consistency Write Canonical Lock Ordering ReadWrite Read Write

9 Consistency Through Scheduling Edge Consistency Model: – Two vertices can be Updated simultaneously if they do not share an edge. Graph Coloring: – Two vertices can be assigned the same color if they do not share an edge. Barrier Phase 1 Barrier Phase 2 Barrier Phase 3

10 Algorithms Implemented PageRank Loopy Belief Propagation Gibbs Sampling CoEM Graphical Model Parameter Learning Probabilistic Matrix/Tensor Factorization Alternating Least Squares Lasso with Sparse Features Support Vector Machines with Sparse Features Label-Propagation …

11

12 The Table


Download ppt "GraphLab A New Parallel Framework for Machine Learning Carnegie Mellon Based on Slides by Joseph Gonzalez Mosharaf Chowdhury."

Similar presentations


Ads by Google