Graph-Based Parallel Computing William Cohen 1. Announcements Thursday 4/23: student presentations on projects – come with a tablet/laptop/etc Fri 4/24:

Slides:

Advertisements

Similar presentations

Overview of this week Debugging tips for ML algorithms

Advertisements

Piccolo: Building fast distributed programs with partitioned tables Russell Power Jinyang Li New York University.

Make Sense of Big Data Researched by JIANG Wen-rui Led by Pro. ZOU

Distributed Graph Analytics Imranul Hoque CS525 Spring 2013.

Distributed Graph Processing Abhishek Verma CS425.

Spark: Cluster Computing with Working Sets

GraphChi: Big Data – small machine

Matei Zaharia Large-Scale Matrix Operations Using a Data Flow Engine.

Data-Intensive Computing with MapReduce/Pig Pramod Bhatotia MPI-SWS Distributed Systems – Winter Semester 2014.

APACHE GIRAPH ON YARN Chuan Lei and Mohammad Islam.

Yucheng Low Aapo Kyrola Danny Bickson A Framework for Machine Learning and Data Mining in the Cloud Joseph Gonzalez Carlos Guestrin Joe Hellerstein.

Carnegie Mellon Joseph Gonzalez Joint work with Yucheng Low Aapo Kyrola Danny Bickson Carlos Guestrin Guy Blelloch Joe Hellerstein David O’Hallaron A New.

Scaling Distributed Machine Learning with the BASED ON THE PAPER AND PRESENTATION: SCALING DISTRIBUTED MACHINE LEARNING WITH THE PARAMETER SERVER – GOOGLE,

Distributed Computations

GraphLab A New Parallel Framework for Machine Learning Carnegie Mellon Based on Slides by Joseph Gonzalez Mosharaf Chowdhury.

Design Patterns for Efficient Graph Algorithms in MapReduce Jimmy Lin and Michael Schatz University of Maryland Tuesday, June 29, 2010 This work is licensed.

From Graphs to Tables: The Design of Scalable Systems for Graph Analytics Joseph E. Gonzalez Post-doc, UC Berkeley AMPLab Co-founder,

Paper by: Grzegorz Malewicz, Matthew Austern, Aart Bik, James Dehnert, Ilan Horn, Naty Leiser, Grzegorz Czajkowski (Google, Inc.) Pregel: A System for.

Distributed Computations MapReduce

A Lightweight Infrastructure for Graph Analytics Donald Nguyen Andrew Lenharth and Keshav Pingali The University of Texas at Austin.

Joseph Gonzalez Yucheng Low Aapo Kyrola Danny Bickson Joe Hellerstein Alex Smola Distributed Graph-Parallel Computation on Natural Graphs Haijie Gu The.

BiGraph BiGraph: Bipartite-oriented Distributed Graph Partitioning for Big Learning Jiaxin Shi Rong Chen, Jiaxin Shi, Binyu Zang, Haibing Guan Institute.

Design Patterns for Efficient Graph Algorithms in MapReduce Jimmy Lin and Michael Schatz University of Maryland MLG, January, 2014 Jaehwan Lee.

GraphLab A New Framework for Parallel Machine Learning

Pregel: A System for Large-Scale Graph Processing

Software tools for Complex Networks Analysis Giovanni Neglia, Small changes to a set of slides from Fabrice Huet, University of Nice Sophia- Antipolis.

Pregel: A System for Large-Scale Graph Processing Presented by Dylan Davis Authors: Grzegorz Malewicz, Matthew H. Austern, Aart J.C. Bik, James C. Dehnert,

X-Stream: Edge-Centric Graph Processing using Streaming Partitions

GRAPH PROCESSING Hi, I am Mayank and the second presenter for today is Shadi. We will be talking about Graph Processing.

Distributed shared memory. What we’ve learnt so far  MapReduce/Dryad as a distributed programming model  Data-flow (computation as vertex, data flow.

CSE 486/586 CSE 486/586 Distributed Systems Graph Processing Steve Ko Computer Sciences and Engineering University at Buffalo.

Carnegie Mellon Yucheng Low Aapo Kyrola Danny Bickson A Framework for Machine Learning and Data Mining in the Cloud Joseph Gonzalez Carlos Guestrin Joe.

Pregel: A System for Large-Scale Graph Processing Grzegorz Malewicz, Matthew H. Austern, Aart J. C. Bik, James C. Dehnert, Ilan Horn, Naty Leiser, and.

MapReduce M/R slides adapted from those of Jeff Dean’s.

GraphX: Unifying Table and Graph Analytics

Harp: Collective Communication on Hadoop Bingjing Zhang, Yang Ruan, Judy Qiu.

GraphX: Unifying Data-Parallel and Graph-Parallel Analytics

Introduction to Large-Scale Graph Computation

Joseph Gonzalez Yucheng Low Danny Bickson Distributed Graph-Parallel Computation on Natural Graphs Haijie Gu Joint work with: Carlos Guestrin.

Carnegie Mellon Yucheng Low Aapo Kyrola Danny Bickson A Framework for Machine Learning and Data Mining in the Cloud Joseph Gonzalez Carlos Guestrin Joe.

Data Structures and Algorithms in Parallel Computing Lecture 4.

GraphX: Graph Analytics on Spark

Other Map-Reduce (ish) Frameworks: Spark William Cohen 1.

Data Structures and Algorithms in Parallel Computing

Pregel: A System for Large-Scale Graph Processing Nov 25 th 2013 Database Lab. Wonseok Choi.

Data Parallel and Graph Parallel Systems for Large-scale Data Processing Presenter: Kun Li.

Graph-Based Parallel Computing

PowerGraph: Distributed Graph- Parallel Computation on Natural Graphs Joseph E. Gonzalez, Yucheng Low, Haijie Gu, and Danny Bickson, Carnegie Mellon University;

Department of Computer Science, Johns Hopkins University Pregel: BSP and Message Passing for Graph Computations EN Randal Burns 14 November 2013.

EpiC: an Extensible and Scalable System for Processing Big Data Dawei Jiang, Gang Chen, Beng Chin Ooi, Kian Lee Tan, Sai Wu School of Computing, National.

TensorFlow– A system for large-scale machine learning

Distributed Programming in “Big Data” Systems Pramod Bhatotia wp

Pagerank and Betweenness centrality on Big Taxi Trajectory Graph

CSCI5570 Large Scale Data Processing Systems

Graph-Based Parallel Computing

Spark Presentation.

PREGEL Data Management in the Cloud

Distributed Graph-Parallel Computation on Natural Graphs

Introduction to Spark.

Graph-Based Parallel Computing

Review of Bulk-Synchronous Communication Costs Problem of Semijoin

COS 518: Advanced Computer Systems Lecture 12 Mike Freedman

Distributed Systems CS

CS110: Discussion about Spark

Apache Spark Lecture by: Faria Kalim (lead TA) CS425, UIUC

Pregelix: Think Like a Vertex, Scale Like Spandex

Apache Spark Lecture by: Faria Kalim (lead TA) CS425 Fall 2018 UIUC

Big Data Analytics: Exploring Graphs with Optimized SQL Queries

Review of Bulk-Synchronous Communication Costs Problem of Semijoin

Computational Advertising and

Presentation transcript:

Graph-Based Parallel Computing William Cohen 1

Announcements Thursday 4/23: student presentations on projects – come with a tablet/laptop/etc Fri 4/24: more sample questions Tuesday 4/28: review session for exam Thursday 4/30: exam – 80min, in this room – closed book, but you can bring one 8 ½ x 11” sheet (front and back) of notes Tues May 5 th - project writeups due: 2

After next Thursday: next fall! will switch to fall, so it’s happening F2015 If you’re wondering – what looks even better on my c.v. than acing a class called “Machine Learning from Large Datasets”? – how you can get even more expert in big ML – how you can help your fellow students take a really great course on big ML – consider… TA-ing for me next semester 3

Graph-Based Parallel Computing William Cohen 4

Outline Motivation/where it fits Sample systems (c. 2010) – Pregel: and some sample programs Bulk synchronous processing – Signal/Collect and GraphLab Asynchronous processing GraphLab descendants – PowerGraph: partitioning – GraphChi: graphs w/o parallelism – GraphX: graphs over Spark 5

Problems we’ve seen so far Operations on sets of sparse feature vectors: – Classification – Topic modeling – Similarity joins Graph operations: – PageRank, personalized PageRank – Semi-supervised learning on graphs 6

Architectures we’ve seen so far Stream-and-sort: limited-memory, serial, simple workflows + parallelism: Map-reduce (Hadoop) + abstract operators like join, group: PIG, Hive, … + caching in memory and efficient iteration: Spark, Flink, … + parameter servers (Petuum, …) + …..? one candidate: architectures for graph processing 7

Architectures we’ve seen so far Large immutable data structures on (distributed) disk, processing by sweeping through then and creating new data structures: – stream-and-sort, Hadoop, PIG, Hive, … Large immutable data structures in distributed memory: – Spark – distributed tables Large mutable data structures in distributed memory: – parameter server: structure is a hashtable – today: large mutable graphs 8

Outline Motivation/where it fits Sample systems (c. 2010) – Pregel: and some sample programs Bulk synchronous processing – Signal/Collect and GraphLab Asynchronous processing GraphLab descendants – PowerGraph: partitioning – GraphChi: graphs w/o parallelism – GraphX: graphs over Spark 9

GRAPH ABSTRACTIONS: PREGEL (SIGMOD 2010*) *Used internally at least 1-2 years before 10

Many ML algorithms tend to have Sparse data dependencies Local computations Iterative updates Typical example: Gibbs sampling 11

Example: Gibbs Sampling [Guestrin UAI 2010] 12 X4X4 X4X4 X5X5 X5X5 X6X6 X6X6 X9X9 X9X9 X8X8 X8X8 X3X3 X3X3 X2X2 X2X2 X1X1 X1X1 X7X7 X7X7 1) Sparse Data Dependencies 2) Local Computations 3) Iterative Updates For LDA: Z d,m for X d,m depends on others Z’s in doc d, and topic assignments to copies of word X d,m

Pregel (Google, Sigmod 2010) Primary data structure is a graph Computations are sequence of supersteps, in each of which – user-defined function is invoked (in parallel) at each vertex v, can get/set value – UDF can also issue requests to get/set edges – UDF can read messages sent to v in the last superstep and schedule messages to send to in the next superstep – Halt when every vertex votes to halt Output is directed graph Also: aggregators (like ALLREDUCE) Bulk synchronous processing (BSP) model: all vertex operations happen simultaneously vertex value changes communication 13

Pregel (Google, Sigmod 2010) One master: partitions the graph among workers Workers keep graph “shard” in memory Messages to other partitions are buffered Communication across partitions is expensive, within partitions is cheap – quality of partition makes a difference! 14

simplest rule: stop when everyone votes to halt everyone computes in parallel 15

Streaming PageRank: with some long rows Repeat until converged: – Let v t+1 = cu + (1-c)Wv t Store A as a list of edges: each line is: “i d(i) j” Store v’ and v in memory: v’ starts out as cu For each line “i d j“ v’[j] += (1-c)v[i]/d We need to get the degree of i and store it locally 16 recap from 3/17 note we need to scan through the graph each time

17

edge weight Another task: single source shortest path 18

a little bit of a cheat 19

Many Graph-Parallel Algorithms Collaborative Filtering –Alternating Least Squares –Stochastic Gradient Descent –Tensor Factorization Structured Prediction –Loopy Belief Propagation –Max-Product Linear Programs –Gibbs Sampling Semi-supervised M L –Graph SSL –CoEM Community Detection –Triangle-Counting –K-core Decomposition –K-Truss Graph Analytics –PageRank –Personalized PageRank –Shortest Path –Graph Coloring Classification –Neural Networks 20

Low-Rank Matrix Factorization: 21 r 13 r 14 r 24 r 25 f(1) f(2) f(3) f(4) f(5) User Factors (U) Movie Factors (M) User s Movie s Netflix User s ≈ x Movie s f(i) f(j) Iterate: Recommending Products

Outline Motivation/where it fits Sample systems (c. 2010) – Pregel: and some sample programs Bulk synchronous processing – Signal/Collect and GraphLab Asynchronous processing GraphLab descendants – PowerGraph: partitioning – GraphChi: graphs w/o parallelism – GraphX: graphs over Spark 22

GRAPH ABSTRACTIONS: SIGNAL/COLLECT (SEMANTIC WEB CONFERENCE, 2010) Stutz, Strebel, Bernstein, Univ Zurich 23

Signal/collect model vs Pregel Integrated with RDF/SPARQL Vertices can be non-uniform types Vertex: – id, mutable state, outgoing edges, most recent received signals (map: neighbor id  signal), uncollected signals – user-defined collect function Edge: id, source, dest – user-defined signal function Allows asynchronous computations….via v.scoreSignal, v.scoreCollect For “data-flow” operations On multicore architecture: shared memory for workers 24

Signal/collect model signals are made available in a list and a map next state for a vertex is output of the collect() operation relax “num_iterations” soon 25

Signal/collect examples Single-source shortest path 26

Signal/collect examples PageRank Life 27

PageRank + Preprocessing and Graph Building 28

Signal/collect examples Co-EM/wvRN/Harmonic fields 29

Signal/collect examples Matching path queries: dept(X) -[member]  postdoc(Y) -[recieved]  grant(Z) Matching path queries: dept(X) -[member]  postdoc(Y) -[recieved]  grant(Z) MLD wcohen partha NSF37 8 InMind7 LTI dept(X) -[member]  postdoc(Y) -[recieved]  grant(Z) 30

Signal/collect examples: data flow Matching path queries: dept(X) -[member]  postdoc(Y) -[recieved]  grant(Z) Matching path queries: dept(X) -[member]  postdoc(Y) -[recieved]  grant(Z) MLD wcohen partha NSF37 8 InMind7 LTI dept(X=MLD) -[member]  postdoc(Y) -[recieved]  grant(Z) dept(X=LTI) -[member]  postdoc(Y) -[recieved]  grant(Z) note: can be multiple input signals 31

Signal/collect examples Matching path queries: dept(X) -[member]  postdoc(Y) -[recieved]  grant(Z) Matching path queries: dept(X) -[member]  postdoc(Y) -[recieved]  grant(Z) MLD wcohen partha NSF37 8 InMind7 LTI dept(X=MLD) -[member]  postdoc(Y=partha) -[recieved]  grant(Z) 32

Signal/collect model vs Pregel Integrated with RDF/SPARQL Vertices can be non-uniform types Vertex: – id, mutable state, outgoing edges, most recent received signals (map: neighbor id  signal), uncollected signals – user-defined collect function Edge: id, source, dest – user-defined signal function Allows asynchronous computations….via v.scoreSignal, v.scoreCollect For “data-flow” operations 33

Asynchronous Parallel Computation Bulk-Synchronous: All vertices update in parallel – need to keep copy of “old” and “new” vertex values Asynchronous: – Reason 1: if two vertices are not connected, can update them in any order more flexibility, less storage – Reason 2: not all updates are equally important parts of the graph converge quickly, parts slowly 34

using: v.scoreSignal v.scoreCollect using: v.scoreSignal v.scoreCollect 35

36

SSSP PageRank 37

Outline Motivation/where it fits Sample systems (c. 2010) – Pregel: and some sample programs Bulk synchronous processing – Signal/Collect and GraphLab Asynchronous processing GraphLab descendants – PowerGraph: partitioning – GraphChi: graphs w/o parallelism – GraphX: graphs over Spark 38

GRAPH ABSTRACTIONS: GRAPHLAB (UAI, 2010) Guestrin, Gonzalez, Bikel, etc. Many slides below pilfered from Carlos or Joey…. 39

GraphLab Data in graph, UDF vertex function Differences: – some control over scheduling vertex function can insert new tasks in a queue – messages must follow graph edges: can access adjacent vertices only – “shared data table” for global data – library algorithms for matrix factorization, coEM, SVM, Gibbs, … – GraphLab  Now Dato 40

Graphical Model Learning 41 Optimal Better Approx. Priority Schedule Splash Schedule 15.5x speedup on 16 cpus On multicore architecture: shared memory for workers

Gibbs Sampling Protein-protein interaction networks [Elidan et al. 2006] – Pair-wise MRF – 14K Vertices – 100K Edges 10x Speedup Scheduling reduces locking overhead 42 Optimal Better Round robin schedule Colored Schedule

CoEM (Rosie Jones, 2005) Named Entity Recognition Task VerticesEdges Small0.2M20M Large2M200M the dog Australia Catalina Island ran quickly travelled to is pleasant Hadoop95 Cores7.5 hrs Is “Dog” an animal? Is “Catalina” a place? 43

CoEM (Rosie Jones, 2005) 44 Optimal Better Small Large GraphLab16 Cores30 min 15x Faster!6x fewer CPUs! Hadoop95 Cores7.5 hrs

GRAPH ABSTRACTIONS: GRAPHLAB CONTINUED…. 45

Outline Motivation/where it fits Sample systems (c. 2010) – Pregel: and some sample programs Bulk synchronous processing – Signal/Collect and GraphLab Asynchronous processing GraphLab descendants – PowerGraph: partitioning – GraphChi: graphs w/o parallelism – GraphX: graphs over Spark 46

GraphLab’s descendents PowerGraph GraphChi GraphX On multicore architecture: shared memory for workers On cluster architecture (like Pregel): different memory spaces What are the challenges moving away from shared-memory? 47

Natural Graphs  Power Law Top 1% of vertices is adjacent to 53% of the edges! Altavista Web Graph: 1.4B Vertices, 6.7B Edges “Power Law” -Slope = α ≈ 2 GraphLab group/Aapo 48

Touches a large fraction of graph (GraphLab 1) Produces many messages (Pregel, Signal/Collect) Edge information too large for single machine Asynchronous consistency requires heavy locking (GraphLab 1) Synchronous consistency is prone to stragglers (Pregel) Problem: High Degree Vertices Limit Parallelism GraphLab group/Aapo 49

PowerGraph Problem: GraphLab’s localities can be large – “all neighbors of a node” can be large for hubs, high indegree nodes Approach: – new graph partitioning algorithm can replicate data – gather-apply-scatter API: finer-grained parallelism gather ~ combiner apply ~ vertex UDF (for all replicates) scatter ~ messages from vertex to edges 50

Factorized Vertex Updates Split update into 3 phases + + … +  Δ Y Y Y Parallel Sum Y Scope Gather Y Y Apply(, Δ)  Y Locally apply the accumulated Δ to vertex Apply Y Update neighbors Scatter Data-parallel over edges GraphLab group/Aapo 51

PageRank in PowerGraph PageRankProgram(i) Gather( j  i ) : return w ji * R[j] sum(a, b) : return a + b; Apply(i, Σ) : R[i] = β + (1 – β) * Σ Scatter( i  j ) : if (R[i] changes) then activate(j) 52 GraphLab group/Aapo

Machine 2 Machine 1 Machine 4 Machine 3 Distributed Execution of a PowerGraph Vertex-Program Σ1Σ1 Σ1Σ1 Σ2Σ2 Σ2Σ2 Σ3Σ3 Σ3Σ3 Σ4Σ4 Σ4Σ Y Y YY Y’ Σ Σ Gather Apply Scatter 53 GraphLab group/Aapo

Minimizing Communication in PowerGraph Y YY A vertex-cut minimizes machines each vertex spans Percolation theory suggests that power law graphs have good vertex cuts. [Albert et al. 2000] Communication is linear in the number of machines each vertex spans 54 GraphLab group/Aapo

Partitioning Performance Twitter Graph: 41M vertices, 1.4B edges Oblivious balances partition quality and partitioning time. Random Oblivious Coordinated Oblivious Random 55 Cost Construction Time Better GraphLab group/Aapo

Partitioning matters… GraphLab group/Aapo 56

Outline Motivation/where it fits Sample systems (c. 2010) – Pregel: and some sample programs Bulk synchronous processing – Signal/Collect and GraphLab Asynchronous processing GraphLab descendants – PowerGraph: partitioning – GraphChi: graphs w/o parallelism – GraphX: graphs over Spark 57

GraphLab’s descendents PowerGraph GraphChi GraphX 58

GraphLab con’t PowerGraph GraphChi – Goal: use graph abstraction on-disk, not in- memory, on a conventional workstation Linux Cluster Services (Amazon AWS) MPI/TCP-IP PThreads Hadoop/HDFS General-purpose API Graph Analytics Graphical Models Computer Vision Clustering Topic Modeling Collaborative Filtering 59

GraphLab con’t GraphChi – Key insight: some algorithms on graph are streamable (i.e., PageRank-Nibble) in general we can’t easily stream the graph because neighbors will be scattered but maybe we can limit the degree to which they’re scattered … enough to make streaming possible? – “almost-streaming”: keep P cursors in a file instead of one 60

Vertices are numbered from 1 to n – P intervals, each associated with a shard on disk. – sub-graph = interval of vertices PSW: Shards and Intervals shard(1) interval(1)interval(2)interval(P) shard (2) shard(P) 1nv1v1 v2v Load 2. Compute 3. Write

PSW: Layout Shard 1 Shards small enough to fit in memory; balance size of shards Shard: in-edges for interval of vertices; sorted by source-id in-edges for vertices sorted by source_id Vertices Vertices Vertices Vertices Shard 2Shard 3Shard 4Shard 1 1. Load 2. Compute 3. Write 62

Vertices Vertices Vertices Vertices Load all in-edges in memory Load subgraph for vertices What about out-edges? Arranged in sequence in other shards Shard 2Shard 3Shard 4 PSW: Loading Sub-graph Shard 1 in-edges for vertices sorted by source_id 1. Load 2. Compute 3. Write 63

Shard 1 Load all in-edges in memory Load subgraph for vertices Shard 2Shard 3Shard 4 PSW: Loading Sub-graph Vertices Vertices Vertices Vertices Out-edge blocks in memory in-edges for vertices sorted by source_id 1. Load 2. Compute 3. Write 64

PSW Load-Phase Only P large reads for each interval. P 2 reads on one full pass Load 2. Compute 3. Write

PSW: Execute updates Update-function is executed on interval’s vertices Edges have pointers to the loaded data blocks – Changes take effect immediately  asynchronous. &Dat a Block X Block Y Load 2. Compute 3. Write

PSW: Commit to Disk In write phase, the blocks are written back to disk – Next load-phase sees the preceding writes  asynchronous Load 2. Compute 3. Write &Dat a Block X Block Y In total: P 2 reads and writes / full pass on the graph.  Performs well on both SSD and hard drive. To make this work: the size of a vertex state can’t change when it’s updated (at last, as stored on disk).

Experiment Setting Mac Mini (Apple Inc.) – 8 GB RAM – 256 GB SSD, 1TB hard drive – Intel Core i5, 2.5 GHz Experiment graphs: GraphVerticesEdgesP (shards)Preprocessing live-journal4.8M69M30.5 min netflix0.5M99M201 min twitter M1.5B202 min uk M3.7B4031 min uk-union133M5.4B5033 min yahoo-web1.4B6.6B5037 min 68

Comparison to Existing Systems Notes: comparison results do not include time to transfer the data to cluster, preprocessing, or the time to load the graph from disk. GraphChi computes asynchronously, while all but GraphLab synchronously. PageRank See the paper for more comparisons. WebGraph Belief Propagation (U Kang et al.) Matrix Factorization (Alt. Least Sqr.)Triangle Counting On a Mac Mini: GraphChi can solve as big problems as existing large-scale systems. Comparable performance. 69

Outline Motivation/where it fits Sample systems (c. 2010) – Pregel: and some sample programs Bulk synchronous processing – Signal/Collect and GraphLab Asynchronous processing GraphLab “descendants” – PowerGraph: partitioning – GraphChi: graphs w/o parallelism – GraphX: graphs over Spark (Gonzalez) 70

GraphLab’s descendents PowerGraph GraphChi GraphX – implementation of GraphLabs API on top of Spark – Motivations: avoid transfers between subsystems leverage larger community for common infrastructure – What’s different: Graphs are now immutable and operations transform one graph into another (RDD  RDG, resiliant distributed graph) 71

Idea 1: Graph as Tables Id Rxin Jegonzal Franklin Istoica SrcIdDstId rxinjegonzal franklinrxin istoicafranklin jegonzal Property (E) Friend Advisor Coworker PI Property (V) (Stu., Berk.) (PstDoc, Berk.) (Prof., Berk) R J F I Property Graph Vertex Property Table Edge Property Table Under the hood things can be split even more finely: eg a vertex map table + vertex data table. Operators maximize structure sharing and minimize communication. 72

Operators Table (RDD) operators are inherited from Spark: 73 map filter groupBy sort union join leftOuterJoin rightOuterJoin reduce count fold reduceByKey groupByKey cogroup cross zip sample take first partitionBy mapWith pipe save...

class Graph [ V, E ] { def Graph(vertices: Table[ (Id, V) ], edges: Table[ (Id, Id, E) ]) // Table Views def vertices: Table[ (Id, V) ] def edges: Table[ (Id, Id, E) ] def triplets: Table [ ((Id, V), (Id, V), E) ] // Transformations def reverse: Graph[V, E] def subgraph(pV: (Id, V) => Boolean, pE: Edge[V,E] => Boolean): Graph[V,E] def mapV(m: (Id, V) => T ): Graph[T,E] def mapE(m: Edge[V,E] => T ): Graph[V,T] // Joins def joinV(tbl: Table [(Id, T)]): Graph[(V, T), E ] def joinE(tbl: Table [(Id, Id, T)]): Graph[V, (E, T)] // Computation def mrTriplets(mapF: (Edge[V,E]) => List[(Id, T)], reduceF: (T, T) => T): Graph[T, E] } Graph Operators 74 Idea 2: mrTriplets: low- level routine similar to scatter-gather-apply. Evolved to aggregateNeighbors, aggregateMessages Idea 2: mrTriplets: low- level routine similar to scatter-gather-apply. Evolved to aggregateNeighbors, aggregateMessages

The GraphX Stack (Lines of Code) GraphX (3575) Spark Pregel (28) + GraphLab (50) PageRan k (5) Connected Comp. (10) Shortest Path (10) ALS (40) LDA (120) K-core (51) Triangl e Count (45) SVD (40) 75

Performance Comparisons GraphX is roughly 3x slower than GraphLab Live-Journal: 69 Million Edges 76

Summary Large immutable data structures on (distributed) disk, processing by sweeping through then and creating new data structures: – stream-and-sort, Hadoop, PIG, Hive, … Large immutable data structures in distributed memory: – Spark – distributed tables Large mutable data structures in distributed memory: – parameter server: structure is a hashtable – Pregel, GraphLab, GraphChi, GraphX: structure is a graph 77

Summary APIs for the various systems vary in detail but have a similar flavor – Typical algorithms iteratively update vertex state – Changes in state are communicated with messages which need to be aggregated from neighbors Biggest wins are – on problems where graph is fixed in each iteration, but vertex data changes – on graphs small enough to fit in (distributed) memory 78

Some things to take away Platforms for iterative operations on graphs – GraphX: if you want to integrate with Spark – GraphChi: if you don’t have a cluster – GraphLab/Dato: if you don’t need free software and performance is crucial – Pregel: if you work at Google – Giraph, Signal/collect, … Important differences – Intended architecture: shared-memory and threads, distributed cluster memory, graph on disk – How graphs are partitioned for clusters – If processing is synchronous or asynchronous 79