Pregel: A System for Large-Scale Graph Processing

Slides:



Advertisements
Similar presentations
Numbers Treasure Hunt Following each question, click on the answer. If correct, the next page will load with a graphic first – these can be used to check.
Advertisements

Variations of the Turing Machine
Angstrom Care 培苗社 Quadratic Equation II
AP STUDY SESSION 2.
1
Copyright © 2003 Pearson Education, Inc. Slide 1 Computer Systems Organization & Architecture Chapters 8-12 John D. Carpinelli.
1 Copyright © 2013 Elsevier Inc. All rights reserved. Chapter 4 Computing Platforms.
Processes and Operating Systems
Copyright © 2011, Elsevier Inc. All rights reserved. Chapter 6 Author: Julia Richards and R. Scott Hawley.
Author: Julia Richards and R. Scott Hawley
Properties Use, share, or modify this drill on mathematic properties. There is too much material for a single class, so you’ll have to select for your.
1 Hyades Command Routing Message flow and data translation.
David Burdett May 11, 2004 Package Binding for WS CDL.
1 RA I Sub-Regional Training Seminar on CLIMAT&CLIMAT TEMP Reporting Casablanca, Morocco, 20 – 22 December 2005 Status of observing programmes in RA I.
CALENDAR.
1 10 pt 15 pt 20 pt 25 pt 5 pt 10 pt 15 pt 20 pt 25 pt 5 pt 10 pt 15 pt 20 pt 25 pt 5 pt 10 pt 15 pt 20 pt 25 pt 5 pt 10 pt 15 pt 20 pt 25 pt 5 pt BlendsDigraphsShort.
1 Chapter 12 File Management Patricia Roy Manatee Community College, Venice, FL ©2008, Prentice Hall Operating Systems: Internals and Design Principles,
1 Click here to End Presentation Software: Installation and Updates Internet Download CD release NACIS Updates.
Photo Slideshow Instructions (delete before presenting or this page will show when slideshow loops) 1.Set PowerPoint to work in Outline. View/Normal click.
Break Time Remaining 10:00.
Turing Machines.
Table 12.1: Cash Flows to a Cash and Carry Trading Strategy.
Red Tag Date 13/12/11 5S.
PP Test Review Sections 6-1 to 6-6
1 The Blue Café by Chris Rea My world is miles of endless roads.
Bright Futures Guidelines Priorities and Screening Tables
EIS Bridge Tool and Staging Tables September 1, 2009 Instructor: Way Poteat Slide: 1.
Outline Minimum Spanning Tree Maximal Flow Algorithm LP formulation 1.
Bellwork Do the following problem on a ½ sheet of paper and turn in.
CS 6143 COMPUTER ARCHITECTURE II SPRING 2014 ACM Principles and Practice of Parallel Programming, PPoPP, 2006 Panel Presentations Parallel Processing is.
Operating Systems Operating Systems - Winter 2010 Chapter 3 – Input/Output Vrije Universiteit Amsterdam.
Exarte Bezoek aan de Mediacampus Bachelor in de grafische en digitale media April 2014.
TESOL International Convention Presentation- ESL Instruction: Developing Your Skills to Become a Master Conductor by Beth Clifton Crumpler by.
Copyright © 2013, 2009, 2006 Pearson Education, Inc. 1 Section 5.5 Dividing Polynomials Copyright © 2013, 2009, 2006 Pearson Education, Inc. 1.
Sample Service Screenshots Enterprise Cloud Service 11.3.
Copyright © 2012, Elsevier Inc. All rights Reserved. 1 Chapter 7 Modeling Structure with Blocks.
1 RA III - Regional Training Seminar on CLIMAT&CLIMAT TEMP Reporting Buenos Aires, Argentina, 25 – 27 October 2006 Status of observing programmes in RA.
Basel-ICU-Journal Challenge18/20/ Basel-ICU-Journal Challenge8/20/2014.
1..
CONTROL VISION Set-up. Step 1 Step 2 Step 3 Step 5 Step 4.
Adding Up In Chunks.
MaK_Full ahead loaded 1 Alarm Page Directory (F11)
1 10 pt 15 pt 20 pt 25 pt 5 pt 10 pt 15 pt 20 pt 25 pt 5 pt 10 pt 15 pt 20 pt 25 pt 5 pt 10 pt 15 pt 20 pt 25 pt 5 pt 10 pt 15 pt 20 pt 25 pt 5 pt Synthetic.
Artificial Intelligence
: 3 00.
1 hi at no doifpi me be go we of at be do go hi if me no of pi we Inorder Traversal Inorder traversal. n Visit the left subtree. n Visit the node. n Visit.
Prof.ir. Klaas H.J. Robers, 14 July Graduation: a process organised by YOU.
Speak Up for Safety Dr. Susan Strauss Harassment & Bullying Consultant November 9, 2012.
Essential Cell Biology
Converting a Fraction to %
Clock will move after 1 minute
PSSA Preparation.
Essential Cell Biology
The DDS Benchmarking Environment James Edmondson Vanderbilt University Nashville, TN.
Immunobiology: The Immune System in Health & Disease Sixth Edition
Physics for Scientists & Engineers, 3rd Edition
Energy Generation in Mitochondria and Chlorplasts
Select a time to count down from the clock above
1 Decidability continued…. 2 Theorem: For a recursively enumerable language it is undecidable to determine whether is finite Proof: We will reduce the.
Distributed Graph Processing Abhishek Verma CS425.
Pregel: A System for Large-Scale Graph Processing
Paper by: Grzegorz Malewicz, Matthew Austern, Aart Bik, James Dehnert, Ilan Horn, Naty Leiser, Grzegorz Czajkowski (Google, Inc.) Pregel: A System for.
Pregel: A System for Large-Scale Graph Processing
Pregel: A System for Large-Scale Graph Processing Presented by Dylan Davis Authors: Grzegorz Malewicz, Matthew H. Austern, Aart J.C. Bik, James C. Dehnert,
CSE 486/586 CSE 486/586 Distributed Systems Graph Processing Steve Ko Computer Sciences and Engineering University at Buffalo.
Pregel: A System for Large-Scale Graph Processing Grzegorz Malewicz, Matthew H. Austern, Aart J. C. Bik, James C. Dehnert, Ilan Horn, Naty Leiser, and.
Data Structures and Algorithms in Parallel Computing Lecture 4.
Pregel: A System for Large-Scale Graph Processing Nov 25 th 2013 Database Lab. Wonseok Choi.
PREGEL Data Management in the Cloud
Presentation transcript:

Pregel: A System for Large-Scale Graph Processing Grzegorz Malewicz, Matthew H. Austern, Aart J. C. Bik, James C. Dehnert, Ilan Horn, Naty Leiser, and Grzegorz Czajkwoski Google, Inc. SIGMOD ’10 7 July 2010 Taewhi Lee

Outline Introduction Computation Model Writing a Pregel Program System Implementation Experiments Conclusion & Future Work

Outline Introduction Computation Model Writing a Pregel Program System Implementation Experiments Conclusion & Future Work

Introduction (1/2) Source: SIGMETRICS ’09 Tutorial – MapReduce: The Programming Model and Practice, by Jerry Zhao

Introduction (2/2) Many practical computing problems concern large graphs MapReduce is ill-suited for graph processing Many iterations are needed for parallel graph processing Materializations of intermediate results at every MapReduce iteration harm performance Large graph data Graph algorithms Web graph Transportation routes Citation relationships Social networks PageRank Shortest path Connected components Clustering techniques

Single Source Shortest Path (SSSP) Problem Find shortest path from a source node to all target nodes Solution Single processor machine: Dijkstra’s algorithm

Example: SSSP – Dijkstra’s Algorithm  10 5 2 3 1 9 7 4 6

Example: SSSP – Dijkstra’s Algorithm 10 5  2 3 1 9 7 4 6

Example: SSSP – Dijkstra’s Algorithm 8 5 14 7 10 2 3 1 9 4 6

Example: SSSP – Dijkstra’s Algorithm 8 5 13 7 10 2 3 1 9 4 6

Example: SSSP – Dijkstra’s Algorithm 8 5 9 7 10 2 3 1 4 6

Example: SSSP – Dijkstra’s Algorithm 8 5 9 7 10 2 3 1 4 6

Single Source Shortest Path (SSSP) Problem Find shortest path from a source node to all target nodes Solution Single processor machine: Dijkstra’s algorithm MapReduce/Pregel: parallel breadth-first search (BFS)

MapReduce Execution Overview

Example: SSSP – Parallel BFS in MapReduce Adjacency matrix Adjacency List A: (B, 10), (D, 5) B: (C, 1), (D, 2) C: (E, 4) D: (B, 3), (C, 9), (E, 2) E: (A, 7), (C, 6)  10 5 2 3 1 9 7 4 6 A B C D E A B C D E 10 5 1 2 4 3 9 7 6

Example: SSSP – Parallel BFS in MapReduce Map input: <node ID, <dist, adj list>> <A, <0, <(B, 10), (D, 5)>>> <B, <inf, <(C, 1), (D, 2)>>> <C, <inf, <(E, 4)>>> <D, <inf, <(B, 3), (C, 9), (E, 2)>>> <E, <inf, <(A, 7), (C, 6)>>> Map output: <dest node ID, dist> <B, 10> <D, 5> <C, inf> <D, inf> <E, inf> <B, inf> <C, inf> <E, inf> <A, inf> <C, inf>  10 5 2 3 1 9 7 4 6 A B C D E <A, <0, <(B, 10), (D, 5)>>> <B, <inf, <(C, 1), (D, 2)>>> <C, <inf, <(E, 4)>>> <D, <inf, <(B, 3), (C, 9), (E, 2)>>> <E, <inf, <(A, 7), (C, 6)>>> Flushed to local disk!!

Example: SSSP – Parallel BFS in MapReduce Reduce input: <node ID, dist> <A, <0, <(B, 10), (D, 5)>>> <A, inf> <B, <inf, <(C, 1), (D, 2)>>> <B, 10> <B, inf> <C, <inf, <(E, 4)>>> <C, inf> <C, inf> <C, inf> <D, <inf, <(B, 3), (C, 9), (E, 2)>>> <D, 5> <D, inf> <E, <inf, <(A, 7), (C, 6)>>> <E, inf> <E, inf>  10 5 2 3 1 9 7 4 6 A B C D E

Example: SSSP – Parallel BFS in MapReduce Reduce input: <node ID, dist> <A, <0, <(B, 10), (D, 5)>>> <A, inf> <B, <inf, <(C, 1), (D, 2)>>> <B, 10> <B, inf> <C, <inf, <(E, 4)>>> <C, inf> <C, inf> <C, inf> <D, <inf, <(B, 3), (C, 9), (E, 2)>>> <D, 5> <D, inf> <E, <inf, <(A, 7), (C, 6)>>> <E, inf> <E, inf>  10 5 2 3 1 9 7 4 6 A B C D E

Example: SSSP – Parallel BFS in MapReduce Reduce output: <node ID, <dist, adj list>> = Map input for next iteration <A, <0, <(B, 10), (D, 5)>>> <B, <10, <(C, 1), (D, 2)>>> <C, <inf, <(E, 4)>>> <D, <5, <(B, 3), (C, 9), (E, 2)>>> <E, <inf, <(A, 7), (C, 6)>>> Map output: <dest node ID, dist> <B, 10> <D, 5> <C, 11> <D, 12> <E, inf> <B, 8> <C, 14> <E, 7> <A, inf> <C, inf> 10 5  2 3 1 9 7 4 6 A B C D E Flushed to DFS!! <A, <0, <(B, 10), (D, 5)>>> <B, <10, <(C, 1), (D, 2)>>> <C, <inf, <(E, 4)>>> <D, <5, <(B, 3), (C, 9), (E, 2)>>> <E, <inf, <(A, 7), (C, 6)>>> Flushed to local disk!!

Example: SSSP – Parallel BFS in MapReduce Reduce input: <node ID, dist> <A, <0, <(B, 10), (D, 5)>>> <A, inf> <B, <10, <(C, 1), (D, 2)>>> <B, 10> <B, 8> <C, <inf, <(E, 4)>>> <C, 11> <C, 14> <C, inf> <D, <5, <(B, 3), (C, 9), (E, 2)>>> <D, 5> <D, 12> <E, <inf, <(A, 7), (C, 6)>>> <E, inf> <E, 7> 10 5  2 3 1 9 7 4 6 A B C D E

Example: SSSP – Parallel BFS in MapReduce Reduce input: <node ID, dist> <A, <0, <(B, 10), (D, 5)>>> <A, inf> <B, <10, <(C, 1), (D, 2)>>> <B, 10> <B, 8> <C, <inf, <(E, 4)>>> <C, 11> <C, 14> <C, inf> <D, <5, <(B, 3), (C, 9), (E, 2)>>> <D, 5> <D, 12> <E, <inf, <(A, 7), (C, 6)>>> <E, inf> <E, 7> 10 5  2 3 1 9 7 4 6 A B C D E

Example: SSSP – Parallel BFS in MapReduce Reduce output: <node ID, <dist, adj list>> = Map input for next iteration <A, <0, <(B, 10), (D, 5)>>> <B, <8, <(C, 1), (D, 2)>>> <C, <11, <(E, 4)>>> <D, <5, <(B, 3), (C, 9), (E, 2)>>> <E, <7, <(A, 7), (C, 6)>>> … the rest omitted … 8 5 11 7 10 2 3 1 9 4 6 A B C D E Flushed to DFS!!

Outline Introduction Computation Model Writing a Pregel Program System Implementation Experiments Conclusion & Future Work

Computation Model (1/3) Input Supersteps Output (a sequence of iterations)

Computation Model (2/3) “Think like a vertex” Inspired by Valiant’s Bulk Synchronous Parallel model (1990) Source: http://en.wikipedia.org/wiki/Bulk_synchronous_parallel

Computation Model (3/3) Superstep: the vertices compute in parallel Each vertex Receives messages sent in the previous superstep Executes the same user-defined function Modifies its value or that of its outgoing edges Sends messages to other vertices (to be received in the next superstep) Mutates the topology of the graph Votes to halt if it has no further work to do Termination condition All vertices are simultaneously inactive There are no messages in transit

Example: SSSP – Parallel BFS in Pregel  10 5 2 3 1 9 7 4 6

Example: SSSP – Parallel BFS in Pregel  10 5 2 3 1 9 7 4 6  10       5 

Example: SSSP – Parallel BFS in Pregel 10 5  2 3 1 9 7 4 6

Example: SSSP – Parallel BFS in Pregel 10 5  2 3 1 9 7 4 6 11 14 8 12 7

Example: SSSP – Parallel BFS in Pregel 8 5 11 7 10 2 3 1 9 4 6

Example: SSSP – Parallel BFS in Pregel 8 5 11 7 10 2 3 1 9 4 6 9 13 14 15

Example: SSSP – Parallel BFS in Pregel 8 5 9 7 10 2 3 1 4 6

Example: SSSP – Parallel BFS in Pregel 8 5 9 7 10 2 3 1 4 6 13

Example: SSSP – Parallel BFS in Pregel 8 5 9 7 10 2 3 1 4 6

Differences from MapReduce Graph algorithms can be written as a series of chained MapReduce invocation Pregel Keeps vertices & edges on the machine that performs computation Uses network transfers only for messages MapReduce Passes the entire state of the graph from one stage to the next Needs to coordinate the steps of a chained MapReduce

Outline Introduction Computation Model Writing a Pregel Program System Implementation Experiments Conclusion & Future Work

C++ API Writing a Pregel program Subclassing the predefined Vertex class Override this! in msgs out msg

Example: Vertex Class for SSSP

Outline Introduction Computation Model Writing a Pregel Program System Implementation Experiments Conclusion & Future Work

System Architecture Pregel system also uses the master/worker model Maintains worker Recovers faults of workers Provides Web-UI monitoring tool of job progress Worker Processes its task Communicates with the other workers Persistent data is stored as files on a distributed storage system (such as GFS or BigTable) Temporary data is stored on local disk

Execution of a Pregel Program Many copies of the program begin executing on a cluster of machines The master assigns a partition of the input to each worker Each worker loads the vertices and marks them as active The master instructs each worker to perform a superstep Each worker loops through its active vertices & computes for each vertex Messages are sent asynchronously, but are delivered before the end of the superstep This step is repeated as long as any vertices are active, or any messages are in transit After the computation halts, the master may instruct each worker to save its portion of the graph

Fault Tolerance Checkpointing Failure detection Recovery The master periodically instructs the workers to save the state of their partitions to persistent storage e.g., Vertex values, edge values, incoming messages Failure detection Using regular “ping” messages Recovery The master reassigns graph partitions to the currently available workers The workers all reload their partition state from most recent available checkpoint

Outline Introduction Computation Model Writing a Pregel Program System Implementation Experiments Conclusion & Future Work

Experiments Environment Naïve SSSP implementation H/W: A cluster of 300 multicore commodity PCs Data: binary trees, log-normal random graphs (general graphs) Naïve SSSP implementation The weight of all edges = 1 No checkpointing

Experiments SSSP – 1 billion vertex binary tree: varying # of worker tasks

Experiments SSSP – binary trees: varying graph sizes on 800 worker tasks

Experiments SSSP – Random graphs: varying graph sizes on 800 worker tasks

Outline Introduction Computation Model Writing a Pregel Program System Implementation Experiments Conclusion & Future Work

Conclusion & Future Work Pregel is a scalable and fault-tolerant platform with an API that is sufficiently flexible to express arbitrary graph algorithms Future work Relaxing the synchronicity of the model Not to wait for slower workers at inter-superstep barriers Assigning vertices to machines to minimize inter-machine communication Caring dense graphs in which most vertices send messages to most other vertices

Any questions or comments? Thank You! Any questions or comments?