Fast, Exact Graph Diameter Computation with Vertex Programming Corey Pennycuff and Tim Weninger SIGKDD Workshop on High Performance Graph Mining August.

Slides:



Advertisements
Similar presentations
Bellman-Ford algorithm
Advertisements

Pregel: A System for Large-Scale Graph Processing
Graph Theory Arnold Mesa. Basic Concepts n A graph G = (V,E) is defined by a set of vertices and edges v3 v1 v2Vertex (v1) Edge (e1) A Graph with 3 vertices.
1 TDD: Topics in Distributed Databases Distributed Query Processing MapReduce Vertex-centric models for querying graphs Distributed query evaluation by.
CSE 390B: Graph Algorithms Based on CSE 373 slides by Jessica Miller, Ruth Anderson 1.
* Bellman-Ford: single-source shortest distance * O(VE) for graphs with negative edges * Detects negative weight cycles * Floyd-Warshall: All pairs shortest.
CS 206 Introduction to Computer Science II 03 / 27 / 2009 Instructor: Michael Eckmann.
Distributed Graph Processing Abhishek Verma CS425.
APACHE GIRAPH ON YARN Chuan Lei and Mohammad Islam.
PaaS Techniques Programming Model
An Overview of the BSP Model of Parallel Computation Overview Only.
Yuzhou Zhang ﹡, Jianyong Wang #, Yi Wang §, Lizhu Zhou ¶ Presented by Nam Nguyen Parallel Community Detection on Large Networks with Propinquity Dynamics.
Design Patterns for Efficient Graph Algorithms in MapReduce Jimmy Lin and Michael Schatz University of Maryland Tuesday, June 29, 2010 This work is licensed.
Shortest Path Algorithm By Weston Vu CS 146. What is Shortest Paths? Shortest Paths is a part of the graph algorithm. It is used to calculate the shortest.
CS 206 Introduction to Computer Science II 03 / 30 / 2009 Instructor: Michael Eckmann.
All-Pairs Shortest Paths
Pregel: A System for Large-Scale Graph Processing
Big Data Infrastructure Jimmy Lin University of Maryland Monday, April 13, 2015 Session 10: Beyond MapReduce — Graph Processing This work is licensed under.
Paper by: Grzegorz Malewicz, Matthew Austern, Aart Bik, James Dehnert, Ilan Horn, Naty Leiser, Grzegorz Czajkowski (Google, Inc.) Pregel: A System for.
CS 425 / ECE 428 Distributed Systems Fall 2014 Indranil Gupta (Indy) Lecture 22: Stream Processing, Graph Processing All slides © IG.
Introduction of Apache Hama Edward J. Yoon, October 11, 2011.
Using Dijkstra’s Algorithm to Find a Shortest Path from a to z 1.
Presented By HaeJoon Lee Yanyan Shen, Beng Chin Ooi, Bogdan Marius Tudor National University of Singapore Wei Lu Renmin University Cang Chen Zhejiang University.
1 Fast Failure Recovery in Distributed Graph Processing Systems Yanyan Shen, Gang Chen, H.V. Jagadish, Wei Lu, Beng Chin Ooi, Bogdan Marius Tudor.
Pregel: A System for Large-Scale Graph Processing Presented by Dylan Davis Authors: Grzegorz Malewicz, Matthew H. Austern, Aart J.C. Bik, James C. Dehnert,
GRAPH PROCESSING Hi, I am Mayank and the second presenter for today is Shadi. We will be talking about Graph Processing.
Big Graph Processing on Cloud Jeffrey Xu Yu ( 于旭 ) The Chinese University of Hong Kong
Bulk Synchronous Parallel Processing Model Jamie Perkins.
CSE 486/586 CSE 486/586 Distributed Systems Graph Processing Steve Ko Computer Sciences and Engineering University at Buffalo.
Pregel: A System for Large-Scale Graph Processing Grzegorz Malewicz, Matthew H. Austern, Aart J. C. Bik, James C. Dehnert, Ilan Horn, Naty Leiser, and.
1 Broadcast. 2 3 Use a spanning tree Root 4 synchronous It takes the same time at link to send a message It takes the same time at each node to process.
PREDIcT: Towards Predicting the Runtime of Iterative Analytics Adrian Popescu 1, Andrey Balmin 2, Vuk Ercegovac 3, Anastasia Ailamaki
COMP261 Lecture 7 A* Search. A* search Can we do better than Dijkstra's algorithm? Yes! –want to explore more promising paths, not just shortest so far.
CSE 2331 / 5331 Topic 12: Shortest Path Basics Dijkstra Algorithm Relaxation Bellman-Ford Alg.
1 Bulk Synchronous Parallel Computing Trevor Schaub Jim Sellers This presentation was prepared for Professor Stefan Dobrev in partial fulfillment of the.
Data Structures and Algorithms in Parallel Computing Lecture 4.
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley. Ver Chapter 13: Graphs Data Abstraction & Problem Solving with C++
Data Structures and Algorithms in Parallel Computing
Acknowledgement: Arijit Khan, Sameh Elnikety. Google: > 1 trillion indexed pages Web GraphSocial Network Facebook: > 1.5 billion active users 31 billion.
Graphs and Paths Data Structures & Problem Solving Using JAVA Second Edition Mark Allen Weiss Chapter 14 © 2002 Addison Wesley.
Outline  Introduction  Subgraph Pattern Matching  Types of Subgraph Pattern Matching  Models of Computation  Distributed Algorithms  Performance.
Matching in bipartite graphs Given: non-weighted bipartite graph not covered node extending alternating path initial matching Algorithm: so-called “extending.
1 Tree and Graph Processing On Hadoop Ted Malaska.
Department of Computer Science, Johns Hopkins University Pregel: BSP and Message Passing for Graph Computations EN Randal Burns 14 November 2013.
Big Data Infrastructure Week 11: Analyzing Graphs, Redux (1/2) This work is licensed under a Creative Commons Attribution-Noncommercial-Share Alike 3.0.
Big Data Infrastructure
Shortest Paths.
CSC 172 DATA STRUCTURES.
Pagerank and Betweenness centrality on Big Taxi Trajectory Graph
PREGEL Data Management in the Cloud
CPT-S 415 Big Data Yinghui Wu EME B45.
Data Structures and Algorithms in Parallel Computing
Unweighted Shortest Path Neil Tang 3/11/2010
Review of Bulk-Synchronous Communication Costs Problem of Semijoin
Data-Intensive Distributed Computing
CSC 172 DATA STRUCTURES.
Pregelix: Big(ger) Graph Analytics on A Dataflow Engine
CSC 172 DATA STRUCTURES.
Distributed Systems CS
Shortest Paths.
Shortest-Paths Trees Kun-Mao Chao (趙坤茂)
Pregelix: Think Like a Vertex, Scale Like Spandex
Shortest Path Algorithms
Solution to problem 4. a) Control flow graph art Start k = i + 2 * j
Sorting and Divide-and-Conquer
Shortest Paths.
Review of Bulk-Synchronous Communication Costs Problem of Semijoin
Graph Search in C++ Andrew Lindsay.
All Pairs Shortest Path Examples While the illustrations which follow only show solutions from vertex A (or 1) for simplicity, students should note that.
Motivation Contemporary big data tools such as MapReduce and graph processing tools have fixed data abstraction and support a limited set of communication.
Presentation transcript:

Fast, Exact Graph Diameter Computation with Vertex Programming Corey Pennycuff and Tim Weninger SIGKDD Workshop on High Performance Graph Mining August 10, 2015 Vertex-Centric Computing for Large Scale Graph Analytics

Dijkstra’s Single Source Shortest Path A C F E D B 0 2 ABCDEFG A G

Medium Graphs 4 million nodes 200 million edges

Bigger Graphs Solution – Hadoop data mappers shuffle and sort reducers result 234 DISK

Graph Diameter HADIReverse Cuthill-McKeeRandom BFS

Bulk Synchronous Parallel (BSP) Created in 1990 by Les Valiant and Bill McColl at Oxford data result Superstep 1 Superstep 2 Superstep 3 Data kept in memory DISK Superstep 0 barrier

Graph Analytics with BSP Require the programmer to “think like a vertex” A C F E D B …

The Vertex Each Vertex Can: Receive messages from previous superstep Modify its value/datum Send messages

BSP Single Source Shortest Path compute(MessageIterator* msgs){ bool changed = false; foreach(msg : msgs){ if(msg < datum){ datum = msg; changed = true; } if(changed) { foreach(edge : GetOutEdgeIterator()){ sendMessageTo(edge.dest, datum + edge.weight) } }else{ voteToHalt(); } A C F E D B G

Dijkstra’s Single Source Shortest Path ABCDEFG A0 Superstep 0 master A C F E D B 0 G

Dijkstra’s Single Source Shortest Path ABCDEFG A0112 Superstep 1 A C F E D B 0 G

Dijkstra’s Single Source Shortest Path Superstep 2 A C F E D B 0 G ABCDEFG A0112

Supersteps-1 = Node Eccenctricity A C F E D B 0 G ABCDEFG A0112

Diameter Measurement A C F E D B G A C F E D B G A C F E D B G A C F E D B G A C F E D B G A C F E D B G A C F E D B G

Limitations Must be synchronous Designed for unweighted graphs

Performance Results ER-Graphs (p=32%)

Performance Results SF-Graphs (k=3)

Performance Results Real World Graphs

Thank you