Mayank Bhatt, Jayasi Mehar

Slides:



Advertisements
Similar presentations
Max- coloring in trees SRIRAM V.PEMMARAJU AND RAJIV RAMAN BY JAYATI JENNIFER LAW.
Advertisements

Every edge is in a red ellipse (the bags). The bags are connected in a tree. The bags an original vertex is part of are connected.
Design and Analysis of Algorithms Approximation algorithms for NP-complete problems Haidong Xue Summer 2012, at GSU.
GRAPH BALANCING. Scheduling on Unrelated Machines J1 J2 J3 J4 J5 M1 M2 M3.
4/22/ Clock Network Synthesis Prof. Shiyan Hu Office: EREC 731.
1 Advancing Supercomputer Performance Through Interconnection Topology Synthesis Yi Zhu, Michael Taylor, Scott B. Baden and Chung-Kuan Cheng Department.
Distributed Graph Analytics Imranul Hoque CS525 Spring 2013.
Online Social Networks and Media. Graph partitioning The general problem – Input: a graph G=(V,E) edge (u,v) denotes similarity between u and v weighted.
Hierarchical Decompositions for Congestion Minimization in Networks Harald Räcke 1.
FUNDAMENTAL PROBLEMS AND ALGORITHMS Graph Theory and Combinational © Giovanni De Micheli Stanford University.
Networks. Graphs (undirected, unweighted) has a set of vertices V has a set of undirected, unweighted edges E graph G = (V, E), where.
1 University of Freiburg Computer Networks and Telematics Prof. Christian Schindelhauer Wireless Sensor Networks 21st Lecture Christian Schindelhauer.
Graph Algorithms: Minimum Spanning Tree We are given a weighted, undirected graph G = (V, E), with weight function w:
38 th Design Automation Conference, Las Vegas, June 19, 2001 Creating and Exploiting Flexibility in Steiner Trees Elaheh Bozorgzadeh, Ryan Kastner, Majid.
Peer-to-Peer Based Multimedia Distribution Service Zhe Xiang, Qian Zhang, Wenwu Zhu, Zhensheng Zhang IEEE Transactions on Multimedia, Vol. 6, No. 2, April.
Detecting Network Intrusions via Sampling : A Game Theoretic Approach Presented By: Matt Vidal Murali Kodialam T.V. Lakshman July 22, 2003 Bell Labs, Lucent.
ER UCLA UCLA ICCAD: November 5, 2000 Predictable Routing Ryan Kastner, Elaheh Borzorgzadeh, and Majid Sarrafzadeh ER Group Dept. of Computer Science UCLA.
UMass Lowell Computer Science Analysis of Algorithms Prof. Karen Daniels Fall, 2002 Monday, 12/2/02 Design Patterns for Optimization Problems Greedy.
Approximation Algorithms Motivation and Definitions TSP Vertex Cover Scheduling.
The Shortest Path Problem
GIS Analysis. Questions to answer Position – what is here? Condition – where are …? Trends – what has changed? Pattern – what spatial patterns exist?
Balanced Graph Edge Partition ACM KDD 2014 Florian Bourse ENS Marc Lelarge INRIA-ENS Milan Vojnovic Microsoft Research.
Interconnect Network Topologies
BiGraph BiGraph: Bipartite-oriented Distributed Graph Partitioning for Big Learning Jiaxin Shi Rong Chen, Jiaxin Shi, Binyu Zang, Haibing Guan Institute.
1 Introduction to Approximation Algorithms. 2 NP-completeness Do your best then.
Scott Perryman Jordan Williams.  NP-completeness is a class of unsolved decision problems in Computer Science.  A decision problem is a YES or NO answer.
Network Aware Resource Allocation in Distributed Clouds.
Pregel: A System for Large-Scale Graph Processing Presented by Dylan Davis Authors: Grzegorz Malewicz, Matthew H. Austern, Aart J.C. Bik, James C. Dehnert,
X-Stream: Edge-Centric Graph Processing using Streaming Partitions
GRAPH PROCESSING Hi, I am Mayank and the second presenter for today is Shadi. We will be talking about Graph Processing.
1 Introduction to Approximation Algorithms. 2 NP-completeness Do your best then.
A Survey of Distributed Task Schedulers Kei Takahashi (M1)
1 Optimal Oblivious Routing in Hole-Free Networks Costas Busch Louisiana State University Malik Magdon-Ismail Rensselaer Polytechnic Institute.
A Graph-based Friend Recommendation System Using Genetic Algorithm
InterConnection Network Topologies to Minimize graph diameter: Low Diameter Regular graphs and Physical Wire Length Constrained networks Nilesh Choudhury.
CSE 589 Part VI. Reading Skiena, Sections 5.5 and 6.8 CLR, chapter 37.
CS 484 Load Balancing. Goal: All processors working all the time Efficiency of 1 Distribute the load (work) to meet the goal Two types of load balancing.
Dzmitry Kliazovich University of Luxembourg, Luxembourg
Data Structures and Algorithms in Parallel Computing Lecture 3.
Interconnect Networks Basics. Generic parallel/distributed system architecture On-chip interconnects (manycore processor) Off-chip interconnects (clusters.
LOCALIZED MINIMUM - ENERGY BROADCASTING IN AD - HOC NETWORKS Paper By : Julien Cartigny, David Simplot, And Ivan Stojmenovic Instructor : Dr Yingshu Li.
Scalable and Topology-Aware Load Balancers in Charm++ Amit Sharma Parallel Programming Lab, UIUC.
A Simulation-Based Study of Overlay Routing Performance CS 268 Course Project Andrey Ermolinskiy, Hovig Bayandorian, Daniel Chen.
Community structure in graphs Santo Fortunato. More links “inside” than “outside” Graphs are “sparse” “Communities”
Prims Algorithm for finding a minimum spanning tree
On the Placement of Web Server Replicas Yu Cai. Paper On the Placement of Web Server Replicas Lili Qiu, Venkata N. Padmanabhan, Geoffrey M. Voelker Infocom.
PowerGraph: Distributed Graph- Parallel Computation on Natural Graphs Joseph E. Gonzalez, Yucheng Low, Haijie Gu, and Danny Bickson, Carnegie Mellon University;
Graphs Definition: a graph is an abstract representation of a set of objects where some pairs of the objects are connected by links. The interconnected.
ETH Zurich – Distributed Computing Group Stephan HolzerSODA Stephan Holzer Silvio Frischknecht Roger Wattenhofer Networks Cannot Compute Their Diameter.
Placing Relay Nodes for Intra-Domain Path Diversity Meeyoung Cha Sue Moon Chong-Dae Park Aman Shaikh Proc. of IEEE INFOCOM 2006 Speaker 游鎮鴻.
Introduction Wireless Ad-Hoc Network  Set of transceivers communicating by radio.
Spanning Trees Dijkstra (Unit 10) SOL: DM.2 Classwork worksheet Homework (day 70) Worksheet Quiz next block.
Optimizing Distributed Actor Systems for Dynamic Interactive Services
Cohesive Subgraph Computation over Large Graphs
Constraint-Based Routing
Computer Network Topology
Graph Theory and Algorithm 02
Mélange: Multi-tenant Scheduling for Graph Processing Jobs
Research: algorithmic solutions for networking
Robustness of wireless ad hoc network topologies
Robustness of wireless ad hoc network topologies
Coverage Approximation Algorithms
Introduction Wireless Ad-Hoc Network
Gephi.
Graph Searching.
3.3 Network-Centric Community Detection
Weighted Graphs & Shortest Paths
Distributed Graph Algorithms
CSCI 465 Data Communications and Networks Lecture 16
Gurbinder Gill Roshan Dathathri Loc Hoang Keshav Pingali
Presentation transcript:

Mayank Bhatt, Jayasi Mehar Topology-Aware Distributed Graph Processing for Tightly-Coupled Clusters Mayank Bhatt, Jayasi Mehar DPRG: http://dprg.cs.uiuc.edu

Our work explores the problem of graph partitioning, focused on reducing the communication cost on tightly coupled clusters

Why? Experimenting with cloud frameworks on HPC systems Interest in supercomputing as a service More big data jobs running on supercomputers

Tightly-Coupled Clusters Supercomputers Compute nodes embedded inside the network topology Messages routed via compute nodes Communication patterns can influence performance “Hop count” is an approximate measure of cost of communication

Blue Waters Interconnect 3D Torus Subset of nodes returned for running job Static routing - number of hops between two nodes will remain constant

Graph Processing Systems Lot of real world data is expressed in the form of graphs Billion of vertices, trillions of edges, need to distribute Algorithms - ex. Shortest path, PageRank 2 stages - Ingress and Processing

Types of Partitioning System of choice: PowerGraph Masters and Mirrors Masters communicate with all mirrors Our hypothesis: placing masters and mirrors close by should reduce communication cost Vertex Cuts Edge Cuts

Master mirror placement Place replicas of a vertex first and then decide where to place the master Place the master of each vertex first and then decide where to place the replica - Hashing M R M R M

Random Partitioning Fast ingress Communication cost between master and mirrors can be high Replication factor could be high M R R

Oblivious Partitioning Slower ingress Heuristic based partitioning Leads to smaller replication factor than random Starting point to optimize Master mirror communication M R

Grid Partitioning Intersecting constraint sets Leads to a controlled replication factor Master mirror communication not optimized M R

Topology Aware Variants Make the partitioning step aware of the underlying network topology Place masters and mirrors such that communication cost is minimized

Choosing a master Pick master such that total number of hops are minimum Geometric centroid Edge degrees of each replica can be different Weighted Centroid

Grid Centroid Edges are placed using the Grid partitioning Strategy first Load: number of masters on candidate Number of edges on mirror Number of hops between mirror and candidate

Restricted Oblivious

Restricted Oblivious Number of edges on candidate Maximum number of edges on a node Minimum number of edges on a node Number of hops between candidate and master

Experiments Cluster size: 36 nodes Algorithm: Approximate diameter Graph: Power-law, 20 million vertices

Tradeoff between runtime and ingress

Data intensive algorithms benefit more Graph Algorithms Data intensive algorithms benefit more

Improvements depend on type of graph Graph Type Improvements depend on type of graph

Network Data Transfer

Other System Optimizations Controlling the frequency of data injection into network impacts runtime in certain algorithms Smaller network buffers => flushed more frequently

Small computation and network data benefit from frequent flushing Buffer Sizes PageRank Approximate Diameter Small computation and network data benefit from frequent flushing

Decisions, decisions

DPRG: http://dprg.cs.uiuc.edu Conclusions Two new topology-aware algorithms for graph partitioning No ‘one size fits all’ approach to graph partitioning We propose a decision tree that can help decide which partitioning algorithm is best System optimizations complement performance DPRG: http://dprg.cs.uiuc.edu

Questions and Feedback? DPRG: http://dprg.cs.uiuc.edu