Lu Qin Center of Quantum Computation and Intelligent Systems, University of Technology, Australia Jeffery Xu Yu The Chinese University of Hong Kong, China.

Slides:



Advertisements
Similar presentations
Lecture 15. Graph Algorithms
Advertisements

AI Pathfinding Representing the Search Space
LIBRA: Lightweight Data Skew Mitigation in MapReduce
Weighted graphs Example Consider the following graph, where nodes represent cities, and edges show if there is a direct flight between each pair of cities.
A Model of Computation for MapReduce
Minimum Energy Mobile Wireless Networks IEEE JSAC 2001/10/18.
Minimum Spanning Trees Definition Two properties of MST’s Prim and Kruskal’s Algorithm –Proofs of correctness Boruvka’s algorithm Verifying an MST Randomized.
Self Stabilizing Algorithms for Topology Management Presentation: Deniz Çokuslu.
Breadth-First Search Seminar – Networking Algorithms CS and EE Dept. Lulea University of Technology 27 Jan Mohammad Reza Akhavan.
Graph Theory ITEC 320 Lecture 21. Graph Theory Review Higher level usage of pointers –Factories –Flyweight –Disk pool Rationale Benefits / Downsides.
Minimum Spanning Trees
Connected Substructure Similarity Search Haichuan Shang The University of New South Wales & NICTA, Australia Joint Work: Xuemin Lin (The University of.
Parallel Subgraph Listing in a Large-Scale Graph Yingxia Shao  Bin Cui  Lei Chen  Lin Ma  Junjie Yao  Ning Xu   School of EECS, Peking University.
IMapReduce: A Distributed Computing Framework for Iterative Computation Yanfeng Zhang, Northeastern University, China Qixin Gao, Northeastern University,
Data Structures, Spring 2004 © L. Joskowicz 1 Data Structures – LECTURE 13 Minumum spanning trees Motivation Properties of minimum spanning trees Kruskal’s.
Graph Algorithms: Minimum Spanning Tree We are given a weighted, undirected graph G = (V, E), with weight function w:
Advanced Topics in Algorithms and Data Structures An overview of the lecture 2 Models of parallel computation Characteristics of SIMD models Design issue.
Cloud Computing Lecture #5 Graph Algorithms with MapReduce Jimmy Lin The iSchool University of Maryland Wednesday, October 1, 2008 This work is licensed.
Aho-Corasick String Matching An Efficient String Matching.
1 Internet Networking Spring 2004 Tutorial 6 Network Cost of Minimum Spanning Tree.
Models of Parallel Computation Advanced Algorithms & Data Structures Lecture Theme 12 Prof. Dr. Th. Ottmann Summer Semester 2006.
T,  e  T c(e) = 50 G = (V, E), c(e) Minimum Spanning Tree.
Parallel K-Means Clustering Based on MapReduce The Key Laboratory of Intelligent Information Processing, Chinese Academy of Sciences Weizhong Zhao, Huifang.
Minimal MapReduce Algorithms Yufei Tao Chinese University of Hong Kong, Hong Kong.
Cloud Computing Lecture #4 Graph Algorithms with MapReduce Jimmy Lin The iSchool University of Maryland Wednesday, February 6, 2008 This work is licensed.
Dynamic Sets and Data Structures Over the course of an algorithm’s execution, an algorithm may maintain a dynamic set of objects The algorithm will perform.
Advanced Topics: MapReduce ECE 454 Computer Systems Programming Topics: Reductions Implemented in Distributed Frameworks Distributed Key-Value Stores Hadoop.
Minimizing interference for the highway model in Wireless Ad-hoc and Sensor Networks Haisheng Tan, Tiancheng, Lou, Francis C.M. Lau, YuexuanWang, Shiteng.
1 The Map-Reduce Framework Compiled by Mark Silberstein, using slides from Dan Weld’s class at U. Washington, Yaniv Carmeli and some other.
Big Graph Processing on Cloud Jeffrey Xu Yu ( 于旭 ) The Chinese University of Hong Kong
Graph Theory in Computer Science
CS4231 Parallel and Distributed Algorithms AY 2006/2007 Semester 2 Lecture 10 Instructor: Haifeng YU.
Pairwise Document Similarity in Large Collections with MapReduce Tamer Elsayed, Jimmy Lin, and Douglas W. Oard Association for Computational Linguistics,
7.1 and 7.2: Spanning Trees. A network is a graph that is connected –The network must be a sub-graph of the original graph (its edges must come from the.
CPS216: Advanced Database Systems (Data-intensive Computing Systems) Introduction to MapReduce and Hadoop Shivnath Babu.
Mining High Utility Itemset in Big Data
1 Distributed Energy-Efficient Scheduling for Data-Intensive Applications with Deadline Constraints on Data Grids Cong Liu and Xiao Qin Auburn University.
Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications.
Graph Algorithms. Graph Algorithms: Topics  Introduction to graph algorithms and graph represent ations  Single Source Shortest Path (SSSP) problem.
Most of contents are provided by the website Graph Essentials TJTSD66: Advanced Topics in Social Media.
Autonomic distributed systems. 2 Think about this Human population x10 9 computer population.
Computer Science and Engineering TreeSpan Efficiently Computing Similarity All-Matching Gaoping Zhu #, Xuemin Lin #, Ke Zhu #, Wenjie Zhang #, Jeffrey.
Distributed Computing Seminar Lecture 5: Graph Algorithms & PageRank Christophe Bisciglia, Aaron Kimball, & Sierra Michels-Slettvet Summer 2007 Except.
Data Structures and Algorithms in Parallel Computing Lecture 7.
1/16/20161 Introduction to Graphs Advanced Programming Concepts/Data Structures Ananda Gunawardena.
CSE 421 Algorithms Richard Anderson Winter 2009 Lecture 5.
Lecture 19 Minimal Spanning Trees CSCI – 1900 Mathematics for Computer Science Fall 2014 Bill Pine.
MapReduce Basics Chapter 2 Lin and Dyer & /tutorial/
Graphs Definition: a graph is an abstract representation of a set of objects where some pairs of the objects are connected by links. The interconnected.
Lecture 12 Algorithm Analysis Arne Kutzner Hanyang University / Seoul Korea.
1 Student Date Time Wei Li Nov 30, 2015 Monday 9:00-9:25am Shubbhi Taneja Nov 30, 2015 Monday9:25-9:50am Rodrigo Sanandan Dec 2, 2015 Wednesday9:00-9:25am.
Graphs. What is a graph? In simple words, A graph is a set of vertices and edges which connect them. A node (or vertex) is a discrete position in the.
Implementation of Classifier Tool in Twister Magesh khanna Vadivelu Shivaraman Janakiraman.
Representing Graphs Depth First Search Breadth First Search Graph Searching Algorithms.
Graph Search Applications, Minimum Spanning Tree
Computing Connected Components on Parallel Computers
Lecture 12 Algorithm Analysis
Data Structures and Algorithms in Parallel Computing
CSE 421: Introduction to Algorithms
Tree Construction (BFS, DFS, MST) Chapter 5
Cloud Computing Lecture #4 Graph Algorithms with MapReduce
MapReduce.
Elementary Graph Algorithms
CS 583 Analysis of Algorithms
Lecture 12 Algorithm Analysis
CSE 421: Introduction to Algorithms
Efficient Graph Traversal with Realistic Conditions
Lecture 12 Algorithm Analysis
Lecture 10 Graph Algorithms
MapReduce: Simplified Data Processing on Large Clusters
Presentation transcript:

Lu Qin Center of Quantum Computation and Intelligent Systems, University of Technology, Australia Jeffery Xu Yu The Chinese University of Hong Kong, China Lijun Chang The University of New South Wales, Australia Hong Cheng The Chinese University of Hong Kong, China Xuemin Lin The University of New South Wales, Australia East China Normal University, China Mahmoud Agbareya, 13 January 2015

Agenda Introduction The Scalable Graph Processing Class ( SGC ) SGC Algorithms Performance Studies 2

Agenda Introduction The Scalable Graph Processing Class ( SGC ) SGC Algorithms Performance Studies 3

Introduction What is MapRecuce? MapReduce Class ( MRC ) Minimal MapReduce Class ( MMC ) 4

Introduction What is MapRecuce? MapReduce Class ( MRC ) Minimal MapReduce Class ( MMC ) 5

What is MapReduce Programming model for processing large data sets in distributed systems. Process the data as (key, value) pairs. May executes in rounds Each round has three phases: map, shuffle and reduce. Each round runs in many machines – each machine is dedicated for one task (map or reduce) Introduced by two researchers from Google in Most popular implementation is Hadoop. 6

What is MapReduce (cont.) Example 7

Introduction What is MapRecuce? MapReduce Class ( MRC ) Minimal MapReduce Class ( MMC ) 8

MapReduce Class ( MRC ) Definition 9

10 MapReduce Class ( MRC ) (Graph version) Definition

Introduction What is MapRecuce? MapReduce Class ( MRC ) Minimal MapReduce Class ( MMC ) 11

Minimal MapReduce Class ( MMC ) Definition 12

13 Minimal MapReduce Class ( MMC ) Definition

Agenda Introduction The Scalable Graph Processing Class ( SGC ) SGC Algorithms Performance Studies 14

The Scalable Graph Processing Class ( SGC ) Motivation Preliminaries SGC Definition Two graph operators in SGC : NE Join EN Join 15

The Scalable Graph Processing Class ( SGC ) Motivation Preliminaries SGC Definition Two graph operators in SGC : NE Join EN Join 16

Motivation 17

Motivation We aim to define a MapReduce class in which, graph algorithm has the following three properties: Scalability: The algorithm can always be speeded up by adding more machines. Stability: The algorithms stops in bounded number of rounds. Robustness: The algorithm never fails regardless of how much memory each machine has. 18

The Scalable Graph Processing Class ( SGC ) Motivation Preliminaries SGC Definition Two graph operators in SGC : NE Join EN Join 19

Preliminaries 20

Preliminaries 21

Preliminaries 22

The Scalable Graph Processing Class ( SGC ) Motivation Preliminaries SGC Definition Two graph operators in SGC : NE Join EN Join 23

24 Scalable Graph Processing Class ( SGC ) definition

25 Scalable Graph Processing Class ( SGC ) definition

The Scalable Graph Processing Class ( SGC ) Motivation Preliminaries SGC Definition Two graph operators in SGC : NE Join EN Join 26

Two graph operators in SGC 27

The Scalable Graph Processing Class ( SGC ) Motivation Preliminaries SGC Definition Two graph operators in SGC : NE Join EN Join 28

NE Join 29

NE Join 30

NE Join in MapReduce 31

NE Join in MapReduce 32

The Scalable Graph Processing Class ( SGC ) Motivation Preliminaries SGC Definition Two graph operators in SGC : NE Join EN Join 33

EN Join 34

EN Join 35

EN Join 36

EN Join in MapReduce 37

Agenda Introduction The Scalable Graph Processing Class ( SGC ) SGC Algorithms Performance Studies 38

SGC Algorithms Basic Graph Algorithms: Breadth First Search Page Rank Graph Keyword Search Advanced Algorithms: Connected Component Minimum Spanning Forest 39

SGC Algorithms Basic Graph Algorithms: Breadth First Search Page Rank Graph Keyword Search Advanced Algorithms: Connected Component Minimum Spanning Forest 40

Breadth First Search 41

SGC Algorithms Basic Graph Algorithms: Breadth First Search Page Rank Graph Keyword Search Advanced Algorithms: Connected Component Minimum Spanning Forest 42

Page Rank 43

Page Rank 44

SGC Algorithms Basic Graph Algorithms: Breadth First Search Page Rank Graph Keyword Search Advanced Algorithms: Connected Component Minimum Spanning Forest 45

Graph Keyword Search 46

Graph Keyword Search 47

SGC Algorithms Basic Graph Algorithms: Breadth First Search Page Rank Graph Keyword Search Advanced Algorithms: Connected Component Minimum Spanning Forest 48

49 Forest Initializing Conditional Star Hooking Unconditional Star Hooking Pointer Jumping Star Detection Procedure

Connected Component 50 Forest Initializing: Line 1: find the minimum neighbor for each node and set it to be the parent.

Connected Component 51 Forest Initializing:

Connected Component 52

Connected Component 53 Forest Initializing:

Connected Component 54 Star Detection: Rules to detect that node is not in star (applied in order)

Connected Component 55

Connected Component 56

Connected Component 57

Connected Component 58

Connected Component 59 Conditional Star Hooking (inside the loop):

Connected Component 60 Conditional Star Hooking (inside the loop): After Conditional Star Hooking, it’s guaranteed that there are no edges between two starts.

Connected Component 61

Connected Component 62 Unconditional Star Hooking (inside the loop):

Connected Component 63

Connected Component 64 Pointer Jumping (inside the loop):

Connected Component 65

SGC Algorithms Basic Graph Algorithms: Breadth First Search Page Rank Graph Keyword Search Advanced Algorithms: Connected Component Minimum Spanning Forest 66

Minimum Spanning Forest 67

Minimum Spanning Forest 68

69 Forest Initializing Cycle Breaking Edge Hooking Pointer Jumping

Minimum Spanning Forest 70

Minimum Spanning Forest Forest Initialization 71

Minimum Spanning Forest 72

Minimum Spanning Forest Cycle Breaking 73

Minimum Spanning Forest Pointer Jumping 74

Minimum Spanning Forest 75

Minimum Spanning Forest 76

Minimum Spanning Forest 77

Minimum Spanning Forest 78

Minimum Spanning Forest Edge Hooking 79

Minimum Spanning Forest 80

Agenda Introduction The Scalable Graph Processing Class ( SGC ) SGC Algorithms Performance Studies 81

82

83

84

85

86

87