Graph Algorithms. Graph Algorithms: Topics  Introduction to graph algorithms and graph represent ations  Single Source Shortest Path (SSSP) problem.

Slides:



Advertisements
Similar presentations
Map-Reduce Graph Processing Adapted from UMD Jimmy Lin’s slides, which is licensed under a Creative Commons Attribution-Noncommercial-Share Alike 3.0 United.
Advertisements

CSE 5243 (AU 14) Graph Basics and a Gentle Introduction to PageRank 1.
IKI 10100: Data Structures & Algorithms Ruli Manurung (acknowledgments to Denny & Ade Azurat) 1 Fasilkom UI Ruli Manurung (Fasilkom UI)IKI10100: Lecture10.
Data-Intensive Computing with MapReduce Jimmy Lin University of Maryland Thursday, February 21, 2013 Session 5: Graph Processing This work is licensed.
大规模数据处理 / 云计算 Lecture 6 – Graph Algorithm 彭波 北京大学信息科学技术学院 4/26/2011 This work is licensed under a Creative Commons.
Thanks to Jimmy Lin slides
Graph A graph, G = (V, E), is a data structure where: V is a set of vertices (aka nodes) E is a set of edges We use graphs to represent relationships among.
Graph Searching CSE 373 Data Structures Lecture 20.
Graph Traversals Visit vertices of a graph G to determine some property: Is G connected? Is there a path from vertex a to vertex b? Does G have a cycle?
Breadth-First Search Seminar – Networking Algorithms CS and EE Dept. Lulea University of Technology 27 Jan Mohammad Reza Akhavan.
Graphs Chapter 12. Chapter Objectives  To become familiar with graph terminology and the different types of graphs  To study a Graph ADT and different.
CS171 Introduction to Computer Science II Graphs Strike Back.
Midwestern State University Department of Computer Science Dr. Ranette Halverson CMPS 2433 CHAPTER 4 - PART 2 GRAPHS 1.
The Shortest Path Problem. Shortest-Path Algorithms Find the “shortest” path from point A to point B “Shortest” in time, distance, cost, … Numerous.
Distributed Graph Processing Abhishek Verma CS425.
ITEC200 – Week 12 Graphs. 2 Chapter Objectives To become familiar with graph terminology and the different types of graphs To study.
Chapter 9 Graph algorithms. Sample Graph Problems Path problems. Connectedness problems. Spanning tree problems.
Cloud Computing Lecture #5 Graph Algorithms with MapReduce Jimmy Lin The iSchool University of Maryland Wednesday, October 1, 2008 This work is licensed.
Graphs. Graph definitions There are two kinds of graphs: directed graphs (sometimes called digraphs) and undirected graphs Birmingham Rugby London Cambridge.
Graphs Chapter 12. Chapter 12: Graphs2 Chapter Objectives To become familiar with graph terminology and the different types of graphs To study a Graph.
MapReduce Algorithms CSE 490H. Algorithms for MapReduce Sorting Searching TF-IDF BFS PageRank More advanced algorithms.
Spring 2010CS 2251 Graphs Chapter 10. Spring 2010CS 2252 Chapter Objectives To become familiar with graph terminology and the different types of graphs.
Graphs & Graph Algorithms 2 Nelson Padua-Perez Bill Pugh Department of Computer Science University of Maryland, College Park.
Design Patterns for Efficient Graph Algorithms in MapReduce Jimmy Lin and Michael Schatz University of Maryland Tuesday, June 29, 2010 This work is licensed.
Chapter 9 Graph algorithms Lec 21 Dec 1, Sample Graph Problems Path problems. Connectedness problems. Spanning tree problems.
CSE 780 Algorithms Advanced Algorithms Graph Algorithms Representations BFS.
Using Search in Problem Solving
Fall 2007CS 2251 Graphs Chapter 12. Fall 2007CS 2252 Chapter Objectives To become familiar with graph terminology and the different types of graphs To.
Graphs & Graph Algorithms 2 Fawzi Emad Chau-Wen Tseng Department of Computer Science University of Maryland, College Park.
Cloud Computing Lecture #4 Graph Algorithms with MapReduce Jimmy Lin The iSchool University of Maryland Wednesday, February 6, 2008 This work is licensed.
Graph Essentials Social Media Mining. 2 Measures and Metrics 2 Social Media Mining Graph Essentials Networks A network is a graph. – Elements of the network.
Social Media Mining Graph Essentials.
Design Patterns for Efficient Graph Algorithms in MapReduce Jimmy Lin and Michael Schatz University of Maryland MLG, January, 2014 Jaehwan Lee.
Graph Algorithms Ch. 5 Lin and Dyer. Graphs Are everywhere Manifest in the flow of s Connections on social network Bus or flight routes Social graphs:
Chapter 9 – Graphs A graph G=(V,E) – vertices and edges
Lecture 12-2: Introduction to Computer Algorithms beyond Search & Sort.
Computer Science 112 Fundamentals of Programming II Introduction to Graphs.
MapReduce and Graph Data Chapter 5 Based on slides from Jimmy Lin’s lecture slides ( (licensed.
COSC 2007 Data Structures II Chapter 14 Graphs III.
Introduction to Algorithms Jiafen Liu Sept
CSE 2331 / 5331 Topic 12: Shortest Path Basics Dijkstra Algorithm Relaxation Bellman-Ford Alg.
ITEC 2620A Introduction to Data Structures Instructor: Prof. Z. Yang Course Website: 2620a.htm Office: TEL 3049.
Most of contents are provided by the website Graph Essentials TJTSD66: Advanced Topics in Social Media.
Distributed Computing Seminar Lecture 5: Graph Algorithms & PageRank Christophe Bisciglia, Aaron Kimball, & Sierra Michels-Slettvet Summer 2007 Except.
Data Structures and Algorithms in Parallel Computing Lecture 3.
Graphs. Graphs Similar to the graphs you’ve known since the 5 th grade: line graphs, bar graphs, etc., but more general. Those mathematical graphs are.
1 Directed Graphs Chapter 8. 2 Objectives You will be able to: Say what a directed graph is. Describe two ways to represent a directed graph: Adjacency.
Graphs Chapter 12. Chapter 12: Graphs2 Chapter Objectives To become familiar with graph terminology and the different types of graphs To study a Graph.
Graphs. Introduction Graphs are a collection of vertices and edges Graphs are a collection of vertices and edges The solid circles are the vertices A,
Big Data Infrastructure Week 5: Analyzing Graphs (2/2) This work is licensed under a Creative Commons Attribution-Noncommercial-Share Alike 3.0 United.
Big Data Infrastructure Week 5: Analyzing Graphs (1/2) This work is licensed under a Creative Commons Attribution-Noncommercial-Share Alike 3.0 United.
Importance Measures on Nodes Lecture 2 Srinivasan Parthasarathy 1.
Main Index Contents 11 Main Index Contents Graph Categories Graph Categories Example of Digraph Example of Digraph Connectedness of Digraph Connectedness.
大规模数据处理 / 云计算 05 – Graph Algorithm 闫宏飞 北京大学信息科学技术学院 7/22/2014 Jimmy Lin University of Maryland SEWMGroup This work.
Lecture 20. Graphs and network models 1. Recap Binary search tree is a special binary tree which is designed to make the search of elements or keys in.
Big Data Infrastructure
Big Data Infrastructure
Graphs Lecture 19 CS2110 – Spring 2013.
I206: Lecture 15: Graphs Marti Hearst Spring 2012.
MapReduce and Data Management
Graphs & Graph Algorithms 2
Cloud Computing Lecture #4 Graph Algorithms with MapReduce
Graph Algorithms Ch. 5 Lin and Dyer.
Algorithms (2IL15) – Lecture 5 SINGLE-SOURCE SHORTEST PATHS
Chapter 22: Elementary Graph Algorithms I
Graph Algorithms Adapted from UMD Jimmy Lin’s slides, which is licensed under a Creative Commons Attribution-Noncommercial-Share Alike 3.0 United States.
Lecture 10 Graph Algorithms
Algorithms CSCI 235, Spring 2019 Lecture 32 Graphs I
Graph Algorithms Ch. 5 Lin and Dyer.
More Graphs Lecture 19 CS2110 – Fall 2009.
Presentation transcript:

Graph Algorithms

Graph Algorithms: Topics  Introduction to graph algorithms and graph represent ations  Single Source Shortest Path (SSSP) problem  Refresher: Dijkstra’s algorithm  Breadth-First Search with MapReduce  PageRank

What’s a graph?  G = (V,E), where  V represents the set of vertices (nodes)  E represents the set of edges (links)  Both vertices and edges may contain additional informat ion  Different types of graphs:  Directed vs. undirected edges  Presence or absence of cycles ...

Some Graph Problems  Finding shortest paths  Routing Internet traffic and UPS trucks  Finding minimum spanning trees  Telco laying down fiber  Finding Max Flow  Airline scheduling  Identify “special” nodes and communities  Breaking up terrorist cells, spread of avian flu  Bipartite matching  Monster.com, Match.com  And of course... PageRank

Representing Graphs  G = (V, E)  Two common representations  Adjacency matrix  Adjacency list

Adjacency Matrices Represent a graph as an n x n square matrix M  n = |V|  M ij = 1 means a link from node i to j

Adjacency Lists Take adjacency matrices… and throw away all the zeros : 2, 4 2: 1, 3, 4 3: 1 4: 1, 3

Single Source Shortest Path  Problem: find shortest path from a source node to one or more target nodes  First, a refresher: Dijkstra’s Algorithm

Dijkstra’s Algorithm Example 0 0     Example from CLR

Dijkstra’s Algorithm Example   Example from CLR

Dijkstra’s Algorithm Example Example from CLR

Dijkstra’s Algorithm Example Example from CLR

Dijkstra’s Algorithm Example Example from CLR

Dijkstra’s Algorithm Example Example from CLR

Single Source Shortest Path  Problem: find shortest path from a source node to one or more target nodes  Single processor machine: Dijkstra’s Algorithm  MapReduce: parallel Breadth-First Search (BFS)

Finding the Shortest Path  Consider simple case of equal edge weights  Solution to the problem can be defined inductively  Here’s the intuition:  D ISTANCE T O (startNode) = 0  For all nodes n directly reachable from startNode, D ISTANCE T O (n) = 1  For all nodes n reachable from some other set of nodes S, D ISTANCE T O (n) = 1 + min(D ISTANCE T O (m), m  S) m3m3 m3m3 m2m2 m2m2 m1m1 m1m1 n n … … … cost 1 cost 2 cost 3

Dikjstra Shortest Path Algorithm

From Intuition to Algorithm  Mapper input  Key: node n  Value: D (distance from start), adjacency list (list of nodes reachable from n)  Mapper output   p  targets in adjacency list: emit( key = p, value = D+1)  The reducer gathers possible distances to a given p an d selects the minimum one  Additional bookkeeping needed to keep track of actual path

MapReduce for shortest path

Multiple Iterations Needed  Each MapReduce iteration advances the “known fron tier” by one hop  Subsequent iterations include more and more reachable nodes as frontier expands  Multiple iterations are needed to explore entire graph  Feed output back into the same MapReduce task  Preserving graph structure:  Problem: Where did the adjacency list go?  Solution: mapper emits (n, adjacency list) as well

Weighted Edges  Now add positive weights to the edges  Simple change: adjacency list in map task includes a weight w for each edge  emit (p, D+w p ) instead of (p, D+1) for each node p

Comparison to Dijkstra  Dijkstra’s algorithm is more efficient  At any step it only pursues edges from the minimum-cost path inside the frontier  MapReduce explores all paths in parallel

Random Walks Over the Web  Model:  User starts at a random Web page  User randomly clicks on links, surfing from page to page  PageRank = the amount of time that will be spent on any given page

PageRank: Defined Given page x with in-bound links t 1 …t n, where  C(t) is the out-degree of t   is probability of random jump  N is the total number of nodes in the graph X X t1t1 t1t1 t2t2 t2t2 tntn tntn …

PageRank Example

Computing PageRank  Properties of PageRank  Can be computed iteratively  Effects at each iteration is local  Sketch of algorithm:  Start with seed PR i values  Each page distributes PR i “credit” to all pages it links to  Each target page adds up “credit” from multiple in-boun d links to compute PR i+1  Iterate until values converge

PageRank in MapReduce Map: distribute PageRank “credit” to link targets... Reduce: gather up PageRank “credit” from multiple sources to compute new PageRank value Iterate until convergence

MapReduce for PageRank

PageRank: Issues  Is PageRank guaranteed to converge? How quickly?  What is the “correct” value of , and how sensitive is t he algorithm to it?  What about dangling links?  How do you know when to stop?

Graph Algorithms in MapRe duce  General approach:  Store graphs as adjacency lists  Each map task receives a node and its adjacency list  Map task compute some function of the link structure, e mits value with target as the key  Reduce task collects keys (target nodes) and aggregates  Perform multiple MapReduce iterations until some ter mination condition  Remember to “pass” graph structure from one iteration t o next