The Shape of the Web So, the Web is a directed graph, but what does it look like?

Slides:



Advertisements
Similar presentations
Introduction to Algorithms Graph Algorithms
Advertisements

Two algorithms for checking emptiness. How to check for emptiness? Is L (A) = ; ? Need to check if there exists an accepting computation (passes through.
Comp 122, Spring 2004 Graph Algorithms – 2. graphs Lin / Devi Comp 122, Fall 2004 Identification of Edges Edge type for edge (u, v) can be identified.
Web as Network: A Case Study Networked Life CIS 112 Spring 2010 Prof. Michael Kearns.
1 Strongly connected components. 2 Definition: the strongly connected components (SCC) C 1, …, C k of a directed graph G = (V,E) are the largest disjoint.
CSC401 – Analysis of Algorithms Lecture Notes 14 Graph Biconnectivity
Elementary Graph Algorithms Depth-first search.Topological Sort. Strongly connected components. Chapter 22 CLRS.
Theory of Computing Lecture 6 MAS 714 Hartmut Klauck.
CS 267: Automated Verification Lecture 10: Nested Depth First Search, Counter- Example Generation Revisited, Bit-State Hashing, On-The-Fly Model Checking.
Graph Algorithms What is a graph? V - vertices E µ V x V - edges directed / undirected Why graphs? Representation: adjacency matrix adjacency lists.
Lecture 16: DFS, DAG, and Strongly Connected Components Shang-Hua Teng.
1 Chapter 22: Elementary Graph Algorithms IV. 2 About this lecture Review of Strongly Connected Components (SCC) in a directed graph Finding all SCC (i.e.,
CS 312 – Graph Algorithms1 Graph Algorithms Many problems are naturally represented as graphs – Networks, Maps, Possible paths, Resource Flow, etc. Ch.
Graphs – Depth First Search ORD DFW SFO LAX
Graph Searching CSE 373 Data Structures Lecture 20.
Graph Traversals Visit vertices of a graph G to determine some property: Is G connected? Is there a path from vertex a to vertex b? Does G have a cycle?
TECH Computer Science Graphs and Graph Traversals  // From Tree to Graph  // Many programs can be cast as problems on graph Definitions and Representations.
Introduction This chapter explores graphs and their applications in computer science This chapter explores graphs and their applications in computer science.
Graph Searching (Graph Traversal) Algorithm Design and Analysis Week 8 Bibliography: [CLRS] – chap 22.2 –
Graph Traversals Visit vertices of a graph G to determine some property: Is G connected? Is there a path from vertex a to vertex b? Does G have a cycle?
Tirgul 8 Graph algorithms: Strongly connected components.
Data Structures, Spring 2004 © L. Joskowicz 1 Data Structures – LECTURE 14 Strongly connected components Definition and motivation Algorithm Chapter 22.5.
CPSC 311, Fall CPSC 311 Analysis of Algorithms Graph Algorithms Prof. Jennifer Welch Fall 2009.
Testing for Connectedness and Cycles
CPSC 411 Design and Analysis of Algorithms Set 8: Graph Algorithms Prof. Jennifer Welch Spring 2011 CPSC 411, Spring 2011: Set 8 1.
Lecture 13 CSE 331 Oct 2, Announcements Please turn in your HW 3 Graded HW2, solutions to HW 3, HW 4 at the END of the class Maybe extra lectures.
1 Data Structures DFS, Topological Sort Dana Shapira.
Lecture 10 Topics Application of DFS Topological Sort
1 The Graph Abstract Data Type CS 5050 Chapter 6.
CSE 780 Algorithms Advanced Algorithms Graph Alg. DFS Topological sort.
Tirgul 11 BFS,DFS review Properties Use. Breadth-First-Search(BFS) The BFS algorithm executes a breadth search over the graph. The search starts at a.
CS344: Lecture 16 S. Muthu Muthukrishnan. Graph Navigation BFS: DFS: DFS numbering by start time or finish time. –tree, back, forward and cross edges.
Lecture 15: Depth First Search Shang-Hua Teng. Graphs G= (V,E) B E C F D A B E C F D A Directed Graph (digraph) –Degree: in/out Undirected Graph –Adjacency.
1 CSE 417: Algorithms and Computational Complexity Winter 2001 Lecture 10 Instructor: Paul Beame.
Data Structures, Spring 2006 © L. Joskowicz 1 Data Structures – LECTURE 14 Strongly connected components Definition and motivation Algorithm Chapter 22.5.
CS 312: Algorithm Analysis Lecture #16: Strongly Connected Components This work is licensed under a Creative Commons Attribution-Share Alike 3.0 Unported.
Depth-First Search Idea: Keep going forward as long as there are unseen nodes to be visited. Backtrack when stuck. v G G G G is completely traversed.
Graph Algorithms Using Depth First Search Prepared by John Reif, Ph.D. Distinguished Professor of Computer Science Duke University Analysis of Algorithms.
COSC 3101A - Design and Analysis of Algorithms 10
CSC 213 – Large Scale Programming. Today’s Goals  Make Britney sad through my color choices  Revisit issue of graph terminology and usage  Subgraphs,
COM1721: Freshman Honors Seminar A Random Walk Through Computing Lecture 2: Structure of the Web October 1, 2002.
Graphs.
Biconnected Components CS 312. Objectives Formulate problems as problems on graphs Implement iterative DFS Describe what a biconnected component is Be.
Lecture 11 Algorithm Analysis Arne Kutzner Hanyang University / Seoul Korea.
Jan Topological Order and SCC Edge classification Topological order Recognition of strongly connected components.
Elementary Graph Algorithms Many of the slides are from Prof. Plaisted’s resources at University of North Carolina at Chapel Hill.
Depth-First Search1 DB A C E. 2 Depth-first search (DFS) is a general technique for traversing a graph A DFS traversal of a graph G – Visits all the vertices.
Chapter 22: Elementary Graph Algorithms
Depth-First Search Lecture 21: Graph Traversals
Graphs. Introduction Graphs are a collection of vertices and edges Graphs are a collection of vertices and edges The solid circles are the vertices A,
Trees Thm 2.1. (Cayley 1889) There are nn-2 different labeled trees
1 Algorithms CSCI 235, Fall 2015 Lecture 35 Graphs IV.
Graphs and Paths : Chapter 15 Saurav Karmakar
 2004 SDU 1 Lecture5-Strongly Connected Components.
Main Index Contents 11 Main Index Contents Graph Categories Graph Categories Example of Digraph Example of Digraph Connectedness of Digraph Connectedness.
Hw. 6: Algorithm for finding strongly connected components. Original digraph as drawn in our book and in class: Preorder label : Postorder label Nodes:
CSC 213 – Large Scale Programming Lecture 31: Graph Traversals.
Lecture 7 Graph Traversal
CSC317 1 At the same time: Breadth-first search tree: If node v is discovered after u then edge uv is added to the tree. We say that u is a predecessor.
Chapter 22: Elementary Graph Algorithms Overview: Definition of a graph Representation of graphs adjacency list matrix Elementary search algorithms breadth-first.
Graph Algorithms – 2. graphs Parenthesis Theorem Theorem 22.7 For all u, v, exactly one of the following holds: 1. d[u] < f [u] < d[v] < f [v] or.
Graph Search Applications, Minimum Spanning Tree
Introduction to Algorithms
Tracing An Algorithm for Strongly Connected Components that uses Depth First Search Graph obtained from Text, page a-al: Geetika Tewari.
Undirected versus Directed Graphs
Main algorithm with recursion: We’ll have a function DFS that initializes, and then calls DFS-Visit, which is a recursive function and does the depth first.
Lecture 10 Algorithm Analysis
Advanced Algorithms Analysis and Design
Presentation transcript:

The Shape of the Web So, the Web is a directed graph, but what does it look like?

What is the shape of the web? Broder et.al: Graph structure of the web (2000)

The Bow-Tie Shape of the Web CORE: A giant strongly connected component (SCC) IN: The part of pages leading to CORE OUT: Pages reachable from CORE TENDRILS: Not part of CORE, leading out of IN, into OUT, or bypassing CORE (TUBES). ISLANDS: Disconnected pages.

Within a SCC, a path can be found from any node to any other node: 1  2, 2  3, 3  1. How do you compute a SCC?

Strongly Connected Component SCC is defined on a digraph G = (V,E) only! (Why?) SCC is a subset C of V with the following properties: 1.  u,v  C, u is reachable from v in G and v is reachable from u in G 2. If C is a proper subset of another subset D of V, then D does not satisfy property 1 Translation: C is a maximal subset of Vertices with mutual reachability Which graph traversal algorithm(s) produce reachability information? How many SCCs does G have?

Transpose of a Digraph If G = (V,E) is a digraph, then its transpose G T is the digraph G = (V, E T ) where E T = { (v,u) | (u,v)  E } Which graph has more SCCs? G or G T ?

A fact of life… A directed graph and its transpose have exactly the same strongly connected components Why?

Detecting the STRONGLY CONNECTED COMPONENT requires 2 DFS traversals 1. Run DFS to compute the finishing times ƒ[u]. 2. Computer the graph’s transpose. 3. Run a 2 nd DFS, while considering vertices in the order of decreasing ƒ[u]. 4. Output the vertices in each tree in the forest as a separate SCC CRLS Algorithms textbook page – 8 – 4 – 7 – 6 – 1 – 2 – 5 – /8 3/4 2/7 12/13 10/11 9/14 5/6 15/16 17/ /16 12/15 13/14 7/8 9/10 5/6 17/18 3/4 1/2

IN: OUT: TENDRILS: ISLANDS: SCC identified. How about the rest?

Visualizing a small Web

But Why is it a Bowtie? Maybe is a teapot, a daisy? A bugle? A cauliflower? It is a collection of Bowties, because. (it could not be anything else) Proof by construction M: Why the Shape of the Web is a Bowtie? (2010)

Bowtie Web: Proof by Construction Start by considering one link per page Pseudo-trees appear The cycles of pseudotrees are like budding COREs INs created, no OUTs

The Second link creates a Bowtie Consider the 2 nd link It will reduce the number of components, enlarge the CORE, create IN and TENDRILS OUTs may appear as smaller cycles (than CORE)

Nodes w/out links and Third links Now include nodes w/out links (as possible targets for Bowtie nodes). They start off as ISLANDS Consider the effect of the 3 rd link. What happens when you link: IN node to IN, CORE, OUT, ISLAND CORE node to IN, CORE, OUT, ISLAND OUT node to IN, CORE, OUT, ISLAND ISLAND node to IN, CORE, OUT, ISLAND

Third links enlarge Bowties! INCOREOUTISLAND INuninteresting TUBETENDRIL COREEnlarged CORE uninteresting Enlarged OUT OUT ISLAND

Web is many Bowties!

So, why is the Web shaped as a Bowtie? Because. That’s the only thing it could be. A randomly generated digraph is a bowtie (or a daisy, or a teapot, or a bugle, or a ____)

How do you explore the Web Graph? Depth-first search? Breadth-first search? Best-first search? Random walk? How do you even know you have explored the whole graph? Where do you store it?

Our Random Selections: 1 st Link CORE: 5 IN: 12 OUT: 0 ISLAND: 1 TENDRIL: 0

Our Random Selections: 2 nd Link CORE: 14 IN: 4 OUT: 0 ISLAND: 0 TENDRIL: 0