Download presentation
1
Models of Web-Like Graphs: Integrated Approach
Igor Kanovsky, Shaul Mazor Emek Yezreel College, Israel University of Haifa, Israel © Igor Kanovsky & Shaul SCI2003, Orlando, FL, July 2003
2
© Igor Kanovsky & Shaul Mazor @ SCI2003, Orlando, FL, July 2003
The Web as a graph A huge digraph with similar to the Web graph statistical characteristics is called a Web-like graph. The known significant properties of the Web as a graph are: Power-law distributions. Small world topology. Bipartite cliques. “Bow-tie" shape. © Igor Kanovsky & Shaul SCI2003, Orlando, FL, July 2003
3
Power-Law distributions (PLD)
PLD of in- and out-degrees of vertices. The number of web pages having kin links on the page or kout links from the page is proportional to k- for some constants in, out > 2 Andrei Broder, Ravi Kumar and others. Graph structure in the web.2001 © Igor Kanovsky & Shaul SCI2003, Orlando, FL, July 2003
4
© Igor Kanovsky & Shaul Mazor @ SCI2003, Orlando, FL, July 2003
The Small World Small diameter of the graph.The average distance between any two connected web graph vertices is bounded by log N, where N is the number of the vertices in the graph. Big clustering coefficient. Clustering coefficient C(v) for a vertex v is a percentage of neighbours of v connected to each other. For graph C = <C(v)>. Clustering coefficient of the Web graph is significant bigger in comparison to a random graph. © Igor Kanovsky & Shaul SCI2003, Orlando, FL, July 2003
5
© Igor Kanovsky & Shaul Mazor @ SCI2003, Orlando, FL, July 2003
The Small World (2) Lada A. Adamic. The Small World Web © Igor Kanovsky & Shaul SCI2003, Orlando, FL, July 2003
6
© Igor Kanovsky & Shaul Mazor @ SCI2003, Orlando, FL, July 2003
Bipartite Small Cores A bipartite core Ci,j is a graph on i+j nodes that contains at least one bipartite clique Ki,j as a subgraph. There are a lot of bipartite small cores Ci,j (with i,j ≥ 3) in the Web graph (a random graph does not have small cliques). K3,3 This small cliques are the cores of the web communities – set of connected sites with a common content topic. © Igor Kanovsky & Shaul SCI2003, Orlando, FL, July 2003
7
Bipartite Small Cores (2)
Number of Cij as functions of i.j Ravi Kumar, Prabhakar Raghavan, Sridhar Rajagopalan and Andrew Tomkins. Extracting large-scale knowledge bases from the web.2000. © Igor Kanovsky & Shaul SCI2003, Orlando, FL, July 2003
8
© Igor Kanovsky & Shaul Mazor @ SCI2003, Orlando, FL, July 2003
"bow-tie" shape The major part of web pages can be divided into four sets: a core made by the strongly connected components (SCC), i.e. pages that are mutually connected to each other, 2 sets (upstream and downstream) made by the pages that can only reach (or be reached by) the pages in the core, and a set (tendril) containing pages that can neither reach nor be reached from the core. The Web graph has a "bow-tie" shape, © Igor Kanovsky & Shaul SCI2003, Orlando, FL, July 2003
9
Web-like Graph Modeling
The aim is to find stochastic processes yields web-like graph. Our integrated approach is based on well known Web graph models extended in order to satisfy all mentioned above statistical properties. We try to keep a web-like graph model as simple as possible, thus it has to have a minimum set of parameters. © Igor Kanovsky & Shaul SCI2003, Orlando, FL, July 2003
10
© Igor Kanovsky & Shaul Mazor @ SCI2003, Orlando, FL, July 2003
Web-like graph models Erdös-Rènyi Model. Classical pure random graph model. A “Small world” Model. Regular lattice with small number of random long-range links. Has no power-law distributions (PLD). The Preferential-Attachment (PA) Model. At every time step, a new node is added and linked to other nodes randomly with probability proportional to node’s in degree. Has in-degree PLD (the slope is -3). The Copying Model. At every time step, a new node is added and linked to other nodes or by coping an existing link from random chosen node (with probability 1-p) , or randomly (with probability p). Has in-degree PLD (the slope is –(2-p)/(1-p) ). © Igor Kanovsky & Shaul SCI2003, Orlando, FL, July 2003
11
Extended scale-free model (1)
At each time step, a new vertex is added and is connected to existing vertex through random number m ( z) of new edges, where the average number of edges per node (z) is constant for a growing graph. The probability that an existing vertex gains an edge is proportional to its in-degree. © Igor Kanovsky & Shaul SCI2003, Orlando, FL, July 2003
12
Extended scale-free model (2)
Simultaneously, z-m directed edges are distributed among all the vertices in the graph by the following rules: (i) the source is chosen with a probability proportional to their out degree, (ii) the target ends is chosen with a probability proportional to their in-degree. The model has 3 parameters: average degree z, initial attractiveness of vertex to gain in and out edge Ain , Aout . © Igor Kanovsky & Shaul SCI2003, Orlando, FL, July 2003
13
Simulation results.In-degree distribution.
Our model. N = 30 K.<k>=8 Ain = 2.Aout = 6. Web. N = 500 M. © Igor Kanovsky & Shaul SCI2003, Orlando, FL, July 2003
14
Simulation results.Out-degree distribution.
Our model. N = 30 K.<k>=8 Ain = 2.Aout = 6. Web. N = 500 M. © Igor Kanovsky & Shaul SCI2003, Orlando, FL, July 2003
15
Degree distribution for various average degree
Simulation results Degree distribution for various average degree © Igor Kanovsky & Shaul SCI2003, Orlando, FL, July 2003
16
Degree distribution for various initial “in” attractiveness Ain
© Igor Kanovsky & Shaul SCI2003, Orlando, FL, July 2003
17
Degree distribution for various initial “out” attractiveness Aout
© Igor Kanovsky & Shaul SCI2003, Orlando, FL, July 2003
18
Diameter for various average degree.
© Igor Kanovsky & Shaul SCI2003, Orlando, FL, July 2003
19
Diameter for various initial “in” attractiveness Ain
© Igor Kanovsky & Shaul SCI2003, Orlando, FL, July 2003
20
Clustering coefficient for various average degree
© Igor Kanovsky & Shaul SCI2003, Orlando, FL, July 2003
21
Clustering coefficient for various initial “in” attractiveness Ain
© Igor Kanovsky & Shaul SCI2003, Orlando, FL, July 2003
22
Number of bipartite cliques K3,3 for various average degree.
© Igor Kanovsky & Shaul SCI2003, Orlando, FL, July 2003
23
© Igor Kanovsky & Shaul Mazor @ SCI2003, Orlando, FL, July 2003
Number of bipartite cliques K3,3 for various initial “in” attractiveness Ain. © Igor Kanovsky & Shaul SCI2003, Orlando, FL, July 2003
24
Largest SCC for various average degree
© Igor Kanovsky & Shaul SCI2003, Orlando, FL, July 2003
25
Largest SCC for various initial “in” attractiveness Ain.
© Igor Kanovsky & Shaul SCI2003, Orlando, FL, July 2003
26
Characteristics of several web-like models
NA – not applicable © Igor Kanovsky & Shaul SCI2003, Orlando, FL, July 2003
27
Advantages of our approach
Only our extended scale-free model capture all known statistical properties of the Web graph. The model is very simple. It has only three parameters. The model may be used for developing and testing different algorithms for Web (like search, ranking, site promotion). © Igor Kanovsky & Shaul SCI2003, Orlando, FL, July 2003
28
igor kanovsky, igork@yvc.ac.il, http://www.yvc.ac.il/ik/
Thank you. For contacts: igor kanovsky, © Igor Kanovsky & Shaul SCI2003, Orlando, FL, July 2003
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.