Download presentation
Presentation is loading. Please wait.
Published bySara May Modified over 8 years ago
1
Complex Networks Analysis Information Systems Engineering 372-2-5503 (2013 A) Instructor: Rami Puzis (puzis@bgu.ac.il) TA: Luiza Nahshon(lu.nacshon@gmail.com) Web: https://piazza.com/bgu.ac.il/fall2013/37225503 Coursera: https://class.coursera.org/sna-003/class Office hours: Schedule via email Slides are taken in part from Network Science class 2012 ( http://barabasilab.neu.edu/courses/phys5116/)
2
Course schedule TopicWeek Introduction, Graph Theory, 1 Properties of networks 2-3 Centrality measures 4-7 Community structure 8-9 Generative models 10-11 Diffusion in networks 11-13
3
Exercise 1 (due 31/10/2013) – Find and download four different networks from four different web sites. The web sites should reside in different domains. At least one network should have less than 1,000 vertices and at least one network should have more than 100,000 vertices. – Use gephi of other network analysis software for a nice layout of the networks and submit PDF of the network visualization. – Fill the course questioner.
4
WHY COMPLEX NETWORKS
5
TRANSPORTATION:
6
Communication networks domain 2 domain 1 domain 3 rout er
7
Thex A SIMPLE STORY (2) Predicting the H1N1 pandemic Network Science: Introduction 2012
8
RealProjected EPIDEMIC FORECAST Predicting the H1N1 pandemic Network Science: Introduction 2012
9
Thex Network Science: Introduction January 10, 2011 The August 14, 2003 outage Network Science: Introduction 2012
10
HUMANS GENES Humans have only about three times as many genes as the fly, so human complexity seems unlikely to come from a sheer quantity of genes. Rather, some scientists suggest, each human has a network with different parts like genes, proteins and groups. Network Science: Introduction 2012
11
Complex systems Made of many non-identical elements connected by diverse interactions. NETWORK HUMANS GENES Drosophila Melanogaster Homo Sapiens Network Science: Introduction 2012
12
Drosophila Melanogaster Homo Sapiens In the generic networks shown, the points represent the elements of each organism’s genetic network, and the dotted lines show the interactions between them. HUMANS GENES Network Science: Introduction 2012
13
HARDWARE
14
SOFTWARE:
15
Network Science: Introduction 2012
16
THE LIFE OF NETWORKS Network Science: Introduction 2012
17
THE LIFE OF NETWORKS Network Science: Introduction 2012
18
Graph theory: 1735, Euler Social Network Research: 1930s, Moreno Communication networks/internet: 1960s Ecological Networks: May, 1979. THE HISTORY OF NETWORK ANALYSIS Network Science: Introduction 2012
19
NETWORK SCIENCE The science of the 21 st century Network Science: Introduction 2012
20
Data Availability: Universality: The (urgent) need to understand complexity: THE EMERGENCE OF NETWORK SCIENCE Movie Actor Network, 1998; World Wide Web, 1999. C elegans neural wiring diagram 1990 Citation Network, 1998 Metabolic Network, 2000; PPI network, 2001 The architecture of networks emerging in various domains of science, nature, and technology are more similar to each other than one would have expected. Despite the challenges complex systems offer us, we cannot afford to not address their behavior, a view increasingly shared both by scientists and policy makers. Networks are not only essential for this journey, but during the past decade some of the most important advances towards understanding complexity were provided in context of network theory. Network Science: Introduction 2012
21
Thex If you were to understand the spread of diseases, can you do it without networks? If you were to understand the WWW structure, searchability, etc, hopeless without invoking the Web’s topology. If you want to understand human diseases, it is hopeless without considering the wiring diagram of the cell. MOST IMPORTANT Networks Really Matter Network Science: Introduction 2012
22
GRAPH THEORY AND BASIC TERMINOLOGY
23
Can one walk across the seven bridges and never cross the same bridge twice? THE BRIDGES OF KONIGSBERG http://www.numericana.com/answer/graphs.htm Euler PATH or CIRCUIT: return to the starting point by traveling each link of the graph once and only once. Network Science: Graph Theory 2012
24
COMPONENTS OF A COMPLEX SYSTEM components: nodes, vertices N interactions: links, edges L system: network, graph (N,L) Network Science: Graph Theory 2012
25
network often refers to real systems www, social network metabolic network. Language: (Network, node, link) graph: mathematical representation of a network web graph, social graph (a Facebook term) Language: (Graph, vertex, edge) We will try to make this distinction whenever it is appropriate, but in most cases we will use the two terms interchangeably. NETWORKS OR GRAPHS? Network Science: Graph Theory 2012
26
A COMMON LANGUAGE Peter Mary Albert co-worker friend brothers friend Protein 1 Protein 2 Protein 5 Protein 9 Movie 1 Movie 3 Movie 2 Actor 3 Actor 1 Actor 2 Actor 4 N=4 L=4 Network Science: Graph Theory 2012
27
The choice of the proper network representation determines our ability to use network theory successfully. In some cases there is a unique, unambiguous representation. In other cases, the representation is by no means unique. For example,, the way we assign the links between a group of individuals will determine the nature of the question we can study. CHOOSING A PROPER REPRESENTATION Network Science: Graph Theory 2012
28
If you connect individuals that work with each other, you will explore the professional network. CHOOSING A PROPER REPRESENTATION Network Science: Graph Theory 2012
29
CHOOSING A PROPER REPRESENTATION If you connect those that have a romantic and sexual relationship, you will be exploring the sexual networks. HOWEVER Could you investigate Sexually Transmitted Diseases without time series data?
30
CHOOSING A PROPER REPRESENTATION Grey arrows indicate STD propagation chance in one Direction Blue lines indicate STD propagation chance In both directions
31
If you connect individuals based on their first name (all Peters connected to each other), you will be exploring what? It is a network, nevertheless. CHOOSING A PROPER REPRESENTATION Network Science: Graph Theory 2012
32
Joe Mary Lisa Bob Ted Dina Ed An n Mia Jim Steve Brad Ellie Edna Mind y Ben LuisLea Anna Annotated networks
33
Joe Mary Lisa Bob Ted Dina Ed An n Mia Jim Steve Brad Ellie Edna Mind y Ben LuisLea Anna poor average rich teen old adult young adult Colleague Friend Child Annotated networks
34
Network Science: Graph Theory 2012
35
A ij =1 if there is a link between node i and j A ij =0 if nodes i and j are not connected to each other. ADJACENCY MATRIX Note that for a directed graph (right) the matrix is not symmetric. 4 2 3 1 2 3 1 4 Network Science: Graph Theory 2012
36
a b c d e f g h a 0 1 0 0 1 0 1 0 b 1 0 1 0 0 0 0 1 c 0 1 0 1 0 1 1 0 d 0 0 1 0 1 0 0 0 e 1 0 0 1 0 0 0 0 f 0 0 1 0 0 0 1 0 g 1 0 1 0 0 0 0 0 h 0 1 0 0 0 0 0 0 ADJACENCY MATRIX b e g a c f h d Network Science: Graph Theory 2012
37
3 GRAPHOLOGY 1 UndirectedDirected 1 4 2 3 2 1 4 Actor network, protein-protein interactionsWWW, citation networks Network Science: Graph Theory 2012
38
GRAPHOLOGY 2 Unweighted (undirected) Weighted (undirected) 3 1 4 2 3 2 1 4 protein-protein interactions, wwwCall Graph, metabolic networks Network Science: Graph Theory 2012
39
What do the numbers mean? Semantics! Link weights are used to quantify semantics length, strength, capacity, … Network Science: Graph Theory 2012 0.5 1 2 2 2 3 3 5 1 2 2 2 3 3 5 WeightsLinks Distance, capacityRoad network Number of emails, size of the emails Email network Number of publications Coauthoirship
40
GRAPHOLOGY 3 Self-interactionsMultigraph (undirected) 3 1 4 2 3 2 1 4 Protein interaction network, wwwSocial networks, collaboration networks Network Science: Graph Theory 2012
41
GRAPHOLOGY 4 Complete Graph (undirected) 3 1 4 2 Actor networks Network Science: Graph Theory 2012
42
GRAPHOLOGY: Real networks can have multiple characteristics WWW > directed multigraph with self-interactions Protein Interactions > undirected unweighted with self-interactions Collaboration network > undirected multigraph or weighted. Mobile phone calls > directed, weighted. Facebook Friendship links > undirected, unweighted. Network Science: Graph Theory 2012
43
bipartite graph (or bigraph) is a graph whose nodes can be divided into two disjoint sets U and V such that every link connects a node in U to one in V; that is, U and V are independent sets.graph disjoint setsindependent sets Examples: Hollywood actor network Collaboration networks Disease network (diseasome) BIPARTITE GRAPHS Network Science: Graph Theory 2012 U V U V
44
Gene network GENOME PHENOME DISEASOME Disease network Goh, Cusick, Valle, Childs, Vidal & Barabási, PNAS (2007) GENE NETWORK – DISEASE NETWORK Network Science: Graph Theory 2012
46
ADJACENCY MATRIX AND NODE DEGREES Undirected 2 3 1 4 Directed 4 2 3 1 Network Science: Graph Theory 2012
47
Node degree: the number of links connected to the node. NODE DEGREES Undirected In directed networks we can define an in-degree and out-degree. The (total) degree is the sum of in- and out-degree. Source: a node with k in = 0; Sink: a node with k out = 0. Directed A G F B C D E A B
48
We have a sample of values x 1,..., x N Average (a.k.a. mean) : typical value = (x 1 + x 1 +... + x N )/N = Σ i x i /N A BIT OF STATISTICS Network Science: Graph Theory 2012
49
N – the number of nodes in the graph AVERAGE DEGREE Undirected Directed A F B C D E j i Network Science: Graph Theory 2012
50
Most networks observed in real systems are sparse: WWW (ND Sample): N=325,729;L=1.4 10 6 L max =10 12 =4.51 Protein (S. Cerevisiae): N= 1,870;L=4,470L max =10 7 =2.39 Coauthorship (Math): N= 70,975; L=2 10 5 L max =3 10 10 =3.9 Movie Actors: N=212,250; L=6 10 6 L max =1.8 10 13 =28.78 (Source: Albert, Barabasi, RMP2002) REAL NETWORKS ARE SPARSE Network Science: Graph Theory 2012 Average degree 0N2Log(N) O(N)O(1) TreesComplete graph Dense graphsSparse graphs Disconnected
51
Network Science: Graph Theory 2012
52
A path is a sequence of nodes in which each node is adjacent to the next one P i0,in of length n between nodes i 0 and i n is an ordered collection of n+1 nodes and n links A path can intersect itself and pass through the same link repeatedly. Each time a link is crossed, it is counted separately A legitimate path on the graph on the right: ABCBCADEEBA In a directed network, the path can follow only the direction of an arrow. PATHS A B C D E Network Science: Graph Theory 2012
53
PATHOLOGY: 2 2 5 5 4 4 3 3 1 1 Cycle 2 2 5 5 4 4 3 3 1 1 Path A path with the same start and end node. A sequence of links/nodes. Network Science: Graph Theory 2012
54
PATHOLOGY: 2 2 5 5 4 4 3 3 1 1 Simple Cycle 2 2 5 5 4 4 3 3 1 1 Simple Path A cycle that does not intersect itself. A path that does not intersect itself. Network Science: Graph Theory 2012
55
PATHOLOGY: 2 2 5 5 4 4 3 3 1 1 2 2 5 5 4 4 3 3 1 1 Eulerian Path Hamiltonian Path A path that visits each node exactly once. A path that traverses each link exactly once. Network Science: Graph Theory 2012
56
N ij,number of paths between any two nodes i and j: Length n=1: If there is a link between i and j, then A ij =1 and A ij =0 otherwise. Length n=2: If there is a path of length two between i and j, then A ik A kj =1, and A ik A kj =0 otherwise. The number of paths of length 2: Length n: In general, if there is a path of length n between i and j, then A ik …A lj =1 and A ik …A lj =0 otherwise. The number of paths of length n between i and j is * * holds for both directed and undirected networks. NUMBER OF PATHS BETWEEN TWO NODES Adjacency Matrix Network Science: Graph Theory 2012
57
The distance (shortest path, geodesic path) between two nodes is defined as the number of edges along the shortest path connecting them. *If the two nodes are disconnected, the distance is infinity. In directed graphs each path needs to follow the direction of the arrows. Thus in a digraph the distance from node A to B (on an AB path) is generally different from the distance from node B to A (on a BCA path). DISTANCE IN A GRAPH Shortest Path, Geodesic Path D C A B D C A B Network Science: Graph Theory 2012
58
Distance between node 1 and node 4: 1.Start at 1. FINDING DISTANCES: BREADTH FIRST SEATCH 1 Network Science: Graph Theory 2012
59
Distance between node 1 and node 4: 1.Start at 1. 2.Find the nodes adjacent to 1. Mark them as at distance 1. Put them in a queue. FINDING DISTANCES: BREADTH FIRST SEATCH 111 1 Network Science: Graph Theory 2012
60
Distance between node 1 and node 4: 1.Start at 1. 2.Find the nodes adjacent to 1. Mark them as at distance 1. Put them in a queue. 3.Take the first node out of the queue. Find the unmarked nodes adjacent to it in the graph. Mark them with the label of 2. Put them in the queue. FINDING DISTANCES: BREADTH FIRST SEATCH 111 1 2 2 22 2 11 1 Network Science: Graph Theory 2012
61
Distance between node 1 and node 4: 1.Repeat until you find node 4 or there are no more nodes in the queue. 2.The distance between 1 and 4 is the label of 4 or, if 4 does not have a label, infinity. FINDING DISTANCES: BREADTH FIRST SEATCH Network Science: Graph Theory 2012
62
Diameter: d max the maximum distance between any pair of nodes. What is the diameter of a disconnected graph? Average distance,, for a connected directed graph: where d ij is the distance from node i to node j NETWORK DIAMETER AND AVERAGE DISTANCE Network Science: Graph Theory 2012
63
PATHOLOGY: summary 2 2 5 5 4 4 3 3 1 1 Path 2 2 5 5 4 4 3 3 1 1 Shortest Path A sequence of nodes such that each node is connected to the next node along the path by a link. The path with the shortest length between two nodes (distance). Network Science: Graph Theory 2012
64
PATHOLOGY: summary 2 2 5 5 4 4 3 3 1 1 Diameter 2 2 5 5 4 4 3 3 1 1 Average Path Length The longest shortest path in a graph The average of the shortest paths for all pairs of nodes. Network Science: Graph Theory 2012
65
Connected (undirected) graph: any two vertices can be joined by a path. A disconnected graph is made up by two or more connected components. Bridge: if we erase it, the graph becomes disconnected. Largest Component: Giant Component The rest: Isolates CONNECTIVITY OF UNDIRECTED GRAPHS D C A B F F G D C A B F F G Network Science: Graph Theory 2012
66
The adjacency matrix of a network with several components can be written in a block- diagonal form, so that nonzero elements are confined to squares, with all other elements being zero: Figure after Newman, 2010 CONNECTIVITY OF UNDIRECTED GRAPHS Adjacency Matrix Network Science: Graph Theory 2012
67
Strongly connected directed graph: has a path from each node to every other node and vice versa (e.g. AB path and BA path). Weakly connected directed graph: has a path between each pair of nodes in either direction (e.g. AB path or BA path). Strongly connected components can be identified, but not every node is part of a nontrivial strongly connected component. In-component : nodes that can reach the scc, Out-component : nodes that can be reached from the scc. CONNECTIVITY OF DIRECTED GRAPHS D C A B F G E E C A B G F D Network Science: Graph Theory 2012
68
Degree distribution p k THREE CENTRAL QUANTITIES IN NETWORK SCIENCE Average path length Clustering coefficient C Network Science: Graph Theory 2012
70
We have a sample of values x 1,..., x N Distribution of x (a.k.a. PDF): probability that a randomly chosen value is x P(x) = (# values x) / N Σ i P(x i ) = 1 always! STATISTICS REMINDER Histograms >>> Network Science: Graph Theory 2012
71
Degree distribution P(k): probability that a randomly chosen vertex has degree k N k = # nodes with degree k P(k) = N k / N k P(k) 1234 0.1 0.2 0.3 0.4 0.5 0.6 DEGREE DISTRIBUTION Network Science: Graph Theory 2012
72
discrete representation: p k is the probability that a node has degree k. continuum description: p(k) is the pdf of the degrees, where represents the probability that a node’s degree is between k 1 and k 2. Normalization condition: where K min is the minimal degree in the network. DEGREE DISTRIBUTION Network Science: Graph Theory 2012
73
Local Clustering Coefficient: what portion of your neighbors are connected? Node i with degree k i C i in [0,1] CLUSTERING COEFFICIENT Network Science: Graph Theory 2012
74
Degree distribution: P(k) Path length: Clustering coefficient: THREE CENTRAL QUANTITIES IN NETWORK SCIENCE Network Science: Graph Theory 2012
75
A. Degree distribution: p k B. Path length: C. Clustering coefficient: THREE CENTRAL QUANTITIES IN NETWORK SCIENCE Network Science: Graph Theory 2012
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.