Download presentation
1
Centrality Spring 2012
2
Why do we care? Diffusion (practices, information, disease)
Structure, status, prestige Seeing, perspective, worldview Power as relational, constraints as relational Network location as dependent variable Explaining outcomes Supporting strategic “networking”
3
Example: 2-Step Flow of Communication*
Micro- macro- link in communications theory Lazarsfeld on mass media and voting (1940s) high centrality nodes – opinion leaders – mediate broadcast info flow later (Lazarsfeld & Katz (1955)) formalized as two-step flow of communication model: mass media messages filtered through more-exposed central members of social groups. *Remix of
4
What Vertices are Most Important?
The Question What Vertices are Most Important?
5
Everyday Understandings
Important = prominent Important = admired Important = linchpin Important = listened to Important = in the know Important = gate keeper Important = involved
6
Translations Ordinary Description Possible Network Interpretation
prominent Vertex is “visible” to many other vertices admired Vertex is “chosen” by many other vertices listened to Vertex is “received” by many other vertices in the know Vertex is short distance from many sources of information linchpin Vertex irreplaceable part gate keeper Vertex stands between one part of graph and another involved Vertex connected to many parts of graph
7
Davis Southern Women Degree Centrality
8
Davis’ Southern Women - Centrality
9
A Simple Network A B C D E F G - 1
10
𝐶 𝐷 𝑣 𝑖 = 𝑘 𝑖 centrality degree
11
Degree Centrality can Fail to Differentiate
B C D E F G - 1 CD A 4 B C D E F G 1
12
Degree Centrality Can Mislead
13
𝐶 𝑐 𝑣 𝑖 = 1 𝑗=1 𝑛 𝑑( 𝑣 𝑖 𝑣, 𝑣 𝑗 ) centrality closeness
14
Closeness Centrality Closeness = 1/total distance to other vertices
𝐶 𝑐 𝑣 = 1 𝑖=1 𝑛 𝑑(𝑣, 𝑣 𝑖 ) 𝐶 𝑐 𝐴 = 1 𝑑 𝐴𝐵 +𝑑 𝐴𝐶 +𝑑 𝐴𝐷 +𝑑 𝐴𝐸 +𝑑 𝐴𝐹 +𝑑 𝐴𝐺 𝐶 𝑐 𝐴 = 𝐶 𝑐 𝐴 = 1 8 =0.125
15
Compare Two Graphs What is the problem here? How would you fix it?
Compute Closeness Centrality of a Vertex 𝐶 𝑐 𝐴 = =0.3 3 𝐶 𝑐 𝐴 = =0.2 What is the problem here? How would you fix it?
16
Normalization 𝐶 ′ 𝐶 𝑣 𝑖 = 𝑛−1 𝑗=1 𝑛 𝑑 𝑣 𝑖 , 𝑣 𝑗
Adjusting a formula to take into account things like graph size Usually by “mapping” values to (0…1) or -1…+1 For closeness centrality: 𝐶 ′ 𝐶 𝑣 𝑖 = 𝑛−1 𝑗=1 𝑛 𝑑 𝑣 𝑖 , 𝑣 𝑗 Where n is number of vertices in the graph
17
Compare Two Graphs 𝐶′ 𝑐 𝐴 = =1 𝐶′ 𝑐 𝐴 = =1 Intuitively, both blue vertices should have the same closeness centrality since both are 1 step away from all other vertices.
18
𝐶 𝑏 𝑣 𝑖 = 𝑛 𝑠𝑡𝑖 𝑔 𝑠𝑡 centrality Betweenness
19
Betweenness Centrality
Fraction of shortest paths that include vertex A B C D E F G - 1,1 2,4 2,1 3,4
20
Betweenness Centrality
Fraction of shortest paths that include vertex 1 shortest path of 4 goes through A 𝐶 𝑏 𝑣 𝑖 = 𝑛 𝑠𝑡𝑖 𝑔 𝑠𝑡 A B C D E F G - 1,1 2,4 2,1 3,4 Example: Calculate betweenness centrality of vertex A 1 shortest path of 4 goes through A 𝐶 𝑏 𝑣 𝑖 = 𝑛 𝑠𝑡𝑖 𝑔 𝑠𝑡 = = 0.75 1 shortest path of 4 goes through A
21
Normalizing Betweenness
Middle vertices should have same CB? Since number of paths vertex COULD be on is (n-1)(n-2)/2 we can use this as our denominator 𝐶′ 𝐵 = 𝐶 𝐵 𝑛−1 𝑛−2 2
22
Calculate Cb(F) A B C D E F G - 1 4
23
Vertex Centrality Comparison
Usually centrality metrics positively correlated When not, something interesting going on Low Degree Low Closeness Low Betweenness High Degree Ego embedded in cluster that is far from the rest of the network Ego's connections are redundant - communication bypasses him/her High Closeness Ego tied to important or active alters Probably multiple paths in the network, ego is near many people, but so are many others High Betweenness Ego's few ties are crucial for network flow Very rare cell. Would mean that ego monopolizes the ties from a small number of people to many others.
24
Information Centrality
Betweenness only uses geodesic paths Information can also flow on longer paths Sometimes we hear it through the grapevine While betweenness focuses just on the geodesic, information centrality focuses on how information might flow through many different paths, weighted by strength of tie and distance. (Moody)
25
Information Centrality
Chapter 2 Resistance Distance, Information Centrality, Node Vulnerability and Vibrations in Complex Networks by Ernesto Estrada and Naomichi Hatano
26
Diagrams by J Moody, Duke U.
27
centrality Eigenvector
28
Consider this Example The two red nodes have similar amounts of “local” centrality, but different amounts of “global” centrality.
29
Power/Eigenvector Centrality
Weakness of degree centrality – it counts your neighbors but not whether or not they count Basic idea ego’s centrality is function of neighbors’ centrality C(ego) = f (C(ego’s neighbors) )
30
Algorithm Assume all vertices have centrality, C = 1
Recalculate C by summing C of neighbors Repeat the process Each time we are “taking into account” the centralities of yet another “layer” of the vertices around us
31
1
32
2 3 5 4
33
6 7 13 10 18 9
34
15 22 33 16 20 52 36 46 25
35
40 58 126 52 72 139 94 209 92
36
And consider the matrix What does this matrix “do” to the vector ?
Consider the xy coordinate plane where a line from (0,0) to (x,y) is the vector And consider the matrix What does this matrix “do” to the vector ? (x,y) x y y x 1 1
37
Matrix Multiplication as Distortion
1 1 1 BUT 1 1 1
38
So, what is an Eigenvector?
39
Eigenvector Adjacency matrix redistributes vertex contents
Some vector of contents is in equilibrium These are the eigenvector centralities
40
What is an Eigenvector? Consider a graph & its 5x5 adjacency matrix, A
41
And then consider a vector, x…
a 5x1 vector of values, one for each vertex in the graph. In this case, we've used the degree centrality of each vertex.
42
What happens when… …we multiply the vector x by the matrix A?
The result, of course, is another 5x1 vector.
43
Axx diffuses the vertex values
Look at first element of resulting vector The 1s in the A matrix "pick up" values of each vertex to which the first vertex is connected Result value is sum of values of these vertices.
44
Intuitiveness Visible on Rearrangment
45
Eigenvector vs. Power (Bonacich)
46
Centrality in Social Networks
Power / Eigenvalue In recent work, Borgatti (2003; 2005) discusses centrality in terms of two key dimensions: Substantively, the key question for centrality is knowing what is flowing through the network. The key features are: Whether the actor retains the good to pass to others (Information, Diseases) or whether they pass the good and then loose it (physical objects) Whether the key factor for spread is distance (disease with low pij) or multiple sources (information) The off-the-shelf measures do not always match the social process of interest, so researchers need to be mindful of this.
47
What Can We Study with Centrality?
City systems Illegal networks Marketing targets Opinion formation/spread Epidemiology
48
Directed Networks: Centrality v. Prestige
An actor is considered central if her ties make her visible to others Visibility by direct ties AND by indirect ties through intermediaries Many social and economic phenomena such as access and control over resources and brokerage of information involve centrality since simply participating in interaction is what counts. But the number of ties alone does not determine importance So we distinguish a second type of visibility : prestige Takes into account the direction of the tie Generally, more incoming ties more prestige And, more incoming ties from higher prestige vertices more prestige Based on Aliseya Wright HCC Spring 2005 Wednesday, April 13,
49
Google(PageRank): Overview
Pre-computes a rank-vector Provides a-priori (offline) importance estimates for all pages on Web Independent of search query In-degree prestige Not all votes are worth the same Prestige of a page is the sum of prestige of citing pages: p = Ep Pre-compute query independent prestige score Query time: prestige scores used in conjunction with query-specific IR scores Mining the Web Chakrabarti and Ramakrishnan
50
Chakrabarti and Ramakrishnan
Google(PageRank) Assumption the prestige of a page is proportional to the sum of the prestige scores of pages linking to it Random surfer on strongly connected web graph E is adjacency matrix of the Web No parallel edges matrix L derived from E by normalizing all row-sums to one: . Mining the Web Chakrabarti and Ramakrishnan
51
Chakrabarti and Ramakrishnan
The PageRank After ith step: Convergence to stationary distribution of L. p -> principal eigenvector of LT Called the PageRank Convergence criteria L is irreducible there is a directed path from every node to every other node L is aperiodic for all u & v, there are paths with all possible number of links on them, except for a finite set of path lengths Mining the Web Chakrabarti and Ramakrishnan
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.