Department of Computer and IT Engineering University of Kurdistan

Department of Computer and IT Engineering University of Kurdistan
Social Network Analysis Centrality By: Dr. Alireza Abdollahpouri

Power in social networks
Certain positions within the network give nodes more power Directly affect/influence others Control the flow of information Avoid control of others

Centrality in social networks
Centrality encodes the relationship between structure and power in groups Certain positions within the network give nodes more power or importance How do we measure importance? Who can directly affect/influence others? Highest degree nodes are “in the thick of it” Who controls information flow? Nodes that fall on shortest paths between others can disrupt the flow of information between them Who can quickly inform most others? Nodes who are close to other nodes can quickly get information to them

Measures and Metrics Knowing the structure of a network, we can calculate various useful quantities or measures that capture particular features of the network topology. basis of most of such measures are from social network analysis Degree distribution, Average path length, Density Centrality Degree, Eigenvector, Katz, PageRank, Hubs, Closeness, Betweenness, Several other graph metrics Clustering coefficient, Assortativity, Modularity, …

Characterizing networks: Who is most central?

Network centrality Which nodes are most ‘central’?
Definition of ‘central’ varies by context/purpose Local measure: degree Relative to rest of network: closeness, betweenness, eigenvector (Bonacich power centrality), Katz, PageRank, … How evenly is centrality distributed among nodes? Centralization, hubs and authorities, …

Centrality: who’s important based on their network position
In each of the following networks, X has higher centrality than Y according to a particular measure indegree outdegree betweenness closeness

He who has many friends is most important.
Degree centrality (undirected) He who has many friends is most important. When is the number of connections the best centrality measure? people who will do favors for you people you can talk to (influence set, information access, …) influence of an article in terms of citations (using in-degree)

divide by the max. possible, i.e. (N-1)
Degree: normalized degree centrality divide by the max. possible, i.e. (N-1)

Degree centrality The number of others a node is connected to
Node with high degree has high potential communication activity 1 2 3 4 5 node In-degree Out-degree Total degree 1 2 3 5 4 1 2 3 4 5

Extensions of undirected degree centrality - prestige
indegree centrality a paper that is cited by many others has high prestige a person nominated by many others for a reward has high prestige

How much variation is there in the centrality scores among the nodes?
Centralization: how equal are the nodes? How much variation is there in the centrality scores among the nodes? Freeman’s general formula for centralization: (can use other metrics, e.g. gini coefficient or standard deviation) maximum value in the network

Degree Centrality in Social Networks
Freeman: 0.0 Variance: 0.0 Freeman: 1.0 Variance: 3.9 Freeman: .02 Variance: .17 Freeman: .07 Variance: .20

Centralization= Σ(C*-Ci) / Max Σ(C*-Ci)
Degree centralization examples CD = 0.167 CD = 1.0 CD = 0.167 Centralization= Σ(C*-Ci) / Max Σ(C*-Ci)

Degree centralization examples
example financial trading networks high centralization: one node trading with many others low centralization: trades are more evenly distributed

When degree isn’t everything
In what ways does degree fail to capture centrality in the following graphs? ability to broker between groups likelihood that information originating anywhere in the network reaches you…

Betweenness: another centrality measure
intuition: how many pairs of individuals would have to go through you in order to reach one another in the minimum number of hops? who has higher betweenness, X or Y? Y X

Betweenness centrality
Number of shortest paths (geodesics) connecting all pairs of other nodes that pass through a given node Node with highest betweenness can potentially control or distort communication 1 2 3 4 5 12 123 124 1245 23 24 245 32 34 35 1 2 3 4 5 452 4523 45 52 523 524

Betweenness on toy networks
non-normalized version: A B C D E A lies between no two other vertices B lies between A and 3 other vertices: C, D, and E C lies between 4 pairs of vertices (A,D),(A,E),(B,D),(B,E) note that there are no alternate paths for these pairs to take, so C gets full credit

betweenness centrality: definition
paths between j and k that pass through i betweenness of vertex i all paths between j and k Where gjk = the number of geodesics connecting j-k, and gjk = the number that actor i is on. Usually normalized by: number of pairs of vertices excluding the vertex itself For directed graph: (N-1)*(N-2)

betweenness on toy networks
non-normalized version:

non-normalized version: broker

non-normalized version: why do C and D each have betweenness 1? They are both on shortest paths for pairs (A,E), and (B,E), and so must share credit: ½+½ = 1 Can you figure out why B has betweenness 3.5 while E has betweenness 0.5? C A E B D

Betweenness centrality
If you add lines from C to D and from D to H, you remove the high betweenness centrality of E and F

Centrality vs. Centralization
Centrality is a characteristic of an actor’s position in a network Centralization is a characteristic of a network Centralization indicates: - how unequal the distribution of centrality is in a network or - how much variance there is in the distribution of centrality in a network Centrality is a micro-level measure Centralization is a macro-level measure

Betweenness Centralization (examples)

Comparison Nodes are sized by degree, and colored by betweenness.
What about high degree but relatively low betweenness? Can you spot nodes with high betweenness but relatively low degree?

Extending betweenness centrality to directed networks
We now consider the fraction of all directed paths between any two vertices that pass through a node paths between j and k that pass through i betweenness of vertex i all paths between j and k Only modification: when normalizing, we have (N-1)*(N-2) instead of (N-1)*(N-2)/2, because we have twice as many ordered pairs as unordered pairs

Directed geodesics A node does not necessarily lie on a geodesic from j to k if it lies on a geodesic from k to j j k

closeness: another centrality measure
What if it’s not so important to have many direct friends? Or be “between” others But one still wants to be in the “middle” of things, not too far from the center

Closeness centrality: definition
Closeness is based on the length of the average shortest path between a vertex and all vertices in the graph Closeness Centrality: depends on inverse distance to other vertices Normalized Closeness Centrality

closeness centrality: toy example
B C D E

closeness centrality: more toy examples

Closeness Centrality (examples)
Distance Closeness normalized Distance Closeness normalized

Distance Closeness normalized
Closeness Centrality in Social Networks Distance Closeness normalized

Closeness Centrality in Social Networks
Distance Closeness normalized

how closely do degree and betweenness correspond to closeness?
number of connections denoted by size closeness length of shortest path to all others denoted by color

Closeness centrality Values tend to span a rather small dynamic range
typical distance increases logarithmically with network size In a typical network the closeness centrality C might span a factor of five or less It is difficult to distinguish between central and less central vertices a small change in network might considerably affect the centrality order Alternative computations exist but they have their own problems

Influence range The influence range of i is the set of vertices who are reachable from the node i

Extensions of undirected closeness centrality
closeness centrality usually implies all paths should lead to you paths should lead from you to everywhere else usually consider only vertices from which the node i in question can be reached

Eigenvector Centrality
Idea: A central actor is connected to other central actors A natural extension of the degree centrality For a given graph G:=(V,E) with |V| number of vertices let A be the adjacency matrix. The centrality score of vertex v can be defined as:

Eigenvector Centrality
Google's PageRank is a variant of the eigenvector centrality measure Node B is more popular in the network if we only extend our vision out to a distance of 1 from each node. But A is connected to nodes that are connected to many other nodes, while B is connected to less-popular nodes. A has a higher eigenvector centrality.

Degree vs. Eigenvector Centrality

Comparing centralization of different networks
Empirical study: Comparing centralization of different networks Comparison of centralization metrics across three networks: • butland ppi: binding interactions among 716 yeast proteins • addhealth9: friendships among 136 boys • tribes: positive relations among 12 NZ tribes

Empirical study: Comparing centralization of different networks The protein network looks visually centralized, but • most centralization is local; • globally, somewhat decentralized. The friendship network has small degree centrality (why?). The tribes network has one particularly central node.

Centrality in Social Networks
There are other options, usually based on generalizing some aspect of those above: Random Walk Betweenness (Mark Newman). Looks at the number of times you would expect node I to be on the path between k and j if information traveled a ‘random walk’ through the network. Peer Influence based measures (Friedkin and others). Based on the assumed network autocorrelation model of peer influence. In practice it’s a variant of the eigenvector centrality measures. Subgraph centrality. Counts the number of cliques of size 2, 3, 4, … n-1 that each node belongs to. Reduces to (another) function of the eigenvalues. Similar to influence & information centrality, but does distinguish some unique positions. Fragmentation centrality – Part of Borgatti’s Key Player idea, where nodes are central if they can easily break up a network. Moody & White’s Embeddedness measure is technically a group-level index, but captures the extent to which a given set of nodes are nested inside a network Removal Centrality – effect on the rest of the (graph for any given statistic) with the removal of a given node. Really gets at the system-contribution of a particular actor.

Questions

Department of Computer and IT Engineering University of Kurdistan

Similar presentations

Presentation on theme: "Department of Computer and IT Engineering University of Kurdistan"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Department of Computer and IT Engineering University of Kurdistan

Similar presentations

Presentation on theme: "Department of Computer and IT Engineering University of Kurdistan"— Presentation transcript:

Similar presentations

About project

Feedback