Social Network Analysis (1) LING 575 Fei Xia 01/04/2011
Basic idea Build a graph – A node represents a person – A link represents the relation between two persons – Question: define what kind of relation should be used Process the graph to answer questions such as – what is the structure of the graph – who is a key player in the graph Let’s start with paper #4, (Diesner and Carley, 2005), “Exploration of Communication Network from the Enron Corpus”
(Diesner and Carley, 2005) Research questions: – What are the structure and properties of the communication networks in Enron? How do these features relate to other networks? – Who are key players or critical individuals in the system? – How do structure and key players change over time?
Dataset Start with the ISI database – 252,759 s from 151 people Database refinement – Add job position and job location info there are 15 unique job titles (CEO, president, VP, etc.) – Normalize addresses on average, each person has 1.9 addresses
Communication network Oct 2000 (160 agents)Oct 2001 (174 agents)
Degree centrality Given a graph G=(V,E) with n vertices, in-degree centrality: out-degree centrality:
Closeness centrality Loosely, Closeness is the inverse of the average distance in the network between the node and all other nodes. If every node is reachable from v
Betweenness centrality Loosely, across all node pairs, the percentage that has a shortest path that passes through v. sum = 0; For each pair of vertices (s,t) compute all the shortest paths between s and t determine the fraction of shortest paths that go through v sum += fraction; betweenness = sum / X; X is (n-1)(n-2)/2 for undirected graph, and (n-1)(n-2) for directed graph
Key players per centrality measures
exchange per month
s sent to positions