Download presentation
Presentation is loading. Please wait.
Published byGary Paul Modified over 9 years ago
1
Danny Hendler Advanced Topics in on-line Social Networks Analysis Social networks analysis seminar Introductory lecture Danny Hendler, Ben-Gurion University CS20225921, Advanced Topics in On-Line Social Networks Analysis
2
Seminar requirements 1.Select a paper and notify me by Thursday, November 5, 2015 2.Study the paper well and prepare a good presentation 3.Meet with me to receive feedback before your talk 1.At least 1 week before presentation 4.Give the seminar talk 5.Participate in at least 80% of seminar talks Recommended reading: “ Networks, crowds, and markets: reasoning about a highly connected world ”. Easley & Kleinberg, 2010. Available online. “Social Media Mining: an Introduction”. Zafarani, Abassi & Liu, 2014. Available online. Papers list (to be published soon). Danny Hendler, Ben-Gurion University CS20225921, Advanced Topics in On-Line Social Networks Analysis
3
Seminar schedule Introductory lecture #1 25/10/15 Semester ends Students send their 3 preferences Introductory lecture #2, papers list published 8/11/ 15 3/11/1 5 First student talk 1/11/15 10 weeks of Student talks Danny Hendler, Ben-Gurion University CS20225921, Advanced Topics in On-Line Social Networks Analysis
4
Talk outline Social networks Properties of social networks Small-world phenomenon Power-law distribution Community structure Community detection Newman & Girvan algorithm Click Percolation Method (CPM) Danny Hendler, Ben-Gurion University CS20225921, Advanced Topics in On-Line Social Networks Analysis
5
Social networks What is a social network? A network, represented by a graph where nodes represent actors and edges represent interactions / relationships Danny Hendler, Ben-Gurion University CS20225921, Advanced Topics in On-Line Social Networks Analysis
6
Social networks: an example Danny Hendler, Ben-Gurion University CS20225921, Advanced Topics in On-Line Social Networks Analysis
7
Social networks: an example Giant component Danny Hendler, Ben-Gurion University CS20225921, Advanced Topics in On-Line Social Networks Analysis
8
Social networks: an example Some nodes are very active Danny Hendler, Ben-Gurion University CS20225921, Advanced Topics in On-Line Social Networks Analysis
9
Types of online social media Danny Hendler, Ben-Gurion University CS20225921, Advanced Topics in On-Line Social Networks Analysis
10
Top 20 USA websites 1Google.com11Craiglist.com 2Facebook.com12Netflix.com 3Amazon.com13Live.com 4Youtube.com14Bing.com 5Yahoo.com15Linkedin.com 6Wikipedia.org16Pinterest.com 7Ebay.com17Espn.go.com 8Twitter.com18Imgur.com 9Go.com19Tumblr.com 10Reddit.com20Chase.com Source: Alexa report, October, 2015 Danny Hendler, Ben-Gurion University CS20225921, Advanced Topics in On-Line Social Networks Analysis
11
Top 20 USA websites 25% social network sites Danny Hendler, Ben-Gurion University CS20225921, Advanced Topics in On-Line Social Networks Analysis 1Google.com11Craiglist.com 2Facebook.com12Netflix.com 3Amazon.com13Live.com 4Youtube.com14Bing.com 5Yahoo.com15Linkedin.com 6Wikipedia.org16Pinterest.com 7Ebay.com17Espn.go.com 8Twitter.com18Imgur.com 9Go.com19Tumblr.com 10Reddit.com20Chase.com Source: Alexa report, October, 2015
12
Top 20 USA websites 25% social network sites 25% additional sites with social network aspects Source: Alexa report, February, 2014 Danny Hendler, Ben-Gurion University CS20225921, Advanced Topics in On-Line Social Networks Analysis 1Google.com11Craiglist.com 2Facebook.com12Netflix.com 3Amazon.com13Live.com 4Youtube.com14Bing.com 5Yahoo.com15Linkedin.com 6Wikipedia.org16Pinterest.com 7Ebay.com17Espn.go.com 8Twitter.com18Imgur.com 9Go.com19Tumblr.com 10Reddit.com20Chase.com
13
Knowledge we may gain: Identifying romantic ties in facebook. (*) Backstrom & Kleinberg. Romantic partnerships and the dispersion of social ties: a network analysis of relationship status on facebook. CSCW 2014, pp. 831-841. Danny Hendler, Ben-Gurion University CS20225921, Advanced Topics in On-Line Social Networks Analysis
14
Knowledge we may gain: Web structure (*) Broder et al. Graph structure in the Web. WWW 2000, pp. 309-320. Danny Hendler, Ben-Gurion University CS20225921, Advanced Topics in On-Line Social Networks Analysis
15
Knowledge we may gain: Dynamic of viral marketing. (*) Leskovec et al.. The dynamics of viral marketing. Transactions on the Web, 2007. Danny Hendler, Ben-Gurion University CS20225921, Advanced Topics in On-Line Social Networks Analysis
16
Knowledge we may gain: Identify “key players”, collaborations. Paul Erdős, 1913-1996 “A mathematician is a machine for turning coffee into theorems” Danny Hendler, Ben-Gurion University CS20225921, Advanced Topics in On-Line Social Networks Analysis
17
Paul Erdős, 1913-1996 A mathematician is a machine for turning coffee into theorems Danny Hendler, Ben-Gurion University CS20225921, Advanced Topics in On-Line Social Networks Analysis Knowledge we may gain: Identify “key players”, collaborations.
18
Erdős number Bacon number Knowledge we may gain: Identify “key players”, collaborations. Paul Erdős's Bacon number is 5 Paul Erdős and Ronald Graham appeared in N Is a Number: A Portrait of Paul Erdős. Ronald Graham and Merce Cunningham appeared in Great Genius and Profound Stupidity. Merce Cunningham and Dennis Hopper appeared in John Cage: The Revenge of the Dead Indians. Dennis Hopper and Chris Penn appeared in True Romance. Chris Penn and Kevin Bacon appeared in Footloose Source: wiki Danny Hendler, Ben-Gurion University CS20225921, Advanced Topics in On-Line Social Networks Analysis
19
Properties of social networks Danny Hendler, Ben-Gurion University CS20225921, Advanced Topics in On-Line Social Networks Analysis Social networks Properties of social networks Small-world phenomenon Power-law distribution Community structure Community detection Newman & Girvan algorithm Click Percolation Method (CPM)
20
Milgram's small world phenomenon experiment (1967) Six degrees of separation: “I read somewhere that everybody on this planet is separated by only six other people. Six degrees of separation between us and everyone else on this planet.” (*) (*) John Guare. Six Degrees of Separation: A Play. Vintage Books, 1990. Milgram decided to check if this is the case… Danny Hendler, Ben-Gurion University CS20225921, Advanced Topics in On-Line Social Networks Analysis
21
Milgram's experiment Budget: $680!!! A set of “starters”, all try to forward a letter to a single “target” person Starters notified of target’s name/address/occupation Must forward letter to someone known on “first-name basis” Image taken from Wiki. Danny Hendler, Ben-Gurion University CS20225921, Advanced Topics in On-Line Social Networks Analysis
22
Milgram's experiment: results 64 chains arrived Median length: 6 source: “Networks, crowds and Markets”, D. Easley & J. Kleinberg. (Book is online) Danny Hendler, Ben-Gurion University CS20225921, Advanced Topics in On-Line Social Networks Analysis
23
A slightly more modern example (2008): Microsoft instant messenger shortest paths Danny Hendler, Ben-Gurion University CS20225921, Advanced Topics in On-Line Social Networks Analysis
24
Average path-length in Real-World networks Danny Hendler, Ben-Gurion University CS20225921, Advanced Topics in On-Line Social Networks Analysis source: “Social Media Mining, an Introduction”, R. Zafarani, M. A. Abbasi & H. Liu. (Book is online)
25
Properties of social networks Danny Hendler, Ben-Gurion University CS20225921, Advanced Topics in On-Line Social Networks Analysis Social networks Properties of social networks Small-world phenomenon Power-law distribution Community structure Community detection Newman & Girvan algorithm Click Percolation Method (CPM)
26
A matter of popularity… As a function of k: what fraction of Web pages have k in-links? ~1/k 2.1 (*) (*) Broder et al. Graph structure in the Web. WWW 2000, pp. 309-320. Danny Hendler, Ben-Gurion University CS20225921, Advanced Topics in On-Line Social Networks Analysis
27
Degrees A.k.a. long tail distribution, scale-free distribution The power low distribution Fraction of nodes Most nodes have low degrees Few nodes do have extremely high degrees Danny Hendler, Ben-Gurion University CS20225921, Advanced Topics in On-Line Social Networks Analysis
28
Web pages in-degree: log-log scale Danny Hendler, Ben-Gurion University CS20225921, Advanced Topics in On-Line Social Networks Analysis
29
Some more examples Friendship Network in FlickrFriendship Network in YouTube Danny Hendler, Ben-Gurion University CS20225921, Advanced Topics in On-Line Social Networks Analysis
30
Why is popularity power-law? Danny Hendler, Ben-Gurion University CS20225921, Advanced Topics in On-Line Social Networks Analysis
31
A simple game… Procedure for creating Web page j {1,2…N} Choose page i<j randomly & uniformly: a.With probability p, create a link to page i b.With probability 1-p, create a link to the page pointed to by page i As a function of k: what fraction of Web pages have k in-links? ~1/k c, lim c =-2 p 0 Danny Hendler, Ben-Gurion University CS20225921, Advanced Topics in On-Line Social Networks Analysis
32
Rich get richer… Procedure for creating Web page j {1,2…N} Choose page i<j randomly & uniformly: a.With probability p, create a link to page i b.With probability 1-p, create a link to the page pointed to by page i = a.With probability p, choose page i<j uniformly and create a link to page i b.With probability 1-p, choose a page i<j with probability proportional to i‘th number of links and create a link to i Danny Hendler, Ben-Gurion University CS20225921, Advanced Topics in On-Line Social Networks Analysis
33
The situation in random graphs Nodes connected at random Node degrees follow a binomial distribution Probability of “very popular” nodes practically 0 Danny Hendler, Ben-Gurion University CS20225921, Advanced Topics in On-Line Social Networks Analysis
34
Communities (a.k.a. clusters/modules) Community structure: the organization of vertices in clusters, with “many” edges joining vertices of the same community and “relatively few” edges joining different communities Often represent sets of actors sharing similar properties/roles. Danny Hendler, Ben-Gurion University CS20225921, Advanced Topics in On-Line Social Networks Analysis
35
Community detection Danny Hendler, Ben-Gurion University CS20225921, Advanced Topics in On-Line Social Networks Analysis Social networks Properties of social networks Small-world phenomenon Power-law distribution Community structure Community detection Newman & Girvan algorithm Click Percolation Method (CPM)
36
Why is community-detection important? A community ``summarizes” a group of actors and is relatively easy to visualize/understand Partition to communities reveals high-level domain structure May reveal important properties without compromising individuals' privacy Danny Hendler, Ben-Gurion University CS20225921, Advanced Topics in On-Line Social Networks Analysis
37
Community detection applications Clustering web clients with geographical proximity and similar access patterns cache servers positioning [Krishnamurty & Wang, SIGCOMM 2000] Clustering customers with similar interests Recommendation systems [Reddy et al., DNIS 2002] Analysing structural positions Identifying central actors and inter-community mediators Follow political trends Detect malicious actors (e.g. spammers) … Danny Hendler, Ben-Gurion University CS20225921, Advanced Topics in On-Line Social Networks Analysis
39
Community detection Social networks Properties of social networks Small-world phenomenon Power-law distribution Community structure Community detection Newman & Girvan algorithm Click Percolation Method (CPM)
40
“Edge-betweeness” based detection A divisive method (as opposed to agglomerative methods) Look for an edge that is most “between” pairs of nodes o Responsible for connecting many pairs Remove edge and recalculate Newman and Girvan. Finding and evaluating community structure in networks, 2003 Danny Hendler, Ben-Gurion University CS20225921, Advanced Topics in On-Line Social Networks Analysis
41
Shortest-path betweeness Compute all-pairs shortest paths For each edge, compute the number of such paths it belongs to Remove a maximum-weight edge Repeat until no edges (more on this later) Danny Hendler, Ben-Gurion University CS20225921, Advanced Topics in On-Line Social Networks Analysis
42
Shortest-path betweeness: an example 0 1 2 3 5 4 6 7 9 8 Danny Hendler, Ben-Gurion University CS20225921, Advanced Topics in On-Line Social Networks Analysis
43
Shortest-path betweeness: an example 0 1 2 3 5 4 6 7 9 8 24 Danny Hendler, Ben-Gurion University CS20225921, Advanced Topics in On-Line Social Networks Analysis
44
Shortest-path betweeness: an example 0 1 2 3 5 4 6 7 9 8 Danny Hendler, Ben-Gurion University CS20225921, Advanced Topics in On-Line Social Networks Analysis
45
Shortest-path betweeness: an example 0 1 2 3 5 4 6 7 9 8 9 Danny Hendler, Ben-Gurion University CS20225921, Advanced Topics in On-Line Social Networks Analysis
46
Shortest-path betweeness: an example 0 1 2 3 5 4 6 7 9 8 3 Danny Hendler, Ben-Gurion University CS20225921, Advanced Topics in On-Line Social Networks Analysis
47
Shortest-path betweeness: an example 0 1 2 3 5 4 6 7 9 8 1 Danny Hendler, Ben-Gurion University CS20225921, Advanced Topics in On-Line Social Networks Analysis
48
Shortest-path betweeness: an example 0 1 2 3 5 4 6 7 9 8 1 Danny Hendler, Ben-Gurion University CS20225921, Advanced Topics in On-Line Social Networks Analysis
49
Shortest-path betweeness: an example 0 1 2 3 5 4 6 7 9 8 1 Danny Hendler, Ben-Gurion University CS20225921, Advanced Topics in On-Line Social Networks Analysis
50
Shortest-path betweeness: an example 0 1 2 3 5 4 6 7 9 8 1 Danny Hendler, Ben-Gurion University CS20225921, Advanced Topics in On-Line Social Networks Analysis
51
Shortest-path betweeness: an example 0 1 2 3 5 4 6 7 9 8 1 Danny Hendler, Ben-Gurion University CS20225921, Advanced Topics in On-Line Social Networks Analysis
52
Shortest-path betweeness: an example 0 1 2 3 5 4 6 7 9 8 1 Danny Hendler, Ben-Gurion University CS20225921, Advanced Topics in On-Line Social Networks Analysis
53
Shortest-path betweeness: an example 6 7 9 8 1 0 1 2 3 5 4 Danny Hendler, Ben-Gurion University CS20225921, Advanced Topics in On-Line Social Networks Analysis
54
Shortest-path betweeness: an example 6 7 9 8 1 0 1 2 3 5 4 Danny Hendler, Ben-Gurion University CS20225921, Advanced Topics in On-Line Social Networks Analysis
55
Shortest-path betweeness: an example 6 7 9 8 1 0 1 2 3 5 4 Danny Hendler, Ben-Gurion University CS20225921, Advanced Topics in On-Line Social Networks Analysis
56
Shortest-path betweeness: an example 6 7 9 8 0 1 2 3 5 4 Danny Hendler, Ben-Gurion University CS20225921, Advanced Topics in On-Line Social Networks Analysis
57
Shortest-path betweeness: an example What if there are several shortest paths? 1 4 3 2 5 4 3 3 2.5 Danny Hendler, Ben-Gurion University CS20225921, Advanced Topics in On-Line Social Networks Analysis
58
Dendrograms (hierarchical trees) A dendrogram (hierarchical tree) illustrates the output of hierarchical clustering algorithms Leaves represent graph nodes, top represents original graph As we move down the tree, larger communities are partitioned to smaller ones 1234567890 Danny Hendler, Ben-Gurion University CS20225921, Advanced Topics in On-Line Social Networks Analysis
59
Shortest-path betweeness: an example 0 1 2 3 5 4 6 7 9 8 24 1234567890 Danny Hendler, Ben-Gurion University CS20225921, Advanced Topics in On-Line Social Networks Analysis
60
Shortest-path betweeness: an example 0 1 2 3 5 4 6 7 9 8 9 1234567890 Danny Hendler, Ben-Gurion University CS20225921, Advanced Topics in On-Line Social Networks Analysis
61
Shortest-path betweeness: an example 0 1 2 3 5 4 6 7 9 8 3 1234567890 Danny Hendler, Ben-Gurion University CS20225921, Advanced Topics in On-Line Social Networks Analysis
62
Shortest-path betweeness: an example 0 1 2 3 5 4 6 7 9 8 1 1234567890 Danny Hendler, Ben-Gurion University CS20225921, Advanced Topics in On-Line Social Networks Analysis
63
Shortest-path betweeness: an example 0 1 2 3 5 4 6 7 9 8 1 1234567890 Danny Hendler, Ben-Gurion University CS20225921, Advanced Topics in On-Line Social Networks Analysis
64
Shortest-path betweeness: an example 0 1 2 3 5 4 6 7 9 8 1 1234567890 Danny Hendler, Ben-Gurion University CS20225921, Advanced Topics in On-Line Social Networks Analysis
65
Shortest-path betweeness: an example 0 1 2 3 5 4 6 7 9 8 1 1234567890 Danny Hendler, Ben-Gurion University CS20225921, Advanced Topics in On-Line Social Networks Analysis
66
Shortest-path betweeness: an example 0 1 2 3 5 4 6 7 9 8 1 1234567890 Danny Hendler, Ben-Gurion University CS20225921, Advanced Topics in On-Line Social Networks Analysis
67
Shortest-path betweeness: an example 0 1 2 3 5 4 6 7 9 8 1 1234567890 Danny Hendler, Ben-Gurion University CS20225921, Advanced Topics in On-Line Social Networks Analysis
68
Shortest-path betweeness: an example 6 7 9 8 1 1234567890 0 1 2 3 5 4 Danny Hendler, Ben-Gurion University CS20225921, Advanced Topics in On-Line Social Networks Analysis
69
Shortest-path betweeness: an example 6 7 9 8 1 1234567890 0 1 2 3 5 4 Danny Hendler, Ben-Gurion University CS20225921, Advanced Topics in On-Line Social Networks Analysis
70
Shortest-path betweeness: an example 6 7 9 8 1 1234567890 0 1 2 3 5 4 Danny Hendler, Ben-Gurion University CS20225921, Advanced Topics in On-Line Social Networks Analysis
71
Shortest-path betweeness: an example 6 7 9 8 123456789 0 1 2 3 5 4 0 Danny Hendler, Ben-Gurion University CS20225921, Advanced Topics in On-Line Social Networks Analysis
72
Evaluation: computer-generated networks Large number of graphs with 128 nodes and 4 communities of 32-nodes each Probability p in for intra-community edges Probablilty p ext for inter-community edges Chosen such that expected vertex degree is 16 Danny Hendler, Ben-Gurion University CS20225921, Advanced Topics in On-Line Social Networks Analysis
73
Results (for 64-nodes networks) z in =6, z out =2 Danny Hendler, Ben-Gurion University CS20225921, Advanced Topics in On-Line Social Networks Analysis
74
Evaluation: The Zachary karate club Danny Hendler, Ben-Gurion University CS20225921, Advanced Topics in On-Line Social Networks Analysis
75
Results on Zachary club network Shortest-pathShortest-path no recalculation Shortest path 2-communities partition missed just a single person! Re-calculation of betweeness essential Danny Hendler, Ben-Gurion University CS20225921, Advanced Topics in On-Line Social Networks Analysis
76
Quality functions Hierarchical clustering algorithms create numerous partitions In general, we do not know how many communities we should seek. How will we know that our clustering is “good” We need a quality function Danny Hendler, Ben-Gurion University CS20225921, Advanced Topics in On-Line Social Networks Analysis
77
The modularity quality function Newman and Girvan. Finding and evaluating community structure in networks, 2003 No communities in random graphs Equal probabilities for all edges Check how far intra-community and inter-community densities are from those you would expect in a random graph with identical nodes and same degree-distribution Danny Hendler, Ben-Gurion University CS20225921, Advanced Topics in On-Line Social Networks Analysis
78
The modularity quality function Clauset, Newman and Moore. Finding community structure in very large networks, 2004 Modularity value # edges Graph adjacency matrix Degrees of nodes-pair Probability of an edge if degrees are set and edges placed in random In-same-cluster indicator variable Danny Hendler, Ben-Gurion University CS20225921, Advanced Topics in On-Line Social Networks Analysis
79
Computer-generated networks: modularity Modularity maximized at correct partition Danny Hendler, Ben-Gurion University CS20225921, Advanced Topics in On-Line Social Networks Analysis
80
Zachary club network: modularity One of two local maxima at correct partition Danny Hendler, Ben-Gurion University CS20225921, Advanced Topics in On-Line Social Networks Analysis
81
Social networks Properties of social networks Small-world phenomenon Power-law distribution Community structure Community detection Newman & Girvan algorithm Click Percolation Method (CPM) Danny Hendler, Ben-Gurion University CS20225921, Advanced Topics in On-Line Social Networks Analysis Community detection
82
Danny Hendler, Ben-Gurion University CS20225921, Advanced Topics in On-Line Social Networks Analysis Clique Percolation Method (CPM) Input: A parameter k, and a network Procedure: Find out all cliques of size k in the given network Construct a clique graph Two cliques are adjacent if they share k-1 nodes These connected components in the clique graph form a community Slide based on “Social Media Mining, an Introduction”, R. Zafarani, M. A. Abbasi & H. Liu.
83
Danny Hendler, Ben-Gurion University CS20225921, Advanced Topics in On-Line Social Networks Analysis Clique Percolation Method: an example Slide based on “Social Media Mining, an Introduction”, R. Zafarani, M. A. Abbasi & H. Liu. Cliques of size 3: {1, 2, 3}, {3, 4,5}, {4, 5, 7}, {4,5, 6}, {4,6,7}, {5,6, 7}, {6, 7, 8}, {8,9,10} Communities: {1, 2, 3} {8,9,10} {3,4, 5, 6, 7, 8}
84
Danny Hendler, Ben-Gurion University CS20225921, Advanced Topics in On-Line Social Networks Analysis Clique Percolation Method: an example Slide based on “Social Media Mining, an Introduction”, R. Zafarani, M. A. Abbasi & H. Liu. Communities: {1, 2, 3} {8,9,10} {3,4, 5, 6, 7, 8} Reveals overlapping community structure
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.