Presentation is loading. Please wait.

Presentation is loading. Please wait.

Danny Hendler Advanced Topics in on-line Social Networks Analysis

Similar presentations


Presentation on theme: "Danny Hendler Advanced Topics in on-line Social Networks Analysis"— Presentation transcript:

1 Danny Hendler Advanced Topics in on-line Social Networks Analysis
Social networks analysis seminar Introductory lecture Danny Hendler, Ben-Gurion University CS , Advanced Topics in On-Line Social Networks Analysis

2 Seminar requirements Select a paper and notify me by Tuesday, November 8, 2016 Study the paper well and prepare a good presentation Meet with me to receive feedback before your talk At least 1 week before presentation Give the seminar talk Participate in at least 80% of seminar talks Recommended reading: “Networks, crowds, and markets: reasoning about a highly connected world”. Easley & Kleinberg, Available online. “Social Media Mining: an Introduction”. Zafarani, Abassi & Liu, Available online. Papers list (to be published soon). Danny Hendler, Ben-Gurion University CS , Advanced Topics in On-Line Social Networks Analysis

3 Seminar schedule 10 more weeks of Student talks Semester ends 3/11/16
Introductory lecture #1 8/11/16 Papers list published, paper assignment period starts 10/11/16 Introductory lecture #2 13/11/16 Paper assignment period ends 15/11/16 Papers assignment published 17/11/16 Student talks start 10 more weeks of Student talks Semester ends Danny Hendler, Ben-Gurion University CS , Advanced Topics in On-Line Social Networks Analysis

4 Talk outline Social network concepts Properties of social networks
Small-world phenomenon Power-law distribution Community structure Community detection Newman & Girvan algorithm Click Percolation Method (CPM) Danny Hendler, Ben-Gurion University CS , Advanced Topics in On-Line Social Networks Analysis

5 Social networks What is a social network? A network, represented by a graph where nodes represent actors and edges represent interactions / relationships Danny Hendler, Ben-Gurion University CS , Advanced Topics in On-Line Social Networks Analysis

6 Social networks: an example
Danny Hendler, Ben-Gurion University CS , Advanced Topics in On-Line Social Networks Analysis

7 Social networks: an example
Giant component Danny Hendler, Ben-Gurion University CS , Advanced Topics in On-Line Social Networks Analysis

8 Social networks: an example
Some nodes are very active Danny Hendler, Ben-Gurion University CS , Advanced Topics in On-Line Social Networks Analysis

9 Social networks: an example
Others less so Danny Hendler, Ben-Gurion University CS , Advanced Topics in On-Line Social Networks Analysis

10 Types of online social media
Danny Hendler, Ben-Gurion University CS , Advanced Topics in On-Line Social Networks Analysis

11 Top 20 USA websites 1 Google.com 11 Craiglist.com 2 Facebook.com 12
Netflix.com 3 Amazon.com 13 Live.com 4 Youtube.com 14 Bing.com 5 Yahoo.com 15 Linkedin.com 6 Wikipedia.org 16 Pinterest.com 7 Ebay.com 17 Espn.go.com 8 Twitter.com 18 Imgur.com 9 Go.com 19 Tumblr.com 10 Reddit.com 20 Chase.com Source: Alexa report, October, 2015 Danny Hendler, Ben-Gurion University CS , Advanced Topics in On-Line Social Networks Analysis

12 Top 20 USA websites 1 Google.com 11 Craiglist.com 2 Facebook.com 12
Netflix.com 3 Amazon.com 13 Live.com 4 Youtube.com 14 Bing.com 5 Yahoo.com 15 Linkedin.com 6 Wikipedia.org 16 Pinterest.com 7 Ebay.com 17 Espn.go.com 8 Twitter.com 18 Imgur.com 9 Go.com 19 Tumblr.com 10 Reddit.com 20 Chase.com 25% social network sites Source: Alexa report, October, 2015 Danny Hendler, Ben-Gurion University CS , Advanced Topics in On-Line Social Networks Analysis

13 Top 20 USA websites 1 Google.com 11 Craiglist.com 2 Facebook.com 12
Netflix.com 3 Amazon.com 13 Live.com 4 Youtube.com 14 Bing.com 5 Yahoo.com 15 Linkedin.com 6 Wikipedia.org 16 Pinterest.com 7 Ebay.com 17 Espn.go.com 8 Twitter.com 18 Imgur.com 9 Go.com 19 Tumblr.com 10 Reddit.com 20 Chase.com 25% social network sites 25% additional sites with social network aspects Source: Alexa report, February, 2014 Danny Hendler, Ben-Gurion University CS , Advanced Topics in On-Line Social Networks Analysis

14 Knowledge we may gain: Identifying romantic ties in facebook.
(*) Backstrom & Kleinberg. Romantic partnerships and the dispersion of social ties: a network analysis of relationship status on facebook. CSCW 2014, pp Danny Hendler, Ben-Gurion University CS , Advanced Topics in On-Line Social Networks Analysis

15 Knowledge we may gain: Web structure
(*) Broder et al. Graph structure in the Web. WWW 2000, pp Danny Hendler, Ben-Gurion University CS , Advanced Topics in On-Line Social Networks Analysis

16 Knowledge we may gain: Dynamic of viral marketing.
(*) Leskovec et al.. The dynamics of viral marketing. Transactions on the Web, 2007. Danny Hendler, Ben-Gurion University CS , Advanced Topics in On-Line Social Networks Analysis

17 Knowledge we may gain: Identify “key players”, collaborations.
Paul Erdős, “A mathematician is a machine for turning coffee into  theorems” Danny Hendler, Ben-Gurion University CS , Advanced Topics in On-Line Social Networks Analysis

18 Knowledge we may gain: Identify “key players”, collaborations.
Paul Erdős, A mathematician is a machine for turning coffee into  theorems Danny Hendler, Ben-Gurion University CS , Advanced Topics in On-Line Social Networks Analysis

19 Knowledge we may gain: Identify “key players”, collaborations.
Bacon number Erdős number Paul Erdős's Bacon number is 5 Paul Erdős and Ronald Graham appeared in N Is a Number: A Portrait of Paul Erdős. Ronald Graham and Merce Cunningham appeared in Great Genius and Profound Stupidity. Merce Cunningham and Dennis Hopper appeared in John Cage: The Revenge of the Dead Indians. Dennis Hopper and Chris Penn appeared in True Romance. Chris Penn and Kevin Bacon appeared in Footloose Source: wiki Danny Hendler, Ben-Gurion University CS , Advanced Topics in On-Line Social Networks Analysis

20 Properties of social networks
Social network concepts Properties of social networks Small-world phenomenon Power-law distribution Community structure Community detection Newman & Girvan algorithm Click Percolation Method (CPM) Danny Hendler, Ben-Gurion University CS , Advanced Topics in On-Line Social Networks Analysis

21 Milgram's small world phenomenon experiment (1967)
Six degrees of separation: “I read somewhere that everybody on this planet is separated by only six other people. Six degrees of separation between us and everyone else on this planet.” (*) Milgram decided to check if this is the case… (*) John Guare. Six Degrees of Separation: A Play. Vintage Books, 1990. Danny Hendler, Ben-Gurion University CS , Advanced Topics in On-Line Social Networks Analysis

22 Milgram's experiment Budget: $680!!!
A set of “starters”, all try to forward a letter to a single “target” person Starters notified of target’s name/address/occupation Must forward letter to someone known on “first-name basis” Danny Hendler, Ben-Gurion University CS , Advanced Topics in On-Line Social Networks Analysis Image taken from Wiki.

23 Milgram's experiment: results
64 chains arrived Median length: 6 source: “Networks, crowds and Markets”, D. Easley & J. Kleinberg. (Book is online) Danny Hendler, Ben-Gurion University CS , Advanced Topics in On-Line Social Networks Analysis

24 A slightly more modern example (2008): Microsoft instant messenger shortest paths
Danny Hendler, Ben-Gurion University CS , Advanced Topics in On-Line Social Networks Analysis

25 Average path-length in Real-World networks
source: “Social Media Mining, an Introduction”, R. Zafarani, M. A. Abbasi & H. Liu. (Book is online) Danny Hendler, Ben-Gurion University CS , Advanced Topics in On-Line Social Networks Analysis

26 Properties of social networks
Social network concepts Properties of social networks Small-world phenomenon Power-law distribution Community structure Community detection Newman & Girvan algorithm Click Percolation Method (CPM) Danny Hendler, Ben-Gurion University CS , Advanced Topics in On-Line Social Networks Analysis

27 As a function of k: what fraction of Web pages have k in-links?
A matter of popularity… As a function of k: what fraction of Web pages have k in-links? ~1/k2.1 (*) (*) Broder et al. Graph structure in the Web. WWW 2000, pp Danny Hendler, Ben-Gurion University CS , Advanced Topics in On-Line Social Networks Analysis

28 Few nodes do have extremely high degrees
The power law distribution Degrees A.k.a. long tail distribution, scale-free distribution Most nodes have low degrees Few nodes do have extremely high degrees Fraction of nodes Danny Hendler, Ben-Gurion University CS , Advanced Topics in On-Line Social Networks Analysis

29 Web pages in-degree: log-log scale
Danny Hendler, Ben-Gurion University CS , Advanced Topics in On-Line Social Networks Analysis

30 Some more examples Friendship Network in Flickr
Friendship Network in YouTube Danny Hendler, Ben-Gurion University CS , Advanced Topics in On-Line Social Networks Analysis

31 Why is popularity power-law?
Danny Hendler, Ben-Gurion University CS , Advanced Topics in On-Line Social Networks Analysis

32 As a function of k: what fraction of Web pages have k in-links?
A simple game… Procedure for creating Web page j  {1,2…N} Choose page i<j randomly & uniformly: With probability p, create a link to page i With probability 1-p, create a link to the page pointed to by page i As a function of k: what fraction of Web pages have k in-links? ~1/kc, lim c =-2 p  0 Danny Hendler, Ben-Gurion University CS , Advanced Topics in On-Line Social Networks Analysis

33 = Rich get richer… Procedure for creating Web page j  {1,2…N}
Choose page i<j randomly & uniformly: With probability p, create a link to page i With probability 1-p, create a link to the page pointed to by page i = With probability p, choose page i<j uniformly and create a link to page i With probability 1-p, choose a page i<j with probability proportional to i‘th number of incoming links and create a link to i Danny Hendler, Ben-Gurion University CS , Advanced Topics in On-Line Social Networks Analysis

34 What behaves like power law?
Fraction of telephone numbers that receive k calls per day (~ 𝑎 𝑘 2 ) Fraction of books bought by k people (~ 𝑎 𝑘 3 ) Fraction of papers receiving k citations (~ 𝑎 𝑘 3 ) Fraction of cities with population k (*) Broder et al. Graph structure in the Web. WWW 2000, pp Danny Hendler, Ben-Gurion University CS , Advanced Topics in On-Line Social Networks Analysis

35 The situation in random graphs
Nodes connected at random Node degrees follow a binomial distribution Probability of “very popular” nodes practically 0 Danny Hendler, Ben-Gurion University CS , Advanced Topics in On-Line Social Networks Analysis

36 Communities (a.k.a. clusters/modules)
Community structure: the organization of vertices in clusters, with “many” edges joining vertices of the same community and “relatively few” edges joining different communities Often represent sets of actors sharing similar properties/roles. Danny Hendler, Ben-Gurion University CS , Advanced Topics in On-Line Social Networks Analysis

37 Community detection Social network concepts
Properties of social networks Small-world phenomenon Power-law distribution Community structure Community detection Newman & Girvan algorithm Click Percolation Method (CPM) Danny Hendler, Ben-Gurion University CS , Advanced Topics in On-Line Social Networks Analysis

38 Why is community-detection important?
A community ``summarizes” a group of actors and is relatively easy to visualize/understand Partition to communities reveals high-level domain structure May reveal important properties without compromising individuals' privacy Danny Hendler, Ben-Gurion University CS , Advanced Topics in On-Line Social Networks Analysis

39 Community detection applications
Clustering web clients with geographical proximity and similar access patterns  cache servers positioning [Krishnamurty & Wang, SIGCOMM 2000] Clustering customers with similar interests  Recommendation systems [Reddy et al., DNIS 2002] Analysing structural positions  Identifying central actors and inter-community mediators Follow political trends Detect malicious actors (e.g. spammers) Danny Hendler, Ben-Gurion University CS , Advanced Topics in On-Line Social Networks Analysis

40

41 Community detection Social network concepts
Properties of social networks Small-world phenomenon Power-law distribution Community structure Community detection Newman & Girvan algorithm Click Percolation Method (CPM) Danny Hendler, Ben-Gurion University CS , Advanced Topics in On-Line Social Networks Analysis

42 “Edge-betweeness” based detection
A divisive method (as opposed to agglomerative methods) Look for an edge that is most “between” pairs of nodes Responsible for connecting many pairs Remove edge and recalculate Newman and Girvan. Finding and evaluating community structure in networks, 2003 Danny Hendler, Ben-Gurion University CS , Advanced Topics in On-Line Social Networks Analysis

43 Shortest-path betweeness
Compute all-pairs shortest paths For each edge, compute the number of such paths it belongs to Remove a maximum-weight edge Repeat until no edges (more on this later) Danny Hendler, Ben-Gurion University CS , Advanced Topics in On-Line Social Networks Analysis

44 Shortest-path betweeness: an example
2 1 7 6 8 3 9 4 5 Danny Hendler, Ben-Gurion University CS , Advanced Topics in On-Line Social Networks Analysis

45 Shortest-path betweeness: an example
2 24 1 7 6 8 3 9 4 5 Danny Hendler, Ben-Gurion University CS , Advanced Topics in On-Line Social Networks Analysis

46 Shortest-path betweeness: an example
2 1 7 6 8 3 9 4 5 Danny Hendler, Ben-Gurion University CS , Advanced Topics in On-Line Social Networks Analysis

47 Shortest-path betweeness: an example
2 1 7 9 6 8 3 9 4 5 Danny Hendler, Ben-Gurion University CS , Advanced Topics in On-Line Social Networks Analysis

48 Shortest-path betweeness: an example
2 1 3 7 6 8 3 9 4 5 Danny Hendler, Ben-Gurion University CS , Advanced Topics in On-Line Social Networks Analysis

49 Shortest-path betweeness: an example
1 2 1 7 6 8 3 9 4 5 Danny Hendler, Ben-Gurion University CS , Advanced Topics in On-Line Social Networks Analysis

50 Shortest-path betweeness: an example
1 2 1 7 6 8 3 9 4 5 Danny Hendler, Ben-Gurion University CS , Advanced Topics in On-Line Social Networks Analysis

51 Shortest-path betweeness: an example
2 1 1 7 6 8 3 9 4 5 Danny Hendler, Ben-Gurion University CS , Advanced Topics in On-Line Social Networks Analysis

52 Shortest-path betweeness: an example
2 1 7 6 8 1 3 9 4 5 Danny Hendler, Ben-Gurion University CS , Advanced Topics in On-Line Social Networks Analysis

53 Shortest-path betweeness: an example
2 1 7 6 8 3 9 1 4 5 Danny Hendler, Ben-Gurion University CS , Advanced Topics in On-Line Social Networks Analysis

54 Shortest-path betweeness: an example
2 1 7 6 8 3 9 4 5 1 Danny Hendler, Ben-Gurion University CS , Advanced Topics in On-Line Social Networks Analysis

55 Shortest-path betweeness: an example
2 1 7 1 6 8 3 9 4 5 Danny Hendler, Ben-Gurion University CS , Advanced Topics in On-Line Social Networks Analysis

56 Shortest-path betweeness: an example
2 1 7 6 1 8 3 9 4 5 Danny Hendler, Ben-Gurion University CS , Advanced Topics in On-Line Social Networks Analysis

57 Shortest-path betweeness: an example
2 1 7 6 8 3 9 1 4 5 Danny Hendler, Ben-Gurion University CS , Advanced Topics in On-Line Social Networks Analysis

58 Shortest-path betweeness: an example
2 1 7 6 8 3 9 4 5 Danny Hendler, Ben-Gurion University CS , Advanced Topics in On-Line Social Networks Analysis

59 Shortest-path betweeness: an example
What if there are several shortest paths? 1 4 3 2 5 2.5 Danny Hendler, Ben-Gurion University CS , Advanced Topics in On-Line Social Networks Analysis

60 Dendrograms (hierarchical trees)
A dendrogram (hierarchical tree) illustrates the output of hierarchical clustering algorithms Leaves represent graph nodes, top represents original graph As we move down the tree, larger communities are partitioned to smaller ones 1 2 3 4 5 6 7 8 9 Danny Hendler, Ben-Gurion University CS , Advanced Topics in On-Line Social Networks Analysis

61 Shortest-path betweeness: an example
2 24 1 7 1 2 3 4 5 6 7 8 9 6 8 3 9 4 5 Danny Hendler, Ben-Gurion University CS , Advanced Topics in On-Line Social Networks Analysis

62 Shortest-path betweeness: an example
2 1 7 9 6 8 3 9 4 5 1 2 3 4 5 6 7 8 9 Danny Hendler, Ben-Gurion University CS , Advanced Topics in On-Line Social Networks Analysis

63 Shortest-path betweeness: an example
2 1 3 7 6 8 3 9 4 5 1 2 3 4 5 6 7 8 9 Danny Hendler, Ben-Gurion University CS , Advanced Topics in On-Line Social Networks Analysis

64 Shortest-path betweeness: an example
1 2 1 7 6 8 3 9 4 5 1 2 3 4 5 6 7 8 9 Danny Hendler, Ben-Gurion University CS , Advanced Topics in On-Line Social Networks Analysis

65 Shortest-path betweeness: an example
1 2 1 7 6 8 3 9 4 5 1 2 3 4 5 6 7 8 9 Danny Hendler, Ben-Gurion University CS , Advanced Topics in On-Line Social Networks Analysis

66 Shortest-path betweeness: an example
2 1 1 7 6 8 3 9 4 5 1 2 3 4 5 6 7 8 9 Danny Hendler, Ben-Gurion University CS , Advanced Topics in On-Line Social Networks Analysis

67 Shortest-path betweeness: an example
2 1 7 6 8 1 3 9 4 5 1 2 3 4 5 6 7 8 9 Danny Hendler, Ben-Gurion University CS , Advanced Topics in On-Line Social Networks Analysis

68 Shortest-path betweeness: an example
2 1 7 6 8 3 9 1 4 5 1 2 3 4 5 6 7 8 9 Danny Hendler, Ben-Gurion University CS , Advanced Topics in On-Line Social Networks Analysis

69 Shortest-path betweeness: an example
2 1 7 6 8 3 9 4 5 1 1 2 3 4 5 6 7 8 9 Danny Hendler, Ben-Gurion University CS , Advanced Topics in On-Line Social Networks Analysis

70 Shortest-path betweeness: an example
2 1 7 1 6 8 3 9 4 5 1 2 3 4 5 6 7 8 9 Danny Hendler, Ben-Gurion University CS , Advanced Topics in On-Line Social Networks Analysis

71 Shortest-path betweeness: an example
2 1 7 6 1 8 3 9 4 5 1 2 3 4 5 6 7 8 9 Danny Hendler, Ben-Gurion University CS , Advanced Topics in On-Line Social Networks Analysis

72 Shortest-path betweeness: an example
2 1 7 6 8 3 9 1 4 5 1 2 3 4 5 6 7 8 9 Danny Hendler, Ben-Gurion University CS , Advanced Topics in On-Line Social Networks Analysis

73 Shortest-path betweeness: an example
2 1 7 6 8 3 9 4 5 1 2 3 4 5 6 7 8 9 Danny Hendler, Ben-Gurion University CS , Advanced Topics in On-Line Social Networks Analysis

74 Evaluation: computer-generated networks
Large number of graphs with 128 nodes and 4 communities of 32-nodes each Probability pin for intra-community edges Probablilty pext for inter-community edges Chosen such that expected vertex degree is 16 Danny Hendler, Ben-Gurion University CS , Advanced Topics in On-Line Social Networks Analysis

75 Results (for 64-nodes networks)
zin=6, zout=2 Danny Hendler, Ben-Gurion University CS , Advanced Topics in On-Line Social Networks Analysis

76 Evaluation: The Zachary karate club
Danny Hendler, Ben-Gurion University CS , Advanced Topics in On-Line Social Networks Analysis

77 Shortest-path no recalculation
Results on Zachary club network Shortest path 2-communities partition missed just a single person! Re-calculation of betweeness essential Shortest-path Shortest-path no recalculation Danny Hendler, Ben-Gurion University CS , Advanced Topics in On-Line Social Networks Analysis

78 We need a quality function
Quality functions Hierarchical clustering algorithms create numerous partitions In general, we do not know how many communities we should seek. How will we know that our clustering is “good”? We need a quality function Danny Hendler, Ben-Gurion University CS , Advanced Topics in On-Line Social Networks Analysis

79 The modularity quality function
No communities in random graphs Equal probabilities for all edges Check how far intra-community and inter-community densities are from those you would expect in a random graph with identical nodes and same degree-distribution Newman and Girvan. Finding and evaluating community structure in networks, 2003 Danny Hendler, Ben-Gurion University CS , Advanced Topics in On-Line Social Networks Analysis

80 The modularity quality function
Degrees of nodes-pair Modularity value Probability of an edge if degrees are set and edges placed in random # edges In-same-cluster indicator variable Graph adjacency matrix Clauset, Newman and Moore. Finding community structure in very large networks, 2004 Danny Hendler, Ben-Gurion University CS , Advanced Topics in On-Line Social Networks Analysis

81 Modularity maximized at correct partition
Computer-generated networks: modularity Modularity maximized at correct partition Danny Hendler, Ben-Gurion University CS , Advanced Topics in On-Line Social Networks Analysis

82 One of two local maxima at correct partition
Zachary club network: modularity One of two local maxima at correct partition Danny Hendler, Ben-Gurion University CS , Advanced Topics in On-Line Social Networks Analysis

83 Community detection Social network concepts
Properties of social networks Small-world phenomenon Power-law distribution Community structure Community detection Newman & Girvan algorithm Click Percolation Method (CPM) Danny Hendler, Ben-Gurion University CS , Advanced Topics in On-Line Social Networks Analysis

84 Clique Percolation Method (CPM)
Input: A parameter k, and a network Procedure: Find out all cliques of size k in the given network Construct a clique graph Two cliques are adjacent if they share k-1 nodes These connected components in the clique graph form a community Slide based on “Social Media Mining, an Introduction”, R. Zafarani, M. A. Abbasi & H. Liu. Danny Hendler, Ben-Gurion University CS , Advanced Topics in On-Line Social Networks Analysis

85 Clique Percolation Method: an example
Cliques of size 3: {1, 2, 3}, {3, 4,5}, {4, 5, 7}, {4,5, 6}, {4,6,7}, {5,6, 7}, {6, 7, 8}, {8,9,10} Communities: {1, 2, 3} {8,9,10} {3,4, 5, 6, 7, 8} Slide based on “Social Media Mining, an Introduction”, R. Zafarani, M. A. Abbasi & H. Liu. Danny Hendler, Ben-Gurion University CS , Advanced Topics in On-Line Social Networks Analysis

86 Reveals overlapping community structure
Clique Percolation Method: an example Communities: {1, 2, 3} {8,9,10} {3,4, 5, 6, 7, 8} Reveals overlapping community structure Slide based on “Social Media Mining, an Introduction”, R. Zafarani, M. A. Abbasi & H. Liu. Danny Hendler, Ben-Gurion University CS , Advanced Topics in On-Line Social Networks Analysis


Download ppt "Danny Hendler Advanced Topics in on-line Social Networks Analysis"

Similar presentations


Ads by Google