Presentation is loading. Please wait.

Presentation is loading. Please wait.

M EASUREMENT AND A NALYSIS OF O NLINE S OCIAL N ETWORKS Professor : Dr Sheykh Esmaili Presenters: Pourya Aliabadi Boshra Ardallani Paria Rakhshani 1.

Similar presentations


Presentation on theme: "M EASUREMENT AND A NALYSIS OF O NLINE S OCIAL N ETWORKS Professor : Dr Sheykh Esmaili Presenters: Pourya Aliabadi Boshra Ardallani Paria Rakhshani 1."— Presentation transcript:

1 M EASUREMENT AND A NALYSIS OF O NLINE S OCIAL N ETWORKS Professor : Dr Sheykh Esmaili Presenters: Pourya Aliabadi Boshra Ardallani Paria Rakhshani 1

2 INTRODUCTION The Internet has spawned different types of information sharing systems, including the Web. MySpace (over 190 million users) Orkut (over 62 million) LinkedIn (over 11 million) LiveJournal (over 5.5 million) Unlike the Web, which is largely organized around content, online social networks are organized around users. 2

3 INTRODUCTION( CONT.) Users join a network, publish their profile and create links to any other users with whom they associate. The resulting social network provides a basis for maintaining social relationships, for finding users with similar interests, and for locating content. Understanding of the graph structure of online social networks is necessary to evaluate current systems, to design future online social network based systems. 3

4 INTRODUCTION( CONT.) Recent work has proposed the use of social networks to mitigate email spam, to improve Internet search, and to defend against Sybil attacks. We obtained our data by crawling publicly accessible information on these sites 4

5 INTRODUCTION( CONT.) This differs from content graphs like the graph formed by Web hyperlinks, where the popular pages (authorities) and the pages with many references (hubs) are distinct. We find that online social networks contain a large, strongly connected core of high-degree nodes, surrounded by many small clusters of low- degree nodes. Flow of information in these networks. 5

6 O NLINE SOCIAL NETWORKING SITES Online social networking sites. are usually run by individual corporations. Users. must register with a site, possibly under a pseudonym. Some sites allow browsing of public data without explicit. Links. The social network is composed of user accounts and links between users. Some sites (e.g. Flickr, LiveJournal) allow users to link to any other user, without consent from the link target. 6

7 O NLINE SOCIAL NETWORKING SITES ( CONT.) Groups. Most sites enable users to create and join special interest groups. Users can post messages to groups and upload shared content to the group. Certain groups are moderated. admission to such a group and postings to a group are controlled by a user designated as the group’s moderator. Other groups are unrestricted, allowing any member to join and post messages or content. 7

8 I S THE SOCIAL NETWORK USED IN LOCATING CONTENT ? Only Orkut is a “ pure ” social networking site, in the sense that the primary purpose of the site is finding and connecting to new users. Flickr, YouTube, and LiveJournal are used for sharing photographs, videos, and blogs, respectively. 8

9 W HY STUDY SOCIAL NETWORKS ? Are already at the heart of some very popular Web sites. Play an important role in future personal and commercial online interaction. Help us understand the impact of online social networks on the future Internet. We speculate how our data might be of interest to researchers in other disciplines. 9

10 S HARED INTEREST AND TRUST Adjacent users in a social network tend to trust each other. A number of research systems have been proposed to exploit this trust. Adjacent users in a social network also tend to have common interests. Users browse neighboring regions of their social network because they are likely to find content that is of interest to them. 10

11 I MPACT ON FUTURE I NTERNET Impact on future Internet Impact on other disciplines Sociologists can examine our data to test existing theories Studying the structure of online social networks may help improve the understanding of online campaigning and viral marketing. Political campaigns have realized the importance of blogs in elections. 11

12 H OW TO GET DATASETS ? Sites reluctant to give out data Cannot enumerate user list Performed crawls of user graph Crawled using cluster of 58 machines Used APIs where available Otherwise, used HTML screen scraping 12

13 C HALLENGES IN CRAWLING LARGE GRAPHS Need to crawl quickly Underlying social networks changing rapidly Need to crawl completely Social networks aren’t necessarily connected, some users have no links, or small clusters Need to estimate the crawl coverage 13

14 H OW TO VERIFY SAMPLES Obtain a random user sample Conduct a crawl using these random users as seeds See if these random nodes connect to the original WCC (weakly connected component) 14

15 D ATASET FROM F LICKR Used API to conduct the crawl Obtained random users by guessing usernames (########@N00) to evaluate coverage Covered 27% of user population, but remaining users have very few links 15

16 D ATASET FROM L IVE J OURNAL Used API to conduct the crawl Obtained random users using special URL http://www.livejournal.com/random.bml Crawl covered 95% of user population 16

17 D ATASET FROM O RKUT Used HTML screen-scraping to conduct the crawl 17

18 D ATASET FROM Y OU T UBE Used API to conduct the crawl Could not obtain random users Usernames user-specified strings Unable to estimate fraction of users covered 18

19 H IGH - LEVEL DATA CHARACTERISTICS Metrics vary by orders of magnitude However, networks share many key properties FlickrLiveJournalOrkutYouTube Number of Users1.8M5.3M3.1M1.2M Avg. friends per User12.217.01064.3 Number of User Groups0.1M7.5M8.7M30K Avg. Group per User4.621.21060.25 19

20 A NALYSIS OF NETWORK STRUCTURE Characterize the structural properties of the four network and compare them Link symetry Power-law node degrees Correlation of indegree and outdegree Path lengths and diameter Link degree correlations Densely connected core Tightly clustered fringe 20

21 H OW ARE THE LINKS DISTRIBUTED ? Distribution of indegree and outdegree is similar ●Underlying cause is link symmetry 21

22 L INK SYMMETRY YouTubeFlickrLiveJournalOrkut symmetry79.1%62.0%73.5%100.0% Possibly contributed by informing users of new incoming links Unlike other complex networks, such as the Web ●Sites like cnn.com receive much links more than they give makes it harder to identify reputable sources 22

23 P OWER - LAW NODE DEGREES All social networks show properties consistent with power-law networks. ●The majority of the nodes have small degree, and a few nodes have significantly higher degree 23

24 C ORRELATION OF INDEGREE AND OUTDEGREE outdegree vs. indegree in web outdegree vs. indegree in social networks 24 PWCNN OSN

25 P ATH LENGTHS AND DIAMETER all four networks have short path length from 4.25 – 5.88 networkAvg.path lendiameter web16.12905 flickr5.6727 livejournal5.8820 orkat4.259 youtube5.1021 25

26 L INK DEGREE CORRELATIONS Examine which users tend to connect to each other Focus on: Joint degree distribution How often nodes of different degrees connect to each other Scale free behavior A value calculated directly from the joint degree distribution of graph Assortativity A measure of the likelihood for nodes to connect to other nodes with similar degrees 26

27 27 J OINT DEGREE DISTRIBUTION AND S CALE - FREE BEHAVIOUR

28 D ENSELY CONNECTED CORE comprising of between 1% and 10% of the highest degree nodes removing 10% of core nodes results in breaking up graph into millions of very small SCCs graphs below show results as nodes are removed starting with highest-degree nodes (left) and path length as graph is constructed beginning with highest-degree nodes(right) 28 Sub logarithmic growth 28

29 C LUSTERING COEFFICIENT Clustering coefficient C is a metric of cliquishness Online social networks are tightly clustered 10,000 times more clustered than random graphs 5-50 times more clustered than random power-law graphs 29

30 T IGHTLY CLUSTERED FRINGE Low-degree users show high degree of clustering Social network graphs show stronger clustering 30

31 G ROUPS NetworkGroupsUsageAvg. SizeAvg. C Flickr103M21%820.47 LiveJournal7M61%150.81 Orkut9M13%370.52 YouTube30M8%100.34 31

32 W HAT DOES THE STRUCTURE LOOK LIKE o the networks contain a densely connected core of high-degree nodes; o and that this core links small groups of strongly clustered, low- degree nodes at the fringes of the network. octopus 32

33 C ONCLUSIONS Structure of OSNs is significantly different from the Web Higher degree symmetry in OSNs Much higher levels of local clustering in OSNs Privacy controls make graph crawling very difficult Pure social networks different from content sharing networks 33

34 Thanks 34


Download ppt "M EASUREMENT AND A NALYSIS OF O NLINE S OCIAL N ETWORKS Professor : Dr Sheykh Esmaili Presenters: Pourya Aliabadi Boshra Ardallani Paria Rakhshani 1."

Similar presentations


Ads by Google