Presentation is loading. Please wait.

Presentation is loading. Please wait.

Social Computing and Big Data Analytics 社群運算與大數據分析 1 1042SCBDA12 MIS MBA (M2226) (8628) Wed, 8,9, (15:10-17:00) (B309) Social Network Analysis ( 社會網絡分析.

Similar presentations


Presentation on theme: "Social Computing and Big Data Analytics 社群運算與大數據分析 1 1042SCBDA12 MIS MBA (M2226) (8628) Wed, 8,9, (15:10-17:00) (B309) Social Network Analysis ( 社會網絡分析."— Presentation transcript:

1 Social Computing and Big Data Analytics 社群運算與大數據分析 1 1042SCBDA12 MIS MBA (M2226) (8628) Wed, 8,9, (15:10-17:00) (B309) Social Network Analysis ( 社會網絡分析 ) Min-Yuh Day 戴敏育 Assistant Professor 專任助理教授 Dept. of Information Management, Tamkang University Dept. of Information ManagementTamkang University 淡江大學 淡江大學 資訊管理學系 資訊管理學系 http://mail. tku.edu.tw/myday/ 2016-05-18 Tamkang University

2 週次 (Week) 日期 (Date) 內容 (Subject/Topics) 1 2016/02/17 Course Orientation for Social Computing and Big Data Analytics ( 社群運算與大數據分析課程介紹 ) 2 2016/02/24 Data Science and Big Data Analytics: Discovering, Analyzing, Visualizing and Presenting Data ( 資料科學與大數據分析: 探索、分析、視覺化與呈現資料 ) 3 2016/03/02 Fundamental Big Data: MapReduce Paradigm, Hadoop and Spark Ecosystem ( 大數據基礎: MapReduce 典範、 Hadoop 與 Spark 生態系統 ) 課程大綱 (Syllabus) 2

3 週次 (Week) 日期 (Date) 內容 (Subject/Topics) 4 2016/03/09 Big Data Processing Platforms with SMACK: Spark, Mesos, Akka, Cassandra and Kafka ( 大數據處理平台 SMACK : Spark, Mesos, Akka, Cassandra, Kafka) 5 2016/03/16 Big Data Analytics with Numpy in Python (Python Numpy 大數據分析 ) 6 2016/03/23 Finance Big Data Analytics with Pandas in Python (Python Pandas 財務大數據分析 ) 7 2016/03/30 Text Mining Techniques and Natural Language Processing ( 文字探勘分析技術與自然語言處理 ) 8 2016/04/06 Off-campus study ( 教學行政觀摩日 ) 課程大綱 (Syllabus) 3

4 週次 (Week) 日期 (Date) 內容 (Subject/Topics) 9 2016/04/13 Social Media Marketing Analytics ( 社群媒體行銷分析 ) 10 2016/04/20 期中報告 (Midterm Project Report) 11 2016/04/27 Deep Learning with Theano and Keras in Python (Python Theano 和 Keras 深度學習 ) 12 2016/05/04 Deep Learning with Google TensorFlow (Google TensorFlow 深度學習 ) 13 2016/05/11 Sentiment Analysis on Social Media with Deep Learning ( 深度學習社群媒體情感分析 ) 課程大綱 (Syllabus) 4

5 週次 (Week) 日期 (Date) 內容 (Subject/Topics) 14 2016/05/18 Social Network Analysis ( 社會網絡分析 ) 15 2016/05/25 Measurements of Social Network ( 社會網絡量測 ) 16 2016/06/01 Tools of Social Network Analysis ( 社會網絡分析工具 ) 17 2016/06/08 Final Project Presentation I ( 期末報告 I) 18 2016/06/15 Final Project Presentation II ( 期末報告 II) 課程大綱 (Syllabus) 5

6 Social Computing 6

7 Social Network Analysis 7

8 Social Computing Social Network Analysis Link mining Community Detection Social Recommendation 8

9 Business Insights with Social Analytics 9

10 Analyzing the Social Web: Social Network Analysis 10

11 11 Source: http://www.amazon.com/Analyzing-Social-Web-Jennifer-Golbeck/dp/0124055311http://www.amazon.com/Analyzing-Social-Web-Jennifer-Golbeck/dp/0124055311 Jennifer Golbeck (2013), Analyzing the Social Web, Morgan Kaufmann

12 Social Network Analysis (SNA) Facebook TouchGraph 12

13 Social Network Analysis 13 Source: http://www.fmsasg.com/SocialNetworkAnalysis/http://www.fmsasg.com/SocialNetworkAnalysis/

14 Social Network Analysis A social network is a social structure of people, related (directly or indirectly) to each other through a common relation or interest Social network analysis (SNA) is the study of social networks to understand their structure and behavior 14 Source: (c) Jaideep Srivastava, srivasta@cs.umn.edu, Data Mining for Social Network Analysis

15 Centrality Prestige 15 Social Network Analysis (SNA)

16 Graph Theory 16

17 Graph 17 A A B B D D E E C C

18 Graph g = (V, E) 18

19 Vertex (Node) 19

20 Vertices (Nodes) 20

21 Edge 21

22 Edges 22

23 Arc 23

24 Arcs 24

25 Undirected Graph 25 Source: https://www.youtube.com/watch?v=89mxOdwPfxA A A B B D D E E C C

26 Directed Graph 26 Source: https://www.youtube.com/watch?v=89mxOdwPfxA A A B B D D E E C C

27 Degree 27 Source: https://www.youtube.com/watch?v=89mxOdwPfxA A A B B D D E E C C

28 Degree 28 Source: https://www.youtube.com/watch?v=89mxOdwPfxA A A B B D D E E C C A: 2 B: 4 C: 2 D:1 E: 1

29 Density 29 Source: https://www.youtube.com/watch?v=89mxOdwPfxA A A B B D D E E C C

30 Density 30 Source: https://www.youtube.com/watch?v=89mxOdwPfxA A A B B D D E E C C Edges (Links): 5 Total Possible Edges: 10 Density: 5/10 = 0.5

31 Density 31 Nodes (n): 10 Edges (Links): 13 Total Possible Edges: (n * (n-1)) / 2 = (10 * 9) / 2 = 45 Density: 13/45 = 0.29 A A B B D D C C E E F F G G H H I I J J

32 Which Node is Most Important? 32 A A B B D D C C E E F F G G H H I I J J

33 Centrality Important or prominent actors are those that are linked or involved with other actors extensively. A person with extensive contacts (links) or communications with many other people in the organization is considered more important than a person with relatively fewer contacts. The links can also be called ties. A central actor is one involved in many ties. 33 Source: Bing Liu (2011), “Web Data Mining: Exploring Hyperlinks, Contents, and Usage Data”

34 Social Network Analysis (SNA) Degree Centrality Betweenness Centrality Closeness Centrality 34

35 Degree Centrality 35

36 36 Social Network Analysis: Degree Centrality A A B B D D C C E E F F G G H H I I J J

37 37 Social Network Analysis: Degree Centrality A A B B D D C C E E F F G G H H I I J J NodeScore Standardized Score A2 2/10 = 0.2 B2 C5 5/10 = 0.5 D3 3/10 = 0.3 E3 F2 2/10 = 0.2 G4 4/10 = 0.4 H3 3/10 = 0.3 I1 1/10 = 0.1 J1

38 Betweenness Centrality 38

39 Betweenness centrality: Connectivity Number of shortest paths going through the actor 39

40 Betweenness Centrality 40 Where g jk = the number of shortest paths connecting jk g jk (i) = the number that actor i is on. Normalized Betweenness Centrality Number of pairs of vertices excluding the vertex itself Source: https://www.youtube.com/watch?v=RXohUeNCJiU

41 Betweenness Centrality 41 A A B B D D E E C C A: B  C: 0/1 = 0 B  D: 0/1 = 0 B  E: 0/1 = 0 C  D: 0/1 = 0 C  E: 0/1 = 0 D  E: 0/1 = 0 Total: 0 A: Betweenness Centrality = 0

42 Betweenness Centrality 42 A A B B D D E E C C B: A  C: 0/1 = 0 A  D: 1/1 = 1 A  E: 1/1 = 1 C  D: 1/1 = 1 C  E: 1/1 = 1 D  E: 1/1 = 1 Total: 5 B: Betweenness Centrality = 5

43 Betweenness Centrality 43 A A B B D D E E C C C: A  B: 0/1 = 0 A  D: 0/1 = 0 A  E: 0/1 = 0 B  D: 0/1 = 0 B  E: 0/1 = 0 D  E: 0/1 = 0 Total: 0 C: Betweenness Centrality = 0

44 Betweenness Centrality 44 A A B B D D E E C C A: 0 B: 5 C: 0 D: 0 E: 0

45 45 A A G G D D B B J J C C H H I I E E F F A A D D B B J J C C H H E E F F Which Node is Most Important?

46 46 A A G G D D B B J J C C H H I I E E F F A A G G D D J J H H I I E E F F Which Node is Most Important?

47 47 A A D D B B C C E E Betweenness Centrality

48 48 A A B B D D E E C C A: B  C: 0/1 = 0 B  D: 0/1 = 0 B  E: 0/1 = 0 C  D: 0/1 = 0 C  E: 0/1 = 0 D  E: 0/1 = 0 Total: 0 A: Betweenness Centrality = 0

49 Closeness Centrality 49

50 50 Social Network Analysis: Closeness Centrality A A B B D D C C E E F F G G H H I I J J C  A: 1 C  B: 1 C  D: 1 C  E: 1 C  F: 2 C  G: 1 C  H: 2 C  I: 3 C  J: 3 Total=15 C: Closeness Centrality = 15/9 = 1.67

51 51 Social Network Analysis: Closeness Centrality A A B B D D C C E E F F G G H H I I J J G  A: 2 G  B: 2 G  C: 1 G  D: 2 G  E: 1 G  F: 1 G  H: 1 G  I: 2 G  J: 2 Total=14 G: Closeness Centrality = 14/9 = 1.56

52 52 Social Network Analysis: Closeness Centrality A A B B D D C C E E F F G G H H I I J J H  A: 3 H  B: 3 H  C: 2 H  D: 2 H  E: 2 H  F: 2 H  G: 1 H  I: 1 H  J: 1 Total=17 H: Closeness Centrality = 17/9 = 1.89

53 53 Social Network Analysis: Closeness Centrality A A B B D D C C E E F F G G H H I I J J H: Closeness Centrality = 17/9 = 1.89 C: Closeness Centrality = 15/9 = 1.67 G: Closeness Centrality = 14/9 = 1.56 1 1 2 2 3 3

54 Eigenvector centrality: Importance of a node depends on the importance of its neighbors 54

55 55 where d(p j, p k ) is the geodesic distance (shortest paths) linking p j, p k Social Network Analysis: Closeness Centrality Sum of the reciprocal distances

56 56 Source: Abbasi, A., Hossain, L., & Leydesdorff, L. (2012). Betweenness centrality as a driver of preferential attachment in the evolution of research collaboration networks. Journal of Informetrics, 6(3), 403-412. where g ij is the geodesic distance (shortest paths) linking p i and p j and g ij (p k ) is the geodesic distance linking p i and p j that contains p k. Social Network Analysis: Betweenness Centrality

57 57 where a (p i, p k ) = 1 if and only if p i and p k are connected by a line 0 otherwise Social Network Analysis: Degree Centrality

58 Social Network Analysis: Hub and Authority 58 Source: http://www.fmsasg.com/SocialNetworkAnalysis/http://www.fmsasg.com/SocialNetworkAnalysis/ Hubs are entities that point to a relatively large number of authorities. They are essentially the mutually reinforcing analogues to authorities. Authorities point to high hubs. Hubs point to high authorities. You cannot have one without the other.

59 Application of SNA Social Network Analysis of Research Collaboration in Information Reuse and Integration 59 Source: Min-Yuh Day, Sheng-Pao Shih, Weide Chang (2011), "Social Network Analysis of Research Collaboration in Information Reuse and Integration"

60 Example of SNA Data Source 60 Source: http://www.informatik.uni-trier.de/~ley/db/conf/iri/iri2010.htmlhttp://www.informatik.uni-trier.de/~ley/db/conf/iri/iri2010.html

61 Research Question RQ1: What are the scientific collaboration patterns in the IRI research community? RQ2: Who are the prominent researchers in the IRI community? 61 Source: Min-Yuh Day, Sheng-Pao Shih, Weide Chang (2011), "Social Network Analysis of Research Collaboration in Information Reuse and Integration"

62 Methodology Developed a simple web focused crawler program to download literature information about all IRI papers published between 2003 and 2010 from IEEE Xplore and DBLP. – 767 paper – 1599 distinct author Developed a program to convert the list of coauthors into the format of a network file which can be readable by social network analysis software. UCINet and Pajek were used in this study for the social network analysis. 62 Source: Min-Yuh Day, Sheng-Pao Shih, Weide Chang (2011), "Social Network Analysis of Research Collaboration in Information Reuse and Integration"

63 Top10 prolific authors (IRI 2003-2010) 1.Stuart Harvey Rubin 2.Taghi M. Khoshgoftaar 3.Shu-Ching Chen 4.Mei-Ling Shyu 5.Mohamed E. Fayad 6.Reda Alhajj 7.Du Zhang 8.Wen-Lian Hsu 9.Jason Van Hulse 10.Min-Yuh Day 63 Source: Min-Yuh Day, Sheng-Pao Shih, Weide Chang (2011), "Social Network Analysis of Research Collaboration in Information Reuse and Integration"

64 Data Analysis and Discussion Closeness Centrality – Collaborated widely Betweenness Centrality – Collaborated diversely Degree Centrality – Collaborated frequently Visualization of Social Network Analysis – Insight into the structural characteristics of research collaboration networks 64 Source: Min-Yuh Day, Sheng-Pao Shih, Weide Chang (2011), "Social Network Analysis of Research Collaboration in Information Reuse and Integration"

65 Top 20 authors with the highest closeness scores RankIDClosenessAuthor 130.024675Shu-Ching Chen 210.022830Stuart Harvey Rubin 340.022207Mei-Ling Shyu 460.020013Reda Alhajj 5610.019700Na Zhao 62600.018936Min Chen 71510.018230Gordon K. Lee 8190.017962Chengcui Zhang 910430.017962Isai Michel Lombera 1010270.017962Michael Armella 114430.017448James B. Law 121570.017082Keqi Zhang 132530.016731Shahid Hamid 1410380.016618Walter Z. Tang 159590.016285Chengjun Zhan 169570.016285Lin Luo 179560.016285Guo Chen 189550.016285Xin Huang 199430.016285Sneh Gulati 209600.016071Sheng-Tun Li 65 Source: Min-Yuh Day, Sheng-Pao Shih, Weide Chang (2011), "Social Network Analysis of Research Collaboration in Information Reuse and Integration"

66 Top 20 authors with the highest betweeness scores RankIDBetweennessAuthor 110.000752Stuart Harvey Rubin 230.000741Shu-Ching Chen 320.000406Taghi M. Khoshgoftaar 4660.000385Xingquan Zhu 540.000376Mei-Ling Shyu 660.000296Reda Alhajj 7650.000256Xindong Wu 8190.000194Chengcui Zhang 9390.000185Wei Dai 10150.000107Narayan C. Debnath 11310.000094Qianhui Althea Liang 121510.000094Gordon K. Lee 1370.000085Du Zhang 14300.000072Baowen Xu 15410.000067Hongji Yang 162700.000060Zhiwei Xu 1750.000043Mohamed E. Fayad 181100.000042Abhijit S. Pandya 191060.000042Sam Hsu 2080.000042Wen-Lian Hsu 66 Source: Min-Yuh Day, Sheng-Pao Shih, Weide Chang (2011), "Social Network Analysis of Research Collaboration in Information Reuse and Integration"

67 Top 20 authors with the highest degree scores RankIDDegreeAuthor 130.035044Shu-Ching Chen 210.034418Stuart Harvey Rubin 320.030663Taghi M. Khoshgoftaar 460.028786Reda Alhajj 580.028786Wen-Lian Hsu 6100.024406Min-Yuh Day 740.022528Mei-Ling Shyu 8170.021277Richard Tzong-Han Tsai 9140.017522Eduardo Santana de Almeida 10160.017522Roumen Kountchev 11400.016896Hong-Jie Dai 12150.015645Narayan C. Debnath 1390.015019Jason Van Hulse 14250.013767Roumiana Kountcheva 15280.013141Silvio Romero de Lemos Meira 16240.013141Vladimir Todorov 17230.013141Mariofanna G. Milanova 1850.013141Mohamed E. Fayad 19 0.012516Chengcui Zhang 20180.011890Waleed W. Smari 67 Source: Min-Yuh Day, Sheng-Pao Shih, Weide Chang (2011), "Social Network Analysis of Research Collaboration in Information Reuse and Integration"

68 Visualization of IRI (IEEE IRI 2003-2010) co-authorship network (global view) 68 Source: Min-Yuh Day, Sheng-Pao Shih, Weide Chang (2011), "Social Network Analysis of Research Collaboration in Information Reuse and Integration"

69 69 Source: Min-Yuh Day, Sheng-Pao Shih, Weide Chang (2011), "Social Network Analysis of Research Collaboration in Information Reuse and Integration"

70 Visualization of Social Network Analysis 70 Source: Min-Yuh Day, Sheng-Pao Shih, Weide Chang (2011), "Social Network Analysis of Research Collaboration in Information Reuse and Integration"

71 71 Source: Min-Yuh Day, Sheng-Pao Shih, Weide Chang (2011), "Social Network Analysis of Research Collaboration in Information Reuse and Integration" Visualization of Social Network Analysis

72 72 Source: Min-Yuh Day, Sheng-Pao Shih, Weide Chang (2011), "Social Network Analysis of Research Collaboration in Information Reuse and Integration" Visualization of Social Network Analysis

73 Social Network Analysis (SNA) Tools NetworkX igraph Gephi UCINet Pajek 73

74 NetworkX 74 https://networkx.github.io/

75 igraph 75 http://igraph.org/

76 Gephi The Open Graph Viz Platform 76 Source: https://gephi.org/

77 SNA Tool: UCINet 77 https://sites.google.com/site/ucinetsoftware/home

78 SNA Tool: Pajek 78 http://vlado.fmf.uni-lj.si/pub/networks/pajek/

79 SNA Tool: Pajek 79 http://pajek.imfm.si/doku.php

80 80 Source: http://vlado.fmf.uni-lj.si/pub/networks/doc/gd.01/Pajek9.pnghttp://vlado.fmf.uni-lj.si/pub/networks/doc/gd.01/Pajek9.png

81 81 Source: http://vlado.fmf.uni-lj.si/pub/networks/doc/gd.01/Pajek6.pnghttp://vlado.fmf.uni-lj.si/pub/networks/doc/gd.01/Pajek6.png

82 References Bing Liu (2011), “Web Data Mining: Exploring Hyperlinks, Contents, and Usage Data,” 2 nd Edition, Springer. http://www.cs.uic.edu/~liub/WebMiningBook.html http://www.cs.uic.edu/~liub/WebMiningBook.html Jennifer Golbeck (2013), Analyzing the Social Web, Morgan Kaufmann. http://analyzingthesocialweb.com/course-materials.shtmlhttp://analyzingthesocialweb.com/course-materials.shtml Sentinel Visualizer, http://www.fmsasg.com/SocialNetworkAnalysis/http://www.fmsasg.com/SocialNetworkAnalysis/ Min-Yuh Day, Sheng-Pao Shih, Weide Chang (2011), "Social Network Analysis of Research Collaboration in Information Reuse and Integration," The First International Workshop on Issues and Challenges in Social Computing (WICSOC 2011), August 2, 2011, in Proceedings of the IEEE International Conference on Information Reuse and Integration (IEEE IRI 2011), Las Vegas, Nevada, USA, August 3-5, 2011, pp. 551-556. 82


Download ppt "Social Computing and Big Data Analytics 社群運算與大數據分析 1 1042SCBDA12 MIS MBA (M2226) (8628) Wed, 8,9, (15:10-17:00) (B309) Social Network Analysis ( 社會網絡分析."

Similar presentations


Ads by Google