Presentation is loading. Please wait.

Presentation is loading. Please wait.

Tutorial 2: Using Matlab for network construction, ranking, clustering, topic modeling, and path finding Erjia Yan.

Similar presentations


Presentation on theme: "Tutorial 2: Using Matlab for network construction, ranking, clustering, topic modeling, and path finding Erjia Yan."— Presentation transcript:

1 Tutorial 2: Using Matlab for network construction, ranking, clustering, topic modeling, and path finding Erjia Yan

2  Network construction  Ranking  Clustering  Topic modeling  Path finding

3  Network construction  Ranking  Clustering  Topic modeling  Path finding

4  Bibliographical data

5  Paper-to-paper citation network is the base  Web of Science cited references format:  First Author, Year Of Publication, Abbreviated Journal Name, Volume Number, Beginning Page Number  AANESTAD M, 2011, J STRATEGIC INF SYST, V20, P161  All fields can be found in “full record + cited references” downloading option Some of the newer records may also have DOI. For a better match, it is better to remove the DOI from the cited references

6  For citing papers, extract these fields and format them into Web of Science cited reference format.  Now we have citing papers and cited references that have the same format  Use these two fields, construct an internal citation network that only contains those cited references that are cited by the citing papers in the data set

7  If you can write an app for this, it would be great!  Otherwise, you can follow these instructions  Converting into  Use Access to construct the network  Have a table for citing papers  Import the converted citation pairs to Access  Use query to extract those pairs whose papers are in the table  Now you have the node info and link info  Import both into Matlab CP1CR1; CR2; CR3 CP1CR1 CP1CR2 CP1CR3

8  Now we have paper-to-paper citation networks, but in order to construct for instance author-to-author citation or author co-citation networks, we need to use adjacent matrices. Authors Papers a cell number 1 (i,j)=1 indicates paper i is written by author j

9  Convert into  Add to the beginning of the file  Use Txt2Pajek on the linkage file  Import the edge section of the.net file to Matlab  Select M(1:n,n+1:m) where m is the col size. The selection is our author-paper adjacent matrix ID1AU1; AU2; AU3 ID1AU1 ID1AU2 ID1AU3 ID1 ID2 …… IDn

10

11

12

13  Network construction  Ranking  Clustering  Topic modeling  Path finding

14  By David Gleich of Purdue University  http://www.mathworks.com/matlabcentral/filee xchange/11613-pagerank http://www.mathworks.com/matlabcentral/filee xchange/11613-pagerank  pagerank(M,options)  options.c: the teleportation coefficient [double | {0.85}]  options.v: the personalization vector [vector | {uniform: 1/n}]

15  Network construction  Ranking  Clustering  Topic modeling  Path finding

16  K-means  IDX = kmeans(X,k)  http://www.mathworks.com/help/stats/kmeans.ht ml http://www.mathworks.com/help/stats/kmeans.ht ml  Hierarchical clustering  http://www.mathworks.com/help/stats/hierarchical -clustering.html http://www.mathworks.com/help/stats/hierarchical -clustering.html

17  By MIT Strategic Engineering  http://strategic.mit.edu/downloads.php?page= matlab_networks http://strategic.mit.edu/downloads.php?page= matlab_networks  [modules,module_hist,Q] = newmangirvan(adj,k)  [groups_hist,Q]=newman_comm_fast(adj)

18  By Nees van Eck and Ludo Waltman of Leiden University  http://www.vosviewer.com/relatedsoftware/ http://www.vosviewer.com/relatedsoftware/  A variant of the modularity-based clustering technique  [X, cluster_size, V] = VOS_clustering(A, P)

19  Network construction  Ranking  Clustering  Topic modeling  Path finding

20  By Mark Steyvers of University of California Irvine  http://psiexp.ss.uci.edu/research/programs_dat a/toolbox.htm http://psiexp.ss.uci.edu/research/programs_dat a/toolbox.htm  Input: The input is a bag of word representation containing the number of times each words occurs in a document.

21  Network construction  Ranking  Clustering  Topic modeling  Path finding

22  http://www.mathworks.com/help/bioinfo/ref/gr aphshortestpath.html http://www.mathworks.com/help/bioinfo/ref/gr aphshortestpath.html  [dist, path, pred]=graphshortestpath(G,S,T)  from S to T in graph G  [dist] = graphallshortestpaths(G)  find all shortest path in graph G; dist is a distance matrix for the shortest path of each pair of nodes


Download ppt "Tutorial 2: Using Matlab for network construction, ranking, clustering, topic modeling, and path finding Erjia Yan."

Similar presentations


Ads by Google