Download presentation
Presentation is loading. Please wait.
Published byChloe Cleopatra Ross Modified over 9 years ago
1
Computing & Information Sciences Kansas State University 2015-10-25 Laboratory for Knowledge Discovery in Databases PhD Research Proficiency Exam Jing Xia Laboratory for Knowledge Discovery in Databases Department of Computing and Information Sciences Kansas State University http://www.kddresearch.org http://www.cis.ksu.edu/~xiajing Social Network Analysis using Link Mining
2
Computing & Information Sciences Kansas State University 2015-10-25 Laboratory for Knowledge Discovery in Databases Social Network Introduction Networks in Biological System Mining on Social Network Linking Mining Multi Relational Mining Problem Specification Proposed approach Outline
3
Computing & Information Sciences Kansas State University 2015-10-25 Laboratory for Knowledge Discovery in Databases Social Network Introduction What is Social Network? a social net work is a heterogeneous and multirelational data set represented by a graph. Characteristics of Social Network “Natural” Networks and Universality Quantitative measures Mining Social Network Link Mining: Tasks and Challenges
4
Computing & Information Sciences Kansas State University 2015-10-25 Laboratory for Knowledge Discovery in Databases Society Nodes: individuals Links: social relationship (family/work/friendship/etc.) S. Milgram (1967) “natural” network appears to be a universal Six Degrees of Separation Society networks: Many individuals with diverse social interactions between them. 2015年10月25日星期日 2015年10月25日星期日 2015年10月25日星期日 Data Mining: Concepts and Techniques 4
5
Computing & Information Sciences Kansas State University 2015-10-25 Laboratory for Knowledge Discovery in Databases Communication The Earth is developing an electronic system, a network with diverse nodes and links are -computers -routers -satellites -phone lines -TV cables -EM waves Communication networks: Many non-identical components with diverse connections between them.
6
Computing & Information Sciences Kansas State University 2015-10-25 Laboratory for Knowledge Discovery in Databases Epidemiology Nodes: doctors, patients, geological location Links: contact relationship (direct/indirect infectiousness )
7
Computing & Information Sciences Kansas State University 2015-10-25 Laboratory for Knowledge Discovery in Databases Characteristics of Social Network Consider many kinds of networks: social, technological, business, economic, content,… These networks tend to share certain informal properties: Multi relational interaction Temporal (time-evolving) large scale; continual growth distributed, organic growth: vertices “decide” who to link to mixture of local and long-distance connections abstract notions of distance: geographical, content, social,…
8
Computing & Information Sciences Kansas State University 2015-10-25 Laboratory for Knowledge Discovery in Databases Social Network Theory Do natural networks share more quantitative universals? What would these “universals” be? How can we make them precise and measure them? How can we explain their universality? This is the domain of social network theory Sometimes also referred to as link analysis
9
Computing & Information Sciences Kansas State University 2015-10-25 Laboratory for Knowledge Discovery in Databases Quantitative Measure Connected components: how many, and how large? Network diameter: maximum (worst-case) or average? exclude infinite distances? (disconnected components) the small-world phenomenon
10
Computing & Information Sciences Kansas State University 2015-10-25 Laboratory for Knowledge Discovery in Databases Quantitative Measure Clustering: to what extent that links tend to cluster “locally”? what is the balance between local and long- distance connections? what roles do the two types of links play? Degree distribution: what is the typical degree in the network? what is the overall distribution?
11
Computing & Information Sciences Kansas State University 2015-10-25 Laboratory for Knowledge Discovery in Databases Social Network Introduction Networks in Biological System Problem Specification Mining on Social Network Linking Mining Multi Relational Mining Outline
12
Computing & Information Sciences Kansas State University 2015-10-25 Laboratory for Knowledge Discovery in Databases Bio-Map Protein-gene interaction protein-protein interactions PROTEOME GENOME Citrate Cycle METABOLISM Bio-chemical reactions
13
Computing & Information Sciences Kansas State University 2015-10-25 Laboratory for Knowledge Discovery in Databases Protein-Protein Interaction Network protein-protein interactions PROTEOME
14
Computing & Information Sciences Kansas State University 2015-10-25 Laboratory for Knowledge Discovery in Databases Protein-Protein Interaction Network Nodes : proteins Links : multi relational physical interactions (binding) complex membership Pathway P. Uetz, et al. Nature 403, 623-7 (2000).
15
Computing & Information Sciences Kansas State University 2015-10-25 Laboratory for Knowledge Discovery in Databases Social Network Introduction Networks in Biological System Mining on Social Network Linking Mining Multi Relational Mining Problem Specification Proposed approach Outline
16
Computing & Information Sciences Kansas State University 2015-10-25 Laboratory for Knowledge Discovery in Databases Link Mining Traditional machine learning and data mining approaches assume: data is flat Typical real data set Instances in data set form linked networks Link Mining Newly emerging research area at the intersection of research in social network and link analysis, hypertext and web mining, graph mining, relational learning and inductive logic programming
17
Computing & Information Sciences Kansas State University 2015-10-25 Laboratory for Knowledge Discovery in Databases Link Mining Tasks Object-Related Tasks Link-based object ranking Link-based object classification Object clustering (group detection) Object identification (entity resolution) Link-Related Tasks Link prediction Graph-Related Tasks Subgraph discovery Graph classification
18
Computing & Information Sciences Kansas State University 2015-10-25 Laboratory for Knowledge Discovery in Databases Multi-relational Link Mining Traditional link mining assume there is only one kind of relation in the network: link is flat There exist multiple, heterogeneous social networks, each representing a particular kind of relationship Multi-relational & heterogeneous
19
Computing & Information Sciences Kansas State University 2015-10-25 Laboratory for Knowledge Discovery in Databases Multi-relational Network Multi-relational & heterogeneous Network Multiple object and link types Example Network Medical network: patients, doctors, disease, contacts, treatments Bibliographic network: years, publications, authors, venues Epidemic transmission network (involve temporal data, multi-relational: airborne, patients’ contacts
20
Computing & Information Sciences Kansas State University 2015-10-25 Laboratory for Knowledge Discovery in Databases Social Network Introduction Networks in Biological System Mining on Social Network Linking Mining Multi Relational Mining Problem Specification Proposed approach Outline
21
Computing & Information Sciences Kansas State University 2015-10-25 Laboratory for Knowledge Discovery in Databases Problem Specification Phenomenon: Heterogeneity & Multi-relationship exists in many real network Rationale: it might be useful for link mining Problem Can we utilize multi-relationship to help link analysis How to extract relations as relation network (RN)? How to identify relationship among relation network? (co-relation, independent, etc) Is RN time-evolving? Which relation plays an important role?
22
Computing & Information Sciences Kansas State University 2015-10-25 Laboratory for Knowledge Discovery in Databases Problem Example1 Application Domain: Epidemic Disease Pre-condition 1: given multi relations -- patients’ contacts network in timeline Pre-condition 2: sequential relationship among relations Pre-condition 3: another medium of disease transmission Problem: can we predict if any person will be infected, based on mining these multi-relational networks?
23
Computing & Information Sciences Kansas State University 2015-10-25 Laboratory for Knowledge Discovery in Databases Problem Example2 Application Domain: bibliographic network Pre-condition 1: given multi relations – the co-author relation networks of a conference in some years Problem 1: what is the relationship among these relation networks Problem 2: How can we utilize the relationship to meet the user’s query Mining Hidden Community in Heterogeneous Social Networks, Deng Cai, Zheng Shao, Xiaofei He, Xifeng Yan, and Jiawei Han, March, Report No. UIUCDCS-R-2005-2538 UILU-ENG-2005-1731
24
Computing & Information Sciences Kansas State University 2015-10-25 Laboratory for Knowledge Discovery in Databases Problem Example3 Application Domain: bibliographic network Pre-condition 1: given multi relations – the co-author networks of a conference in some years Pre-condition 2: topics of publications Problem: Can we predict if two researchers will be co- author in the future, based on two types of networks?
25
Computing & Information Sciences Kansas State University 2015-10-25 Laboratory for Knowledge Discovery in Databases Social Network Introduction Networks in Biological System Mining on Social Network Linking Mining Multi Relational Mining Problem Specification Proposed approach Outline
26
Computing & Information Sciences Kansas State University 2015-10-25 Laboratory for Knowledge Discovery in Databases Proposed approach Random walk with restart 1 4 3 2 5 6 7 9 10 8 11 12 Node 4 Node 1 Node 2 Node 3 Node 4 Node 5 Node 6 Node 7 Node 8 Node 9 Node 10 Node 11 Node 12 0.13 0.10 0.13 0.22 0.13 0.05 0.08 0.04 0.03 0.04 0.02 1 4 3 2 5 6 7 9 10 8 11 12 0.13 0.10 0.13 0.05 0.08 0.04 0.02 0.040.04 0.03 More red, more relevant Nearby nodes, higher scores
27
Computing & Information Sciences Kansas State University 2015-10-25 Laboratory for Knowledge Discovery in Databases Proposed approach Basic idea RWR serves as a measure for proximity between two nodes in network Model relationship among multi relations using RWR Purpose Facilitate mining more interesting patterns Increase prediction accuracy
28
Computing & Information Sciences Kansas State University 2015-10-25 Laboratory for Knowledge Discovery in Databases Measure Relationship A: RWR! Q: what is most related conference to ICDM Neighborhood Formulation [Sun ICDM2005]
29
Computing & Information Sciences Kansas State University 2015-10-25 Laboratory for Knowledge Discovery in Databases Multi-Relational Model ICDM author network KDD author network PKDD author network ICML author network relation network
30
Computing & Information Sciences Kansas State University 2015-10-25 Laboratory for Knowledge Discovery in Databases Other Applications Content-based Image Retrieval [He] Personalized PageRank [Jeh], [Widom], [Haveliwala] Anomaly Detection (for node; link) [Sun] Link Prediction [Getoor], [Jensen] Semi-supervised Learning [Zhu], [Zhou] …
31
Computing & Information Sciences Kansas State University 2015-10-25 Laboratory for Knowledge Discovery in Databases Social Network Analysis Linking mining Problem: multi relational Proposed approach Summary
32
Computing & Information Sciences Kansas State University 2015-10-25 Laboratory for Knowledge Discovery in Databases Thank you
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.