Copyright 2006, Data Mining Research Laboratory An Event-based Framework for Characterizing the Evolutionary Behavior of Interaction Graphs Sitaram Asur,

Slides:



Advertisements
Similar presentations
Yinyin Yuan and Chang-Tsun Li Computer Science Department
Advertisements

+ Multi-label Classification using Adaptive Neighborhoods Tanwistha Saha, Huzefa Rangwala and Carlotta Domeniconi Department of Computer Science George.
LEARNING INFLUENCE PROBABILITIES IN SOCIAL NETWORKS Amit Goyal Francesco Bonchi Laks V. S. Lakshmanan University of British Columbia Yahoo! Research University.
KDD 2009 Scalable Graph Clustering using Stochastic Flows Applications to Community Discovery Venu Satuluri and Srinivasan Parthasarathy Data Mining Research.
Community Detection Laks V.S. Lakshmanan (based on Girvan & Newman. Finding and evaluating community structure in networks. Physical Review E 69,
Community Detection Algorithm and Community Quality Metric Mingming Chen & Boleslaw K. Szymanski Department of Computer Science Rensselaer Polytechnic.
Maximizing the Spread of Influence through a Social Network
Graph Partitioning Dr. Frank McCown Intro to Web Science Harding University This work is licensed under Creative Commons Attribution-NonCommercial 3.0Attribution-NonCommercial.
Patterns of Influence in a Recommendation Network Jure Leskovec, CMU Ajit Singh, CMU Jon Kleinberg, Cornell School of Computer Science Carnegie Mellon.
Social Media Mining Chapter 5 1 Chapter 5, Community Detection and Mining in Social Media. Lei Tang and Huan Liu, Morgan & Claypool, September, 2010.
1 Social Influence Analysis in Large-scale Networks Jie Tang 1, Jimeng Sun 2, Chi Wang 1, and Zi Yang 1 1 Dept. of Computer Science and Technology Tsinghua.
1 Modularity and Community Structure in Networks* Final project *Based on a paper by M.E.J Newman in PNAS 2006.
A hub-attachment based method to detect functional modules from confidence-scored protein interactions and expression profiles Authors: Chia-Hao Chin 1,4,
Discovering Overlapping Groups in Social Media Xufei Wang, Lei Tang, Huiji Gao, and Huan Liu Arizona State University.
Genome-wide prediction and characterization of interactions between transcription factors in S. cerevisiae Speaker: Chunhui Cai.
Comparison of Networks Across Species CS374 Presentation October 26, 2006 Chuan Sheng Foo.
Modularity in Biological networks.  Hypothesis: Biological function are carried by discrete functional modules.  Hartwell, L.-H., Hopfield, J. J., Leibler,
1 Modularity and Community Structure in Networks* Final project *Based on a paper by M.E.J Newman in PNAS 2006.
Introduction to molecular networks Sushmita Roy BMI/CS 576 Nov 6 th, 2014.
Application of Graph Theory to OO Software Engineering Alexander Chatzigeorgiou, Nikolaos Tsantalis, George Stephanides Department of Applied Informatics.
Triangulation of network metaphors The Royal Netherlands Academy of Arts and Sciences Iina Hellsten & Andrea Scharnhorst Networked Research and Digital.
A Framework For Community Identification in Dynamic Social Networks Chayant Tantipathananandh Tanya Berger-Wolf David Kempe Presented by Victor Lee.
Comparative Expression Moran Yassour +=. Goal Build a multi-species gene-coexpression network Find functions of unknown genes Discover how the genes.
Systematic Analysis of Interactome: A New Trend in Bioinformatics KOCSEA Technical Symposium 2010 Young-Rae Cho, Ph.D. Assistant Professor Department of.
Models of Influence in Online Social Networks
Graph-based consensus clustering for class discovery from gene expression data Zhiwen Yum, Hau-San Wong and Hongqiang Wang Bioinformatics, 2007.
Social Network Analysis via Factor Graph Model
On Anomalous Hot Spot Discovery in Graph Streams
Research Meeting Seungseok Kang Center for E-Business Technology Seoul National University Seoul, Korea.
Social Networking and On-Line Communities: Classification and Research Trends Maria Ioannidou, Eugenia Raptotasiou, Ioannis Anagnostopoulos.
Evolutionary Clustering and Analysis of Bibliographic Networks Manish Gupta (UIUC) Charu C. Aggarwal (IBM) Jiawei Han (UIUC) Yizhou Sun (UIUC) ASONAM 2011.
Using Bayesian Networks to Analyze Expression Data N. Friedman, M. Linial, I. Nachman, D. Hebrew University.
Community Detection by Modularity Optimization Jooyoung Lee
2015/10/111 DBconnect: Mining Research Community on DBLP Data Osmar R. Zaïane, Jiyang Chen, Randy Goebel Web Mining and Social Network Analysis Workshop.
Clustering of protein networks: Graph theory and terminology Scale-free architecture Modularity Robustness Reading: Barabasi and Oltvai 2004, Milo et al.
To Blog or Not to Blog: Characterizing and Predicting Retention in Community Blogs Imrul Kayes 1, Xiang Zuo 1, Da Wang 2, Jacob Chakareski 3 1 University.
December 7-10, 2013, Dallas, Texas
Exploiting Context Analysis for Combining Multiple Entity Resolution Systems -Ramu Bandaru Zhaoqi Chen Dmitri V.kalashnikov Sharad Mehrotra.
Mapping Medline Papers, Genes, and Proteins Related to Melanoma Research Kevin Boyack †, Ketan Mane ‡, Katy Börner ‡ † VisWave LLC, Albuquerque, NM
Biological Signal Detection for Protein Function Prediction Investigators: Yang Dai Prime Grant Support: NSF Problem Statement and Motivation Technical.
Finding Top-k Shortest Path Distance Changes in an Evolutionary Network SSTD th August 2011 Manish Gupta UIUC Charu Aggarwal IBM Jiawei Han UIUC.
April 28, 2003 Early Fault Detection and Failure Prediction in Large Software Systems Felix Salfner and Miroslaw Malek Department of Computer Science Humboldt.
CSCE555 Bioinformatics Lecture 18 Network Biology: Comparison of Networks Across Species Meeting: MW 4:00PM-5:15PM SWGN2A21 Instructor: Dr. Jianjun Hu.
Mining and Visualizing the Evolution of Subgroups in Social Networks Falkowsky, T., Bartelheimer, J. & Spiliopoulou, M. (2006) IEEE/WIC/ACM International.
SpeakEasy: Algorithm for Robust Community Detection
Computational Tools for Population Biology Tanya Berger-Wolf, Computer Science, UIC; Daniel Rubenstein, Ecology and Evolutionary Biology, Princeton; Jared.
Network Community Behavior to Infer Human Activities.
RTM: Laws and a Recursive Generator for Weighted Time-Evolving Graphs Leman Akoglu, Mary McGlohon, Christos Faloutsos Carnegie Mellon University School.
Community Discovery in Social Network Yunming Ye Department of Computer Science Shenzhen Graduate School Harbin Institute of Technology.
Mining information from social media
1 Friends and Neighbors on the Web Presentation for Web Information Retrieval Bruno Lepri.
1 Finding Spread Blockers in Dynamic Networks (SNAKDD08)Habiba, Yintao Yu, Tanya Y., Berger-Wolf, Jared Saia Speaker: Hsu, Yu-wen Advisor: Dr. Koh, Jia-Ling.
Dynamic Networks: How Networks Change with Time? Vahid Mirjalili CSE 891.
Speaker : Yu-Hui Chen Authors : Dinuka A. Soysa, Denis Guangyin Chen, Oscar C. Au, and Amine Bermak From : 2013 IEEE Symposium on Computational Intelligence.
Community structure in graphs Santo Fortunato. More links “inside” than “outside” Graphs are “sparse” “Communities”
Network Partition –Finding modules of the network. Graph Clustering –Partition graphs according to the connectivity. –Nodes within a cluster is highly.
Presented by: Omar Alqahtani Spring Authors: Publication:  ICDE 2015 Type:  Research Paper 2.
Network applications Sushmita Roy BMI/CS 576 Dec 9 th, 2014.
Paper Presentation Social influence based clustering of heterogeneous information networks Qiwei Bao & Siqi Huang.
Algorithms and Computational Biology Lab, Department of Computer Science and & Information Engineering, National Taiwan University, Taiwan Network Biology.
GUILLOU Frederic. Outline Introduction Motivations The basic recommendation system First phase : semantic similarities Second phase : communities Application.
Graph clustering to detect network modules
Wenyu Zhang From Social Network Group
Finding Dense and Connected Subgraphs in Dual Networks
A Viewpoint-based Approach for Interaction Graph Analysis
Biological networks CS 5263 Bioinformatics.
The Importance of Communities for Learning to Influence
Noémi Gaskó, Rodica Ioana Lung, Mihai Alexandru Suciu
SEG5010 Presentation Zhou Lanjun.
Affiliation Network Models of Clusters in Networks
Presentation transcript:

Copyright 2006, Data Mining Research Laboratory An Event-based Framework for Characterizing the Evolutionary Behavior of Interaction Graphs Sitaram Asur, Srinivasan Parthasarathy and Duygu Ucar Department of Computer Science The Ohio State University

Copyright 2006, Data Mining Research Laboratory Motivation Interaction Networks –Represent scientific data from various domains –Nodes represent entities –Edges represent interactions among entities –Examples: Biological Networks - Protein- Protein Interaction (PPI) networks, gene expression networks Collaboration networks Social networks, online communities, blog networks Protein-protein interactions in yeast (Jeong et al, 2001) Physicist collaboration network (Newman and Girvan, 2004)

Copyright 2006, Data Mining Research Laboratory Motivation Mining interaction networks important –Gain insight into structure, properties and behavior of these networks [Newman, 2001] Modular nature of interaction networks important –Co-expression networks : dense components - > functional modules –Social networks : clusters -> community structure

Copyright 2006, Data Mining Research Laboratory Motivation A large number of earlier approaches focused on mining static interaction networks Many important real-world networks are dynamic Temporal protein interaction network of the yeast mitotic cell cycle. Ulrik de Lichtenberg, et al. Science 307, 724 (2005)

Copyright 2006, Data Mining Research Laboratory Motivation Dynamic Interaction Networks –Nodes and interactions change over time –Structure changes in the network Need for a structured method to characterize and model evolution –Understand nature of change (evolution) in networks –Consider evolution of individuals and communities –Develop models for reasoning and inference of future events

Copyright 2006, Data Mining Research Laboratory Workflow Temporal Snapshots Clustering Event Detection Behavioral Patterns Analysis and Inference Iterate i SiSi S i+1 CiCi C i+1 Evolving Graph

Copyright 2006, Data Mining Research Laboratory Temporal Snapshots Split the graph data into non-overlapping temporal snapshots –Each snapshot corresponds to a graph –Consists of all nodes and interactions active in that time period –Nodes active if they have an interaction in a particular time period A B E C G F D A B E C G F D T1T1 T2T2

Copyright 2006, Data Mining Research Laboratory Clustering Represent the snapshot graphs using clusters –Clusters of a graph can provide structure information –Examine the evolution of clusters over time –Can provide insight on corresponding changes to the graph –MCL clustering algorithm employed in this work –Ensemble clustering approaches can be employed to obtain robust clusters (Asur et al, ISMB 2007) A B E C G F D A B E C G F D T1T1 T2T2

Copyright 2006, Data Mining Research Laboratory Community-based Event Detection Continue Merge Split Form Dissolve C 1 1 C 2 1 C 2 2 C C CC 5 4 C C 6 1 C 6 4 C 6 2 C 6 5 C 6 3 T=1T=3T=2 4 3 C 4 1 C 4 2 C T=4T=5T=6

Copyright 2006, Data Mining Research Laboratory Entity-based Event Detection Appear Disappear Join Leave 1 2 C 1 1 C T=1T=3T=2T=4 C C B A 4 1 C 4 2 C A B A B C 2 1 C 2 2 A B

Copyright 2006, Data Mining Research Laboratory Event Detection Represent each set of snapshot clusters as a k X N binary cluster-membership matrix Use bitwise operators to compute the events between each successive pair of matrices (snapshots) Example: Continue Event Continue (C j, C k ) = AND (S i (j), S i+1 (k)) == OR(S i (j), S i+1 (k)) Event Detection algorithm linear in the number of nodes in the graph O(N)

Copyright 2006, Data Mining Research Laboratory Temporal Analysis Use critical events for analysis Form and Dissolve events –Used to study group formation and dissipation Merge and Split events –Evolution of groups Continue events –Stability of clusters/groups –Evolution of topics in a collaboration network

Copyright 2006, Data Mining Research Laboratory Behavioral Analysis Use entity-based critical events discovered to compose incremental measures for capturing behavioral patterns Behavioral measures can then be used to analyze evolutionary behavior of nodes and clusters Four Behavioral measures –Stability Index –Sociability Index –Popularity Index –Influence Index

Copyright 2006, Data Mining Research Laboratory Case Study 1 : DBLP Collaboration network Data from 28 key conferences in databases/data mining/AI over 10 years Authors (nodes) connected by collaborations (edges) nodes and edges Collaboration networks display many of the structural features of social networks (Kempe, Kleinberg and Tardos 2003, Newman 2001)

Copyright 2006, Data Mining Research Laboratory Case Study 2 : Clinical Trials Network Clinical Trials –Can provide information on risks, benefits and optimal dosage levels. –Consists of observations of patients under drug use as well as some under placebo –Generally represented as a set of multivariate time series Evolving clinical trials network –Nodes representing patients –Correlations among patients modeled as edges –Edges change over time as correlations change Motivation: Use evolution of correlation to identify potential toxic effects of drugs

Copyright 2006, Data Mining Research Laboratory Stability Index Propensity of a node to interact with the same group of people over time Stability for a node over time incrementally computed based on the stability of the clusters it belongs to

Copyright 2006, Data Mining Research Laboratory Stability for Clinical Trials data Nodes with low Stability Index values represent patients with fluctuating correlation values (outliers) Null Hypothesis: –If the drug does not result in toxicity, then outliers are likely to be flagged at random from each group (drug and placebo). Experiment on clinical trials network for diabetes patients –19 nodes (patients) found having Stability Index below threshold. –The drug under study was discontinued due to possible toxic effects. 18 out of the 19 were on the drug!!!

Copyright 2006, Data Mining Research Laboratory Sociability Index Incremental measure of the different interactions a node participates in Opposite of the Stability Index Does not represent degree!

Copyright 2006, Data Mining Research Laboratory Sociability Index for Community Prediction Goal : To identify future cluster co-occurrences based on history data for the DBLP dataset Key Intuition: If two authors have high sociability, and they have not yet collaborated (not been clustered together), there is a high chance they will. Setup : Use the data for to predict cluster co- occurrences for

Copyright 2006, Data Mining Research Laboratory Experimental Results Comparison with other measures (Liben-Nowell and Kleinberg, CIKM 2003) –Common Neighbor –Adamic-Adar –Jacquard

Copyright 2006, Data Mining Research Laboratory Popularity Index Measure of attraction of nodes to a cluster Influence measure of a cluster Does not reflect the size of the cluster DBLP dataset –Can be used to identify hot topics –If a large number of nodes join a cluster and they are all working on a similar topic, it indicates a buzz around that topic for that year

Copyright 2006, Data Mining Research Laboratory Application of Popularity Index Example : XML Year 1999 : 3 authors (XML and web applications) Year 2000 : 50 joins –30 of these authors published papers on XML

Copyright 2006, Data Mining Research Laboratory Influence Index Measure of influence of a node on others Influence in terms of participation in critical events Influence of a node initially computed as Follower nodes need to be pruned! unless

Copyright 2006, Data Mining Research Laboratory Top Influential authors – DBLP dataset

Copyright 2006, Data Mining Research Laboratory Diffusion Models Study the spread of information in an evolving interaction network (Kempe et al, 2003, 2005) –Nodes activated with information –Newly activated nodes become contagious briefly –Information propagates through the network –Activation function maps weights of the links of a node to determine if it is activated SUM Activation: If sum of weights > threshold, activate MAX Activation: If any single weight > threshold, activate t1t1 t2t2 t3t3 t4t4

Copyright 2006, Data Mining Research Laboratory Diffusion Models – Influence Maximization Influence Maximization Problem : Find initial set of nodes that can activate the most number of nodes over a time period –Critical in applications such as viral marketing and for epidemiological research –Complicated in the case of dynamic interaction networks as the network changes over time Need for dynamic measures that reflect the current status of the network –Sociability Index used to weight links Highly sociable nodes have high propensity to pass on information –Influence Index to determine initial set of active nodes –Comparison with random choice of nodes and degree-based selection (Wasserman and Faust, 1994)

Copyright 2006, Data Mining Research Laboratory Conclusions Most real-world graphs dynamic in nature –Need for analysis, reasoning and inference –Proposed an event-based framework Clusters to capture structure at different snapshots Critical events over clusters to identify dynamic properties of graphs Behavioral patterns incrementally composed from critical events –Proposed method useful in many application domains Protein function prediction, drug design, recommender systems, viral marketing, epidemiology Temporal Snapshots Clustering Event Detection Behavioral Patterns Analysis and Inference

Copyright 2006, Data Mining Research Laboratory Future Directions Extensions to large interaction graphs Use of semantic information for reasoning and inference –Merge and Split Events If two clusters have high semantic similarity, probability of a Merge is high –Continue events Track the evolution of topics Sequences of Form, Continue, Continue … Multi-scale temporal modeling Analyze snapshots of different granularity

Copyright 2006, Data Mining Research Laboratory Poster # 36, this evening (Mon 13 th Aug, 6:15 – 9:15 pm) This work was supported by the following grants: –DOE Early Career Principal Investigator Award No. DE-FG02- 04ER25611 –NSF CAREER Grant IIS Contacts: –Sitaram Asur : –Dr Srinivasan Parthasarathy : –Duygu Ucar : Group Webpage : Thanks!

Copyright 2006, Data Mining Research Laboratory Event Detection

Copyright 2006, Data Mining Research Laboratory Event Detection