Download presentation
Presentation is loading. Please wait.
Published byAustin Bradford Modified over 8 years ago
1
Outlier Detection for Information Networks Manish Gupta 15 th Jan 2013
2
Problem 1: TopK Outlier Cuboid Detection for Graph OLAP Consider the DBLP co-authorship network One can store it in OLAP with dimensions as research areas and years Query: In which research area and in which set of years, there were exceptionally high collaborations between Stanford, IITBombay and Berkeley authors? Research area and years determine different levels of cuboids Given: A subgraph query and a weighted network (like DBLP) Find: TopK outlier cuboids from graph OLAP such that the percentage edge weight covered by matches is exceptionlly high A possible result: (DM+DB, 2001-2004) We hope to explore an application of genetic algorithms in this project StanfordIITBombay Berkeley
3
Problem 2: Outlier Substructures in an Information Network Given: A heterogeneous information network and a heterogeneous query Consider the DBLP network of authors, conferences and title terms Consider a simple query: A Data Mining researcher Patterns: Most data mining researchers – Are connected to other data mining authors, conferences or terms – Are connected to very few very popular authors – Etc Given the query, one can find all matches in the network For a match, given the usual connectivity patterns and the neighborhood for the match – One can compute p = probability of generation of the match – Outlier score can then be computed as 1-p We hope to explore a generative way of modeling a subgraph neighborhood (like Block models) in this project
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.