Download presentation
Presentation is loading. Please wait.
Published byKristopher Sharp Modified over 9 years ago
1
Modeling, Searching, and Explaining Abnormal Instances in Multi-Relational Networks Chapter 1. Introduction Speaker: Cheng-Te Li 2007. 7. 9
2
2 Outline Introduction Problem Definition –Multi-relational Networks –The Importance of Abnormal Instances –Explanation Design Considerations Objective and Challenges Approach Contributions
3
3 Introduction A discovery is said to be an accident meeting a prepared mind. – Albert Szent Gyorgyi For CS, to model the discovery process via AI Motivation: “Natural Selection” The discovery process
4
4 Outline Introduction Problem Definition –Multi-relational Networks –The Importance of Abnormal Instances –Explanation Design Considerations Objective and Challenges Approach Contributions
5
5 Problem Definition Essentially, how to model through AI? –Our general framework Three key features –Multi-relational network (MRN) –Abnormal Instances –Human-understandable explanation
6
6 Multi-relational Networks Definition –Nodes : objects of different types –Links : binary relationships between objects –Multi-relational : multiple different types of links –Attributes Encode semantic relationship between different types of object E.g. Bibliography network
7
7 Multi-relational Networks (con’t) More examples –Kinship network ( 親屬網絡 ) –WWW : incoming, outgoing, and email links –WordNet : lexical relationship between concepts Multiple relationship types carry different kinds of semantic information to compare and contrast PageRank, Centrality Theory –Cannot deal with relation types in a network
8
8 Abnormal Instances Discovery from a network –Identify central nodes, recognize frequent subgraphs, learn interesting property Our goal is to discover those look different ! –Attraction of “light bulb” –An unheard-of anomaly detection via relational data –Potential applications : Information Awareness and Homeland Security Fraud Detection and Law Enforcement General Scientific Discovery Data Cleaning
9
9 Explanation The difficulty of verification –To find something previously unknown –False positive problem may exists even if high precision and high recall, which likes unsupervised discovery Explanation-based discovery –Human-understandable explanation –Intuitive validation by user –Further investigation
10
10 Outline Introduction Problem Definition –Multi-relational Networks –The Importance of Abnormal Instances –Explanation Design Considerations Objective and Challenges Approach Contributions
11
11 Design Considerations Three strategies to identify abnormal instances Rule-based learning Pattern-matching e.g. “abnormal if it doesn’t cite any other people’s papers” Supervised Learning Manual labeling for training and classification Merit : high precision Demerit : domain dependent expensive to create sensitive to human bias can only find expected, not for novel Unsupervised Learning Comparison-based due to our definition Property : Easily adapted to new domain without training More suitable to security-related problems
12
12 Design Considerations (con’t) System Requirements Utilize information of MRN, e.g. type of links Adapt to different domains, no training Explainable Scalable Provide high-level bias Support different levels of detail for explanations
13
13 Outline Introduction Problem Definition –Multi-relational Networks –The Importance of Abnormal Instances –Explanation Design Considerations Objective and Challenges Approach Contributions
14
14 Objectives & Challenges Objectives – Discovery stage : identify abnormal nodes – Explanation stage : produce descriptions for nodes found –e.g. organized crime network Challenges – Make anomaly detection obey previous requirements Identify suspicious instances in MRN : rule-based, supervised Conventional unsupervised algo. for propositional or numerical data PageRank, HITS, Random Walk : not consider link types – Consider understandable explanations as discovery Need a complex-enough and not-over-complicated model
15
15 Approach Design a model capturing the semantic of nodes –Select a set of relevant path types as semantic features –Compute statistical dependency between nodes and path types as feature values Find nodes with abnormal semantics –Distance-based outlier detection with semantic profiles Explain them ! –Apply a classification to separate abnormal from others –Translate generated rules into natural language
16
16 Contributions An unsupervised way to identify abnormal in MRN Outperform state-of-the-art algo. by a large margin Generate understandable explanations Do complex data analysis accurately and efficiently Generality and applicability
17
17 Q & A Thanks for your listening !
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.