Modeling, Searching, and Explaining Abnormal Instances in Multi-Relational Networks Chapter 1. Introduction Speaker: Cheng-Te Li 2007. 7. 9.

Slides:



Advertisements
Similar presentations
Pat Langley Computational Learning Laboratory Center for the Study of Language and Information Stanford University, Stanford, California
Advertisements

PEBL: Web Page Classification without Negative Examples Hwanjo Yu, Jiawei Han, Kevin Chen- Chuan Chang IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING,
Recognizing Human Actions by Attributes CVPR2011 Jingen Liu, Benjamin Kuipers, Silvio Savarese Dept. of Electrical Engineering and Computer Science University.
June 19, Proposal: An overall Plan Design to obtain answer to the research questions or problems Outline the various tasks you plan to undertake.
Funding Networks Abdullah Sevincer University of Nevada, Reno Department of Computer Science & Engineering.
1 CS 430 / INFO 430: Information Retrieval Lecture 16 Web Search 2.
Cascading Spatio-Temporal Pattern Discovery P. Mohan, S.Shekhar, J. Shine, J. Rogers CSci 8715 Presented by: Atanu Roy Akash Agrawal.
Sensemaking and Ground Truth Ontology Development Chinua Umoja William M. Pottenger Jason Perry Christopher Janneck.
Queensland University of Technology An Ontology-based Mining Approach for User Search Intent Discovery Yan Shen, Yuefeng Li, Yue Xu, Renato Iannella, Abdulmohsen.
Presented by Zeehasham Rasheed
Developing Ideas for Research and Evaluating Theories of Behavior
Algorithms for Data Mining and Querying with Graphs Investigators: Padhraic Smyth, Sharad Mehrotra University of California, Irvine Students: Joshua O’
Data Mining – Intro.
Introduction to Machine Learning Approach Lecture 5.
1 A Topic Modeling Approach and its Integration into the Random Walk Framework for Academic Search 1 Jie Tang, 2 Ruoming Jin, and 1 Jing Zhang 1 Knowledge.
Enterprise systems infrastructure and architecture DT211 4
Machine Learning in Simulation-Based Analysis 1 Li-C. Wang, Malgorzata Marek-Sadowska University of California, Santa Barbara.
CS Machine Learning. What is Machine Learning? Adapt to / learn from data  To optimize a performance function Can be used to:  Extract knowledge.
Semantic Web Technologies Lecture # 2 Faculty of Computer Science, IBA.
Business Logic Abuse Detection in Cloud Computing Systems Grzegorz Kołaczek 1st International IBM Cloud Academy Conference Research Triangle Park, NC April.
Joy Oberoi Grade 12. Introduction THEATRE BOOKING SYSTEM (TBS) A system used to perform tasks that one would manually execute at a theatre It is online.
Modeling and Finding Abnormal Nodes (chapter 2) 駱宏毅 Hung-Yi Lo Social Network Mining Lab Seminar July 18, 2007.
Data Mining : Introduction Chapter 1. 2 Index 1. What is Data Mining? 2. Data Mining Functionalities 1. Characterization and Discrimination 2. MIning.
«Tag-based Social Interest Discovery» Proceedings of the 17th International World Wide Web Conference (WWW2008) Xin Li, Lei Guo, Yihong Zhao Yahoo! Inc.,
Kansas State University Department of Computing and Information Sciences CIS 830: Advanced Topics in Artificial Intelligence From Data Mining To Knowledge.
Data Mining Chun-Hung Chou
Intrusion Detection Jie Lin. Outline Introduction A Frame for Intrusion Detection System Intrusion Detection Techniques Ideas for Improving Intrusion.
Copyright R. Weber Machine Learning, Data Mining ISYS370 Dr. R. Weber.
Chapter 2 Modeling and Finding Abnormal Nodes. How to define abnormal nodes ? One plausible answer is : –A node is abnormal if there are no or very few.
Introduction to Data Mining Group Members: Karim C. El-Khazen Pascal Suria Lin Gui Philsou Lee Xiaoting Niu.
Chapter 9 Neural Network.
Jessica Chen-Burger A Framework for Knowledge Sharing and Integrity Checking for Multi-Perspective Models Yun-Heh (Jessica) Chen-Burger Artificial Intelligence.
1 Controversial Issues  Data mining (or simple analysis) on people may come with a profile that would raise controversial issues of  Discrimination 
第十讲 概率图模型导论 Chapter 10 Introduction to Probabilistic Graphical Models
INTERACTIVE ANALYSIS OF COMPUTER CRIMES PRESENTED FOR CS-689 ON 10/12/2000 BY NAGAKALYANA ESKALA.
1 Knowledge Discovery Transparencies prepared by Ho Tu Bao [JAIST] ITCS 6162.
BAA - Big Mechanism using SIRA Technology Chuck Rehberg CTO at Trigent Software and Chief Scientist at Semantic Insights™
Data Mining – Intro. Course Overview Spatial Databases Temporal and Spatio-Temporal Databases Multimedia Databases Data Mining.
Proposals on standardisation process in ESS, The Hague_ ESS net Preparation of Standardisation 1 Proposals on standardisation process.
Anomaly Detection in Data Mining. Hybrid Approach between Filtering- and-refinement and DBSCAN Eng. Ştefan-Iulian Handra Prof. Dr. Eng. Horia Cioc ârlie.
C. Lawrence Zitnick Microsoft Research, Redmond Devi Parikh Virginia Tech Bringing Semantics Into Focus Using Visual.
Chapter 2 Database System Concepts and Architecture Dr. Bernard Chen Ph.D. University of Central Arkansas.
9/03 Data Mining – Introduction G Dong (WSU)1 CS499/ Data Mining Fall 2003 Professor Guozhu Dong Computer Science & Engineering WSU.
Measuring Behavioral Trust in Social Networks
Chapter 14 Data Mining Transparencies. 2 Chapter Objectives u The concepts associated with data mining. u The main features of data mining operations,
1 1 COMP5331: Knowledge Discovery and Data Mining Acknowledgement: Slides modified based on the slides provided by Lawrence Page, Sergey Brin, Rajeev Motwani.
A Classification-based Approach to Question Answering in Discussion Boards Liangjie Hong, Brian D. Davison Lehigh University (SIGIR ’ 09) Speaker: Cho,
Introduction to Data Mining by Yen-Hsien Lee Department of Information Management College of Management National Sun Yat-Sen University March 4, 2003.
Discriminative Frequent Pattern Analysis for Effective Classification By Hong Cheng, Xifeng Yan, Jiawei Han, Chih- Wei Hsu Presented by Mary Biddle.
Concepts and Realization of a Diagram Editor Generator Based on Hypergraph Transformation Author: Mark Minas Presenter: Song Gu.
A Novel Relational Learning-to- Rank Approach for Topic-focused Multi-Document Summarization Yadong Zhu, Yanyan Lan, Jiafeng Guo, Pan Du, Xueqi Cheng Institute.
Machine Learning Chapter 18, 21 Some material adopted from notes by Chuck Dyer.
Supervised Random Walks: Predicting and Recommending Links in Social Networks Lars Backstrom (Facebook) & Jure Leskovec (Stanford) Proc. of WSDM 2011 Present.
The Utilization of Artificial Intelligence in a Hybrid Intrusion Detection System Authors : Martin Botha, Rossouw von Solms, Kent Perry, Edwin Loubser.
Selected Semantic Web UMBC CoBrA – Context Broker Architecture  Using OWL to define ontologies for context modeling and reasoning  Taking.
Instance Discovery and Schema Matching With Applications to Biological Deep Web Data Integration Tantan Liu, Fan Wang, Gagan Agrawal {liut, wangfa,
Semantic Interoperability in GIS N. L. Sarda Suman Somavarapu.
Meta-Path-Based Ranking with Pseudo Relevance Feedback on Heterogeneous Graph for Citation Recommendation By: Xiaozhong Liu, Yingying Yu, Chun Guo, Yizhou.
Data Mining for Surveillance Applications Suspicious Event Detection
Data Mining – Intro.
Machine Learning overview Chapter 18, 21
Introduction C.Eng 714 Spring 2010.
Data Mining for Surveillance Applications Suspicious Event Detection
Hands-on Introduction to Visual Basic .NET
Data Warehousing and Data Mining
Jinhong Jung, Woojung Jin, Lee Sael, U Kang, ICDM ‘16
Data Mining for Surveillance Applications Suspicious Event Detection
Enriching Taxonomies With Functional Domain Knowledge
Basics of ML Rohan Suri.
Generalized Diagnostics with the Non-Axiomatic Reasoning System (NARS)
Presentation transcript:

Modeling, Searching, and Explaining Abnormal Instances in Multi-Relational Networks Chapter 1. Introduction Speaker: Cheng-Te Li

2 Outline Introduction Problem Definition –Multi-relational Networks –The Importance of Abnormal Instances –Explanation Design Considerations Objective and Challenges Approach Contributions

3 Introduction A discovery is said to be an accident meeting a prepared mind. – Albert Szent Gyorgyi For CS, to model the discovery process via AI Motivation: “Natural Selection” The discovery process

4 Outline Introduction Problem Definition –Multi-relational Networks –The Importance of Abnormal Instances –Explanation Design Considerations Objective and Challenges Approach Contributions

5 Problem Definition Essentially, how to model through AI? –Our general framework Three key features –Multi-relational network (MRN) –Abnormal Instances –Human-understandable explanation

6 Multi-relational Networks Definition –Nodes : objects of different types –Links : binary relationships between objects –Multi-relational : multiple different types of links –Attributes Encode semantic relationship between different types of object E.g. Bibliography network

7 Multi-relational Networks (con’t) More examples –Kinship network ( 親屬網絡 ) –WWW : incoming, outgoing, and links –WordNet : lexical relationship between concepts Multiple relationship types carry different kinds of semantic information to compare and contrast PageRank, Centrality Theory –Cannot deal with relation types in a network

8 Abnormal Instances Discovery from a network –Identify central nodes, recognize frequent subgraphs, learn interesting property Our goal is to discover those look different ! –Attraction of “light bulb” –An unheard-of anomaly detection via relational data –Potential applications : Information Awareness and Homeland Security Fraud Detection and Law Enforcement General Scientific Discovery Data Cleaning

9 Explanation The difficulty of verification –To find something previously unknown –False positive problem may exists even if high precision and high recall, which likes unsupervised discovery Explanation-based discovery –Human-understandable explanation –Intuitive validation by user –Further investigation

10 Outline Introduction Problem Definition –Multi-relational Networks –The Importance of Abnormal Instances –Explanation Design Considerations Objective and Challenges Approach Contributions

11 Design Considerations Three strategies to identify abnormal instances Rule-based learning Pattern-matching e.g. “abnormal if it doesn’t cite any other people’s papers” Supervised Learning Manual labeling for training and classification Merit :  high precision Demerit :  domain dependent  expensive to create  sensitive to human bias  can only find expected, not for novel Unsupervised Learning Comparison-based due to our definition Property :  Easily adapted to new domain without training  More suitable to security-related problems

12 Design Considerations (con’t) System Requirements  Utilize information of MRN, e.g. type of links  Adapt to different domains, no training  Explainable  Scalable  Provide high-level bias  Support different levels of detail for explanations

13 Outline Introduction Problem Definition –Multi-relational Networks –The Importance of Abnormal Instances –Explanation Design Considerations Objective and Challenges Approach Contributions

14 Objectives & Challenges Objectives –  Discovery stage : identify abnormal nodes –  Explanation stage : produce descriptions for nodes found –e.g. organized crime network Challenges –  Make anomaly detection obey previous requirements Identify suspicious instances in MRN : rule-based, supervised Conventional unsupervised algo. for propositional or numerical data PageRank, HITS, Random Walk : not consider link types –  Consider understandable explanations as discovery Need a complex-enough and not-over-complicated model

15 Approach  Design a model capturing the semantic of nodes –Select a set of relevant path types as semantic features –Compute statistical dependency between nodes and path types as feature values  Find nodes with abnormal semantics –Distance-based outlier detection with semantic profiles  Explain them ! –Apply a classification to separate abnormal from others –Translate generated rules into natural language

16 Contributions  An unsupervised way to identify abnormal in MRN  Outperform state-of-the-art algo. by a large margin  Generate understandable explanations  Do complex data analysis accurately and efficiently  Generality and applicability

17 Q & A Thanks for your listening !