A simple method for multi-relational outlier detection Sarah Riahi and Oliver Schulte School of Computing Science Simon Fraser University Vancouver, Canada.

Slides:



Advertisements
Similar presentations
University of Texas at Austin Machine Learning Group Department of Computer Sciences University of Texas at Austin Discriminative Structure and Parameter.
Advertisements

Data Mining Anomaly Detection Lecture Notes for Chapter 10 Introduction to Data Mining by Tan, Steinbach, Kumar © Tan,Steinbach, Kumar Introduction to.
Data Mining Anomaly Detection Lecture Notes for Chapter 10 Introduction to Data Mining by Minqi Zhou © Tan,Steinbach, Kumar Introduction to Data Mining.
The IMAP Hybrid Method for Learning Gaussian Bayes Nets Oliver Schulte School of Computing Science Simon Fraser University Vancouver, Canada
A Hierarchy of Independence Assumptions for Multi-Relational Bayes Net Classifiers School of Computing Science Simon Fraser University Vancouver, Canada.
Random Regression: Example Target Query: P(gender(sam) = F)? Sam is friends with Bob and Anna. Unnormalized Probability: Oliver Schulte, Hassan Khosravi,
Building Global Models from Local Patterns A.J. Knobbe.
Markov Logic: A Unifying Framework for Statistical Relational Learning Pedro Domingos Matthew Richardson
Speaker:Benedict Fehringer Seminar:Probabilistic Models for Information Extraction by Dr. Martin Theobald and Maximilian Dylla Based on Richards, M., and.
School of Computing Science Simon Fraser University Vancouver, Canada.
Modelling Relational Statistics With Bayes Nets School of Computing Science Simon Fraser University Vancouver, Canada Tianxiang Gao Yuke Zhu.
APRIL, Application of Probabilistic Inductive Logic Programming, IST Albert-Ludwigs-University, Freiburg, Germany & Imperial College of Science,
Data Mining: Concepts and Techniques (3rd ed.) — Chapter 12 —
A Study on Feature Selection for Toxicity Prediction*
Relational Data Mining in Finance Haonan Zhang CFWin /04/2003.
CSE 574 – Artificial Intelligence II Statistical Relational Learning Instructor: Pedro Domingos.
Analyzing System Logs: A New View of What's Important Sivan Sabato Elad Yom-Tov Aviad Tsherniak Saharon Rosset IBM Research SysML07 (Second Workshop on.
Data Mining with Decision Trees Lutz Hamel Dept. of Computer Science and Statistics University of Rhode Island.
Data Mining Anomaly Detection Lecture Notes for Chapter 10 Introduction to Data Mining by Tan, Steinbach, Kumar © Tan,Steinbach, Kumar Introduction to.
Anomaly Detection. Anomaly/Outlier Detection  What are anomalies/outliers? The set of data points that are considerably different than the remainder.
© Tan,Steinbach, Kumar Introduction to Data Mining 4/18/ Data Mining Anomaly Detection Lecture Notes for Chapter 10 Introduction to Data Mining by.
Raw data analysis S. Purcell & M. C. Neale Twin Workshop, IBG Colorado, March 2002.
Causal Models, Learning Algorithms and their Application to Performance Modeling Jan Lemeire Parallel Systems lab November 15 th 2006.
Part I: Classification and Bayesian Learning
Learning Chapter 18 and Parts of Chapter 20
Time Series Data Analysis - II
Data Mining Chun-Hung Chou
Intrusion Detection Jie Lin. Outline Introduction A Frame for Intrusion Detection System Intrusion Detection Techniques Ideas for Improving Intrusion.
Bayesian networks Classification, segmentation, time series prediction and more. Website: Twitter:
Density-Based Clustering Algorithms
Mixture Models, Monte Carlo, Bayesian Updating and Dynamic Models Mike West Computing Science and Statistics, Vol. 24, pp , 1993.
RDF: A Density-based Outlier Detection Method Using Vertical Data Representation Dongmei Ren, Baoying Wang, William Perrizo North Dakota State University,
Speeding Up Relational Data Mining by Learning to Estimate Candidate Hypothesis Scores Frank DiMaio and Jude Shavlik UW-Madison Computer Sciences ICDM.
Multi-Relational Data Mining: An Introduction Joe Paulowskey.
© Tan,Steinbach, Kumar Introduction to Data Mining 4/18/ Data Mining Anomaly Detection Lecture Notes for Chapter 10 Introduction to Data Mining by.
Data Mining Anomaly Detection Lecture Notes for Chapter 10 Introduction to Data Mining by Tan, Steinbach, Kumar © Tan,Steinbach, Kumar Introduction to.
Data Mining Anomaly Detection © Tan,Steinbach, Kumar Introduction to Data Mining.
Data Mining Anomaly/Outlier Detection Lecture Notes for Chapter 10 Introduction to Data Mining by Tan, Steinbach, Kumar © Tan,Steinbach, Kumar Introduction.
Lecture 7: Outlier Detection Introduction to Data Mining Yunming Ye Department of Computer Science Shenzhen Graduate School Harbin Institute of Technology.
Cluster Analysis Potyó László. Cluster: a collection of data objects Similar to one another within the same cluster Similar to one another within the.
ECML/PKDD 2003 Discovery Challenge Attribute-Value and First Order Data Mining within the STULONG project Anneleen Van Assche, Sofie Verbaeten,
Theoretic Frameworks for Data Mining Reporter: Qi Liu.
FACTORBASE: SQL for Multi-Relational Model Learning Zhensong Qian and Oliver Schulte, Simon Fraser University, Canada 1.Qian, Z.; Schulte, O. The BayesBase.
Data Mining Anomaly Detection Lecture Notes for Chapter 10 Introduction to Data Mining by Tan, Steinbach, Kumar © Tan,Steinbach, Kumar Introduction to.
Data Mining Anomaly/Outlier Detection Lecture Notes for Chapter 10 Introduction to Data Mining by Tan, Steinbach, Kumar.
Chapter 20 Classification and Estimation Classification – Feature selection Good feature have four characteristics: –Discrimination. Features.
Markov-Chain-Monte-Carlo (MCMC) & The Metropolis-Hastings Algorithm P548: Intro Bayesian Stats with Psych Applications Instructor: John Miyamoto 01/19/2016:
Cluster Analysis What is Cluster Analysis? Types of Data in Cluster Analysis A Categorization of Major Clustering Methods Partitioning Methods.
Anomaly Detection Carolina Ruiz Department of Computer Science WPI Slides based on Chapter 10 of “Introduction to Data Mining” textbook by Tan, Steinbach,
Sampling distributions BPS chapter 10 © 2006 W. H. Freeman and Company.
Cluster Analysis This work is created by Dr. Anamika Bhargava, Ms. Pooja Kaul, Ms. Priti Bali and Ms. Rajnipriya Dhawan and licensed under a Creative Commons.
Learning Bayesian Networks for Complex Relational Data
CHAPTER 8 Estimating with Confidence
Dr. Hongqin FAN Department of Building and Real Estate
Chapter 7. Classification and Prediction
Outlier Detection Exception Mining
School of Computer Science & Engineering
Statistical Data Analysis
Outlier Discovery/Anomaly Detection
Model Trees for Identifying Exceptional Players in the NHL Draft
Data Mining Anomaly/Outlier Detection
CS498-EA Reasoning in AI Lecture #20
Statistical Data Analysis
Data Mining, Machine Learning, Data Analysis, etc. scikit-learn
Prepared by: Mahmoud Rafeek Al-Farra
CS639: Data Management for Data Science
Data Mining Anomaly Detection
Machine Learning – a Probabilistic Perspective
Data Mining Anomaly Detection
Presentation transcript:

A simple method for multi-relational outlier detection Sarah Riahi and Oliver Schulte School of Computing Science Simon Fraser University Vancouver, Canada With tools that you probably have around the house lab.

2/13 A simple method for multi-relational outlier detectiona

3/13 System Flow Flach, P. A. (1999), Knowledge representation for inductive learning 'Symbolic and Quantitative Approaches to Reasoning and Uncertainty', Springer, pp Complete Database Population Parameter Values restrict to target individual vector norm outlier score Individual Profile Individual Parameter Values Parameter Learning Algorithm Parameter Learning Algorithm Model.... Input: Model, database, target individual. Output: an outlier score

4/13 Example A simple method for multi-relational outlier detection Model = Markov Logic Network learned for Premier League Season FormulasEstimated Population Parameters Estimated Parameters for P=van Persie SavesMade(P,M)=med AND shotsOnTarget(P,M)=low AND ShotEff(P,M)=low SavesMade(P,M)=med AND shotsOnTarget(P,M)=high AND ShotEff(P,M)=high (331 formulas)....

5/13 Evaluation: Synthetic Data A simple method for multi-relational outlier detection Two Features. Designed so that outliers are easy to distinguish from normals (sanity check). 1. Normals have a strong correlation, outliers none. 2. Outliers have a strong correlation, normals none. 3. Correlations are the same, but marginals are very different.

6/13 Bayesian Network Representation F1=ShotEfficiency F2 =Match_Resullt P(F1=1)= % 50 P(F2=0|F1=0)= % 90 P(F2=1|F1=1)= % 90 Normal=Striker P(F1=1)= % 50 P(F2=1)= % 50 Outlier=MidFielder P(F1=1)= % 50 (a) (b) P(F2=1)= % 50 P(F2=0|F1=0)= % 90 P(F2=1|F1=1)= % 90 F1=ShotEfficiency F2=Match_Resullt Normal=Striker F1=TackleEfficiency F2=Match_Resullt F1=TackleEfficiency F2=Match_Resullt Outlier=MidFielder

7/13 Results AD = Breunig, M.; Kriegel, H.-P.; Ng, R. T. & Sander, J. (2000), LOF: Identifying Density-Based Local Outliers, in ‘ACM SIGMOD'. LOG = Riahi, F.; Schulte, O. & Liang, Q. (2014), 'A Proposal for Statistical Outlier Detection in Relational Structures', AAAI-StarAI Workshop on Statistical-Relational AI. Metric = Area Under Curve ELD = average L1-norm KLD = average difference AD = use single feature marginals only (unit clauses) LOG = outlier score = log-likelihood

8/13 A simple method for multi-relational outlier detection Case Study: Single Features Which formulas/rules influence outlier score the most?  interpretability Which unit clauses influence outlier score the most?

9/13 Novak, P. K.; Webb, G. I. & Wrobel, S. (2009), 'Supervised descriptive rule discovery: A unifying survey of contrast set, emerging pattern and subgroup mining', Journal of Machine Learning Research. Maervoet, J.; Vens, C.; Vanden Berghe, G.; Blockeel, H. & De Causmaecker, P. (2012), 'Outlier Detection in Relational Data: A Case Study', Expert Systems and Applications Case Study: Correlations Which formulas/rules influence outlier score the most?  interpretability Which associations influence outlier score the most? Related to exception mining (Novak et al. 2009) IndividualRuleConfidenc e Individual Confidenc e Class Edin DzekoShotEff = high AND TackleEff = medium  DribbleEff = low 50%38% Van PersieShotEff = high AND TimePlayed = high  ShotsOnTarget = high 70%50% Confidence = conditional probability

10/13 Distribution Divergence Perspective Halpern, “An analysis of first-order logics of probability”, AI Journal de Raedt, L. (2008), Logical and Relational Learning, Springer. Ch.9 Joint Value AssignmentsFrequency for Random Striker Frequency for P=van Persie SavesMade(P,M)=low AND shotsOnTarget(P,M)=low AND ShotEff(P,M)=low 22%10% SavesMade(P,M)=low AND shotsOnTarget(P,M)=high AND ShotEff(P,M)=high 30%62% ….... Outlier Score = Dissimilarity measure between Random Individual and Target Individual. In our work, dissimilarity measure = distribution divergence. Could leverage other distance-type metrics as well.

11/13 Propositionalization for Outlier Detection Lippi, M.; Jaeger, M.; Frasconi, P. & Passerini, A. (2011), 'Relational information gain', Machine Learning 83(2), 219—239. PlayersSavesMade(P,M)=med AND shotsOnTarget(P,M)=low AND ShotEff(P,M)=low SavesMade(P,M)=med AND shotsOnTarget(P,M)=high AND ShotEff(P,M)=high (331 more) Wayne Rooney 13%10%... van Persie 50%62%... ….... Construct 331-dimensional attribute vector for each individual. One frequency/count value for each formula  pseudo-i.i.d data view. Like n-grams. Apply standard single-table analysis methods. Could also use learned weights instead of sufficient statistics.

12/13 Propositionalization Results A simple method for multi-relational outlier detection LowCor = Normals have low correlation. HighCor = Normals have high correlation.

13/13 Summary Outlier detection based on a statistical-relational model. Basic Idea: compare individual profile to entire population. Leverage parameter learning: 1. Learn parameter values for individual. 2. Learn parameter values for entire population. 3. Outlier score = parameter vector difference. E.g. average L1-distance. Leverage relational distance between individuals. In our work, distance ≈ distribution divergence. Outlier score = divergence between individual distribution and population distribution. Another approach: Model-based propositionalization for outlier detection. Attribute-values = frequency counts for patterns in model structure. A simple method for multi-relational outlier detectiona