Patterns extraction from process executions

Slides:



Advertisements
Similar presentations
Answering Approximate Queries over Autonomous Web Databases Xiangfu Meng, Z. M. Ma, and Li Yan College of Information Science and Engineering, Northeastern.
Advertisements

Mustafa Cayci INFS 795 An Evaluation on Feature Selection for Text Clustering.
 Data mining has emerged as a critical tool for knowledge discovery in large data sets. It has been extensively used to analyze business, financial,
gSpan: Graph-based substructure pattern mining
Han-na Yang Trace Clustering in Process Mining M. Song, C.W. Gunther, and W.M.P. van der Aalst.
Models vs. Reality dr.ir. B.F. van Dongen Assistant Professor Eindhoven University of Technology
Context-aware Query Suggestion by Mining Click-through and Session Data Authors: H. Cao et.al KDD 08 Presented by Shize Su 1.
Integrating Bayesian Networks and Simpson’s Paradox in Data Mining Alex Freitas University of Kent Ken McGarry University of Sunderland.
Core Text Mining Operations 2007 년 02 월 06 일 부산대학교 인공지능연구실 한기덕 Text : The Text Mining Handbook pp.19~41.
/faculteit technologie management Process Mining and Security: Detecting Anomalous Process Executions and Checking Process Conformance Wil van der Aalst.
1 Synthesizing High-Frequency Rules from Different Data Sources Xindong Wu and Shichao Zhang IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, VOL.
OOSE 01/17 Institute of Computer Science and Information Engineering, National Cheng Kung University Member:Q 薛弘志 P 蔡文豪 F 周詩御.
272: Software Engineering Fall 2012 Instructor: Tevfik Bultan Lecture 17: Code Mining.
Scientific Workflows Within the Process Mining Domain Martina Caccavale 17 April 2014.
Intrusion Detection Jie Lin. Outline Introduction A Frame for Intrusion Detection System Intrusion Detection Techniques Ideas for Improving Intrusion.
Alert Correlation for Extracting Attack Strategies Authors: B. Zhu and A. A. Ghorbani Source: IJNS review paper Reporter: Chun-Ta Li ( 李俊達 )
Jorge Muñoz-Gama Universitat Politècnica de Catalunya (Barcelona, Spain) Algorithms for Process Conformance and Process Refinement.
Using Identity Credential Usage Logs to Detect Anomalous Service Accesses Daisuke Mashima Dr. Mustaque Ahamad College of Computing Georgia Institute of.
Of 33 lecture 10: ontology – evolution. of 33 ece 720, winter ‘122 ontology evolution introduction - ontologies enable knowledge to be made explicit and.
Join Synopses for Approximate Query Answering Swarup Achrya Philip B. Gibbons Viswanath Poosala Sridhar Ramaswamy Presented by Bhushan Pachpande.
Report on Intrusion Detection and Data Fusion By Ganesh Godavari.
Querying Business Processes Under Models of Uncertainty Daniel Deutch, Tova Milo Tel-Aviv University ERP HR System eComm CRM Logistics Customer Bank Supplier.
Pontificia Universidad Católica de Chile School of Engineering Department of Computer Science A feedback-based framework for process enhancement of causal.
Expert Systems with Applications 34 (2008) 459–468 Multi-level fuzzy mining with multiple minimum supports Yeong-Chyi Lee, Tzung-Pei Hong, Tien-Chin Wang.
Software Architecture Reconstruction By Vijaya Datta Mayyuri Symphony, Cacophony.
Han-na Yang Rediscovering Workflow Models from Event-Based Data using Little Thumb.
Process-oriented System Analysis Process Mining. BPM Lifecycle.
Boundary Detection in Tokenizing Network Application Payload for Anomaly Detection Rachna Vargiya and Philip Chan Department of Computer Sciences Florida.
Decision Mining in Prom A. Rozinat and W.M.P. van der Aalst Joosung, Ko.
Mining Graph Patterns Efficiently via Randomized Summaries Chen Chen, Cindy X. Lin, Matt Fredrikson, Mihai Christodorescu, Xifeng Yan, Jiawei Han VLDB’09.
Introduction to Data Mining by Yen-Hsien Lee Department of Information Management College of Management National Sun Yat-Sen University March 4, 2003.
18 February 2003Mathias Creutz 1 T Seminar: Discovery of frequent episodes in event sequences Heikki Mannila, Hannu Toivonen, and A. Inkeri Verkamo.
Discovering Evolutionary Theme Patterns from Text - An Exploration of Temporal Text Mining Qiaozhu Mei and ChengXiang Zhai Department of Computer Science.
1 Friends and Neighbors on the Web Presentation for Web Information Retrieval Bruno Lepri.
Speaker : Yu-Hui Chen Authors : Dinuka A. Soysa, Denis Guangyin Chen, Oscar C. Au, and Amine Bermak From : 2013 IEEE Symposium on Computational Intelligence.
Data Mining By Farzana Forhad CS 157B. Agenda Decision Tree and ID3 Rough Set Theory Clustering.
TreeFinder : a first step towards XML data mining Advisor : Dr. Hsu Graduate : Keng-Wei Chang Author : Alexandre Termier Marie-Christine Michele Sebag.
Instance Discovery and Schema Matching With Applications to Biological Deep Web Data Integration Tantan Liu, Fan Wang, Gagan Agrawal {liut, wangfa,
Maikel Leemans Wil M.P. van der Aalst. Process Mining in Software Systems 2 System under Study (SUS) Functional perspective Focus: User requests Functional.
Profiling and process mining What has been done???
Profiling: What is it? Notes and reflections on profiling and how it could be used in process mining.
Visualization in Process Mining
Multi-phase Process Mining: Building Instance Graphs
Modelling and Solving Configuration Problems on Business
Parametric calibration of speed–density relationships in mesoscopic traffic simulator with data mining Adviser: Yu-Chiang Li Speaker: Gung-Shian Lin Date:2009/10/20.
DATA MINING © Prentice Hall.
Exploring processes and deviations
A paper on Join Synopses for Approximate Query Answering
Auburn University COMP7330/7336 Advanced Parallel and Distributed Computing Data Partition Dr. Xiao Qin Auburn University.
David Redlich, Thomas Molka, Wasif Gilani, Awais Rashid, Gordon Blair
Data and Applications Security Introduction to Data Mining
Datamining : Refers to extracting or mining knowledge from large amounts of data Applications : Market Analysis Fraud Detection Customer Retention Production.
Generating Natural Answers by Incorporating Copying and Retrieving Mechanisms in Sequence-to-Sequence Learning Shizhu He, Cao liu, Kang Liu and Jun Zhao.
Parametric calibration of speed–density relationships in mesoscopic traffic simulator with data mining Adviser: Yu-Chiang Li Speaker: Gung-Shian Lin Date:2009/10/20.
Lin Lu, Margaret Dunham, and Yu Meng
#VisualHashtags Visual Summarization of Social Media Events using Mid-Level Visual Elements Sonal Goel (IIIT-Delhi), Sarthak Ahuja (IBM Research, India),
Towards a Generic On Line Auditing Tool (OLAT)
ece 627 intelligent web: ontology and beyond
Graph-Based Anomaly Detection
Multi-phase process mining
Information Networks: State of the Art
Bioinformatics, Vol.17 Suppl.1 (ISMB 2001) Weekly Lab. Seminar
Pei Lee, ICDE 2014, Chicago, IL, USA
3 mei 2019 Process Mining and Security: Detecting Anomalous Process Executions and Checking Process Conformance Wil van der Aalst Ana Karla A. de Medeiros.
Discovery of Significant Usage Patterns from Clickstream Data
5 juli 2019 Process Mining and Security: Detecting Anomalous Process Executions and Checking Process Conformance Wil van der Aalst Ana Karla A. de Medeiros.
Yining ZHAO Computer Network Information Center,
Communication Driven Remapping of Processing Element (PE) in Fault-tolerant NoC-based MPSoCs Chia-Ling Chen, Yen-Hao Chen and TingTing Hwang Department.
Introduction Dataset search
Presentation transcript:

Patterns extraction from process executions 19 Feb 2015 Laura Genga

Outline Introduction Approach Experiments Conclusion and future works Building Instance Graphs Patterns Extraction Experiments BPI2013challenge CoseLog Conclusion and future works

Introduction Many real world domains are characterized by processes with little structure Typical process discovery approaches have problems when dealing with such processes «Spaghetti» models

Spaghetti processes analysis Schema simplification Trace clustering Patterns discovery

Patterns discovery Existing approaches: mining on traces Patterns abstraction Episodes discovery Episodes Discovery Patterns Abstraction P1: P1 : <Start,b,c,d,g> P2 : <e,f,h> … P2: P3:

Proposed Approach: Mining on Graphs Event Log Instance Graphs Set Patterns Set Case Id Trace 1 <Start,b,c,d,g,End> 2 <Start,a,b,d,c,g,i,End> 3 <Start,a,e,f,h,i,End> 4 <Start,b,c,d,g,e,f,h, End> 1 1 2 P1 2 P2 3 4

Building IGs set abcdgi The parallelism is hidden in the trace We need to know the causal relations between events Use of process discovery approaches abcdgi

Deriving causal relations from process discovery outcome CR set can be derived by means of some process discovery approach The mining techniques must be chosen carefully Source Target A B E C D F … …. A→B A→E

Instance Graphs building For each pair of events 𝑒 𝑖 , 𝑒 𝑗 for which 𝑒 𝑖 → 𝑒 𝑗 holds, add an edge if in the trace between 𝑒 𝑖 , 𝑒 𝑗 : (1) No successors of 𝑒 𝑖 OR (2) No predecessors of 𝑒 𝑗 Source Target A B I C D G K Source Target A B I C D G K Source Target A B I C D G K Source Target A B I C D G K Source Target A B I C D G K c T1: a b i c d g k a b g d 𝑎→𝑏 1 ok 𝑎→𝑖 1 no 2 ok 𝑏→𝑑 ok 1 no 2 ok 𝑏→𝑐 1 ok i k

Flower models problem Representing all possible behaviors can generate a flower model Using a flower model we obtain only sequence graphs Look only for most frequent relations Some traces will result “anomalous” t1 : <Start,a,e,f,h,End>

Graphs with anomalies 𝑡 1 : bacdg 𝑡 2 : afehi A B E C D G K F H I Source Target A B E C D G K F H I 𝑡 1 : bacdg 𝑡 2 : afehi

Use of conformance checking techniques Conformance checking technique provide precise information about the occurrence of an anomaly in a trace The corresponding graph explicitly represents the anomaly occurrence insertion deletion

Updated graphs with anomalies Source Target A B E C D G K F H I 𝑡 1 : bacdg 𝑡 2 : afehi

Proposed Approach: Mining on Graphs Event Log Instance Graphs Set Patterns Set Case Id Trace 1 <Start,b,c,d,g,End> 2 <Start,a,b,d,c,g,i,End> 3 <Start,a,e,f,h,i,End> 4 <Start,b,c,d,g,e,f,h, End> 1 1 2 P1 2 P2 3 4

Patterns extraction Frequent subgraph mining techniques Extraction of subgraphs whose “support” is above a threshold Support of a subgraph transaction-based: number of graphs involving the subgraph Occurrence-based: number of occurrences of the subgraph TB supp: 2 OB supp: 3

SUBDUE Algorithm Supported computed by using frequency and size Discovered patterns are arranged into a hierarchy

Experiments Two experiments: Log of BPI2013 (Incident Management) Wabo4 (CoseLog project) CR set derived by the Inductive Miner algorithm Patterns evaluation Support : transaction based Domain knowledge

BPI2013 Model mined by IMi

SUB1 Supp: 47% DK: the event “queued + awaiting assignment” is undesired

SUB7 Supp: 19% DK: high rate of incident management delegation

SUB12 Supp: 8% DK: this should be the “ideal” activities order

Wabo4: Process Model Mined by Imi

SUB1 Supp: 41% DK: Starting activities of an application management

SUB12 Supp: 16% DK: Final part of an application management. Unexpected parallelism

SUB3 Supp: 21% DK: can be considered a meaningful sub-process

Summing up The proposed method was able to detect interesting patterns, providing an alternative way to analyze complex, spaghetti processes The method is flexible; it can be used with any process discovery/ frequent subgraph mining technique Limits: The reliability of the results depends on the process discovery approach adopted The pattern interpretation support can be improved

Future works Improving the pattern interpretation support Providing for a pattern also its context Adding a performance evaluation based on patterns Single pattern evaluation: average costs, throughput time… Analyzing the pattern impact on the overall process performance

Thank you for your attention!