Multi-phase process mining

Slides:



Advertisements
Similar presentations
Constraint Satisfaction Problems
Advertisements

Frequent Closed Pattern Search By Row and Feature Enumeration
Menu Theorem 4 The measure of the three angles of a triangle sum to 180 degrees. Theorem 6 An exterior angle of a triangle equals the sum of the two interior.
MXML A Meta model for process mining data
/faculteit technologie management 1 Process Mining: Control-Flow Mining Algorithms Ana Karla Alves de Medeiros Ana Karla Alves de Medeiros Eindhoven University.
FP (FREQUENT PATTERN)-GROWTH ALGORITHM ERTAN LJAJIĆ, 3392/2013 Elektrotehnički fakultet Univerziteta u Beogradu.
Boudewijn van Dongen /t Multi-phase process mining Building instance graphs.
/faculteit technologie management Genetic Process Mining Ana Karla Medeiros Ton Weijters Wil van der Aalst Eindhoven University of Technology Department.
/faculteit technologie management Genetic Process Mining Ana Karla Alves de Medeiros Eindhoven University of Technology Department.
Process Mining in CSCW Systems All truths are easy to understand once they are discovered; the point is to discover them. Galileo Galilei ( )
Business Alignment Using Process Mining as a Tool for Delta Analysis Prof.dr.ir. Wil van der Aalst Eindhoven University of Technology Department of Information.
/faculteit technologie management Process Mining and Security: Detecting Anomalous Process Executions and Checking Process Conformance Wil van der Aalst.
Discovering Coordination Patterns using Process Mining Prof.dr.ir. Wil van der Aalst Eindhoven University of Technology Department of Information and Technology.
Boudewijn van Dongen June 22, 2004 /t Process Mining, the basics.
Process Mining: Discovering processes from event logs All truths are easy to understand once they are discovered; the point is to discover them. Galileo.
/faculteit technologie management Genetic Process Mining Wil van der Aalst Ana Karla Medeiros Ton Weijters Eindhoven University of Technology Department.
Process Mining: An iterative algorithm using the Theory of Regions Kristian Bisgaard Lassen Boudewijn van Dongen Wil van.
External-Memory MST (Arge, Brodal, Toma). Minimum-Spanning Tree Given a weighted, undirected graph G=(V,E), the minimum-spanning tree (MST) problem is.
Process Mining for Ubiquitous Mobile Systems An Overview and a Concrete Algorithm Prof.dr.ir. Wil van der Aalst Eindhoven University of Technology Department.
Drawing AOA and AON networks
Defining Digital Forensic Examination & Analysis Tools Brian Carrier.
Jorge Muñoz-Gama Universitat Politècnica de Catalunya (Barcelona, Spain) Algorithms for Process Conformance and Process Refinement.
A Unified Modeling Framework for Distributed Resource Allocation of General Fork and Join Processing Networks in ACM SIGMETRICS
Process Mining: Discovering processes from event logs All truths are easy to understand once they are discovered; the point is to discover them. Galileo.
Ratio and Proportion.
Chapter 4 : Similar Triangles Informally, similar triangles can be thought of as having the same shape but different sizes. If you had a picture of a triangle.
Chinese postman problem
Mining High Utility Itemset in Big Data
Jianmin Wang 1, Shaoxu Song 1, Xiaochen Zhu 1, Xuemin Lin 2 1 Tsinghua University, China 2 University of New South Wales, Australia 1/23 VLDB 2013.
2006/3/211 Multiple Aggregations over Data Stream Rui Zhang, Nick Koudas, Beng Chin Ooi Divesh Srivastava SIGMOD 2005.
Han-na Yang Rediscovering Workflow Models from Event-Based Data using Little Thumb.
Process-oriented System Analysis Process Mining. BPM Lifecycle.
CASE/Re-factoring and program slicing
Decision Mining in Prom A. Rozinat and W.M.P. van der Aalst Joosung, Ko.
Aim: Slopes of Parallel Lines Course: Applied Geometry Do Now: a. y = 2x + 5 b. y = 2x – 1 c. y = 2x + 2 Aim: What is the relationship between slopes.
Network and Computer Security (CS 475) Modular Arithmetic
Mobile and Wireless Computing Institute for Computer Science, University of Freiburg Western Australian Interactive Virtual Environments Centre (IVEC)
/faculteit technologie management Workflow Mining: Current Status and Future Directions Ana Karla A. de Medeiros, W.M.P van der Aalst and A.J.M.M. Weijters.
THE EYESWEB PLATFORM - GDE The EyesWeb XMI multimodal platform GDE 5 March 2015.
DE MORGAN’S THEOREM. De Morgan’s Theorem De Morgan’s Theorem.
A novel, low-latency algorithm for multiple group-by query optimization Duy-Hung Phan Pietro Michiardi ICDE16.
Multi-phase Process Mining: Building Instance Graphs
30 januari 2018 Mining Social Networks Uncovering interaction patterns in business processes Prof.dr.ir. Wil van der Aalst Eindhoven University of Technology.
7 mei 2018 Process Mining in CSCW Systems All truths are easy to understand once they are discovered; the point is to discover them. Galileo Galilei.
De Morgan’s Theorem,.
Automate Does Not Always Mean Optimize
Profiling based unstructured process logs
New Characterizations in Turnstile Streams with Applications
Hadoop MapReduce Framework
David Redlich, Thomas Molka, Wasif Gilani, Awais Rashid, Gordon Blair
CS 352 Introduction to Logic Design
8.1 Ratio and Proportion.
تصنيف التفاعلات الكيميائية
Concurrent Graph Exploration with Multiple Robots
Instruction encoding We’ve already seen some important aspects of processor design. A datapath contains an ALU, registers and memory. Programmers and compilers.
Project Management (lecture)
Minimum Spanning Trees
Drawing AOA and AON networks
Vectors OCR Stage 10.
Drawing AOA and AON networks
Project Management (lecture)
CENG 351 Data Management and File Structures
II.2 Four Factors in Eight Runs
3 mei 2019 Process Mining and Security: Detecting Anomalous Process Executions and Checking Process Conformance Wil van der Aalst Ana Karla A. de Medeiros.
Distance Formula d = √ (x1 – x2)2 + (y1 – y2)2, where d is the distance between the points (x1, y1) and (x2, y2).
Business Alignment Using Process Mining as a Tool for Delta Analysis
5 juli 2019 Process Mining and Security: Detecting Anomalous Process Executions and Checking Process Conformance Wil van der Aalst Ana Karla A. de Medeiros.
Lines, rays and line segments
Chapter 3 – Describing Logic Circuits
19 augustus 2019 Mining Social Networks Uncovering interaction patterns in business processes Prof.dr.ir. Wil van der Aalst Eindhoven University of Technology.
Presentation transcript:

Multi-phase process mining EPC representations of process instances, and the aggregation of those into one overall specification.

Outline Introduction to Process Mining Introduction to process performance monitoring Using process mining to deploy a monitoring system Extending Conclusion

Overview Process Mining 2) Control flow rediscovery 3) organizational model 4) social network 1) basic performance metrics 5) performance characteristics Next: Process Log Requirements (staffware example) 6) auditing/security If …then … www.processmining.org

Log Files Information systems typically log all kinds of events. We use a XML format for storing event logs. The basic assumption is that the log contains information about specific tasks executed for specific process instances (cases, event-lists, audit trails). We do not assume any knowledge of the underlying process.

Log based ordering relations If there is a process instance (trace, eventlist) in which two functions A and B follow each other directly, we say: A > B Causal relations: A  B if and only if A > B and ¬B > A Parallel relations A || B if and only if A > B and B > A Note, if two tasks A and B are in a loop of length two (I.e. there is a trace such that A B A is in that trace) then A  B and B  A and not A || B

Process Mining, the α-algorithm Using the  and || relation, we construct places for the resulting Petrinet. E G F S A B C D T {E}  {G, F } {F}  {C} {F}  {D } {B, D}  {T }

The problems of the α-algorithm Known issues of the α-algorithm algorithm: - It requires its input to come from a certain class of models. Tasks that can be skipped in the original process: ( AC , ABC ) Tasks that appear twice in the same process, under the same name. Tasks that do not appear in the log, but belong in the process. (Invisible tasks) Errors in the input data (I.e. one trace containing AB versus one million traces containing BA.)

Calculating causal relations Using the causal relation, we construct a general base graph for all cases: And then to: A  B A  C A  D C  D B  D D  E Assume we have the following cases: A,B,C,D,E A,C,B,D,E A,D,E This leads to these relations: A>B B>C C>D D>E A>C C>B B>D A>D A B C D E or

Creating instance graphs For each instance, we walk through the base graph and make instance graphs for each case. C Process instance: A,B,C,D,E and A,C,B,D,E A D E B C Process instance: A,D,E A D E B

Converting to EPCs Each instance-graph is then converted into a human readable format, such as an EPC Process instance: A,B,C,D,E and A,C,B,D,E Process instance: A,D,E

Aggregating instance graphs Assume we have a set of instance graphs: 2x 1x Can we aggregate them into one model? A B C D E Our log file: A,B,C,D,E A,C,B,D,E A,D,E A D E

Aggregating instance graphs First, we project the instances onto themselves, to distinguish between “task executions” and “tasks that were executed” A1 B1 C1 D1 E1 1 Our log file: A,B,C,D,E A,C,B,D,E A,D,E A1 D1 E1 1

Aggregating instance graphs Combine the graphs based on node labels Decide on split/join type C2 2 2 3 3 A3 D3 E3 Our log file: A,B,C,D,E A,C,B,D,E A,D,E 2 2 B2 C A D E B

Aggregating instance graphs What type of split / join to choose? x = a = b = c implies AND x = a + b + c implies XOR else OR Ax a b c

Conversion to EPC C A B D E Conversion to an EPC is done in the same way as for instance graphs Becomes: C A B D E Our log file: A,B,C,D,E A,C,B,D,E A,D,E

(a) (b) (c) Example A B D C G H I E A B D C E A B D C F Consider the following instance graphs: A B D C G H I E (a) A B D C E (b) A B D C F (c)

Example That will lead to the following instance graph projections: A B D C G 1 2 H I E t s f (a) A B D C E 1 t f s (b) A B D C F 1 t f s (c)

Example Which results in this aggregation graph: 1 F 1 1 1 D 2 E t 2 4 4 f 2 2 3 t 3 A 3 B s 4 4 3 3 C 1 4 1 G 1 1 H 1 1 I 1 1 1

Example G 1

Example (Not implemented yet)

Comparison to α-algorithm The α-algorithm requires its input to come from a certain class of models. The multi-phase system does not have this requirement. The result is always an executable specification. The α-algorithm cannot deal with tasks that can be skipped ( AC , ABC ). The multi-phase system deals with this easily, using connectors (invisible steps) The following issues still stand: Tasks that appear twice in the same process, under the same name. Tasks that do not appear in the log, but belong in the process. (Invisible tasks) Errors in the input data (I.e. one trace containing AB versus one million traces containing BA.)

Conclusions The multi-phase system is an improvement to the α-algorithm The multi-phase system improves the practical applicability of the mining research, since it can be used as an interface between Aris PPM and the ProM-framework Further research is needed to deal with duplicate tasks, invisible tasks and noisy log files