Jorge Munoz-Gama Advisor: Josep Carmona December 2014 CONFORMANCE CHECKING AND DIAGNOSIS IN PROCESS MINING.

Slides:



Advertisements
Similar presentations
Risk Modeling The Tropos Approach PhD Lunch Meeting 07/07/2005 Yudistira Asnar –
Advertisements

Introduction to IRRIIS testing platform IRRIIS MIT Conference ROME 8 February 2007 Claudio Balducelli.
Jorge Muñoz-Gama Josep Carmona
Han-na Yang Trace Clustering in Process Mining M. Song, C.W. Gunther, and W.M.P. van der Aalst.
1 Some Comments on Sebastiani et al Nature Genetics 37(4)2005.
CONFORMANCE CHECKING IN THE LARGE: PARTITIONING AND TOPOLOGY Jorge Munoz-Gama, Josep Carmona and Wil M.P. van der Aalst.
Fast Algorithms For Hierarchical Range Histogram Constructions
Aligning Event Logs And Declare Models for Conformance Checking Massimiliano de Leoni, Fabrizio Maggi Wil van der Aalst.
Data Conformance Checking using Optimal Alignments Felix Mannhardt, Massimiliano de Leoni, Hajo A. Reijers.
Aligning Event Logs and Process Models for Multi- perspective Conformance Checking: An Approach Based on ILP Massimiliano de Leoni Wil M. P. van der Aalst.
Models vs. Reality dr.ir. B.F. van Dongen Assistant Professor Eindhoven University of Technology
Block-Structured Process Discovery: Filtering Infrequent Behaviour Sander Leemans Dirk Fahland Wil van der Aalst Eindhoven University of Technology.
The Comparison of the Software Cost Estimating Methods
Synthesis of Embedded Software Using Free-Choice Petri Nets.
CS244-Introduction to Embedded Systems and Ubiquitous Computing Instructor: Eli Bozorgzadeh Computer Science Department UC Irvine Winter 2010.
System Partitioning Kris Kuchcinski
1 The Expected Performance Curve Samy Bengio, Johnny Mariéthoz, Mikaela Keller MI – 25. oktober 2007 Kresten Toftgaard Andersen.
More Graph Algorithms Weiss ch Exercise: MST idea from yesterday Alternative minimum spanning tree algorithm idea Idea: Look at smallest edge not.
Experimental Evaluation
Unraveling Unstructured Process Models Marlon Dumas University of Tartu, Estonia Joint work with Artem Polyvyanyy and Luciano García-Bañuelos Invited Talk,
A university for the world real R © 2009, Chapter 17 Process Mining and Simulation Moe Wynn Anne Rozinat Wil van der Aalst Arthur.
Insuring Sensitive Processes through Process Mining Jorge Munoz-Gama Isao Echizen Jorge Munoz-Gama and Isao Echizen.
Software Testing Sudipto Ghosh CS 406 Fall 99 November 9, 1999.
Domain decomposition in parallel computing Ashok Srinivasan Florida State University COT 5410 – Spring 2004.
Incident Response Mechanism for Chemical Facilities By Stephen Fortier and Greg Shaw George Washington University, Institute for Crisis, Disaster and Risk.
HIERARCHICAL CONFORMANCE CHECKING OF PROCESS MODELS BASED ON EVENT LOGS Jorge Munoz-Gama, Josep Carmona and Wil M.P. van der Aalst.
Model Transformations for Business Process Analysis and Execution Marlon Dumas University of Tartu.
Jorge Muñoz-Gama Universitat Politècnica de Catalunya (Barcelona, Spain) Algorithms for Process Conformance and Process Refinement.
1 Systems Engineering Process Review Mark E. Sampson EMIS 8340 Systems Engineering Tool—applying tools to engineering systems.
EVENT-BASED REAL-TIME DECOMPOSED CONFORMANCE ANALYSIS Seppe vanden Broucke, Jorge Munoz-Gama, Josep Carmona, Bart Baesens, and Jan Vanthienen CoopIS 2014.
Model-Driven Approach for User Interface-Business Alignment Kênia Sousa Advisor: Jean Vanderdonckt Université catholique de Louvain (UCL) Louvain School.
DECOMPOSED CONFORMANCE Jorge Munoz-Gama, Josep Carmona and W.M.P van der Aalst.
Pontificia Universidad Católica de Chile School of Engineering Department of Computer Science A feedback-based framework for process enhancement of causal.
J. Carmona R. Gavaldà UPC (Barcelona, Spain) 1. Outline  The Advent of Process Mining (PM) The challenge of Concept Drift (CD)  Key ingredients  Online.
Jianmin Wang 1, Shaoxu Song 1, Xiaochen Zhu 1, Xuemin Lin 2 1 Tsinghua University, China 2 University of New South Wales, Australia 1/23 VLDB 2013.
Efficiency of Alignment-based algorithms B.F. van Dongen Laziness! (Gu)estimation! Implementation effort?
Exploiting Context Analysis for Combining Multiple Entity Resolution Systems -Ramu Bandaru Zhaoqi Chen Dmitri V.kalashnikov Sharad Mehrotra.
CS244-Introduction to Embedded Systems and Ubiquitous Computing Instructor: Eli Bozorgzadeh Computer Science Department UC Irvine Winter 2010.
Han-na Yang Rediscovering Workflow Models from Event-Based Data using Little Thumb.
CSE 589 Part VI. Reading Skiena, Sections 5.5 and 6.8 CLR, chapter 37.
Modeling and Analysis of Printer Data Paths using Synchronous Data Flow Graphs in Octopus Ashwini Moily Under the supervision of Dr. Lou Somers, Prof.
Decision Mining in Prom A. Rozinat and W.M.P. van der Aalst Joosung, Ko.
Efficient Computing k-Coverage Paths in Multihop Wireless Sensor Networks XuFei Mao, ShaoJie Tang, and Xiang-Yang Li Dept. of Computer Science, Illinois.
Alignment-based Precision Checking A. Adriansyah 1, J. Munoz Gamma 2, J. Carmona 2, B.F. van Dongen 1, W.M.P. van der Aalst 1 Tallinn, 3 September 2012.
Decomposing Data-aware Conformance Checking Massimiliano de Leoni, Jorge Munoz-Gama, Josep Carmona, Wil van der Aalst PAGE 0.
CS 484 Load Balancing. Goal: All processors working all the time Efficiency of 1 Distribute the load (work) to meet the goal Two types of load balancing.
"Decomposing Alignment- based Conformance Checking of Data-aware Process Models" Massimiliano de Leoni, Jorge Muñoz-Gama, Josep Carmona, Wil van der Aalst.
Domain decomposition in parallel computing Ashok Srinivasan Florida State University.
IT Applications for Decision Making. Operations Research Initiated in England during the world war II Make scientifically based decisions regarding the.
Dealing With Concept Drifts in Process Mining. Abstract Although most business processes change over time, contemporary process mining techniques tend.
Smith’s Aerospace © P. Bailey & K. Vander Linden, 2005 Procedural Activity Patrick Bailey Keith Vander Linden Calvin College.
A Validation System for the Complex Event Processing Directives of the ATLAS Shifter Assistant Tool G. Anders (CERN), G. Avolio (CERN), A. Kazarov (PNPI),
Maikel Leemans Wil M.P. van der Aalst. Process Mining in Software Systems 2 System under Study (SUS) Functional perspective Focus: User requests Functional.
Diagnostic Information for Control-Flow Analysis of Workflow Graphs (aka Free-Choice Workflow Nets) Cédric Favre(1,2), Hagen Völzer(1), Peter Müller(2)
Process Mining – Concepts and Algorithms Review of literature on process mining techniques for event log data.
Discovering Models for State-based Processes M.L. van Eck, N. Sidorova, W.M.P. van der Aalst.
Network Layer COMPUTER NETWORKS Networking Standards (Network LAYER)
Experience Report: System Log Analysis for Anomaly Detection
Profiling based unstructured process logs
INCOSE Usability Working Group
1 Queensland University of Technology, Brisbane, Australia
Exploring processes and deviations
Patterns extraction from process executions
Games with Chance Other Search Algorithms
A General Framework for Correlating Business Process Characteristics
On Spatial Joins in MapReduce
Decomposed Process Mining: The ILP Case
JORGE MUNOZ-GAMA and JOSEP CARMONA
Communication Driven Remapping of Processing Element (PE) in Fault-tolerant NoC-based MPSoCs Chia-Ling Chen, Yen-Hao Chen and TingTing Hwang Department.
Presentation transcript:

Jorge Munoz-Gama Advisor: Josep Carmona December 2014 CONFORMANCE CHECKING AND DIAGNOSIS IN PROCESS MINING

PRECISION DECOMPOSITION CONFORMANCE CHECKING CONCLUSIONS

PRECISION DECOMPOSITION CONFORMANCE CHECKING CONCLUSIONS

Conformance Checking in a Nutshell MODELREALITY PROCESS DOMAIN EXPERTS 4

5 Biased Vision

Conformance Checking in a Nutshell MODELREALITY PROCESS LOGS DOMAIN EXPERTS 6

Conformance Checking in a Nutshell MODELREALITY PROCESS LOGS 7

Conformance Checking in a Nutshell MODELREALITY PROCESS LOGS 8

Structure and Outline Structure of the Presentation Problem – Context – Contributions Outline of the Presentation Precision Precision based on the Log Qualitative Analysis of Precision Checking Precision based on Alignments Fitness Decomposition Decomposed Conformance Checking Topological Conformance Diagnosis Data-aware Decomposed Conformance Checking Event-based Real-time Decomposed Conformance Checking 9

PRECISION DECOMPOSITION CONFORMANCE CHECKING CONCLUSIONS

11 Precision Precision based on the Log Qualitative Analysis of Precision Checking Precision based on Alignments Fitness Decomposition Decomposed Conformance Checking Topological Conformance Diagnosis Data-aware Decomposed Conformance Checking Event-based Real-time Decomposed Conformance Checking

Problem “Low Criticality Diagnosis” Process 12 Initial Examination Allergy Test Blood Test Radiology Test Diagnosis Hospital Treatment Home Care Hospital Process-aware Information System Hospital Staff “Low Criticality Diagnosis” Process Model

Problem “Low Criticality Diagnosis” Process 13 Initial Examination Allergy Test Blood Test Radiology Test Diagnosis Hospital Treatment Home Care Initial Examination Allergy Test Blood Test Radiology Test Diagnosis Home Care

Problem “Low Criticality Diagnosis” Process 14 Initial Examination Allergy Test Blood Test Radiology Test Diagnosis Home Care Initial Examination Allergy Test Blood Test Radiology Test Diagnosis Hospital Treatment Home Care Initial Examination Radiology Test …

Context The Importance of Precision A good model must be fitting but also be precise 15

Context Efficient and Comprehensive Approach to measure precision Based on potential points of improvement Not require an exhaustive model state-space exploration Previous works require model exploration/simulation Identify precision problems with a fine granularity Results for analysis and process improvement 16

Contributions Precision based on Escaping Arcs 17 MODEL BEHAVIOR LOG BEHAVIOR Exploration of the model’s behavior: costly, possibly infinite, or require simulation.

Contributions Precision based on Escaping Arcs 18 LOG BEHAVIOR Model behavior traversal restricted by the log behavior. Escaping arcs: points where the model allows more behavior than the one observed in the log. ESCAPING ARC

Compute Precision Modeled Behavior Observed Behavior log Minimal Imprecise Traces ETC Precision (etcp) Contributions Outline of Precision based on Escaping Arcs a, b, d, g, i a, c, d, e, f, h, i a, c, e, d, f, h, i a, c, e, f, d, h, i model a b c d e f h g i a, c, f a, c, d, f a, c, d, e, e a, c, e, d, e a, c, e, e

Compute Precision Modeled Behavior Observed Behavior log Minimal Imprecise Traces ETC Precision (etcp) Contributions Outline of Precision based on Escaping Arcs a, b, d, g, i a, c, d, e, f, h, i a, c, e, d, f, h, i a, c, e, f, d, h, i model a b c d e f h g i a, c, f a, c, d, f a, c, d, e, e a, c, e, d, e a, c, e, e

Contributions Observed Behavior 21 a, b, d, g, i a, c, d, e, f, h, i a, c, e, d, f, h, i a, c, e, f, d, h, i a 1 1 b 1 g 1 d 1 i 1

Contributions Observed Behavior 22 a c b d gi i h f d e a, b, d, g, i a, c, d, e, f, h, i a, c, e, d, f, h, i a, c, e, f, d, h, i

Contributions Observed Behavior 23 a c b d gi i i i h h h f f f d d d e e a, b, d, g, i a, c, d, e, f, h, i a, c, e, d, f, h, i a, c, e, f, d, h, i

Compute Precision Modeled Behavior Observed Behavior log Minimal Imprecise Traces ETC Precision (etcp) Contributions Outline of Precision based on Escaping Arcs a, b, d, g, i a, c, d, e, f, h, i a, c, e, d, f, h, i a, c, e, f, d, h, i model a b c d e f h g i a, c, f a, c, d, f a, c, d, e, e a, c, e, d, e a, c, e, e

Contributions Modeled Behavior 25 a c b d gi i i i h h h f f f d d d e e f f e e e a b c d e f h g i

Compute Precision Modeled Behavior Observed Behavior log Minimal Imprecise Traces ETC Precision (etcp) Contributions Outline of Precision based on Escaping Arcs a, b, d, g, i a, c, d, e, f, h, i a, c, e, d, f, h, i a, c, e, f, d, h, i model a b c d e f h g i a, c, f a, c, d, f a, c, d, e, e a, c, e, d, e a, c, e, e

Contributions Compute Precision For each state of the automaton we take into account the weight, the observed arcs and the allowed arcs: 27 observed states weightescaping arcs allowed arcs

Contributions Computing Precision 28 a c b d gi i i i h h h f f f d d d e e f f e e e … + 4 · 0 +… … + 4 · 2 +… 1 -

Contributions Computing Precision 29 a c b d gi i i i h h h f f f d d d e e f f e e e … + 1 · 1 +… … + 1 · 2 +… 1 -

Contributions Challenges Addressed The precision based on escaping arcs does not require a complete exploration of the model behavior. Instead, the model exploration is restricted by the behavior observed in the log. Escaping arcs pinpoint the situations that need to be fixed to achieve a completely precise system. Collect imprecisions in terms of event log - Minimal Imprecise Log 30 a, c, f a, c, d, f a, c, d, e, e a, c, e, d, e a, c, e, e

31 Precision Precision based on the Log Qualitative Analysis of Precision Checking Precision based on Alignments Fitness Decomposition Decomposed Conformance Checking Topological Conformance Diagnosis Data-aware Decomposed Conformance Checking Event-based Real-time Decomposed Conformance Checking

Problem The Effects of Exceptional Behavior 32 Initial Examination Allergy Test Blood Test Radiology Test Diagnosis Hospital Treatment Home Care Initial Examination Allergy Test Blood Test Radiology Test Diagnosis Home Care

Problem The Effects of Exceptional Behavior 33 Initial Examination Allergy Test Blood Test Radiology Test Diagnosis Hospital Treatment Home Care Initial Examination Allergy Test Blood Test Radiology Test Diagnosis Home Care Initial Examination Allergy Test Blood Test Radiology Test Diagnosis Home Care

Problem Variability of Precision in the Future 34 ETC Precision 0.81 ETC Precision ? ?? Current Moment Close Future Far Future future present

Problem Limited Resources and Imprecision Points 35 Hospital Process Imprecision Points Limited Analysts and Resources

Context Robustness, Confidence and Severity Precision based on Escaping Arcs more robust to exceptional behavior. Estimate the possible variability of the metric in the future. Asses the severity of imprecision points and compare them. 36

Contributions Robustness on Escaping Arcs 37 a c b d gi i i i h h h f f f d d d e e f f e e e

Contributions Robustness on Escaping Arcs 38 a c b d gi i i i h h h f f f d d d e e f f e e e i h f e

Contributions Robustness on Escaping Arcs 39 Threshold parameter to cut exceptional behavior. Parametric threshold High cut factor for main behavior Low cut factor for extreme cases Local-context cut, not global-context cut

Contributions Robustness on Escaping Arcs 40 a c b d gi i i i h h h f f f d d d e e f f e e e i h f e i h f e

log K Low ConfidenceHigh Confidence 41 Contributions Confidence on Escaping Arcs Metric

log K 42 Contributions Confidence on Escaping Arcs Metric

log K 43 Contributions Confidence on Escaping Arcs Metric

44 Contributions Upper Estimation of Precision a c b d gi i i i h h h f f f d d d e e f f e e e K = 3 Best scenario = covering escaping arcs

45 Contributions Upper Estimation of Precision Problem of optimization. Cover escaping arcs with the given k to maximize the metric. Cost of covering a escaping arc: the number of traces to overpass the threshold. Gain of covering a escaping arc: the weight of the state.  BIP Formulation  Upper Estimation

46 Contributions Lower Estimation of Precision a c b d gi i i i h h h f f f d d d e e f f e e e K = 1 Worst scenario = new escaping arcs  Lower Estimation avg A-1

Subjective and multifactor Weight, Alternation, Stability, Criticality 47 A AE B D CDGHFA 94 6 AFH G G DHFA 76 4 H DFA H 0 H 0 G 0 G 0 G DHFA 76 4 H DFA G 0 G A AE B D CDGHFA 94 6 AFH G G DHFA 76 4 H DFA H 0 H 0 G 0 G 0 G DHFA 76 4 H DFA G 0 G 0 H 0 H 0 H 0 H 0 H 0 H 0 H 0 H 0 H 0 H 0 H 0 H 0 H 0 H 0 H 0 H 0 H 0 H All imprecisions equally important? sever mid low Contributions Severity of the Escaping Arcs

Escaping arcs in parts with more weight more sever sever Contributions Weight of an Escaping Arc

More chances to make a mistake more sever 49 sever Contributions Alternation of an Escaping Arc

Apply perturbation increase the number of instances in that point proportional to the current occurrence number Measure how easy is to overpass the threshold Imprecision stable to perturbation more sever sever Contributions Stability of an Escaping Arc

Importance of the task involved in the escaping arc 51 sever Check Date Format Bank Transfer Contributions Criticality of an Escaping Arc

Contributions Challenges Addressed Robustness on the Precision based on Escaping Arcs. Confidence interval on the Precision metric. Severity assessment on the precision problems. 52

53 Precision Precision based on the Log Qualitative Analysis of Precision Checking Precision based on Alignments Fitness Decomposition Decomposed Conformance Checking Topological Conformance Diagnosis Data-aware Decomposed Conformance Checking Event-based Real-time Decomposed Conformance Checking

Problem Precision on Unfitting Scenarios 54 Initial Examination Allergy Test Blood Test Radiology Test Diagnosis Hospital Treatment Home Care Initial Examination Allergy Test Blood Test Radiology Test Diagnosis Home Care Initial Examination Allergy Test Blood Test Radiology Test Diagnosis Home Care Perfect fitness is uncommon in real life

Contributions Unfitting Observed Behavior 55 Log Trace Model Behavior ?

Problem Fitness effects on Precision based on Log 56 a, a, b, b, d What state reaches the model when the trace does not fit? a b c a b d b a a ??? Option: Not considering the unfitting part. The position of the fitting problem influences the precision.

Context Precision Independent of Fitness Unfitting scenarios are common in real-life Precision independent from Fitness A precision not based directly on the log but on a pre-alignment between the observed behavior and the modeled behavior. 57

Context Aligning Observed and Modeled Behavior 58 Log Trace Model Behavior

Context Aligning Observed and Modeled Behavior 59 Find the closest model trace in the model behavior for a given log trace From a global perspective Able to deal with unfitting behavior Optimal guaranteed Time-consuming problem based on A* search algorithms * Adriansyah, A.: Aligning Observed and Modeled Behavior. PhD Thesis. Eindhoven University of Technology. 2014

Compute Precision Modeled Behavior Observed Behavior Minimal Imprecise Traces ETC Precision (etcp) Alignments ad b aa b a d Contributions Precision based on Alignments

Contributions Aligning Observed and Modeled Behavior 61 a b c a b d adab a ad b aa d b Log Trace Alignment Process Model

Contributions Aligning Observed and Modeled Behavior 62 a b c a b d Log Trace adab aabd a ad b aa d b Alignment Process Model Log Moves Model Moves Deviation Fitting trace, closest to the original

Contributions Aligning Observed and Modeled Behavior 63 a b c a b d Log Trace ad abd/acd a d a d Alignment 1 Process Model b a d a d Alignment 2 c Both alignments are optimal

Compute Precision Modeled Behavior Observed Behavior Minimal Imprecise Traces ETC Precision (etcp) Alignments ad b aa b a d New weight function Contributions Precision based on Alignments

Contributions Observed Behavior from 1-Alignment 65 a, a, b, d a, b, d a, d, a, b a, d Event Log / a b c a b d Process Model a, a, b, d a, b, d a, a, b, d a, b, d Fitting Traces a, c, d d b b d a c a 4 4

Contributions Observed Behavior from All-Alignment 66 a, a, b, d a, b, d a, d, a, b a, d Event Log / a b c a b d Process Model a, a, b, d a, b, d a, a, b, d a, b, d Fitting Traces a, c, d d b b d a c a d

Compute Precision Modeled Behavior Observed Behavior Minimal Imprecise Traces ETC Precision (etcp) Alignments ad b aa b a d New weight function Contributions Precision based on Alignments

68 Contributions Challenges Addressed Precision based on alignments. Precision for unfitting cases. Precision independent of fitness. Precision based on 1-alignment or All-alignments.

Contributions Extensions to Precision based on Alignments Extensions to represent the modeled behavior. Use of Representative-alignments. Multi-sets to represent automaton states. Backwards use of the alignments. 69 b b a a b b a a a, c, d, e b, c, d, e e dc a b

PRECISION DECOMPOSITION CONFORMANCE CHECKING CONCLUSIONS

71 Precision Precision based on the Log Qualitative Analysis of Precision Checking Precision based on Alignments Fitness Decomposition Decomposed Conformance Checking Topological Conformance Diagnosis Data-aware Decomposed Conformance Checking Event-based Real-time Decomposed Conformance Checking

Problem Fitness in Large Models 72 Initial Examination Allergy Test Blood Test Radiology Test Diagnosis Hospital Treatment Home Care Initial Examination Allergy Test Blood Test Radiology Test Diagnosis Home Care Initial Examination Allergy Test Blood Test Radiology Test Diagnosis Home Care

Problem Fitness in Large Models 73

Problem Fitness in Large Models 74

Problem Fitness in Large Models 75

Context Fast, Comprehensible and Guaranteed Decompose the Fitness checking problem. Comprehensible decomposition and understandable diagnosis results. Formal guarantees. There is a fitness problem on the original net iff there is a fitness problem in one or more of the components. Fast compared to the monolithic approach. 76 The decomposition preserves the fitness.

Contributions Alignment Fitness Checking 77 Log Trace Model Behavior

Contributions Decomposing Alignment Fitness Checking 78 Log Trace Model Behavior

Contributions Decomposition based on Graphs Based on Graph Decomposition 79 t1 t2 t3 t4 t5 t6 t7 Decomposition based on: Single-Entry Single-Exit Components (SESE) Refined Process Structure Tree (RPST) * Artem Polyvyanyy: Structuring Process Models. PhD Thesis. University of Potsdam (Germany), January 2012 * Hopcroft, J., Tarjan, R.E.: Dividing a graph into triconnected components. SIAM J. Com- put. 2(3), 1973

80 Contributions Interior, Boundary, Entry, and Exit nodes Entry node: boundary where no incoming edge or all outgoing edges Exit node: boundary where no outgoing edge or all incoming edges

Example of SESE and RPST SESE: set of edges which graph has a Single Entry node and a Single Exit node Refined Process Structure Tree (RPST) containing non overlapping SESEs Unique Modular Linear Time 81 Contributions SESE and RPST

Why SESE? Only one entry; only one exit Represent subprocesses within the process Intuitive for conformance diagnosis Why RPST? Partitioning over the RPST Any cut is a partitioning Algorithm to partitioning by size (k) 82 Contributions SESE and RPST

K< Contributions SESE and RPST Why SESE? Only one entry; only one exit Represent subprocesses within the process Intuitive for conformance diagnosis Why RPST? Partitioning over the RPST Any cut is a partitioning Algorithm to partitioning by size (k)

A decomposition based on SESEs preserves the fitness? Fitness Preservation: A model/log is perfectly fitting if and only if all the components are perfectly fitting 84 Contributions Preserving the Fitness

SESEs (per se) do not preserve fitness. 85 Contributions SESE Decomposition does not Preserve Fitness d e f p a b c p

SESEs (per se) do not preserve fitness. 0 tokens in pabcdefS2 is blocked 86 Contributions SESE Decomposition does not Preserve Fitness d e f p a b c p S2 S1

SESEs (per se) do not preserve fitness. 0 tokens in pabcdefS2 is blocked 1 token in pabcdef fits S but not S2 87 Contributions SESE Decomposition does not Preserve Fitness d e f p a b c p S2 S1

SESEs (per se) do not preserve fitness. 0 tokens in pabcdefS2 is blocked 1 token in pabcdef fits S but not S2 2 tokens in pabdecf fits S1 and S2 but not S 88 Contributions SESE Decomposition does not Preserve Fitness d e f p a b c p S2 S1

The problem is in the shared places No reflection on the log, therefore no synchronization. Valid Decomposition: a partition where only transitions are shared among components. No places neither arcs. There is a fitness problem on the original net iff there is a fitness problem in one or more of the components. 89 Contributions Valid Decomposition Theorem: Valid Decomposition preserves the fitness. * W.M.P. van der Aalst : Decomposing Petri nets for process mining: A generic approach. Distributed and Parallel Databases, 2013

Create a ‘bridge’ for each shared place 90 Contributions Bridging a SESE Decomposition d e f a b c b c p d e p S1’ S2’ B1 Notice that not a SESE anymore

Theorem: SESE decomposition with Bridging post- processing preserves the fitness. 91 Contributions SESE + Bridging Theorem SESE decomposition with Bridging is a valid decomposition.

Monolithic 1h 15min 92 Contributions Decomposition Fitness Results Decomposition(7) 2min

93 Contributions Decomposition Fitness Results

94 Precision Precision based on the Log Qualitative Analysis of Precision Checking Precision based on Alignments Fitness Decomposition Decomposed Conformance Checking Topological Conformance Diagnosis Data-aware Decomposed Conformance Checking Event-based Real-time Decomposed Conformance Checking

Problem Locate Fitness Problems in Large Models 95

Context Problematic Components More than just report the list of model components with fitness problems. Provide a structure among the components. Visualize the structure of the decomposition. Use the structure to detect conflictive components highly related. 96

97 Contributions Topological Fitness Checking

Non-Fitting (Weakly) Connected Components Non-Fitting Subnet 98 Contributions Topological Fitness Checking

99 Contributions Topological Fitness Checking

100 Precision Precision based on the Log Qualitative Analysis of Precision Checking Precision based on Alignments Fitness Decomposition Decomposed Conformance Checking Topological and Multi-level Conformance Diagnosis Data-aware Decomposed Conformance Checking Event-based Real-time Decomposed Conformance Checking

Problem Fitness in Data-aware Models 101 Initial Examination Allergy Test Blood Test Radiology Test Diagnosis Hospital Treatment Home Care Initial Examination Allergy Test Blood Test Radiology Test Diagnosis Home Care

Problem Fitness in Data-aware Models 102 Initial Examination Allergy Test Blood Test Radiology Test Diagnosis Hospital Treatment Home Care Initial Examination Allergy Test - FAIL Blood Test - PASS Radiology Test - PASS Diagnosis - HOME Home Care tests diagnosis

Problem Fitness in Data-aware Models 103 Large Medical Data-aware Models

Context Data-aware Conformance Checking Existing techniques for data-aware fitness checking are time-consuming based on A* (control-flow) + ILP (data) Decompose the data-aware fitness problem. Meaningful decomposition and diagnostic results. Formal guarantees on the fitness correctness. Fast compared with the monolithic approach. 104

Contributions Valid Decomposition of Data-aware Models 105 t1 t2 t3 t4 p t5 t6 t7 p No Synchronization Shared places can be out of synchronization during the fitness checking. Valid Decompositions (no places or arcs shared) preserve the fitness.

Contributions Valid Decomposition of Data-aware Models 106 Theorem: Valid Decomposition of Petri nets with data (no shared places, arcs, or data variables) preserves the fitness. No Synchronization t5 t6 t7 t4 data t1 t2 t3 t4 data  Details in the thesis

Contributions Valid Decomposition of Data-aware Models 107 Petri nets with Data are graphs. Decomposition based on SESEs for comprehensive results. t5 t6 t7 t4 data t1 t2 t3 t4 data

Contributions Valid Decomposition of Data-aware Models 108 Improve in the control flow + improve in the data Average number of events per event-log trace Average computation time (s) Real case: Dutch municipality From seconds to 52 seconds (99%)

109 Precision Precision based on the Log Qualitative Analysis of Precision Checking Precision based on Alignments Fitness Decomposition Decomposed Conformance Checking Topological and Multi-level Conformance Diagnosis Data-aware Decomposed Conformance Checking Event-based Real-time Decomposed Conformance Checking

Problem Real-life Monitoring of Hospital Processes 110 Hospital Processes running Large Process Model Process-aware Monitoring System Conformance Reports Conformance Alarms

Context Event-based, Fast, and Comprehensible Fitness real-life monitoring architecture for large process models. Based on events, not in complete traces. Real-time requires time efficiency Comprehensive results as part of the monitoring procedure. 111

Contributions Event-based Real-time Decomposed Fitness 112 Decomposed Model Stream of Events

Contributions Decomposition based on SESE 113

Contributions Event-based Real-time Decomposed Fitness 114 Heuristic Replay Faster compared with alignments. Consequences of bad decisions are limited to the fragment. Event based. Not optimal, but heuristic. a bc f de acf a c f ac f Log Trace Replay b Look-ahead Heuristic

Contributions Example of Real-time Decomposed Fitness 115

PRECISION DECOMPOSITION CONFORMANCE CHECKING CONCLUSIONS

Contributions of the Thesis Contribution Precision  Approach to quantify and analyze the precision between a log and a model based on escaping arcs.  Robustness and confidence interval for precision based on escaping arcs.  Severity assessment of the imprecision point detected.  Precision checking based on aligning observed and modeled behavior.  Abstraction and directionality in precision based on alignments. Fitness Decomposition  Decomposed conformance checking based on SESE components.  Hierarchical and topological decomposition based on SESE components for conformance diagnosis.  Decomposed conformance checking for data-aware models.  Decomposed conformance checking for real-time scenarios. 117

Publications of the Thesis (Precision) Jorge Munoz-Gama, Josep Carmona A Fresh Look at Precision in Process Conformance BPM 2010 – pp Jorge Munoz-Gama, Josep Carmona Enhancing precision in Process Conformance: Stability, confidence and severity. CIDM 2011 – pp Jorge Munoz-Gama, Josep Carmona A General Framework for Precision Checking Journal of Innovative Computing, Information and Control – vol.8 no.7B Arya Adriansyah, Jorge Munoz-Gama, Josep Carmona, Boudewijn F. van Dongen, Wil M. P. van der Aalst Alignment Based Precision Checking BPM Workshops 2012 – pp Arya Adriansyah, Jorge Munoz-Gama, Josep Carmona, Boudewijn F. van Dongen, Wil M. P. van der Aalst Measuring precision of modeled behavior Information Systems and e-Business Management 118

Publications of the Thesis (Decomposition) Jorge Munoz-Gama, Josep Carmona, Wil M. P. van der Aalst Conformance Checking in the Large: Partitioning and Topology BPM 2013 – pp – Best Student Paper Award Jorge Munoz-Gama, Josep Carmona, Wil M. P. van der Aalst Hierarchical Conformance Checking of Process Models Based on Event Logs Petri Nets 2013 – pp Jorge Munoz-Gama, Josep Carmona, Wil M. P. van der Aalst Single-Entry Single-Exit Decomposed Conformance Checking Information Systems – vol.46 pp Massimiliano de Leoni, Jorge Munoz-Gama, Josep Carmona and Wil M.P. van der Aalst Decomposing Conformance Checking on Petri Nets with Data CoopIS 2014 – pp Seppe K.L.M. vanden Broucke, Jorge Munoz-Gama, Josep Carmona, Bart Baesens and Jan Vanthienen Event-based Real-time Decomposed Conformance Analysis CoopIS 2014 – pp

Impact of the Thesis Published in international journals and international conferences Best Student Paper Award in BPM 2013 (Acceptance Rate 14%) Extensively used in the field 150 citations Used for:  measure precision and fitness in models  evaluate discovery algorithms  guide discovery techniques based on genetic algorithms  CoBeFra framework  recommender systems trainning 120

Directions for Future Work New metrics, new dimensions Decomposed alignment of observed and modeled behavior Decomposed conformance for other dimensions Visualization and diagnosis Model repair 121

Thesis and Acknowledgements More details in: … and to all the people that made this work possible, THANKS! 122

Jorge Munoz-Gama Advisor: Josep Carmona December 2014 CONFORMANCE CHECKING AND DIAGNOSIS IN PROCESS MINING

Backup Slides 124

Contributions Precision based on Escaping Arcs 125 Escaping arcs: points where the model allows more behavior than the one observed in the log.

126

127

128

129

130

131

132

133

Problem “Low Criticality Diagnosis” Process 134 Initial Examination Allergy Test Blood Test Radiology Test Diagnosis Hospital Treatment Home Care Process Simulation Software “Low Criticality Diagnosis” Process Model Simulation Results

Problem “Low Criticality Diagnosis” Process 135 Process Simulation Software “Low Criticality Diagnosis” Process Model Simulation Results Initial Examination Allergy Test Blood Test Radiology Test Diagnosis Hospital Treatment Home Care