Download presentation
Presentation is loading. Please wait.
Published byDania January Modified over 9 years ago
1
CONFORMANCE CHECKING IN THE LARGE: PARTITIONING AND TOPOLOGY Jorge Munoz-Gama, Josep Carmona and Wil M.P. van der Aalst
2
Motivation 2 THEORYREALITY REALITY REFLECTION LOGS
3
BPM 2013 (Tuesday) BPI Workshop 2013 Community Motivates Itself 3 Process Mining Conformance Checking Conformance Dimensions Alignment based Conformance Big Data Conformance Checking Applications Decomposition RPST BPM 2013 (Wednesday)
4
Conformance General Idea 4 model Log trace mismatches
5
Conformance in a Nutshell 5 Log Model A B C D E A B B C Alignment E FitnessPrecision How much behavior of the log is captured by the model? How accurate is the model describing the log? Conformance mismatch on the Log Conformance mismatch on the Model
6
Conformance in the Large How easy is to diagnose a conformance problem here? How much time it takes? 6
7
General Idea: Decomposition 7
8
The 4 Challenges Comprehensible Guaranties Fast Diagnosis 8 ?
9
SESE and RPST for decomposing 9 SESERPST Single Entry Single Exit components Refined Process Structure Tree * Artem Polyvyanyy: Structuring Process Models. PhD Thesis. University of Potsdam (Germany), January 2012 * Hopcroft, J., Tarjan, R.E.: Dividing a graph into triconnected components. SIAM J. Com- put. 2(3), 1973 Based on graph decomposition
10
Interior, Boundary, Entry, and Exit nodes Given a subgraph and a node of it: Interior node: connected only to nodes of the subgraph. Boundary node: not interior Entry node: boundary where no incoming edge in subgraph or all outgoing edges in Exit node: boundary where no outgoing edge in subgraph or all incoming edges in 10
11
Nodes examples 11
12
Structural Decomposition 12 transition place token A B D E F C
13
Example of SESE and RPST 13 SESE: set of edges which graph has a Single Entry node and a Single Exit node Refined Process Structure Tree (RPST) containing non overlapping SESEs Unique Modular Linear Time
14
Why SESE and RPST? Why SESE? Only one entry; only one exit Represent subprocesses within the process Intuitive for conformance diagnosis Why RPST? Partitioning over the RPST Any cut is a partitioning Algorithm to partitioning by size (k) 14
15
Why SESE and RPST? Why SESE? Only one entry; only one exit Represent subprocesses within the process Intuitive for conformance diagnosis Why RPST? Partitioning over the RPST Any cut is a partitioning Algorithm to partitioning by size (k) 15 K<5 16 4 8 44
16
The 4 Challenges Comprehensible Guaranties Fast Diagnosis 16 ?
17
Properties of the Partitioning What about the guaranties in conformance checking? Decomposed Perfectly Fitting Checking: A model/log is perfectly fitting if and only if all the components are perfectly fitting 17 * W.M.P. van der Aalst : Decomposing Petri nets for process mining: A generic approach. Distributed and Parallel Databases, 2013
18
SESE and Decomposed Perfectly Fitting SESEs (per se) do not satisfy the Decomposed Perfectly Fitting Checking property 18 1 token in p => abcdef fits S but not S2 2 tokens in p => abdecf fits S1 and S2 but not S
19
Valid Decomposition The problem is in the boundary places No reflection on the log A partition with only transitions shared among components (no places neither arcs) Transitions have reflect on the log Use that reflection to sync the components This is known as a valid decomposition Details in: 19 * W.M.P. van der Aalst : Decomposing Petri nets for process mining: A generic approach. Distributed and Parallel Databases, 2013 * J. Munoz-Gama, J. Carmona, and W.M.P. van der Aalst : Conformance checking in the large: partitioning and topology. BPM 2013
20
SESE to Valid Decomposition Create a ‘bridge’ for each shared place 20
21
Results 21 SESEs and Bridges Few with non negligible size
22
SESE + Bridging Theorem Theorem: SESE decomposition with Bridging post- processing satisfies the Decomposed Perfectly Fitting Checking 22
23
The 4 Challenges Comprehensible Guaranties Fast Diagnosis 23 ?
24
Results 1 Net – 1h 15min 7 Subnets – 2min 24
25
Results 25 Better performance with large, concurrency, long traces and concentrated conformance problems Not always faster: short traces, fitting. Overhead of the decomposition Results even in cases that the original approach can handle
26
The 4 Challenges Comprehensible Guaranties Fast Diagnosis 26 ?
27
Results 27 Conformance problems spread Conformance problems concentrated
28
Topology 28 S1 S4 S2 S3 B1 B2 S7S8 S5 S6
29
Topology Enhancement 29 S1 S4 S2 S3 B1 B2 S7S8 S5 S6 t1,t3,t4,t5,t7,t7,t9
30
Topological Diagnosis Algorithms Non-Fitting (Weakly) Connected Components 30 S1 S4 S2 S3 B1 B2 S7S8 S5 S6 Non-Fitting Subnet
31
Topological Diagnosis in Large 31
32
Topological Results 32 From almost 700 nodes … … to 70
33
The 4 Challenges Comprehensible Guaranties Fast Diagnosis 33
34
Future Work Estimate fitness Not decisional but metric Divide-and-Conquer Alignment Algorithms Reconstruct the alignment New decompositions Less trivial components New conformance dimensions Precision, generalization, … 34
35
Conclusions Partitioning Technique for Conformance Checking Based on SESE and RPST May be faster, distributed, and help on the diagnosis Topology for diagnosis Implemented in ProM framework. 35
36
Thank You
37
Process Mining in a Nutshell 37 THEORYREALITY
38
Process Mining in a Nutshell 38 THEORYREALITY REALITY REFLECTION LOGS
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.