Download presentation
Presentation is loading. Please wait.
Published byAntony Hodge Modified over 9 years ago
1
Jianmin Wang 1, Shaoxu Song 1, Xuemin Lin 2, Xiaochen Zhu 1, Jian Pei 3 1 Tsinghua University, China 2 University of New South Wales, Australia 3 Simon Fraser University, Canada 1/23 ICDE 2015
2
Motivation Exact Algorithm Approximation Experiments Conclusion 2/23 ICDE 2015
3
Information systems record the business history in their event logs. 3/23 Huge Amount of Event Data: Corporation Products No. of Event Traces 1,230,000 Power Generator 3,260,000 Machinery 2,600,000 Train EventNameOperatorSuccessor t1submitM. LiuF. Kang t2designF. KangJ. Zhe & O. Chu t3insulation proofJ. ZheX. Feng t4check inventoryO. ChuX. Feng t5evaluateX. FengSystem2 t6archiveSystem2------- ICDE 2015
4
Structural information do exist among events. Task passing relationships: 4/23 EventNameOperatorSuccessor t1submitAB t2designBC & D t3insulation proofCE t4check inventoryDE t5evaluateEF t6archiveF------- Structured Event LogExecution Graph Human TaskService Task submit design insulation proof check inventory archive evaluate ICDE 2015
5
Business events often follow certain business rules or constraints 5/23 Process specification Execution Constraints by Petri net: Sequence Parallel Choice start enda b c de f g h s submit reviseproof check merge re- evaluate archive design check inventory evaluate electrician proof insulation proof XOR split XOR join AND split AND join a submit revise merge re- evaluate follow submit design insulation proof check inventory archive evaluatesubmit design electrician proof check inventory archive evaluate submit revise proof check merge archive re-evaluate ICDE 2015
6
6/23 start enda b c de f g h s submit reviseproof check merge re- evaluate archive design check inventory evaluate electrician proof insulation proof submit design insulation proof check inventory archive evaluate p0: start p7: end p1: a p2: b p3: c p4: d p5: e p6: s t1: submit t6: archive t2: design t4: check inventory t5: evaluate t3: insulation proof Representing execution as Causal Net (Petri net without XOR) Process specification Causal Net p0p7p1 p2p3 p4p5p6t1t6t2t4t5 t3 start enda b c de s submit archive design check inventory evaluate insulation proof ICDE 2015
7
7/23 check inventory electrician proof insulation proof start enda b c de f g h s submit reviseproof check merge re- evaluate archive design evaluate p0: start p7: end p1: a p2: b p3: c p4: d p5: e p6: s t1: submit t6: archive t2: revise t4: --------t5: evaluate t3: proof p0: start p3: end p1: a p2: b t1: submit t3: archive t2: design Inconsistent Labeling Unsound Structure check inventory electrician proof insulation proof submit reviseproof check merge re- evaluate archive t2: revise t3: proof t4: -------- electrician proof insulation proof proof check Two types of dirty event data: According to the specification: ICDE 2015
8
8/23 The causes of dirty events: Man-made errors (typo); System failures (power down). Survey in a bus manufacturer: 82% executions are dirty; 77.62% are inconsistent labeling, 4.45% are unsound structure. Dirty event data may: Return wrong provenance answer; Mislead the aggregation profiling; Obstruct finding interesting process patterns. ICDE 2015
9
9/23 Inconsistent Labeling Unsound Structure p0: start p7: end p1: a p2: b p3: c p4: d p5: e p6: s t1: submit t6: archive t2: design t4: check inventory t5: evaluate t3: electrician proof p0: start p7: end p1: a p2: b p3: c p4: d p5: e p6: s t1: submit t6: archive t2: design t4: check inventory t5: evaluate t3: insulation proof 1. Find all consistent mappings 2. Choose the one with the minimum repairing cost No valid repair is found ICDE 2015
10
10/23 Hardness: Owing to choices and parallelization of flows, there exist vast possible repairs; Existing methods: Event Log Alignment 1 : Does not exploit structural information. Graph Repair 2 : Does not consider AND and XOR constraints. 1. M. de Leoni, F. M. Maggi, and W. M. P. van der Aalst. Aligning event logs and declarative process models for conformance checking. In BPM, pages 82–97, 2012. 2. S. Song, H. Cheng, J. X. Yu, and L. Chen. Repairing vertex labels under neighborhood constraints. PVLDB, 7(11):987–998, 2014 EventName t1submit t2do revise t3proof t4----------- t5evaluate t6archive EventName t1submit t2revise t3proof check t4merge t5re-evaluate t6archive EventNameOperatorSuccessor t1submitAB t2designBC & D t3insulation proofCE t4check inventoryDE t5evaluateEF t6archiveF------- ICDE 2015
11
Motivation Exact Algorithm Approximation Experiments Conclusion 11/23 ICDE 2015
12
12/23 Branch: Trying all the possible repairs; Branching at XOR split according to the specification. Lower Bound: Simple bound = current repair cost t1:submit t1:submit, t2:design t1:submit, t2:revise t1:submit, t2:design, t3:insulation proof, t4:check inventory t1:submit, t2:design, t3:insulation proof, t4:check inventory, t5:evaluate t1:submit, t2:design, t3:insulation proof, t4:check inventory, t5:evaluate, t6:archive t1:submit, t2:design, t3:insulation proof t1:submit, t2:design, t3:electrician proof t1:submit, t2:design, t3:electrician proof, t4:check inventory bound=0 t1:submit bound=6 t1:submit, t2:design bound=3 t1:submit, t2:revise bound=17 t1:submit, t2:design, t3:electrician proof bound=16 t1:submit, t2:design, t3:insulation proof bound=30 t1:submit, t2:design, t3:insulation proof, t4:check inventory cost=30 t1:submit, t2:design, t3:insulation proof, t4:check inventory, t5:evaluate, t6:archive bound=30 t1:submit, t2:design, t3:insulation proof, t4:check inventory, t5:evaluate bound=31 t1:submit, t2:design, t3:electrician proof, t4:check inventory bound=31 t1:submit, t2:design, t3:electrician proof, t4:check inventory invalid t1:submit, t2:revise ICDE 2015
13
13/23 Pruning Rule: The longest path length in causal net The shortest path length in specification < startenda b c dA CD E FB t2: C p3p0: start t1: A t3 p2 p1: a Process specification Causal Net Invalid! Length = 2 (Transitions) Length = 4 (Transitions) ICDE 2015
14
14/23 (naïve bound=0) t1:submit t2: w(t2)=3 t3: w(t3)=5 t5: w(t5)=0 t4: w(t4)=5 p0: start p7: end p1: a p2: b p3: c p4: d p5: e p6: s t1: submit t6: archive t2: do revise t4: -------t5: evaluate t3: proof 1. Build a conflict graph: where w(t) is the minimum cost on all possible repairs of t Example: to estimate a lower bound for 2. Remove edges (with vertices) until the conflict graph becomes empty: 3. For each removed edge, add the minimum w(t) on the edge to the lower bound: Advanced Bound = min{w(t2), w(t3)} + min{w(t4), w(t5)} = 3 Remove (t2, t3) and (t4, t5) t2: w(t2)=3 t3: w(t3)=5 t5: w(t5)=0 t4: w(t4)=5 ICDE 2015
15
Motivation Exact Algorithm Approximation Experiments Conclusion 15/23 ICDE 2015
16
16/23 1. The start place in causal net the start place in specification 2. Candidates for Transition ’(t k ): pre(t k ) have already been determined; choose candidates without introducing inconsistency on pre(t k ). Heuristic: pass the causal net from the start to the end only once, determine the mapping ’ for each place and transition. p2: b p4: d t2: design t3 Causal net … … … b d design electrician proof insulation proof Specification … … … Candidates for t3: t3:insulation proof t3:electrician proof 3. Choose the candidate that introduces less inconsistency on post(t k ). One Pass algorithm may report false positive unsound structure! ICDE 2015
17
Motivation Exact Algorithm Approximation Experiments Conclusion 17/23 ICDE 2015
18
18/23 Real Life Data Set: Setting we randomly change event names in execution traces as faults; apply the repair methods to modify the execution trace (find new mapping). Criteria: to evaluate the accuracy of recovery, F-measure of precision and recall. Baseline: Event Log Alignment and Graph Repair. Places in Process Specification 24 No. of Event Traces 4722 Transitions in Process Specification 22 maximum size of pre/post set 3 Employed from bus manufacturer:Employed from a telecom company: Places in Process Specification 31 No. of Event Traces 1040 Transitions in Process Specification 32 maximum size of pre/post set 3 ICDE 2015
19
19/23 Bus manufacturer data set:Telecom company data set: Low time cost High accuracy ICDE 2015
20
20/23 Synthetic data set: Pruning Invalid Branch + Advanced Bound ICDE 2015
21
Motivation Exact Algorithm Approximation Experiments Conclusion 21/23 ICDE 2015
22
Define Minimum Repair Problem on Structured Event Logs A Branch and Bound Repair framework Find the minimum repair; Detect unsound Structure. Pruning and Advanced Bounding Function A PTIME Approximate Algorithm 22/23 ICDE 2015
23
Thanks ! 23/23 ICDE 2015
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.