Download presentation
Presentation is loading. Please wait.
Published byTroy Twining Modified over 10 years ago
1
/faculteit technologie management /faculteit wiskunde en informatica PM-1 Process mining: Discovering Process Models from Event Logs Prof.dr.ir. Wil van der Aalst Eindhoven University of Technology, P.O.Box 513, NL-5600 MB, Eindhoven, The Netherlands.
2
/faculteit technologie management /faculteit wiskunde en informatica PM-2 Outline Who we are... –I&T group –selected research projects Process mining –purpose –basic idea –(re)discovery problem –mining algorithm (W) –comparison –example/tools –case study Conclusion
3
/faculteit technologie management /faculteit wiskunde en informatica PM-3 Who we are...
4
/faculteit technologie management /faculteit wiskunde en informatica PM-4 Information & Technology (I&T) group at EUT I&T group (35 persons), Department of Technology Management, Eindhoven University of Technology. Three subgroups: –Business Process Management (workflow management, Petri nets, mining,...) –ICT Architectures (agents, transactions,...) –Software Engineering (software quality,...)
5
/faculteit technologie management /faculteit wiskunde en informatica PM-5 Selected research projects process mining workflow verification workflow patterns web services composition languages case handling XRL/flower business process improvement... In most cases using/extending Petri net theory!
6
/faculteit technologie management /faculteit wiskunde en informatica PM-6 Workflow verification: Woflan Can interface with Staffware, Protos, COSA, Meteor. Can handle Event-driven Process Chains (ARIS)
7
/faculteit technologie management /faculteit wiskunde en informatica PM-7 Workflow patterns The academic response A quest for the basic requirements 20 basic patterns 20+ systems evaluated Joint work with QUT, ATOS, etc. http://www.tm.tue.nl/it/research/patterns +/- 150 pageviews per working day (>25.000 in total)
8
/faculteit technologie management /faculteit wiskunde en informatica PM-8 Web services composition languages Also process support. Standards considered are BPML, BPEL4WS, XLANG, WSFL, WSCI. Joint work with QUT (Brisbane, Australia).
9
/faculteit technologie management /faculteit wiskunde en informatica PM-9 Process mining Team members: Wil van der Aalst Ton Weijters Laura Maruster Ana-Karla Medeiros Boudewijn van Dongen Eric Verbeek
10
/faculteit technologie management /faculteit wiskunde en informatica PM-10 Business Process Management
11
/faculteit technologie management /faculteit wiskunde en informatica PM-11 No feedback loop
12
/faculteit technologie management /faculteit wiskunde en informatica PM-12 The basic idea process mining
13
/faculteit technologie management /faculteit wiskunde en informatica PM-13 Toy example case 1 : task A case 2 : task A case 3 : task A case 3 : task B case 1 : task B case 1 : task C case 2 : task C case 4 : task A case 2 : task B case 2 : task D case 5 : task A case 4 : task C case 1 : task D case 3 : task C case 3 : task D case 4 : task B case 5 : task E case 5 : task D case 4 : task D ABCD {cases 1,3} ACBD {cases 2,4} AED {case 5}
14
/faculteit technologie management /faculteit wiskunde en informatica PM-14 Result: A Petri net model ABCD ACBD AED (W) Petri nets are used as a formalism, the target language can be different, e.g., Event-driven Process Chains.
15
/faculteit technologie management /faculteit wiskunde en informatica PM-15 Focus of this presentation is on the following theoretical question:
16
/faculteit technologie management /faculteit wiskunde en informatica PM-16 Assumption: complete workflow logs without noise. Let T be a set of tasks. T * is a workflow trace and W T * is a workflow log. Let W be a workflow log over T, i.e., W T *. Let a,b T: –a > W b if and only if there is a trace = t 1 t 2 t 3 t n-1 and i {1, , n-2} such that W and t i = a and t i+1 = b, –a W b if and only if a > W b and not (b > W a), –a # W b if and only if not(a > W b) and not(b > W a), and –a W b if and only if a > W b and b > W a. Let N = (P,T,F) be a sound WF-net, i.e., N W. W is a workflow log of N if and only if W T * and every trace W is a firing sequence of N starting in state [i], i.e., (N,[i])\protect[ . W is a complete workflow log of N if and only if (1) for any workflow log W of N: > W > W and (2) for any t T there is a W such that t .
17
/faculteit technologie management /faculteit wiskunde en informatica PM-17 Example 1 case 1 : task A case 2 : task A case 3 : task A case 3 : task B case 1 : task B case 1 : task C case 2 : task C case 4 : task A case 2 : task B case 2 : task D case 5 : task A case 4 : task C case 1 : task D case 3 : task C case 3 : task D case 4 : task B case 5 : task E case 5 : task D case 4 : task D W = { A B C D, A C B D, A E D} A > W B A > W C A > W E B > W C B > W D C > W B C > W D E > W D A W B A W C A W E B W D C W D E W D B W C C W B # W : rest Log is complete if this relation cannot be extended X W Y xor Y W X xor X W Y xor X # W Y
18
/faculteit technologie management /faculteit wiskunde en informatica PM-18 Example 2 W = { A B C D, A C B D} is complete A > W B A > W C B > W C B > W D C > W B C > W D A W B A W C B W D C W D B W C C W B # W : rest
19
/faculteit technologie management /faculteit wiskunde en informatica PM-19 Example 3 W = { A B D, A C D} is complete A > W B A > W C B > W D C > W D A W B A W C B W D C W D W :none # W : rest
20
/faculteit technologie management /faculteit wiskunde en informatica PM-20 Causal relations imply connecting places Let N = (P,T,F) be a sound WF-net and let W be a complete workflow log of N. For any a,b T: a W b implies a b . I.e., if there is a causal relation between two transitions according to the workflow log, then there has to be a place connecting these two transitions. Surprisingly this holds for any sound WF-net! A W B A W C B W D C W D
21
/faculteit technologie management /faculteit wiskunde en informatica PM-21 Connecting places “often” imply causal relations Let N = (P,T,F) be a sound SWF-net and let W be a complete workflow log of N. For any a,b T: a b and b a = implies a W b. No “short loops” (i.e., loops of length 1 or 2). Structured Workflow Nets (SWF-nets) have no implicit places and the following two constructs cannot be used:
22
/faculteit technologie management /faculteit wiskunde en informatica PM-22 Example 4: loops of length 1 are harmful A W B A W D B W D There is a place connecting B to B but not B W B.
23
/faculteit technologie management /faculteit wiskunde en informatica PM-23 Example 5: loops of length 2 are harmful A W B B W D There is a place connecting B to C but not B W C (because C can be followed directly by B). There is a place connecting C to B but not C W B (because B can be followed directly by C).
24
/faculteit technologie management /faculteit wiskunde en informatica PM-24 Example 6: Implicit places remain undetected A W B B W C More complex examples can be given showing that the two other requirements for non-SWF-nets are needed.
25
/faculteit technologie management /faculteit wiskunde en informatica PM-25 Parallelism can “often” be detected Let N = (P,T,F) be a sound SWF-net such that for any a,b T: a b = or b a = and let W be a complete workflow log of N. 1.If a,b T and a b , then a # W b. 2.If a,b T and a b , then a # W b. 3.If a,b,t T, a W t, b W t, and a # W b, then a b t . 4.If a,b,t T, t W a, t W b, and a # W b, then a b t . This is a complex way of stating that for sound SWF-nets without short loops, it is possible to distinguish XOR-splits from AND-splits and XOR- joins from AND-joins.
26
/faculteit technologie management /faculteit wiskunde en informatica PM-26 Mining algorithm (W) Let W be a workflow log over T. (W) is defined as follows. 1.T W = { t T W t }, 2.T I = { t T W t = first( ) }, 3.T O = { t T W t = last( ) }, 4.X W = { (A,B) A T W B T W a A b B a W b a1,a2 A a 1 # W a 2 b1,b2 B b 1 # W b 2 }, 5.Y W = { (A,B) X (A,B) X A A B B (A,B) = (A,B) }, 6.P W = { p (A,B) (A,B) Y W } {i W,o W }, 7.F W = { (a,p (A,B) ) (A,B) Y W a A } { (p (A,B),b) (A,B) Y W b B } { (i W,t) t T I } { (t,o W ) t T O }, and (W) = (P W,T W,F W ).
27
/faculteit technologie management /faculteit wiskunde en informatica PM-27 Solution to the rediscovery problem Let N = (P,T,F) be a sound SWF-net and let W be a complete workflow log of N. If for all a,b T a b = or b a = , then (W) = N modulo renaming of places. I.e., any sound SWF-net without short loops can be rediscovered!
28
/faculteit technologie management /faculteit wiskunde en informatica PM-28 Example 7: Sound SWF-net without short loops
29
/faculteit technologie management /faculteit wiskunde en informatica PM-29 Example 8: A WF-net with an implicit place (W)
30
/faculteit technologie management /faculteit wiskunde en informatica PM-30 Example 9: Loop of length 1 (W)
31
/faculteit technologie management /faculteit wiskunde en informatica PM-31 Example 10: Loop of length 2 (W)
32
/faculteit technologie management /faculteit wiskunde en informatica PM-32 Example 11: Loop of length 3 No problem! (W)
33
/faculteit technologie management /faculteit wiskunde en informatica PM-33 Example 12: Non-free-choice constructs may be harmful (W)
34
/faculteit technologie management /faculteit wiskunde en informatica PM-34 Example 13: Free-choice is not enough Behaviorally equivalent! (W)
35
/faculteit technologie management /faculteit wiskunde en informatica PM-35 Example 14: Example with “hidden” tasks ? Any suggestions?
36
/faculteit technologie management /faculteit wiskunde en informatica PM-36 Simplification! Behaviorally equivalent! (W)
37
/faculteit technologie management /faculteit wiskunde en informatica PM-37 Results and issues Proven to be correct for a large class of processes. Notion of completeness is needed (direct successor relation). Can handle parallelism and time. Open issues: –noise –incomplete logs –data –advanced process patterns (hidden tasks, NFC, etc.) –behavioral equivalence On each of these issues we have some preliminary results.
38
/faculteit technologie management /faculteit wiskunde en informatica PM-38 Scientific competition J.E. Cook (and A.L. Wolf) – New Mexico State University/ University of Colorado, USA J. Herbst (and D. Karagiannis) – DaimlerChrysler, Germany R. Agrawal, D. Gunopulos, M.K. Maxeiner, K. Küspert, and F. Leymann – IBM, Germany G. Schimm – OFFIS, Germany S.Y. Hwang et al. – Sun Yeat-Sen University, Taiwan M. Golani and S.S. Pinter – IBM, Israel D. Grigori, F. Casati, et al. – HP, USA Our approach differs because we incorporate time and noise and take parallelism as a starting point.
39
/faculteit technologie management /faculteit wiskunde en informatica PM-39 Practical competition (ARIS PPM) IDS Scheer's ARIS Process Performance Manager. No process mining but interesting links with systems like SAP.
40
/faculteit technologie management /faculteit wiskunde en informatica PM-40 Tools/standards for process mining
41
/faculteit technologie management /faculteit wiskunde en informatica PM-41 Example: processing customer orders Example in Staffware: 7 tasks and all basic routing constructs
42
/faculteit technologie management /faculteit wiskunde en informatica PM-42 Fragment of Staffware log Case 21 Diractive Description Event User yyyy/mm/dd hh:mm ---------------------------------------------------------------------------- Start swdemo@staffw_edl 2003/02/05 15:00 Register order Processed To swdemo@staffw_edl 2003/02/05 15:00 Register order Released By swdemo@staffw_edl 2003/02/05 15:00 Prepare shipment Processed To swdemo@staffw_edl 2003/02/05 15:00 (Re)send bill Processed To swdemo@staffw_edl 2003/02/05 15:00 (Re)send bill Released By swdemo@staffw_edl 2003/02/05 15:01 Receive payment Processed To swdemo@staffw_edl 2003/02/05 15:01 Prepare shipment Released By swdemo@staffw_edl 2003/02/05 15:01 Ship goods Processed To swdemo@staffw_edl 2003/02/05 15:01 Ship goods Released By swdemo@staffw_edl 2003/02/05 15:02 Receive payment Released By swdemo@staffw_edl 2003/02/05 15:02 Archive order Processed To swdemo@staffw_edl 2003/02/05 15:02 Archive order Released By swdemo@staffw_edl 2003/02/05 15:02 Terminated 2003/02/05 15:02 Case 22 Diractive Description Event User yyyy/mm/dd hh:mm ---------------------------------------------------------------------------- Start swdemo@staffw_edl 2003/02/05 15:02 Register order Processed To swdemo@staffw_edl 2003/02/05 15:02 Register order Released By swdemo@staffw_edl 2003/02/05 15:02 Prepare shipment Processed To swdemo@staffw_edl 2003/02/05 15:02
43
/faculteit technologie management /faculteit wiskunde en informatica PM-43 Fragment of XML file Case start 05-02-2003 15:04 Register order 05-02-2003 15:04
44
/faculteit technologie management /faculteit wiskunde en informatica PM-44 EMiT Focus on time and causality.
45
/faculteit technologie management /faculteit wiskunde en informatica PM-45 Thumb Focus on noise.
46
/faculteit technologie management /faculteit wiskunde en informatica PM-46 Thumb is able to deal with noise (D/F-graphs) causalit y no noise 10% noise
47
/faculteit technologie management /faculteit wiskunde en informatica PM-47 Real case: CJIB Processing of fines 130136 cases 99 different activities
48
/faculteit technologie management /faculteit wiskunde en informatica PM-48 Process in EMiT
49
/faculteit technologie management /faculteit wiskunde en informatica PM-49 Complete process model Validated by CJIB
50
/faculteit technologie management /faculteit wiskunde en informatica PM-50 SAP R/3
51
/faculteit technologie management /faculteit wiskunde en informatica PM-51 Conclusion Process mining is both a scientific and practical challenge. Preliminary results are promising. Challenging problems: –Finding the right data in real information systems. –Dealing with noise and incompleteness. –Dealing with advanced synchronization patterns. –Dealing with hidden tasks/behavioral equivalence.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.