Presentation is loading. Please wait.

Presentation is loading. Please wait.

Multi-phase process mining

Similar presentations


Presentation on theme: "Multi-phase process mining"— Presentation transcript:

1 Multi-phase process mining
EPC representations of process instances, and the aggregation of those into one overall specification.

2 Outline Introduction to Process Mining
Introduction to process performance monitoring Using process mining to deploy a monitoring system Extending Conclusion

3 Overview Process Mining
2) Control flow rediscovery 3) organizational model 4) social network 1) basic performance metrics 5) performance characteristics Next: Process Log Requirements (staffware example) 6) auditing/security If …then …

4 Log Files Information systems typically log all kinds of events. We use a XML format for storing event logs. The basic assumption is that the log contains information about specific tasks executed for specific process instances (cases, event-lists, audit trails). We do not assume any knowledge of the underlying process.

5 Log based ordering relations
If there is a process instance (trace, eventlist) in which two functions A and B follow each other directly, we say: A > B Causal relations: A  B if and only if A > B and ¬B > A Parallel relations A || B if and only if A > B and B > A Note, if two tasks A and B are in a loop of length two (I.e. there is a trace such that A B A is in that trace) then A  B and B  A and not A || B

6 Process Mining, the α-algorithm
Using the  and || relation, we construct places for the resulting Petrinet. E G F S A B C D T {E}  {G, F } {F}  {C} {F}  {D } {B, D}  {T }

7 The problems of the α-algorithm
Known issues of the α-algorithm algorithm: - It requires its input to come from a certain class of models. Tasks that can be skipped in the original process: ( AC , ABC ) Tasks that appear twice in the same process, under the same name. Tasks that do not appear in the log, but belong in the process. (Invisible tasks) Errors in the input data (I.e. one trace containing AB versus one million traces containing BA.)

8 Calculating causal relations
Using the causal relation, we construct a general base graph for all cases: And then to: A  B A  C A  D C  D B  D D  E Assume we have the following cases: A,B,C,D,E A,C,B,D,E A,D,E This leads to these relations: A>B B>C C>D D>E A>C C>B B>D A>D A B C D E or

9 Creating instance graphs
For each instance, we walk through the base graph and make instance graphs for each case. C Process instance: A,B,C,D,E and A,C,B,D,E A D E B C Process instance: A,D,E A D E B

10 Converting to EPCs Each instance-graph is then converted into a human readable format, such as an EPC Process instance: A,B,C,D,E and A,C,B,D,E Process instance: A,D,E

11 Aggregating instance graphs
Assume we have a set of instance graphs: 2x 1x Can we aggregate them into one model? A B C D E Our log file: A,B,C,D,E A,C,B,D,E A,D,E A D E

12 Aggregating instance graphs
First, we project the instances onto themselves, to distinguish between “task executions” and “tasks that were executed” A1 B1 C1 D1 E1 1 Our log file: A,B,C,D,E A,C,B,D,E A,D,E A1 D1 E1 1

13 Aggregating instance graphs
Combine the graphs based on node labels Decide on split/join type C2 2 2 3 3 A3 D3 E3 Our log file: A,B,C,D,E A,C,B,D,E A,D,E 2 2 B2 C A D E B

14 Aggregating instance graphs
What type of split / join to choose? x = a = b = c implies AND x = a + b + c implies XOR else OR Ax a b c

15 Conversion to EPC C A B D E
Conversion to an EPC is done in the same way as for instance graphs Becomes: C A B D E Our log file: A,B,C,D,E A,C,B,D,E A,D,E

16 (a) (b) (c) Example A B D C G H I E A B D C E A B D C F
Consider the following instance graphs: A B D C G H I E (a) A B D C E (b) A B D C F (c)

17 Example That will lead to the following instance graph projections: A B D C G 1 2 H I E t s f (a) A B D C E 1 t f s (b) A B D C F 1 t f s (c)

18 Example Which results in this aggregation graph: 1 F 1 1 1 D 2 E t 2 4 4 f 2 2 3 t 3 A 3 B s 4 4 3 3 C 1 4 1 G 1 1 H 1 1 I 1 1 1

19 Example G 1

20 Example (Not implemented yet)

21 Comparison to α-algorithm
The α-algorithm requires its input to come from a certain class of models. The multi-phase system does not have this requirement. The result is always an executable specification. The α-algorithm cannot deal with tasks that can be skipped ( AC , ABC ). The multi-phase system deals with this easily, using connectors (invisible steps) The following issues still stand: Tasks that appear twice in the same process, under the same name. Tasks that do not appear in the log, but belong in the process. (Invisible tasks) Errors in the input data (I.e. one trace containing AB versus one million traces containing BA.)

22 Conclusions The multi-phase system is an improvement to the α-algorithm The multi-phase system improves the practical applicability of the mining research, since it can be used as an interface between Aris PPM and the ProM-framework Further research is needed to deal with duplicate tasks, invisible tasks and noisy log files


Download ppt "Multi-phase process mining"

Similar presentations


Ads by Google