/faculteit technologie management Genetic Process Mining Wil van der Aalst Ana Karla Medeiros Ton Weijters Eindhoven University of Technology Department of Information Systems
/faculteit technologie management Outline Process Mining Genetic Algorithms Genetic Process Mining –Internal Representation –Fitness measure –Genetic Operators Experiments and Results Conclusion and Future Work
/faculteit technologie management Outline Process Mining Genetic Algorithms Genetic Process Mining –Internal Representation –Fitness measure –Genetic Operators Experiments and Results Conclusion and Future Work
/faculteit technologie management Process Mining X = apply for license A = classes motobike B = classes car C = theoretical exam D = practical motorbike exam E = practical car exam Y = get result
/faculteit technologie management Process Mining (cont.) Most of the current techniques cannot handle –Structural constructs: non-free choice, duplicate tasks and invisible tasks –Noisy logs –Reason: local approach
/faculteit technologie management Outline Process Mining Genetic Algorithms Genetic Process Mining –Internal Representation –Fitness measure –Genetic Operators Experiments and Results Conclusion and Future Work
/faculteit technologie management Genetic Algorithms –Global approach local optimum global optimum
/faculteit technologie management Outline Process Mining Genetic Algorithms Genetic Process Mining –Internal Representation –Fitness measure –Genetic Operators Experiments and Results Conclusion and Future Work
/faculteit technologie management Genetic Process Mining (GPM) Aim: Use genetic algorithm to tackle noise, duplicate activities, non-free choice and invisible tasks Internal Representation Fitness Measure Genetic Operators
/faculteit technologie management GPM – Internal Representation Causal Matrix InputXABCDEY Output X A B C D E Y
/faculteit technologie management GPM – Internal Representation Causal Matrix InputXABCDEY Output X A B C D E Y
/faculteit technologie management GPM – Internal Representation Causal Matrix Input XABCDEY Output X A \/ B A C /\ D B C /\ E C D \/ E D Y E Y Y True
/faculteit technologie management GPM – Internal Representation Causal Matrix Input TrueXXA \/ BA /\ CB /\ CD \/ E XABCDEY Output X A \/ B A C /\ D B C /\ E C D \/ E D Y E Y Y True
/faculteit technologie management GPM – Internal Representation Causal Matrix –Compact representation InputTrueXXA \/ BA /\ CB /\ CD \/ E XABCDEYOutput X A \/ B A C /\ D B C /\ E C D \/ E D Y E Y Y True TaskInputOutputX{}{{A,B}} A{{X}}{{C},{D}} B{{X}}{{C},{E}} C{{A,B}}{{D,E}} D{{A},{C}}{{Y}} E{{B},{C}}{{Y}} Y{{D},{E}}{}
/faculteit technologie management GPM – Internal Representation Causal Matrix –Semantics TaskInputOutput A{}{{B},{C,D}} B{{A}}{{E,F}} C{{A}}{{E}} D{{A}}{{F}} E{{B},{C}}{{G}} F{{B},{D}}{{G}} G{{E},{F}}{} Invisible tasks only fire to enable visible tasks!
/faculteit technologie management GPM – Internal Representation Causal Matrix –Semantics TaskInputOutput A{}{{B},{C,D}} B{{A}}{{E,F}} C{{A}}{{E}} D{{A}}{{F}} E{{B},{C}}{{G}} F{{B},{D}}{{G}} G{{E},{F}}{} Deadlock! Invisible tasks only fire to enable visible tasks!
/faculteit technologie management GPM – Internal Representation Causal Matrix –Mappings TaskInputOutput A{}{{B},{C,D}} B{{A}}{{E,F}} C{{A}}{{E}} D{{A}}{{F}} E{{B},{C}}{{G}} F{{B},{D}}{{G}} G{{E},{F}}{}
/faculteit technologie management GPM – Internal Representation Causal Matrix –Mappings TaskInputOutput A{}{{C,D}} B{}{{D}} C{{A}}{} D{{A,B}}{}
/faculteit technologie management GPM – Fitness Measure Main idea –Benefit the individuals that can parse more frequent material in the log Challenges –How to assess an individual’s fitness? –How to punish individuals that allow for undesired extra behavior?
/faculteit technologie management Fitness - How to assess an individual’s fitness? - Use continuous semantics parser and register problems L = log and CM = causal matrix
/faculteit technologie management Trace: SS,A,B,C,D,EE For noise-free, fitness punishes: OR-split AND-split AND-join OR-join
/faculteit technologie management Trace: SS,A,B,C,D,EE For noise-free, fitness punishes: OR-join AND-join AND-split OR-split
/faculteit technologie management Fitness - How to assess an individual’s fitness?
/faculteit technologie management Fitness - How to punish individuals that allow for undesired extra behavior? Fitness = 1
/faculteit technologie management Fitness - How to punish individuals that allow for undesired extra behavior? - Count the amount of enabled tasks at every reachable marking
/faculteit technologie management Fitness Measure where L = log and CM = causal matrix and CM[] = population
/faculteit technologie management Genetic Operators Crossover –Recombines existing material in the population –Crossover probability –Crossover point = task –Subsets are swapped Mutation –Introduce new material in the population –Mutation probability –Every task of a individual can be mutated
/faculteit technologie management Outline Process Mining Genetic Algorithms Genetic Process Mining –Internal Representation –Fitness measure –Genetic Operators Experiments and Results Conclusion and Future Work
/faculteit technologie management Experiments and Results Experiments –ProM framework Genetic Algorithm Plug-in –Simulated data Results –The genetic algorihm found models that could parse all the traces in the log
/faculteit technologie management ProM framework – Genetic Algorithm Plug-in
/faculteit technologie management ProM framework – Genetic Algorithm Plug-in
/faculteit technologie management Outline Process Mining Genetic Algorithms Genetic Process Mining –Internal Representation –Fitness measure –Genetic Operators Experiments and Results Conclusion and Future Work
/faculteit technologie management Conclusion and Future Work Conclusion –Genetic algorithms can be used to mine process models Future Work –Tackle duplicate tasks –Apply the genetic process mining to "real-life" logs
/faculteit technologie management