Profiling based unstructured process logs Peter Khisa Wakholi Supervisors: Prof Wil Van der Aalst – Eindhoven University Prof Ddembe Williams – Makerere University 5/29/2018
Section 1 Introduction 5/29/2018
Process Mining Overview Construction of models from event logs: Process Model Social networks Organisational Model Compare model with the event log and analyse discrepancies. Audit and security Extend model with a new aspect or perspective Performance Characteristics 5/29/2018
Unstructured Processes Browsing a website is an example of an unstructured process. Other examples Patient flow in hospital Customer care processes Use of a machine Unstructured process lack a definite structure or organization and are not formally organized or systematized during their execution. The execution path depends on: a set of factors that control the flow attributes and interests of the actors 5/29/2018
The Problem Construction of models from unstructured event logs is possible but interpretation is difficult. There is a need to develop a better understanding for unstructured processes. This understanding would help in; Behavioural analysis to gain new insights on processes and actors Predicting the execution path of incomplete processes 5/29/2018
Motivation Profiling can help develop a better understanding of the underlying process models by. Extracting meaningful process models from logs Determining the rules that define the control flow for each case Determining the attributes of the actors that influence observed behaviour. 5/29/2018
Section 2 Current Research 5/29/2018
Research Questions How can complete and highly accurate profiles be developed from unstructured event log data? What techniques that can be used to extract process related profiles based on event log data? How can these techniques be deployed to develop a complete profile for unstructured processes? What interpretation or meaning can be attributed to observed behavior in the profiles? 5/29/2018
Research Approach Experimentation Develop a concept Experiment based on model generated event logs Experiment on real logs Develop a model, method, guidelines or framework 5/29/2018
Hypothesis of DFD for Profiling Process Event Logs Profiling Data High level Petri net Domain Knowledge Profile Event Log File Filtered Log File Association rules Filtering Clustering & Filtering Association Rule Mining Intra Profile Process Mining PROM Analysis 5/29/2018
Profiling Hypothesis Event Log File – This is a log of events for an unstructured business process. It is assumed that it contains process related data for extracting the model and case related data for developing profiles. Clustering and Filtering – Real life logs contain a lot of noise. In addition, the underlying process models could be complex. The purpose of this stage will be to refine the logs through filtering and clustering based on some attributes. The current PROM plug-ins will be assessed for their appropriateness in profile generation. Process Mining – The refined log is mined to discover the underlying process model, which is used as the basis for profile generation. This research will seek to identify appropriate Plug-ins for this task. 5/29/2018
Profiling Hypothesis .... Association rule mining – This will pay a major part in generating profiles. The idea is to map every path in the process model with characteristics that define its users based on association rules. The first part of this study will focus on this. PROM Analysis – We recognise that there are many PROM plug-ins that can be used to provide some profile related information. They will be analysed in order to determine how appropriate they are and to develop some guidelines for profiling. Intra Profile – Association rules generated are only useful if they do not contradict themselves. This stage will seek to develop a mechanism to refine the rules by removing any contradictions. 5/29/2018
Profiling Hypothesis .... Filtering – Knowledge of the domain under study. This knowledge should be used to ensure that the profile generated clearly reflects the expected behaviour patterns. A specific domain will be identified in order to illustrate the concept. Profile – The expected output of all the processes explained above is a complete profile. The study will explore how this can be achieved. 5/29/2018
Association rule mining for profile generation Section 3 Association rule mining for profile generation 5/29/2018
The Idea For every path (arc between two places) of an unstructured process model Develop a list of characteristics that defines attributes of actors that follow the path. The profile of an actor is the list of attributes defined by the path followed. 5/29/2018
Approach Develop an algorithm to generate association rules. Implement the algorithm in PROM. Develop a model using CPN tools. Analyse the results using the plug-in. Refine the algorithm and idea till the results are satisfactory Test the plug-in using real life logs Refine the idea based on the results obtained. Write a paper on the findings 5/29/2018
The Website Browse Model 5/29/2018
Discovered Process Model 5/29/2018
Association Rule Mining Goal: Given an unstructured event log each of which contain some event log and data attributes from a given collection. Develop a process model that defines control flow For every segment in the model generate association rules Express the segment as a sequence Find the attributes of the actors that are associated with the segment Develop a set of rules that govern the entire path for each case. 5/29/2018
Next Steps Develop a definitive and detailed algorithm Develop a plug-in in PROM to test the algorithm 5/29/2018