A General Framework for Correlating Business Process Characteristics Massimiliano de Leoni, Wil M. P. van der Aalst, Marcus Dees
Introduction / Process Mining Performance Related What do the cases that are late have in common? Do people work faster if they have more work? Resource Related What characterizes the workers that skip such-an-such checking activity? Which types of claims are more prone to lead to wrong treatments by resources? Cost and Risk Related Which types of claims are more risky or more expensive? … A B Discovery model event log Information System Conformance / A B model Correlation Analysis Process Mining
The problem of correlating process characteristic any process characteristic to other process characteristics associated with given events Configurable Dependent Independent Configurable Event Filtering Configurable
A General framework for correlating process characteristics Dependent Characteristic, Independent Characteristics, Filtering Dependent Characteristic, Independent Characteristics, Trace Manipulation, Filtering Event Log Process Model 1. Define Analysis Use Case AnalysisUse Case 2. Manipulate and Enrich Event log ... Additional Objects Context Data If analysis needs to be refined Event Log Conformance Resource Time Data Weather ... 2. Make Analysis 3. Make Analysis Analysis Result
Example: Relating the cost of First Hospital Admission to the performing resource / 1 Case ID Timestamp Activity Resource Cost 1 30-11-2011:08.27 First Hospital Admission Carol 90 2-12-2011:13.24 Preoperative Screening Susanne 350 4-12-2011:8.30 Laparoscopic Gastrectomy Andrew 500 4-12-2011:13.30 Nursing Paul 250 2 1-12-2011:11.00 Giuseppe 2-12-2011:15.28 Simon 2-12-2011:16.35 Clare 3-12-2011:13.00 3-12-2011:15.00 4-12-2011:9.00 Victor 34 3 7-12-2011:10.00 Jane 200 8-12-2011:13.24 Giulia 9-12-2011:16.35 4 6-12-2011:14.00 Gianluca Robert 10-12-2011:16.35 13-12-2011:11.00 13-12-2011:16.00 300 Dependent characteristic: Cost Independent characteristics: Resource Filtering: Retain all events for First Hospital Admission
Example: Relating the cost of First Hospital Admission to the performing resource / 2 Problem: Dependent characteristics defined over continuous domain Solution: Discretization techniques: equal-width or equal-frequency binning Dependent characteristic: Cost Independent characteristics: Resource Filtering: Retain all events for First Hospital Admission Resource!=“Jane” Resource=“Jane” (50,200) (200,300)
Example: Decision Point Analysis What determines what to do as next when reaching a decision point? salary age installment amount length Decision Points
Example: Decision Point Analysis / 2 Trace augmentation is needed Case ID Timestamp Activity Resource Cost 1 30-11-2011:08.27 First Hospital Admission Carol 90 2-12-2011:13.24 Preoperative Screening Susanne 350 4-12-2011:8.30 Laparoscopic Gastrectomy Andrew 500 4-12-2011:13.30 Nursing Paul 250 2 1-12-2011:11.00 Giuseppe 2-12-2011:15.28 Simon 2-12-2011:16.35 Clare 3-12-2011:13.00 3-12-2011:15.00 4-12-2011:9.00 Victor 34 3 7-12-2011:10.00 Jane 200 8-12-2011:13.24 Giulia 9-12-2011:16.35 4 6-12-2011:14.00 Gianluca Robert 10-12-2011:16.35 13-12-2011:11.00 13-12-2011:16.00 300 Next Activity in Trace Preoperative Screening Laparoscopic Gastrectomy Nursing null First Hospital Admission
Example: Decision Point Analysis / 2 Independent Characteristics Dependent Characteristic Case ID Timestamp Activity Resource Cost 1 30-11-2011:08.27 First Hospital Admission Carol 90 2-12-2011:13.24 Preoperative Screening Susanne 350 4-12-2011:8.30 Laparoscopic Gastrectomy Andrew 500 4-12-2011:13.30 Nursing Paul 250 2 1-12-2011:11.00 Giuseppe 2-12-2011:15.28 Simon 2-12-2011:16.35 Clare 3-12-2011:13.00 3-12-2011:15.00 4-12-2011:9.00 Victor 34 3 7-12-2011:10.00 Jane 200 8-12-2011:13.24 Giulia 9-12-2011:16.35 4 6-12-2011:14.00 Gianluca Robert 10-12-2011:16.35 13-12-2011:11.00 13-12-2011:16.00 300 Next Activity in Trace Preoperative Screening Laparoscopic Gastrectomy Nursing null First Hospital Admission 16-11-2018
Example: Duration of case executions Dependent Characteristic Many well-studied problems are just ad-hoc instances for our framework
Implementation of the framework Reference Implementation available for ProM 6.4 Check out the FeaturePrediction package of ProM The main input is an event log Additional input objects may be required for specific log manipulations Join the demo session on Wednesday from 16.00 to 17.30!!
A step-through guide: existing the duration of activities Health-care Process enacted in a Dutch hospital. How is the duration of Doctor appointments related to the patient characteristics and his/her treatment?
Selection of the characteristics to consider & Augmentation
Filtering on activity names We should only consider activity Afspraak (Dutch term for Appointment)
Choosing the dependent characteristic & parameter for constructing decision trees Parameters for building decision trees
Discretization Equal-width binning The entire range is divided in n intervals of equal size Equal-frequency The entire range is divided in n intervals such that each interval contains the same number of observed values. n = Number of bins
The resulting Decision Tree Correlating the duration of execution of activity Afspraak to the other characterists of the process
The UWV case study UWV is the institute that manage the unemployment benefits in the Netherlands. UWV opens a reclamation when one receives unemployment benefits although not entitled. UWV eventually discovers it. In many cases, the excessive amount is not recollected. UWV wants to predict the risk of eventual reclamations For prediction, one needs to relate reclamations to the characterists of the customer and the process execution
Customer-related question Are customer characteristics linked to the occurrence of reclamations?
Conformance-related question If the prescribed process flow is not followed, will this influence whether or not a reclamation occurs?
Conclusion Proposed a framework to correlate business process characteristics Useful for operational support (prediction and recommendation) and process improvement Many well-studied problems are just ad-hoc instances for our framework, such as: Decision Point Analysis Prediction of the Remaining Time to the end of instance Decision Support Systems to reduce costs, risks, non-conformance, global execution time Implemented in ProM, Customizable and Extendible
Join the demo session on ? Join the demo session on Wednesday from 16.00 to 17.30!!