Download presentation
Presentation is loading. Please wait.
Published byLawrence Gaines Modified over 9 years ago
1
D ATA ENVELOPMENT A NALYSIS A Tool for Data Mining and Analytics Joe Zhu School of Business Worcester Polytechnic Institute Worcester, MA 01609 jzhu@wpi.edu www.deafrontier.net
2
What is DEA? 2 When DEA was developed/published in 1978 Non-parametric approach to estimating production functions Thus, we have multiple inputs and multiple outputs (of a production function) DEA tries to identify the efficient units
3
What is DEA exactly? 3 More than production efficiency estimate It is a balanced benchmarking Sherman and Zhu(2013) that enables companies to benchmark and locate best practices that are not visible through other commonly-used management methodologies Help executives to study the top-performing units, to identify the best practice and to transfer the valuable knowledge throughout the organization to enhance performance, also to test their assumptions that might be counter-productive
4
A tool for benchmarking 4 If one benchmarks the performance of computers, it is natural to consider different features (screen size and resolution, memory size, process speed, hard disk size, and others). One would then have to classify these features into “inputs” and “outputs” in order to apply a proper DEA analysis. However, these features may not actually represent inputs and outputs at all, in the standard notion of production
5
DEA - revisit 5 Multiple inputs Multiple outputs the smaller the better the larger the better a rule for classifying metrics
6
DMU 6 Definition of DMU is generic and flexible Numerous applications are found in areas of finance, marketing, transportation, sports, accounting, energy, sustainability, fishery, insurance and others
7
(Relative) Efficiency 7 The term ‘efficiency’ here presents best-practice Under general benchmarking, it does not necessarily mean ‘production efficiency’ We may refer to the DEA score as a form of ‘overall performance’ of an organization An example: measuring the quality of care in the case of treating heart-attack patients Some measures which can be used in DEA to yield a composite measure of quality indicators Patients Given Aspirin at Arrival, Patients Given Beta Blocker at Discharge, etc.
8
Mathematical Model 8 Dual
9
Business Analytics by Data Envelopment Analysis (DEA) Descriptive Analytics: Gain insight from historical data Predictive Analytics: Forecasting Prescriptive Analytics: Recommend decisions using optimization, simulation, etc. Decisive Analytics: supports human decisions with visual analytics
10
D ATA ENVELOPMENT A NALYSIS DEA is a D ATA A NALYSIS tool Data Mining and Knowledge Discovery by DEA More than Relative Efficiency 10
11
Sample Size 11 DEA is not a form of regression model It is meaningless to apply a sample size requirement to DEA It is likely that a significant portion of DMUs will be benchmarked as the best practice with ratio 1, if there are too many performance metrics given the number of DMUs One can use certain DEA approaches to reduce the number of best- practice DMUs
12
12 Regression analysis
13
Numerous Models/Approaches One modification to DEA is called stratification. Stratification results in many efficiency frontiers. The first represents all DMUs with the highest efficiency, and so on down each stratified level until all DMUs have been included. Data Envelopment Analysis 13
14
Network Structure 14
15
Ship Block Manufacturing Process Performance Evaluation
16
Shipbuilding process Business & Service Computing Laboratory Main processes of shipbuilding consist of several work stages 16 For effective ship construction A ship is divided into properly sized blocks in the design stage All blocks are manufactured (or assembled) into the body of a ship Design Cutting & Forming Assembly Pre-Outfitting & Painting Pre-Erection Erection Quay
17
management of block manufacturing process (BMP) 17 Effective block manufacturing process (BMP) management has been regarded as one of the most important issues in shipbuilding industry 250 different blocks A large ship usually needs more than 250 different blocks, each manufactured through a different process according to the ship’s type and size Many blocks are assembled into a ship, each block has complex manufacturing processes Thus An effective and efficient BMP performance enables a reduction of the overall shipbuilding period and thereby the cost -If any one block includes unnecessary work stages, the related inefficient resource assignment or long queuing times in the storage yard will have a negative effect on the overall shipbuilding period and productivity considers various factors reflecting real manufacturing processes and situations practical and accurate performance evaluation method that considers various factors reflecting real manufacturing processes and situations is crucial For an effective management of BMP performance For example
18
Practical difficulties in evaluating BMP performance 18 For effective BMP management, the shipbuilding companies have implemented production information systems ( e.g. BAMS (Block Assembly Monitoring System) or RPMS (Real-time Progress Management System)… ) These systems only focus on work scheduling, process monitoring and work automation There are at least two practical difficulties in evaluating BMP performance many block assembly types 1) There are many block assembly types ( e.g. Sub-assembly, Unit-assembly, and Grand-assembly... ) assembly type is in turn classified into one of three form types and each assembly type is in turn classified into one of three form types ( e.g. Small, Curved, and Large… ) Generally, there is a 5~9 day delay between planned work and performed work There are discrepancies between actual and planned work 2) There are discrepancies between actual and planned work in the form of time gaps due to various problems ( e.g. work delay, urgent work, and the convergence of blocks at the end of the process… ) But
19
Goal of this research Business & Service Computing Laboratory This research addresses above two practical difficulties in evaluating BMP performance 19 Data pre- processing Data Extraction Database in shipbuilding company integrated systematic approach to evaluate the performance of BMP in the shipbuilding industry This research proposes an integrated systematic approach to evaluate the performance of BMP in the shipbuilding industry by integrating process mining (PM) and DEA Block manufacturing processes Generation Performance evaluation of BMP Evaluation Guideline for improving the performance of underperforming BMPs Process mining (PM) Data envelopment analysis (DEA)
20
Business & Service Computing Laboratory 20 Proposed method
21
Business & Service Computing Laboratory 21 Clustering Consider block ID 101 It includes three operations; C1, G9 and S6 Extract sample log data based on the defined attributes Database Defined attributes BMP is generated as a form of operations flow from the extracted log data We arrange these operations by End time in ascending order The sequence of operations C1 S6 G9, is the BMP of block ID 101 Generation of BMPs The generated BMPs are then subjected to performance evaluation
22
Business & Service Computing Laboratory 22 Proposed method heterogeneous Generated BMPs are heterogeneous since there are many kinds of BMPs Block clustering For a more accurate performance evaluation homogeneous BMPs Our intention is to evaluate homogeneous BMPs We classify BMPs into several peer groups by their similarity Therefore similarity indextwo vectors The similarity of BMPs is measured by the similarity index, which is calculated by two vectors: Task vector: Task vector: based on the presence or absence of the same operations in two BMPs Transition vector: Transition vector: based on the sequential relationship of the operations in two BMPs The task vector and transition vector take values from 0 to 1, with values closer to 1 indicating that two BMPs are more similar
23
Business & Service Computing Laboratory 23 Performance evaluation Each BMP is regarded as a DMU, and only BMPs in the same group are considered for performance evaluation DEA model where some performance metrics have target levels Due to the nature of our performance metrics, we use a DEA model where some performance metrics have target levels developed recently by Lim & Zhu (2013) In our case, the performance metrics are selected based on the extracted log data. We conducted a questionnaire survey of 30 shipbuilding operating experts to obtain information on which factors are most critical to BMP performance
24
Business & Service Computing Laboratory 24 Case study from a Korean shipbuilding company Two projects’ event logs exported from a Block Assembly Monitoring System (BAMS) were used. Eighty-six blocks six clusters Eighty-six blocks are generated from the log data, which are then classified into six clusters Condition of Experiment We refer to these defined block types in deciding the number of clusters In general, production planners assign the work resources and establish the production scheduling based on the block types defined by the empirical knowledge of shipbuilding operating experts. We refer to these defined block types in deciding the number of clusters
25
Business & Service Computing Laboratory 25 Case study Clustering results including the number of blocks and the process characteristics of each cluster We aggregate all BMPs in the cluster C5 to show a concrete instance for the clustering result The aggregated model of all BMPs in C5 represents BMPs performed in the work shop #2
26
26 Case study The performance metrics are calculated and the descriptive statistics for them are listed
27
27 Case study The evaluation results are summarized Average performance scores of BMPs Performance scores of BMPs in C5 Five blocks (1XXX_622, 2XXX_509, 2XXX_622, 2XXX_631, 2XXX_642) are determined as the best-practice, whereas the remaining 14 blocks are underperforming In particular, 1XXX_110 and 2XXX_110 are the most underperforming blocks. Most of the best-practice blocks have the same BMPs as Comp 101-‘C’ Grand 201-‘P’ Grand 202-‘3’ Grand 203-‘3’ Grand 301-‘3’ Five blocks (1XXX_622, 2XXX_509, 2XXX_622, 2XXX_631, 2XXX_642) are determined as the best-practice, whereas the remaining 14 blocks are underperforming In particular, 1XXX_110 and 2XXX_110 are the most underperforming blocks. Most of the best-practice blocks have the same BMPs as Comp 101-‘C’ Grand 201-‘P’ Grand 202-‘3’ Grand 203-‘3’ Grand 301-‘3’
28
Business & Service Computing Laboratory 28 Case study operations executionresources utilization We analyze the underperforming BMPs (block 2XXX_110 and 1XXX_110) in from the operations execution and resources utilization perspectives We compare the difference between planned operations flow, which is managed by production schedulers, and the actual operations flow of block 2XXX_110 operations execution For the analysis of underperforming block from operations execution perspective The actual operations flows for all best-practice blocks are the same as the planned operations flow The actual operations flows for the underperforming BMPs are different from the planned operations flows very similar operation characteristics Grand 201-‘P’ and Grand 201-‘3’ have very similar operation characteristics, but the work shop and items for these are different As a result, block 2XXX_110 might have incurred a longer waiting time and execution time On the other hand The Grand 201-‘3’ was chosen discretionally by the worker for its similar operation characteristics
29
Conclusion Business & Service Computing Laboratory 29 We proposed an integrated approach to BMP performance evaluation in the shipbuilding industry by using process mining (PM) and DEA Through application of the proposed approach, we verified its effectiveness and practicality Shipbuilding operations experts, moreover, agreed that the provided guidelines can be valuable in establishing additional strategies for improving the performance and productivity of block manufacturing It can be said that this research makes a constructive contribution to practical block performance evaluation in the shipbuilding industry
30
30
31
United Network for Organ Sharing (UNOS) Many variables and observations related to lung and heart transplants. Need for fair and accurate predictions of survival time and quality of life. Ability for medical professionals to accurately predict best donor/recipient pairings may be flawed/biased. Variables contributing towards accurate predictions may be many, complex, and have poorly understood relationships. Reduction of large datasets is important. 31
32
Data concerning donor/recipient for lung/heart transplants. Over 400 variables and 100,000+ observations BIG DATA ANALYTICS 24 variables chosen by Oztekin et al. [2] Can reduce to 12,744 observations from cleaning. Dataset VariablesExplanationVariable type Donor AgeYearsCont. Recipient AgeYearsCont. ABO_MATABO match levelOrdinal EINTEthnicity match levelBinary GINTGender match levelBinary GTIMEGraft survival timeCont. Etc… 32
33
Variables are chosen according to contribution Data is preprocessed using DEA ANN is trained Predictions DEANN Methodology Metrics chosen according to importance with no need to be few in number. Preprocessing with DEA allows better training of ANN. ANN is applicable for “fuzzy” situations. 33
34
DEANN Methodology 34 12,744 records
35
Stratification yielded 12 efficiency levels. Individual levels yielded a higher correlation between the recipient functional status and the input variables when compared to consideration of many (or all) levels. The ANN is trained using one or more of these levels using ten-fold cross validation. DEA allows efficient observations to be utilized so that outlying transplants do not result in poor training of the ANN. DEANN allows the ANN to be trained from efficient data which will result in accurate predictions/faster training time. DEA Results 35
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.