Presentation is loading. Please wait.

Presentation is loading. Please wait.

Xiaochen Zhu 1, Shaoxu Song 1, Xiang Lian 2, Jianmin Wang 1, Lei Zou 3 1 Tsinghua University, China 2 University of Texas - Pan American, USA 3 Peking.

Similar presentations


Presentation on theme: "Xiaochen Zhu 1, Shaoxu Song 1, Xiang Lian 2, Jianmin Wang 1, Lei Zou 3 1 Tsinghua University, China 2 University of Texas - Pan American, USA 3 Peking."— Presentation transcript:

1 Xiaochen Zhu 1, Shaoxu Song 1, Xiang Lian 2, Jianmin Wang 1, Lei Zou 3 1 Tsinghua University, China 2 University of Texas - Pan American, USA 3 Peking University, China 1/21 SIGMOD 2014

2  Motivation  Event Matching Similarity  Structural Similarity Function  Iterative Computation  Estimation  Matching Composite Events  Experiments  Conclusion 2/21 SIGMOD 2014

3  Information systems play an important role in large enterprises:  Enterprise Resource Planning (ERP)  Office Automation (OA)  These systems record the business history in their event logs. 3/21 SIGMOD 2014 Trace IDTraceTrace IDTrace 1ACDEFACDEF6BCDEFBCDEF 2BCDFEBCDFE7BCDFEBCDFE 3ACDFEACDFE8BCDEFBCDEF 4ACDFEACDFE9BCDFEBCDFE 5ACDEFACDEF10BCDFEBCDFE ACDEFACDEF Event IDTrace IDEvent NameTimestamp 11Pay by Cash (A)04-22 13:33:34 21Check Inventory (C)04-22 15:18:11 31Validate (D)04-22 15:31:50 41Ship Goods (E)04-23 08:14:26 51Email Customer (F)04-23 08:17:18

4  Complex event processing  Provenance analysis  Decision support 4/21 Business Data Warehouse Event Logs Beijing Subsidiary Event Logs Shanghai Subsidiary Event Logs Hong Kong Subsidiary Information systems SIGMOD 2014 Exploring the correspondence among events

5  Different events may represent the same activity 5/21 IDTrace t1Pay by Cash (A)  Check Inventory (C)  Validate (D)  Ship Goods (E)  Email Customer (F) t2Pay by Credit Card (B)  Check Inventory (C)  Validate (D)  Email Customer (F)  Ship Goods (E) …… IDTrace s1Order Accepted (1)  Pay by Cash (2)  Inventory Checking & Validation (4)  ????????? (5)  Send Notification (6) s2Order Accepted (1)  Pay by Credit Card (3)  Inventory Checking & Validation (4)  Send Notification (6)  ???????? (5) …… SIGMOD 2014 Linguistic Matching Dislocated Matching Semantic Matching Opaque Matching Composite Events Matching

6  Text Similarity fails  Statistics and structural information  Event Log  Event Dependency Graph (V, E, f) 6/21 Trace IDTrace 1ACDEFACDEF 2BCDFEBCDFE 3ACDFEACDFE 4ACDFEACDFE 5ACDEFACDEF 6BCDEFBCDEF 7BCDFEBCDFE 8BCDEFBCDEF 9BCDFEBCDFE 10BCDFEBCDFE A B C D E F 1.0 0.6 1.0 0.6 0.4 f(B,C)=0.6 1.0 0.4 0.6 f(A)=0.4 frequency of appearance frequency of consecutive events SIGMOD 2014

7 Linguistic Matching Semantic Matching Opaque Matching Dislocated Matching Composite Events Graph Edit Distance Opaque Schema Matching Behavioral Matching Event Matching Similarity 7 1. R. M. Dijkman, M. Dumas, and L. Garc´ıa-Ba˜nuelos. Graph matching algorithms for business process model similarity search. In BPM, pages 48–63, 2009 2. J. Kang and J. F. Naughton. On schema matching with opaque column names and data values. In SIGMOD Conference, pages 205–216, 2003 3. S. Nejati, M. Sabetzadeh, M. Chechik, S. M. Easterbrook, and P. Zave. Matching and merging of statecharts specifications. In ICSE, pages 54–64, 2007.

8 8/21 A B C D E F 1.0 0.6 1.0 0.6 0.4 0.6 1.0 0.4 0.6 0.4 1 3 2 4 5 6 1.0 0.6 1.0 0.6 0.4 0.6 1.0 0.4 0.6 0.4 1.0 Event Logs Dependency Graphs Event Matching Similarities Corresponde nces Composite Event Matching Trace IDTrace 1ACDEFACDEF …… Trace IDTrace 11245612456 …… 123456 A0.230.800.520.200.150.19 B0.380.530.760.240.200.23 C0.300.160.200.610.200.22 D0.340.150.200.370.240.25 E0.270.210.190.180.280.20 F0.300.190.23 0.200.72 A  2, B  3, C  4, D  1 E  5, F  6 A  2, B  3, {C,D}  4, E  5, F  6 Event Matching Similarities SIGMOD 2014

9  Motivation  Event Matching Similarity  Intuition  Iterative Computation  Estimation  Matching Composite Events  Experiments  Conclusion 9/21 SIGMOD 2014

10  Intuition of evaluating the similarity of two events v 1 and v 2 :  1. S(v 1,v 2 )=1, if both v 1 and v 2 have no input neighbor;  2. v 1 is similar to v 2, if they frequently share similar input neighbors. 10/21 SIGMOD 2014 * G. Jeh and J. Widom. Simrank: a measure of structural-context similarity. In KDD, pages 538–543, 2002. A B C D E F 1 3 2 4 5 6 Problem: Cannot deal with dislocated matching

11 11/21 SIGMOD 2014 A B C D E F 1 3 2 4 5 6

12 12/21 SIGMOD 2014 A B C D E F 1 3 2 4 5 6 123456 1.00000000 A0000000 B0000000 C0000000 D0000000 E0000000 F0000000 I = 0 I = 1 I = 2 I = 20 123456 1.00000000 A00.230.800.520.200.150.19 B00.380.530.760.240.200.23 C00.300.100.130.400.130.17 D00.340.110.150.340.17 E00.270.140.13 F00.300.130.150.180.130.63 123456 1.00000000 A00.230.800.520.200.150.19 B00.380.530.760.240.200.23 C00.300.160.200.610.190.22 D00.340.150.200.360.210.22 E00.270.210.190.170.260.19 F00.300.190.230.220.190.70 123456 1.00000000 A00.230.800.520.200.150.19 B00.380.530.760.240.200.23 C00.300.160.200.610.200.22 D00.340.150.200.370.240.25 E00.270.210.190.180.280.20 F00.300.190.23 0.200.72

13 13/21 SIGMOD 2014 Trade-off between accuracy and efficiency.

14  Motivation  Event Matching Similarity  Structural Similarity Function  Iterative Computation  Estimation  Matching Composite Events  Experiments  Conclusion 14/21 SIGMOD 2014

15  Candidates of Composite Events:  C and D, E and F…  Pre-defined or discovered automatically  Heuristics:  Which candidate improves the average similarity 15/21 SIGMOD 2014 A B C D E F 1 3 2 4 5 6 A B C,D E F A B C D E,F

16  Motivation  Event Matching Similarity  Structural Similarity Function  Iterative Computation  Estimation  Matching Composite Events  Experiments  Conclusion 16/21 SIGMOD 2014

17  Real Life Data Set: employed from a real bus manufacturer  True event matching is generated manually by domain experts.  Criteria: to evaluate the accuracy of event matching,  F-measure of precision and recall.  Baseline: Graph Edit Distance 1, Opaque matching 2, Behavioral Matching 3. 1. R. M. Dijkman, M. Dumas, and L. Garc´ıa-Ba˜nuelos. Graph matching algorithms for business process model similarity search. In BPM, pages 48–63, 2009 2. J. Kang and J. F. Naughton. On schema matching with opaque column names and data values. In SIGMOD Conference, pages 205–216, 2003 3. S. Nejati, M. Sabetzadeh, M. Chechik, S. M. Easterbrook, and P. Zave. Matching and merging of statecharts specifications. In ICSE, pages 54–64, 2007. 17/21 No. of Event Logs149Min Event Size2 No. of Traces6000Max Event Size11 ICDE 2014

18 18/21 ICDE 2014 Our Approach

19 19/21 ICDE 2014

20  Event matching framework:  Work well with dislocated matching.  Work well with opaque event names.  An estimative function for trade-off.  Heuristics on matching composite events. 20/21 SIGMOD 2014

21 Thanks ! 21/21 SIGMOD 2014


Download ppt "Xiaochen Zhu 1, Shaoxu Song 1, Xiang Lian 2, Jianmin Wang 1, Lei Zou 3 1 Tsinghua University, China 2 University of Texas - Pan American, USA 3 Peking."

Similar presentations


Ads by Google