Download presentation
Presentation is loading. Please wait.
Published byFelicia Hunt Modified over 9 years ago
1
1 ICDM 2004 Business Meeting 11/4/2004 Data Mining on ICDM Submission Data Shusaku Tsumoto Ning Zhong and Xindong Wu
2
2 ICDM 2004 Business Meeting 11/4/2004 Data Mining on ICDM Submission Data n 38 countries, 445 Submissions n Regular Papers: 39 (9%) n Short Papers: 66 (14.8%) n High Acceptance Ratio (Regular) –Germany: 4/15 (26.7%) –Finland: 2/ 9 (22.2%) –USA: 20/109 (18.3%)
3
3 ICDM 2004 Business Meeting 11/4/2004 Country CountryRegularShortTotalRatio USA202810944.0% China345512.7% UK163917.9% Japan052817.9% Canada332524.0% Taiwan01185.6% Australia211717.6% Germany451560.0% France021414.3% India10147.1% Singapore031225.0% Brazil01128.3% Italy211030.0% Finland21933.3% Spain01714.3% HongKong11633.3% Top 15396339026.2% Total396644523.8%
4
4 ICDM 2004 Business Meeting 11/4/2004 Data Mining on ICDM Submission Data n Top 5 Areas of Submissions: –Data mining applications –Data mining and machine learning algorithms and methods –Mining text and semi-structured data, and mining temporal, spatial and multimedia data –Data pre-processing, data reduction, feature selection and feature transformation –Soft computing and uncertainty management for data mining n High Acceptance Ratio Areas (Regular+Short) –Quality assessment and interestingness metrics of data mining results 5/1050.0% –Data pre-processing, data reduction, feature selection and feature transformation14/3540.0% –Complexity, efficiency, and scalability issues in data mining 4/1136.4%
5
5 Topics Topic Reg ular ShortTotalRatio Data mining applications41084 16.7 % Data mining and machine learning algorithms and methods92081 35.8 % Mining text and semi-structured data, and mining temporal, spatial and multimedia data 3844 25.0 % Data pre-processing, data reduction, feature selection and feature transformation 7735 40.0 % Soft computing and uncertainty management for data mining 3348.8% Foundations of data mining2126 11.5 % Mining data streams3425 28.0 % Human-machine interaction and visual data mining 1166.3% Security, privacy and social impact of data mining2115 20.0 % Data and knowledge representation for data mining1112 16.7 % Pattern recognition and trend analysis 1119.1% Complexity, efficiency, and scalability issues in data mining2211 36.4 % Quality assessment and interestingness metrics of data mining results 2310 50.0 % Statistics and probability in large-scale data mining1 9 11.1 % Integration of data warehousing, OLAP and data mining 19 11.1 % Collaborative filtering/personalization 27 28.6 % Post-processing of data mining results117 28.6 % Others2 6 33.3 % High performance and parallel/distributed data mining1 2 50.0 % Query languages and user interfaces for mining 10.0% Total3966445 23.8 %
6
6 ICDM 2004 Business Meeting 11/4/2004 Corresponding Analysis (Country vs Final Decision) Reject Regular Short Slovenia Japan Hong Kong USA r2=0.177 Germany Italy India r1=0.378 Finland UK France Canada Australia
7
7 ICDM 2004 Business Meeting 11/4/2004 Corresponding Analysis (Topics vs Final Decision) Reject Short Regular Statistics and probability Security, privacy Applications Post-processing Human-machine interaction and visualization r2=0.184 Preprocessing, Feature Selection r1=0.280 High-performance Quality-assessment Collaborative Filtering Soft-computing DM Methods
8
8 ICDM 2004 Business Meeting 11/4/2004 Corresponding Analysis n Country vs Final Decision –Regular: Germany, USA –Short: ? –Reject: Most of the countries are located near this region. n Topics vs Final Decision –Regular: Quality Assessment, Preprocessing/Feature Selection –Short: DM/ML Methods, Collaborative Filtering –Reject: DM Applications
9
9 ICDM 2004 Business Meeting 11/4/2004 Rule Mining on ICDM Submission Data n Datasets – Sample Size: 445 – Attributes: 5 Paper No. : ordered by submission date # of Authors # of Characters in Title Country Category –Analyzed by Clementine 7.1 (and SPSS12.0J)
10
10 ICDM 2004 Business Meeting 11/4/2004 Rule Mining (C5.0) on ICDM Submission Data n C5.0 –[Topic=Mining semi-structured data,…] & [129 Reject (Confidence 0.87, Support 10) –[Country=USA] &[Topic=Mining semi-structured data,…] & [Paper No.>369] & [# of Authors Accept (Confidence 0.667, Support 3) –[Topic=Preprocessing/Feature Selection] & [# of Authors>4] => Accept (Confidence: 1.0, Support 3) –Topic, Paper No, # of Authors : Important Features
11
11 ICDM 2004 Business Meeting 11/4/2004 Rule Mining (GRI) on ICDM Submission Data n Generalized Rule Induction – [# of Authors Rejected (Confidence 96.0%, Support 24) –[# of Chars in Title 212] => Accepted (Confidence 100%, Support 5) n Paper No., # of Chars in Title, # of Authors: Important Features
12
12 ICDM 2004 Business Meeting 11/4/2004 Multidimensional Scaling (2004) Decision # of Authors Review Score # of Chars in Title Topics Paper No. Country
13
13 ICDM 2004 Business Meeting 11/4/2004 Summary (2004) of Mining on ICDM Submission Data n Do not submit a paper too fast ! –Reflection not only on the contents, but also on the titles needed n Mining Text/Web/Semi-structured Data are very popular. n # of Application papers are growing now. (But, many: rejected) n Strong Topics –Preprocessing/Feature-Selection –Postprocessing –Security and Privacy n Several topics are emerging in ICDM2004: –Mining Data Streams –Collaborative Filtering –Quality Assessment
14
14 ICDM 2004 Business Meeting 11/4/2004 Comparison between 02-04 Review Scores: Box-plot
15
15 ICDM 2004 Business Meeting 11/4/2004 Comparison between 02-04 Countries Country Acceptance Ratio (2002) Country Acceptance Ratio (2003) Country Acceptance Ratio (2004) Hong Kong64.7%Israel55.0%Germany60.0% USA47.9%Hong Kong50.0%USA44.0% Canada45.5%Japan37.0%Finland33.0% Finland33.3%USA33.0%Hong Kong33.0% France33.3%Germany32.0%Italy30.0%
16
16 Comparison between 02 and 04 Topics Top 5 in 2002 Acceptance Ratio Top 5 in 2003 Acceptance Ratio Top 5 in 2004 Acceptance Ratio Graph Mining 75.0% Process- centric DM 80.0%Quality Assessment50.0% Temporal Data 52.6% Security, privacy 57.0% Preprocessing, Feature Selection 40.0% Theory42.9% Statistics and Probability 47.0% Complexity/Scalabil ity 36.4% Text Mining 42.1% Visual Data Mining 38.0% DM and ML Methods 35.8% Rule41.7% Post- processing 41.7% Collaborative Filtering 28.6% Post-processing28.6%
17
17 ICDM 2004 Business Meeting 11/4/2004 Multidimensional Scaling (2003 and 2004) Decision # of Authors Review Score # of Chars in Title Topics Paper No. Country Decision # of Authors Review Score # of Chars in Title Topics Paper No. Country 2003 2004 Topological structure w.r.t. similarities seems not to be changed in 2003 and 2004.
18
18 ICDM 2004 Business Meeting 11/4/2004 Data Mining on ICDM Submission Data n Acknowledgements –Many thanks to PC chairs, Vice Chairs and PC members All the authors All the contributors to ICDM2004 –See you again in ICDM2005!
19
19 ICDM 2004 Business Meeting 11/4/2004 Multidimensional Scaling (2004) Decision # of Authors Review Score # of Chars in Title Topics Paper No. Country
20
20 ICDM 2004 Business Meeting 11/4/2004 Multidimensional Scaling (2003) Decision # of Authors Review Score # of Chars in Title Topics Paper No. Country
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.