data mining on icdm submission data l.
Skip this Video
Loading SlideShow in 5 Seconds..
Data Mining on ICDM Submission Data PowerPoint Presentation
Download Presentation
Data Mining on ICDM Submission Data

Loading in 2 Seconds...

  share
play fullscreen
1 / 20
Download Presentation

Data Mining on ICDM Submission Data - PowerPoint PPT Presentation

Faraday
308 Views
Download Presentation

Data Mining on ICDM Submission Data

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

  1. Data Mining on ICDM Submission Data Shusaku Tsumoto Ning Zhong and Xindong Wu ICDM 2004 Business Meeting 11/4/2004

  2. Data Mining on ICDM Submission Data • 38 countries, 445 Submissions • Regular Papers: 39 (9%) • Short Papers: 66 (14.8%) • High Acceptance Ratio (Regular) • Germany: 4/15 (26.7%) • Finland: 2/ 9 (22.2%) • USA: 20/109 (18.3%) ICDM 2004 Business Meeting 11/4/2004

  3. Country ICDM 2004 Business Meeting 11/4/2004

  4. Data Mining on ICDM Submission Data • Top 5 Areas of Submissions: • Data mining applications • Data mining and machine learning algorithms and methods • Mining text and semi-structured data, and mining temporal, spatial and multimedia data • Data pre-processing, data reduction, feature selection and feature transformation • Soft computing and uncertainty management for data mining • High Acceptance Ratio Areas (Regular+Short) • Quality assessment and interestingness metrics of data mining results 5/10 50.0% • Data pre-processing, data reduction, feature selection and feature transformation 14/35 40.0% • Complexity, efficiency, and scalability issues in data mining 4/11 36.4% ICDM 2004 Business Meeting 11/4/2004

  5. Topics

  6. Corresponding Analysis(Country vs Final Decision) r2=0.177 Slovenia Regular Finland Italy Australia India Hong Kong Canada r1=0.378 Germany USA Reject UK France Japan Short ICDM 2004 Business Meeting 11/4/2004

  7. Corresponding Analysis(Topics vs Final Decision) r2=0.184 Collaborative Filtering Applications Short Reject DM Methods Quality-assessment Soft-computing Preprocessing, Feature Selection r1=0.280 Security, privacy Regular Statistics and probability High-performance Human-machine interaction and visualization ICDM 2004 Business Meeting 11/4/2004 Post-processing

  8. Corresponding Analysis • Country vs Final Decision • Regular: Germany, USA • Short: ? • Reject: Most of the countries are located near this region. • Topics vs Final Decision • Regular: Quality Assessment, Preprocessing/Feature Selection • Short: DM/ML Methods, Collaborative Filtering • Reject: DM Applications ICDM 2004 Business Meeting 11/4/2004

  9. Rule Mining on ICDM Submission Data • Datasets • Sample Size: 445 • Attributes: 5 • Paper No. : ordered by submission date • # of Authors • # of Characters in Title • Country • Category • Analyzed by Clementine 7.1 (and SPSS12.0J) ICDM 2004 Business Meeting 11/4/2004

  10. Rule Mining (C5.0)on ICDM Submission Data • C5.0 • [Topic=Mining semi-structured data,…] & [129< Paper No.<=369] => Reject (Confidence 0.87, Support 10) • [Country=USA] & [Topic=Mining semi-structured data,…] & [Paper No.>369] & [# of Authors <=3] =>Accept (Confidence 0.667, Support 3) • [Topic=Preprocessing/Feature Selection] & [# of Authors>4] => Accept (Confidence: 1.0, Support 3) • Topic, Paper No, # of Authors : Important Features ICDM 2004 Business Meeting 11/4/2004

  11. Rule Mining (GRI)on ICDM Submission Data • Generalized Rule Induction • [# of Authors <2] & [Paper No. <120.5] => Rejected (Confidence 96.0%, Support 24) • [# of Chars in Title< 27] & [Paper No. > 212] => Accepted (Confidence 100%, Support 5) • Paper No., # of Chars in Title, # of Authors: Important Features ICDM 2004 Business Meeting 11/4/2004

  12. Multidimensional Scaling(2004) Country Decision Paper No. Review Score Topics # of Authors # of Chars in Title ICDM 2004 Business Meeting 11/4/2004

  13. Summary (2004) of Mining on ICDM Submission Data • Do not submit a paper too fast ! • Reflection not only on the contents, but also on the titles needed • Mining Text/Web/Semi-structured Data are very popular. • # of Application papers are growing now. (But, many: rejected) • Strong Topics • Preprocessing/Feature-Selection • Postprocessing • Security and Privacy • Several topics are emerging in ICDM2004: • Mining Data Streams • Collaborative Filtering • Quality Assessment ICDM 2004 Business Meeting 11/4/2004

  14. Comparison between 02-04Review Scores: Box-plot ICDM 2004 Business Meeting 11/4/2004

  15. Comparison between 02-04Countries ICDM 2004 Business Meeting 11/4/2004

  16. Comparison between 02 and 04Topics

  17. Multidimensional Scaling(2003 and 2004) Topological structure w.r.t. similarities seems not to be changed in 2003 and 2004. Country Decision Paper No. 2004 Topics Review Score # of Authors Country # of Chars in Title 2003 Decision Paper No. Review Score Topics # of Authors ICDM 2004 Business Meeting 11/4/2004 # of Chars in Title

  18. Data Mining on ICDM Submission Data • Acknowledgements • Many thanks to • PC chairs, Vice Chairs and PC members • All the authors • All the contributors to ICDM2004 • See you again in ICDM2005! ICDM 2004 Business Meeting 11/4/2004

  19. Multidimensional Scaling(2004) Country Decision Paper No. Review Score Topics # of Authors # of Chars in Title ICDM 2004 Business Meeting 11/4/2004

  20. Multidimensional Scaling(2003) Country Decision Paper No. Review Score Topics # of Authors # of Chars in Title ICDM 2004 Business Meeting 11/4/2004