1 / 20

Data Mining on ICDM Submission Data

Data Mining on ICDM Submission Data. Shusaku Tsumoto Ning Zhong and Xindong Wu. Data Mining on ICDM Submission Data. 38 countries, 445 Submissions Regular Papers: 39 (9%) Short Papers: 66 (14.8%) High Acceptance Ratio (Regular) Germany: 4/15 (26.7%)

tait
Download Presentation

Data Mining on ICDM Submission Data

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Data Mining on ICDM Submission Data Shusaku Tsumoto Ning Zhong and Xindong Wu ICDM 2004 Business Meeting 11/4/2004

  2. Data Mining on ICDM Submission Data • 38 countries, 445 Submissions • Regular Papers: 39 (9%) • Short Papers: 66 (14.8%) • High Acceptance Ratio (Regular) • Germany: 4/15 (26.7%) • Finland: 2/ 9 (22.2%) • USA: 20/109 (18.3%) ICDM 2004 Business Meeting 11/4/2004

  3. Country ICDM 2004 Business Meeting 11/4/2004

  4. Data Mining on ICDM Submission Data • Top 5 Areas of Submissions: • Data mining applications • Data mining and machine learning algorithms and methods • Mining text and semi-structured data, and mining temporal, spatial and multimedia data • Data pre-processing, data reduction, feature selection and feature transformation • Soft computing and uncertainty management for data mining • High Acceptance Ratio Areas (Regular+Short) • Quality assessment and interestingness metrics of data mining results 5/10 50.0% • Data pre-processing, data reduction, feature selection and feature transformation 14/35 40.0% • Complexity, efficiency, and scalability issues in data mining 4/11 36.4% ICDM 2004 Business Meeting 11/4/2004

  5. Topics

  6. Corresponding Analysis(Country vs Final Decision) r2=0.177 Slovenia Regular Finland Italy Australia India Hong Kong Canada r1=0.378 Germany USA Reject UK France Japan Short ICDM 2004 Business Meeting 11/4/2004

  7. Corresponding Analysis(Topics vs Final Decision) r2=0.184 Collaborative Filtering Applications Short Reject DM Methods Quality-assessment Soft-computing Preprocessing, Feature Selection r1=0.280 Security, privacy Regular Statistics and probability High-performance Human-machine interaction and visualization ICDM 2004 Business Meeting 11/4/2004 Post-processing

  8. Corresponding Analysis • Country vs Final Decision • Regular: Germany, USA • Short: ? • Reject: Most of the countries are located near this region. • Topics vs Final Decision • Regular: Quality Assessment, Preprocessing/Feature Selection • Short: DM/ML Methods, Collaborative Filtering • Reject: DM Applications ICDM 2004 Business Meeting 11/4/2004

  9. Rule Mining on ICDM Submission Data • Datasets • Sample Size: 445 • Attributes: 5 • Paper No. : ordered by submission date • # of Authors • # of Characters in Title • Country • Category • Analyzed by Clementine 7.1 (and SPSS12.0J) ICDM 2004 Business Meeting 11/4/2004

  10. Rule Mining (C5.0)on ICDM Submission Data • C5.0 • [Topic=Mining semi-structured data,…] & [129< Paper No.<=369] => Reject (Confidence 0.87, Support 10) • [Country=USA] & [Topic=Mining semi-structured data,…] & [Paper No.>369] & [# of Authors <=3] =>Accept (Confidence 0.667, Support 3) • [Topic=Preprocessing/Feature Selection] & [# of Authors>4] => Accept (Confidence: 1.0, Support 3) • Topic, Paper No, # of Authors : Important Features ICDM 2004 Business Meeting 11/4/2004

  11. Rule Mining (GRI)on ICDM Submission Data • Generalized Rule Induction • [# of Authors <2] & [Paper No. <120.5] => Rejected (Confidence 96.0%, Support 24) • [# of Chars in Title< 27] & [Paper No. > 212] => Accepted (Confidence 100%, Support 5) • Paper No., # of Chars in Title, # of Authors: Important Features ICDM 2004 Business Meeting 11/4/2004

  12. Multidimensional Scaling(2004) Country Decision Paper No. Review Score Topics # of Authors # of Chars in Title ICDM 2004 Business Meeting 11/4/2004

  13. Summary (2004) of Mining on ICDM Submission Data • Do not submit a paper too fast ! • Reflection not only on the contents, but also on the titles needed • Mining Text/Web/Semi-structured Data are very popular. • # of Application papers are growing now. (But, many: rejected) • Strong Topics • Preprocessing/Feature-Selection • Postprocessing • Security and Privacy • Several topics are emerging in ICDM2004: • Mining Data Streams • Collaborative Filtering • Quality Assessment ICDM 2004 Business Meeting 11/4/2004

  14. Comparison between 02-04Review Scores: Box-plot ICDM 2004 Business Meeting 11/4/2004

  15. Comparison between 02-04Countries ICDM 2004 Business Meeting 11/4/2004

  16. Comparison between 02 and 04Topics

  17. Multidimensional Scaling(2003 and 2004) Topological structure w.r.t. similarities seems not to be changed in 2003 and 2004. Country Decision Paper No. 2004 Topics Review Score # of Authors Country # of Chars in Title 2003 Decision Paper No. Review Score Topics # of Authors ICDM 2004 Business Meeting 11/4/2004 # of Chars in Title

  18. Data Mining on ICDM Submission Data • Acknowledgements • Many thanks to • PC chairs, Vice Chairs and PC members • All the authors • All the contributors to ICDM2004 • See you again in ICDM2005! ICDM 2004 Business Meeting 11/4/2004

  19. Multidimensional Scaling(2004) Country Decision Paper No. Review Score Topics # of Authors # of Chars in Title ICDM 2004 Business Meeting 11/4/2004

  20. Multidimensional Scaling(2003) Country Decision Paper No. Review Score Topics # of Authors # of Chars in Title ICDM 2004 Business Meeting 11/4/2004

More Related