Data mining on icdm submission data
This presentation is the property of its rightful owner.
Sponsored Links
1 / 20

Data Mining on ICDM Submission Data PowerPoint PPT Presentation


  • 113 Views
  • Uploaded on
  • Presentation posted in: General

Data Mining on ICDM Submission Data. Shusaku Tsumoto Ning Zhong and Xindong Wu. Data Mining on ICDM Submission Data. 38 countries, 445 Submissions Regular Papers: 39 (9%) Short Papers: 66 (14.8%) High Acceptance Ratio (Regular) Germany: 4/15 (26.7%)

Download Presentation

Data Mining on ICDM Submission Data

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Data mining on icdm submission data

Data Mining on ICDM Submission Data

Shusaku Tsumoto

Ning Zhong and Xindong Wu

ICDM 2004 Business Meeting 11/4/2004


Data mining on icdm submission data1

Data Mining on ICDM Submission Data

  • 38 countries, 445 Submissions

  • Regular Papers: 39 (9%)

  • Short Papers: 66 (14.8%)

  • High Acceptance Ratio (Regular)

    • Germany: 4/15 (26.7%)

    • Finland: 2/ 9 (22.2%)

    • USA: 20/109 (18.3%)

ICDM 2004 Business Meeting 11/4/2004


Country

Country

ICDM 2004 Business Meeting 11/4/2004


Data mining on icdm submission data2

Data Mining on ICDM Submission Data

  • Top 5 Areas of Submissions:

    • Data mining applications

    • Data mining and machine learning algorithms and methods

    • Mining text and semi-structured data, and mining temporal, spatial and multimedia data

    • Data pre-processing, data reduction, feature selection and feature transformation

    • Soft computing and uncertainty management for data mining

  • High Acceptance Ratio Areas (Regular+Short)

    • Quality assessment and interestingness metrics of data mining results

      5/1050.0%

    • Data pre-processing, data reduction, feature selection and feature transformation14/3540.0%

    • Complexity, efficiency, and scalability issues in data mining4/1136.4%

ICDM 2004 Business Meeting 11/4/2004


Topics

Topics


Corresponding analysis country vs final decision

Corresponding Analysis(Country vs Final Decision)

r2=0.177

Slovenia

Regular

Finland

Italy

Australia

India

Hong Kong

Canada

r1=0.378

Germany

USA

Reject

UK

France

Japan

Short

ICDM 2004 Business Meeting 11/4/2004


Corresponding analysis topics vs final decision

Corresponding Analysis(Topics vs Final Decision)

r2=0.184

Collaborative Filtering

Applications

Short

Reject

DM Methods

Quality-assessment

Soft-computing

Preprocessing, Feature Selection

r1=0.280

Security, privacy

Regular

Statistics and probability

High-performance

Human-machine interaction

and visualization

ICDM 2004 Business Meeting 11/4/2004

Post-processing


Corresponding analysis

Corresponding Analysis

  • Country vs Final Decision

    • Regular: Germany, USA

    • Short: ?

    • Reject: Most of the countries are located near this region.

  • Topics vs Final Decision

    • Regular: Quality Assessment, Preprocessing/Feature Selection

    • Short: DM/ML Methods, Collaborative Filtering

    • Reject: DM Applications

ICDM 2004 Business Meeting 11/4/2004


Rule mining on icdm submission data

Rule Mining on ICDM Submission Data

  • Datasets

    • Sample Size: 445

    • Attributes: 5

      • Paper No. : ordered by submission date

      • # of Authors

      • # of Characters in Title

      • Country

      • Category

    • Analyzed by Clementine 7.1 (and SPSS12.0J)

ICDM 2004 Business Meeting 11/4/2004


Rule mining c5 0 on icdm submission data

Rule Mining (C5.0)on ICDM Submission Data

  • C5.0

    • [Topic=Mining semi-structured data,…] & [129< Paper No.<=369] => Reject (Confidence 0.87, Support 10)

    • [Country=USA] &[Topic=Mining semi-structured data,…] & [Paper No.>369] & [# of Authors <=3] =>Accept (Confidence 0.667, Support 3)

    • [Topic=Preprocessing/Feature Selection] & [# of Authors>4] => Accept (Confidence: 1.0, Support 3)

    • Topic, Paper No, # of Authors : Important Features

ICDM 2004 Business Meeting 11/4/2004


Rule mining gri on icdm submission data

Rule Mining (GRI)on ICDM Submission Data

  • Generalized Rule Induction

    • [# of Authors <2] & [Paper No. <120.5] => Rejected (Confidence 96.0%, Support 24)

    • [# of Chars in Title< 27] & [Paper No. > 212]=> Accepted (Confidence 100%, Support 5)

  • Paper No., # of Chars in Title, # of Authors: Important Features

ICDM 2004 Business Meeting 11/4/2004


Multidimensional scaling 2004

Multidimensional Scaling(2004)

Country

Decision

Paper No.

Review Score

Topics

# of Authors

# of Chars in Title

ICDM 2004 Business Meeting 11/4/2004


Summary 2004 of mining on icdm submission data

Summary (2004) of Mining on ICDM Submission Data

  • Do not submit a paper too fast !

    • Reflection not only on the contents, but also on the titles needed

  • Mining Text/Web/Semi-structured Data are very popular.

  • # of Application papers are growing now. (But, many: rejected)

  • Strong Topics

    • Preprocessing/Feature-Selection

    • Postprocessing

    • Security and Privacy

  • Several topics are emerging in ICDM2004:

    • Mining Data Streams

    • Collaborative Filtering

    • Quality Assessment

ICDM 2004 Business Meeting 11/4/2004


Comparison between 02 04 review scores box plot

Comparison between 02-04Review Scores: Box-plot

ICDM 2004 Business Meeting 11/4/2004


Comparison between 02 04 countries

Comparison between 02-04Countries

ICDM 2004 Business Meeting 11/4/2004


Comparison between 02 and 04 topics

Comparison between 02 and 04Topics


Multidimensional scaling 2003 and 2004

Multidimensional Scaling(2003 and 2004)

Topological structure w.r.t. similarities

seems not to be changed in 2003

and 2004.

Country

Decision

Paper No.

2004

Topics

Review Score

# of Authors

Country

# of Chars in Title

2003

Decision

Paper No.

Review Score

Topics

# of Authors

ICDM 2004 Business Meeting 11/4/2004

# of Chars in Title


Data mining on icdm submission data3

Data Mining on ICDM Submission Data

  • Acknowledgements

    • Many thanks to

      • PC chairs, Vice Chairs and PC members

      • All the authors

      • All the contributors to ICDM2004

    • See you again in ICDM2005!

ICDM 2004 Business Meeting 11/4/2004


Multidimensional scaling 20041

Multidimensional Scaling(2004)

Country

Decision

Paper No.

Review Score

Topics

# of Authors

# of Chars in Title

ICDM 2004 Business Meeting 11/4/2004


Multidimensional scaling 2003

Multidimensional Scaling(2003)

Country

Decision

Paper No.

Review Score

Topics

# of Authors

# of Chars in Title

ICDM 2004 Business Meeting 11/4/2004


  • Login