1 / 15

Presenter : Chang,Chun-Chih Authors : David Milne * , Ian H. Witten 2012, AI

An open-source toolkit for mining Wikipedia. Presenter : Chang,Chun-Chih Authors : David Milne * , Ian H. Witten 2012, AI. Outlines. Motivation Objectives Methodology Experiments Conclusions Comments. Motivation.

mireya
Download Presentation

Presenter : Chang,Chun-Chih Authors : David Milne * , Ian H. Witten 2012, AI

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. An open-source toolkit for mining Wikipedia Presenter : Chang,Chun-ChihAuthors : David Milne *, Ian H. Witten2012, AI

  2. Outlines • Motivation • Objectives • Methodology • Experiments • Conclusions • Comments

  3. Motivation • The online encyclopedia Wikipedia is a vast, constantly evolving tapestry of interlinked articles. • For developers and researchers it represents a giant multilingual database of concepts and semantic relations, a potential resource for natural language processing

  4. Objectives • The Wikipedia Miner toolkit, an open-source software system that allows researchers and developers to integrate Wikipedia’s rich semantics into their own applications. • Wikipedia Miner is intended to be a platform for sharing data mining techniques.

  5. Methodology- Architecture of the wikipedia Miner toolkit

  6. Methodology- Measuring relatedness between concepts

  7. Methodology- Measuring relatedness between concepts

  8. Methodology-Features for measuring artucle relatedness

  9. Experiments- Impact of thresholds for disambiguation and detection

  10. Experiments - Impact of relatedness dependencies

  11. Experiments- Impact of traning data

  12. Experiments - performance of the disambiguator

  13. Experiments- performance of the detector

  14. Conclusions • Our aim in releasing this work open source is not to provide a complete and polished product, • but rather a resource for the research community to collaborate around and continue building together.

  15. Comments • Advantages • Applications - wikipedia - Disambiguation - Annotation

More Related