mapping between taxonomies l.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Mapping Between Taxonomies PowerPoint Presentation
Download Presentation
Mapping Between Taxonomies

Loading in 2 Seconds...

play fullscreen
1 / 19

Mapping Between Taxonomies - PowerPoint PPT Presentation


  • 311 Views
  • Uploaded on

Mapping Between Taxonomies Elena Eneva 27 Sep 2001 Advanced IR Seminar Taxonomies Formal systems of orderly classification of knowledge, which are designed for a specific purpose Change of purpose, change of taxonomies Businesses often need and keep the information in several structures

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Mapping Between Taxonomies' - Anita


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
mapping between taxonomies

Mapping Between Taxonomies

Elena Eneva

27 Sep 2001

Advanced IR Seminar

taxonomies
Taxonomies
  • Formal systems of orderly classification of knowledge, which are designed for a specific purpose
  • Change of purpose, change of taxonomies
  • Businesses often need and keep the

information in several structures

  • Important to be able to automatically map between taxonomies
useful mappings
Useful Mappings
  • Companies, organizing information in various ways (eg. one for marketing, another for product development)
  • Personal online bookmark classification
  • Search engines (eg. Google <-> Yahoo)
  • EU Committee for Standardization “detailed overview of the existing taxonomies officially used in the EU, in order to derive general concepts such as: information organisation, properties, multilinguality, keywords, etc. and, last but not least, the mapping between.”
approach

German

Textile

Approach

French

Automobile

By country

By industry

approach5

German

Textile

Approach

French

Automobile

By country

By industry

approach6

German

Textile

Approach

French

Automobile

By country

By industry

approach7

German

Textile

Approach

French

Automobile

By country

By industry

approach8

Textile

Approach

Automobile

By industry

approach9

abc

abc

abc

abc

abc

abc

Textile

Approach

Automobile

abc

abc

abc

abc

abc

abc

abc

abc

abc

abc

abc

abc

abc

abc

abc

abc

abc

abc

By industry

approach10

Textile

Approach

Automobile

abc

abc

abc

abc

abc

abc

abc

abc

abc

abc

abc

abc

abc

abc

abc

abc

abc

abc

abc

abc

abc

abc

abc

abc

By industry

approach11

Textile

Approach

Automobile

abc

abc

abc

abc

abc

abc

abc

abc

abc

abc

abc

abc

abc

abc

abc

abc

abc

abc

abc

abc

abc

abc

abc

abc

By industry

approach12

German

Textile

Approach

French

Automobile

By country

abc

abc

abc

abc

By industry

approach13

German

Textile

Approach

French

Automobile

By country

abc

abc

abc

abc

By industry

approach14

German

Textile

Approach

French

Automobile

By country

abc

abc

abc

abc

By industry

abc

abc

abc

abc

learning algorithms
Learning Algorithms
  • 2 separate learners for the documents
    • Old doc category -> new doc category
    • Doc contents -> new category
  • Weighted average based on confidence
  • Final result determined by a decision tree
  • One combined learner – used both old category and contents as features
  • Use the unlabeled data for bootstrapping (eg. top 1%)
learners
Learners
  • Decision Tree (C4.5)
  • Naïve Bayes Classifier (Rainbow)
  • Support Vector Machine (SVM-Light)
  • KNN (from Yiming)
datasets
Datasets

Two classification schemes:

  • Reuter 2001
    • Topics
    • Industry categories
  • Hoovers-255 and Hoovers-28
    • 28 industry categories
    • 255 industry categories
  • Web pages from Google and Yahoo
related literature
Related Literature
    • Reconciling Schemas of Disparate Data Sources: A Machine Learning Approach, A. Doan, P. Domingos, and A. Halevy. Proceedings of the ACM SIGMOD Conf. on Management of Data (SIGMOD-2001)
  • Learning Source Descriptions for Data Integration, A. Doan, P. Domingos, and A. Levy. Proceedings of the Third International Workshop on the Web and Databases (WebDB-2000), pages 81-86, 2000. Dallas, TX: ACM SIGMOD.
  • Learning Mappings between Data Schemas , A. Doan, P. Domingos, and A. Levy. Proceedings of the AAAI-2000 Workshop on Learning Statistical Models from Relational Data, 2000, Austin, TX.
questions and ideas
Questions and Ideas
  • Other possible datasets?
  • Other learners?
  • Other papers?

The end.