1 / 38

ArchiWordNet Integrating WordNet with Domain-Specific Knowledge

ArchiWordNet Integrating WordNet with Domain-Specific Knowledge Luisa Bentivogli 1 , Andrea Bocco 2 , Emanuele Pianta 1 1 ITC-irst Trento, Italy 2 Politecnico di Torino, Italy. Outline. ArchiWordNet: a WordNet-like thesaurus Adopting and adapting the MultiWordNet model

mills
Download Presentation

ArchiWordNet Integrating WordNet with Domain-Specific Knowledge

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. ArchiWordNet Integrating WordNet with Domain-Specific Knowledge Luisa Bentivogli1, Andrea Bocco2, Emanuele Pianta1 1ITC-irst Trento, Italy 2Politecnico di Torino, Italy

  2. Outline • ArchiWordNet: a WordNet-like thesaurus • Adopting and adapting the MultiWordNet model • Integrating ArchiWordNet with MultiWordNet • Conclusion and future work GWC 2004 - Brno, January 20-23, 2004

  3. Outline • ArchiWordNet: a WordNet-like thesaurus • Adopting and adapting the MultiWordNet model • Integrating ArchiWordNet with MultiWordNet • Conclusion and future work GWC 2004 - Brno, January 20-23, 2004

  4. ArchiWordNet: a WordNet-like thesaurus • A bilingual English/Italian thesaurus for the “Architecture and Construction” domain • structured according to the WordNet model • fully integrated with MultiWordNet MultiWordNet A multilingual lexical database in which the Italian WordNet is strictly aligned with Princeton’s English WordNet. GWC 2004 - Brno, January 20-23, 2004

  5. Motivation • Still Image Server, an architecture image archive available at the Polytechnic of Turin • need for a thesaurus: • Image cataloguing (minimize subjectivity) • Image retrieval(minimize ambiguity) • No exhaustive thesauri for the architecture domain are available GWC 2004 - Brno, January 20-23, 2004

  6. Why (Multi)WordNet model? • A rich and rigorous structure • synonyms • many relations explicitly and homogeneously encoded • Allows for a more powerful and expressive retrieval mechanism • no ambiguities • extended search with related concepts • Is more suitable for educational purposes GWC 2004 - Brno, January 20-23, 2004

  7. Why integrated with MultiWN? • General and multilingual framework for the specialized knowledge • Integrated access allowing for a more flexible retrieval of the information • Information already existing in the generic (Multi)WordNet can be exploited in the creation of the specialized one GWC 2004 - Brno, January 20-23, 2004

  8. Outline • ArchiWordNet: a WordNet-like thesaurus • Adopting and adapting the MultiWordNet model • Integrating ArchiWordNet with MultiWordNet • Conclusion and future work GWC 2004 - Brno, January 20-23, 2004

  9. Adopting MultiWN model • Sources: • Specialized sources • Art and Architecture Thesaurus (AAT) • Construction Indexing ManualofCI|SfB • International and National standards (ISO, CEN, UNI) • Architecture and Building Dictionaries • Domain literature • MultiWN itself • Issues: • Reorganize specialized sources to make them compatible with the MultiWN model • Modify MultiWN synsets to make them suitable for representing the specialized domain GWC 2004 - Brno, January 20-23, 2004

  10. Reorganizing domain-specific sources AAT hierarchy ArchiWN hierarchy GWC 2004 - Brno, January 20-23, 2004

  11. Tailoring MultiWN synsets • MultiWN synsets considered appropriate by the domain experts are included into ArchiWN • Several options are available: • add or delete synonyms to MultiWN synsets • modify MultiWN definitions of the synsets • delete and add relations between synsets GWC 2004 - Brno, January 20-23, 2004

  12. New relations for ArchiWN • HAS FORM (n/n) • {tympanum} HAS-FORM {triangle, trigon, …} • HAS ROLE (n/n) • {metal section} HAS-ROLE {upright, vertical} • HAS FUNCTION (n/v) • {beam} HAS-FUNCTION {to hold, to support,…} GWC 2004 - Brno, January 20-23, 2004

  13. Outline • ArchiWordNet: a WordNet-like thesaurus • Adopting and adapting the MultiWordNet model • Integrating ArchiWordNet with MultiWordNet • Conclusion and future work GWC 2004 - Brno, January 20-23, 2004

  14. Integrating ArchiWN with MultiWN • 5,000 terms grouped in 13 semantic areas => the main ArchiWN hierarchies • Architectural styles • Materials • Construction products • Techniques • Tools • Components of buildings • Single buildings and building complexes • Physical properties • Conditions • Disciplines • People • Documents • Drawings and representations GWC 2004 - Brno, January 20-23, 2004

  15. Integration issues • Identify the MultiWN nodes where to insert the ArchiWN hierarchies • Include ArchiWN hierarchies in MultiWN • Handle the overlaps between terms present in both MultiWN and ArchiWN • Handle the possible inconsistencies in the hierarchies GWC 2004 - Brno, January 20-23, 2004

  16. The integration methodology • Basic operations • performed on single MultiWN synsets • Complex procedures (plug-in) • apply to entire hierarchies GWC 2004 - Brno, January 20-23, 2004

  17. Basic operations • eclipse a synset • tag a synset with the “architecture and construction” domain label • add or delete relations to a synset • add or delete synonyms in a synset • modify the synset definition GWC 2004 - Brno, January 20-23, 2004

  18. Complex procedures • Substitutive plug-in • Integrative plug-in • Hyponymic plug-in • Inverse plug-in GWC 2004 - Brno, January 20-23, 2004

  19. Complex procedures MWN MWN MWN MWN MWN MWN • Substitutive plug-in • Integrative plug-in • Hyponymic plug-in • Inverse plug-in GWC 2004 - Brno, January 20-23, 2004

  20. Complex procedures MWN AWN MWN AWN AWN AWN • Substitutive plug-in • Integrative plug-in • Hyponymic plug-in • Inverse plug-in GWC 2004 - Brno, January 20-23, 2004

  21. Complex procedures MWN MWN MWN MWN MWN MWN • Substitutive plug-in • Integrative plug-in • Hyponymic plug-in • Inverse plug-in GWC 2004 - Brno, January 20-23, 2004

  22. Complex procedures MWN AWN MWN AWN MWN AWN • Substitutive plug-in • Integrative plug-in • Hyponymic plug-in • Inverse plug-in GWC 2004 - Brno, January 20-23, 2004

  23. Complex procedures MWN MWN MWN MWN MWN • Substitutive plug-in • Integrative plug-in • Hyponymic plug-in • Inverse plug-in GWC 2004 - Brno, January 20-23, 2004

  24. Complex procedures MWN MWN AWN MWN MWN MWN AWN AWN • Substitutive plug-in • Integrative plug-in • Hyponymic plug-in • Inverse plug-in GWC 2004 - Brno, January 20-23, 2004

  25. Complex procedures AWN AWN AWN AWN AWN • Substitutive plug-in • Integrative plug-in • Hyponymic plug-in • Inverse plug-in GWC 2004 - Brno, January 20-23, 2004

  26. Complex procedures AWN AWN MWN AWN AWN AWN MWN MWN • Substitutive plug-in • Integrative plug-in • Hyponymic plug-in • Inverse plug-in GWC 2004 - Brno, January 20-23, 2004

  27. Results • 13 ArchiWN semantic areas plugged in 18 MultiWN synsets • 11 ArchiWN semantic areas (12 hierarchies) directly plugged in MultiWN • 4 substitutive plug-ins • 8 integrative plug-ins • 2 ArchiWN semantic areas (6 hierarchies) required a reorganization of some MultiWN sub-hierarchies • 4 hyponymic plug-ins • 2 inverse plug-ins • large synset eclipsing GWC 2004 - Brno, January 20-23, 2004

  28. ArchiWN up to now • “Single buildings and building complexes” sub-hierarchy • 900 synsets • Italian and English synonyms • accurate definition • Work done manually using the MultiWN graphical interface which allows the user • to modify existing synsets and relations • to create new synsets GWC 2004 - Brno, January 20-23, 2004

  29. Outline • ArchiWordNet: a WordNet-like thesaurus • Adopting and adapting the MultiWordNet model • Integrating ArchiWordNet with MultiWordNet • Conclusion and future work GWC 2004 - Brno, January 20-23, 2004

  30. Conclusions • It is possible to integrate ArchiWN with MultiWN • MultiWN itself can be widely exploited in the creation of ArchiWN hierarchies • Advantages of interdisciplinary cooperation • wrt specialized thesauri • formalized structure • inheritance of linguistic-oriented information from the generic WordNet • wrt lexical resources • many synsets will be associated with images GWC 2004 - Brno, January 20-23, 2004

  31. Future work • Go on enriching the “Single buildings and building complexes” hierarchy and populating the remaining hierarchies • Industrial applications:multilingual specialized lexicon of approximately 1,000 synsets for the window and curtain wall industry • Agreement for the future usage of ArchiWN by the Piemonte region in the cataloguing of its architectural cultural heritage GWC 2004 - Brno, January 20-23, 2004

  32. Details GWC 2004 - Brno, January 20-23, 2004

  33. Direct plug-ins back GWC 2004 - Brno, January 20-23, 2004

  34. Reorganizations back GWC 2004 - Brno, January 20-23, 2004

  35. Term overlapping ITC-irst provides the Polythecnic with lists of terms: -synsets tagged with the “architecture” label in WN-Domains -hyponyms of WordNet plug-in synsets WN-Domains: 2,595 • Architecture = 155 synsets • Town planning = 444 synsets • Building industry = 1,541 synsets • Furniture = 455 synsets GWC 2004 - Brno, January 20-23, 2004

  36. Hyponyms of Plug-in synsets back GWC 2004 - Brno, January 20-23, 2004

  37. entity/1 object/1 artifact/1 part/4 location/1 structure/1 component/3 region/1 structure (AWN) architectural component architectural space building/1 building complex/1 building element room, area, building space open space Reorganization of: -Components of buildings -Single buildings and building complexes eclipsing hypo hypo hypo hypo inverse inverse GWC 2004 - Brno, January 20-23, 2004

  38. Modifying MultiWN definition partition divider support ISA ISA wall structural_wall bearing_wall an architectural partition with a height and length greater than its thickness; used to divide or enclose an area any wall supporting a floor or the roof of a building WordNet: {wall – “an architectural partition with a height and length greater than its thickness; used to divide or enclose an area or to support another structure”} GWC 2004 - Brno, January 20-23, 2004

More Related