160 likes | 270 Views
Explore the integration of OntoSelect with SWSE to enhance ontology discovery and facilitate efficient search functionalities. Discover the architecture, experiments, and benefits of this collaboration for improved ontology ranking. Visit the OntoSelect platform for ontology publishing, browsing, and searching, enriched with rich metadata. Take advantage of multilingual search capabilities and statistical insights to enhance your ontology exploration experience.
E N D
From OntoSelect to OntoSelect-SWSE Paul Buitelaar, Andreas Harth DERI – NUIG, Galway February 2009
Outline • OntoSelect @ DFKI • Recap of OntoSelect Functionality • OntoSelect @ DERI • SWSE: Semantic Web Search Engine Architecture • OntoSelect-SWSE • OntoSelect-SWSE Experiments • Ranked List of Ontologies with Rich Metadata
OntoSelect • Ontology Library and Ontology Search Service http://olp.dfki.de/OntoSelect • OntoSelect monitors the web for ontologies (indexing/updates) • Ontology browse and search (by keyword, topic, document) • Class, property and (multilingual) label browse and search • Ontology publishing (submit your ontology) • Statistics on • Formats • Human languages • Frequently used labels • Ontology publishing
OntoSelect Statistics - Multilinguality Distribution of languages in 136 ontologies with multilingual labels - out of 1530 ontologies currently collected (~9%)
OntoSelect Statistics - Labels Most frequently used labels (‘words’, ‘terms’) in 1530 ontologies
SWSE Architecture • Distributed shared-nothing architecture • Implementation scales to billions of triples
OntoSelect-SWSE Experiments • Experiment Onto • Seed set from Google & Yahoo (*.owl, *.daml, *.rdfs) • 27,519 data sources (ontologies) • 6.5m statements • Experiment Web • Crawling six degrees from seed URI http://www.w3.org/People/Berners-Lee/card • 100,555 data sources (instance data + ontologies) • 11.9m statements
Conclusion • SWSE framework facilitates web data experiments • Taking into account real-world instance usage improves ontology ranking • Web data is noisy (more data providers == more noise) • Ontology registry should include rating facility to “vote out” data sources that do not provide consensus view