1 / 14

The Semantic Web: Has the DB Community Missed the Bus (again ?)

Vipul Kashyap National Library of Medicine, NIH kashyap@nlm.nih.gov NSF Workshop on DB & IS Research for Semantic Web and Enterprises April 3, 2002. The Semantic Web: Has the DB Community Missed the Bus (again ?). What Makes the “Syntactic” Web click ?. Technology ? Yes, but …

Download Presentation

The Semantic Web: Has the DB Community Missed the Bus (again ?)

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Vipul Kashyap National Library of Medicine, NIH kashyap@nlm.nih.gov NSF Workshop on DB & IS Research for Semantic Web and Enterprises April 3, 2002 The Semantic Web:Has the DB Community Missed the Bus (again ?)

  2. What Makes the “Syntactic” Web click ? • Technology ? • Yes, but … • Why wasn’t the internet (telnet, ftp, gopher) as successful ? • Why were DBMS servers, CORBA/RMI not as successful ? • Multimedia ? • Probably… • Better “cognitive compatibility” as compared to text... • Ease of use ? • We are getting there … ! • Just point and click …. Easy to publish information ... • People ? • “For the people/by the people” • A “primitive but useful” mechanism for people to “socialize” with each other ! • Questions: • What is the semantic web ? How can it make the syntactic web better ? • How can DB research help ?

  3. Semantic “Networking” It is crucial for the interoperability layer to migrate fromthe syntactic to the semantic!

  4. User Query/ Information Request User Query/ Information Request User Query/ Information Request Inter-Ontology Relationships Manager Metadata Ontology Metadata Server Server Server The Semantic Web Fabric:A Collection of Metadata Descriptions and Ontologies Ontology Server MetadataRepository MetadataRepository Distributed Computing Infrastructure (J2EE, .NET, CORBA, Agents) ... ... DATA REPOSITORIES DATA REPOSITORIES

  5. Components of the Semantic Web Fabric • Bootstrapping, Creation and Maintenance of Semantic Knowledge • Collaborative and Sociological Processes, Statistical Techniques • Ontology Building, Maintenance and Versioning Tools • Re-use of Existing Semantic Knowledge (Ontologies) • Annotation/Association/Extraction of Knowledge with/from Underlying Data • Information Retrieval and Analysis (Distributed Querying/Search/Inference Middleware) • Semantic Discovery and Composition of Services • Distributed Computing/Communication Infrastructures • Component based technologies, Agent based systems, Web Services • Repositories for managing data and semantic knowledge • Relational Databases, Content Management Systems, Knowledge Base Systems

  6. What DB researchers have done ? • Semantic Data Models • Multi-database Schema Heterogeneity • Multi-database/Federated Database Schema Integration • Schema Evolution • Object Oriented/XML/Deductive Databases/Rule Based Systems • Mediators and Wrappers • Multidatabase/Federated Database Query Processing • Data Mining • Probabilistic Databases • Workflow-based Coordination Systems • Security in Database Systems • Multimedia Databases • Text and Information Retrieval Systems • Image Databases DB Research is well positioned to contribute to the Semantic Web, but: • there has been little interest in issues related to Semantics in the DB community • the Semantic Web can be the underlying theme that ties in all the disparate pieces of work

  7. What are the missing gaps ? • Ontology Integration/Interoperation • Problem is different from Schema Integration • Need to address “semantics” of relationships such as “synonyms”, “hyponyms”, etc. • Ontology Impedance/Mismatch • Relax the requirements of consistency and completeness • Should be able to characterize the “information error/loss” that occurs.. • Dynamic Ontologies • Need to relax the assumption of the “staticness” of database schemas • Inferences based on Semantics of the Data • Has been relatively ignored by the DB community • Semantics of Multimedia Data • Need to focus more on non-traditional data such as text, images, etc. • Need to focus on “annotation mechanisms” as an addition to wrappers/mediators • Performance/Scalability • A traditional strong point of DB research The next wave of research (esp. in the context of the Semantic Web) will focus on re-use of pre-existing data models/schemas/ontologies that describes the content of information sources…

  8. Biblio-Thing Document Technical-Report Book Miscellaneous-Publication Proceedings Edited-Book Technical-Manual Cartographic-Map Computer-Program Doctoral-Thesis Multimedia-Document Newspaper Artwork Journal Master-Thesis Magazine Bibliography Data Ontology: The Blue Ontology Conference Agent Person Organization Author Publisher University Thesis Periodical-Publication http://www-ksl.stanford.edu/knowledge-sharing/ontologies/html/bibliographic-data/

  9. Print-Media Journalism Press Publication Periodical Newspaper Magazine Book Journals Pictorial Series Trade-Book Brochure TextBook SongBook Reference-Book PrayerBook CookBook Encyclopedia WordBook HandBook Directory Instruction-Book Annual GuideBook Manual Bible A subset of WordNet 1.5: The Red Ontology Reference-Manual Instructions http://www.cogsci.princeton.edu/~wn/w3wn.html

  10. Inter-ontological relationships • Synonyms • leads to semantics preserving translations • Hyponyms/Hypernyms • lead to semantics altering translations • typically results in loss of recall and precision • List of Hyponyms • technical-manual hyponymmanual • bookhyponymbook • proceedingshyponymbook • thesishyponymbook • misc-publicationhyponymbook • technical-reportshyponymbook • presshyponymperiodical-publication • periodicalhyponymperiodical-publication

  11. Document (ATLEAST 1 place) Publication Periodical-Publication (ATLEAST 1 ISBN) Periodical Book Pictorial Technical-Report Journal Series Book SongBook Trade-Book Thesis TextBook Proceedings PrayerBook Brochure Misc-Publication Reference-Book CookBook Instruction-Book Encyclopedia Directory HandBook Annual WordBook Manual Bible Instructions Technical-Manual Reference-Manual Ontology Integration and Query Rewriting { union(Journal, union(Book, Proceedings, ..., Misc-Publication)), union(Periodical-Publication, union(Book, ....., Misc-Publication)), Document } {Journal, Periodical-Publication} {union(Book, Proceedings, ..., Misc-Publication)} {Technical-Manual} GuideBook

  12. => 1 - 1 0 < alpha < 1 (alpha)(1/Precision) + (1-alpha)(1/Recall) = 1 - 1 1/2(1/Precision) + 1/2(1/Recall) Estimating Loss of Information based on Term Extensions Loss in Precision Loss in Recall Ext(Term) Ext(Translation) Precision = | Ext(Term)  Ext(Translation)| |Ext(Translation)| Recall = | Ext(Term)  Ext(Translation)| |Ext(Term)| Percentage Loss = | Ext(Term)  Ext(Translation)| |Ext(Term)| + |Ext(Translation)|

  13. Semantic Adaptation of Precision and Recall • Term subsumes Translation • Ext(Translation)  Ext(Term)  Ext(Term)  Ext(Translation) = Ext(Translation) • Precision =1, • Recall =|Ext(Translation)| |Ext(Term)| • However: Term and Translation belong to different ontologies • Ext(Term) = Ext(Term)  Ext(Translation) • Recall.low =|Ext(Translation)|.low |Ext(Translation)|.low + |Ext(Term)| • Recall.high = |Ext(Translation)|.high max(|Ext(Translation)|.high, |Ext(Term)| • Need to evolve a common framework for relating subsumption and information loss

  14. Conclusions • Data Models/Schemas/Ontologies will form the critical infrastructure for the Semantic Web • Re-use of pre-existing data models/schemas/ontologies is crucial in describing the semantics of various information sources • There is a need to relax consistency and completeness requirements and estimate the “error” in the results returned. • Semantics of information should be used to minimize “error” in the information obtained • DB research is well positioned to participate in the Semantic Web if it “adapts” to these new requirements…. ….. Otherwise it is in danger of missing the “bus” again !!

More Related