1 / 16

Design and Creation of Ontologies for Environmental Information Retrieval

Design and Creation of Ontologies for Environmental Information Retrieval. Vipul Kashyap AOS Workshop, Rome, November 2001 vipul_kashyap@yahoo.com. Outline. Ontologies for Information Retrieval: The InfoSleuth System Sources for Ontology Construction The Ontology Design Process:

Download Presentation

Design and Creation of Ontologies for Environmental Information Retrieval

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Design and Creation of Ontologies for Environmental Information Retrieval Vipul KashyapAOS Workshop, Rome, November 2001vipul_kashyap@yahoo.com

  2. Outline • Ontologies for Information Retrieval: The InfoSleuth System • Sources for Ontology Construction • The Ontology Design Process: • “Reverse Engineering” from a database schema • Ontology refinement based on user queries • Enhancing the ontology • Using a data dictionary • Using a Thesaurus • Conclusions and Future Work

  3. User Agent Resource Agent User Agent Resource Agent User Agent Resource Agent Ontologies for Information Retrieval:The InfoSleuth System KQML/KIF agents Domain Ontology

  4. Ontologies for Information Retrieval • Provide a concise, uniform, declarative description of semantic information • Independent of syntactic representations, conceptual models of the underlying information bases • Domain models provide wider access by supporting multiple world views on the same underlying data • EDEN ontology defined in the context of the InfoSleuth system: • important and crucial to capture elements of environmental information

  5. Sources for Ontology construction • Pre-existing Database Schemas • data directed component • Collection of representative set of queries possibly parameterized based on application user interface • application directed component • Thesauri and Vocabularies (e.g., EEA Thesaurus) • knowledge directed component • Ontology = knowledge-based middle ground between applications and data !!!

  6. Abstract detailsfrom Database Schema Choose newDatabase Schema DetermineRelationships Determine entitiesand attributes Group information,Analyze foreign keysand dependencies Drop entitiesand attributes Implementand Test EvaluateOntology Add new entitiesand attributes Add new subclassesand superclasses Choose new query No morequeries The Ontology Design Process Ontology fromDatabase Schema Ontology fromQueries

  7. Reverse Engineering from a Database Schema • Abstraction of details related to: • data organization • local keys • Grouping information in multiple tables • Identifying Relationships • Incorporating new concepts suggested by new schema

  8. Environmental Databases • CERCLIS 3 • http://www.epa.gov/enviro/html/cerclis/cerclis_overview.html • ITT • HAZDAT • http://www.atsdr.cdc.gov/hazdat.html • ERPIMS • http://www.resdyn.com/erpims • Basel Convention Database • http://www.unep.ch/basel

  9. Site site_id (PK) site_name site_ifms_ssid_ code site_rcra_id site_epa_id Abstracting out details related to local keys Database Schema Site_Characteristic site_id (PK, FK to Site) rsic_code (PK, FK to Ref_Sic) sc_date Ontology Site Id code name date

  10. Site code name date alias_name description Grouping Information in Multiple Tables Site site_id (PK) site_name site_ifms_ssid_ code site_rcra_id site_epa_id Site_Characteristic site_id (PK, FK to Site) rsic_code (PK, FK to Ref_Sic) sc_date Ref_Sic rsic_code (PK) rsic_code_desc Site_Alias site_id (PK, FK to Site) site_alias_id (PK) sa_name Database Schema Ontology

  11. Ref_action_type rat_code (PK) rat_name rat_def Remedial_Responsesite_idact_code_idrat_code Waste_Src_Media_Contaminated Contaminant wsmrc_nmbr (PK) site_id (PK, FK to Action) rat_code (FK to Action) act_code_id (FK to Action) actionName Site RemedialResponse PerformedAt Identifying Relationships Site site_id (PK) site_name site_ifms_ssid_ code site_rcra_id site_epa_id Action site_id (PK, FK to Site) rat_code (PK, FK to ref_action_type) act_code_id (PK) Database Schema Ontology

  12. ExportsTo ImportsFrom Country Contaminant Site RemedialResponse PerformedAt Incorporation of new concepts from a different database schema Ontology Database Schema Sends_Exportuncodetoxics_codetoxics_descriptionexporterimporter Receives_Exportuncodetoxics_codetoxics_descriptionexporterimporter

  13. Ontology refinement based on user queries • Addition of New Attributes • At NPL sites with a land use category of INDUSTRIAL, what is the cleanup level range for LEAD …. • Add an attribute landUseCategory to the entity Site in the ontology • Addition of new Relationships • What is the range of concentrations for ARSENIC is a contaminant of concern in the SURFACE SOIL at NPL sites • Add a relationship HasContaminant between the entities Site and Contaminant in the ontology • Addition of class-subclass relationships and new entities • How many Super fund sites are in Edison County, New Jersey ? • Add an entity SuperFundSite as a subclass of Site in the ontology

  14. Site Map coding_scheme1 coding_scheme2 coding_scheme3 state StateName StateCode StateAbbr Using a data dictionary (EDR) to enhance the ontology { “Texas”, “California” } { “TX”, “CA” } select * from Site where state = ‘TX’ or state = ‘California’ select coding_scheme1 from Map where coding_scheme3 = ‘TX’

  15. LandSetup Site AbandonedSite DisusedMilitarySite SuperfundSite Enhancing the Ontology by using a Thesaurus abandoned siteTHEME POLLUTIONBT land setupNT disused military site

  16. Conclusions and Future Work • Role of semantic content in handling data/information overload • Domain Specific ontologies: an approach for capturing semantic content • Design and construction of domain ontologies • labor intensive, time consuming, difficult endeavor • Re-use readily information: schemas, queries, data dictionaries, thesauri • minimize the involvement of the domain expert • Extrapolate this technique into other domains: • telecommunication • IP networks (use of CIM information model by DMTF) • Apply these techniques to Knowledge Management and Acquisition

More Related