Modularizing OWL Ontologies Bernardo Cuenca Grau, Bijan Parsia, Evren Sirin and Aditya Kalyanpur University of Maryland, College Park
Multiplicity in Web ontologies • Key features of the Web • Mostly global identifers • URIs/IRIs • Identifer-based information manipulation • HTTP GET and POST/PUT • Distributed representations - Representations are (somewhat) modular • Progressive disclosure and graceful degradation • How do we get these things into OWL?
URIs in OWL Ontologies • Three key modalities of URI use • URIs as data (mention) • URIs as identifiers (use) • URIs as values of owl:imports (use++) • owl:imports supports transclusion • The transclusion is flat • I.e., an include; imported axioms just asserted • HTML has these modalities
owl:imports • Problems with owl:imports • Does not support information hiding or filtering • None of the imported axioms retain their context. • No effective difference between ``copying and pasting’’ locally the imported ontology. OWL gives us either ALL or NOTHING
owl:imports(2) • Without inclusion, nothing gets in • There are (conditional) effects: • If you import or merge, “same” URIs get merged • With inclusion, everything gets in • Even things which are intuitively irrelevant • NCI: You want Gene, you also get Belief System • The resulting ontology’s logic gets complex • The resulting ontology itself is very complex
Modularity in Web Ontologies • Knowledge sharing and re-use crucial research issues. • Modularity, worthy the name, should provide advantages in the following aspects: • Maintenance and Evolution • Understandability for Humans • Knowledge Re-use and sharing • Processability
E-Connections: The Basics • An E-Connection is a knowledge representation language defined as a combination of other logical formalisms. C(…)(L1,…,Ln) DLs Modal Logics Spatial Logics Temporal Logics … Component logics New Constructors and axioms
E-connections: Combined KBs • A Combined KB is a set of ontologies written in the language of an E-Connection • Each component ontology can be written in any of the component logics • Each component ontology is interpreted in a different logical context
OWL Ontology (SHION(D)) Spatial KB (RCC8) Temporal KB (Interval Temporal Logic) Region1 DC R2. R3 EC R2. R4 PPTP R2. …. Person sub Animal Pets sub Animal Cars… Trees… States… Countries… … T1 before T2. T3 overlaps T4. …
E-connections: Connected KBs • Components are disjoint, but related • through “links” • Individuals in the source are linked to individuals in the target • Concepts can be defined in terms of the links • A class can be defined as all those individuals who are linked to a specific set of individuals in another component • That other set might be defined in terms of links!
Example Applications • A set of independently developed KBs that are now required to inter-operate. • An organization wants to represent knowledge about intuitively disjoint, but connected domains. • Migration from a monolithic to a distributed representation. • Understanding of an ontology by humans. • Reuse of the knowledge about a given domain within an ontology.
What can we gain? • Expressivity & Decidability • Description Logic with temporal or spatial logic • Expressivity & practical algorithms • ex: C(SHIQ,SHOQ,SHIO) merges to SHOIQ • Modularity: Even in C(SHIF)! • Ability to integrate ontologies as reusable modules • Ability to split up large ontologies
Dog DogOwner Pet Person Unhappy Cat Cat Unhappy PetOwner Person Unfriendly Pet Integration:People and Pets example owns ownedBy owns lovesToPlayWith
Dog Pet Unhappy Cat Cat Person Unfriendly Pet DogOwner owns ownedBy Person Unhappy PetOwner owns lovesToPlayWith • DogOwner = Person owns.Dog • (“owns” is a link)
Dog Pet Unhappy Cat Cat Person Unfriendly Pet DogOwner owns ownedBy Person Unhappy PetOwner owns lovesToPlayWith UnhappyPetOwner = Person owns.(UnfriendlyPet)
Dog Pet Unhappy Cat Cat Person Unfriendly Pet DogOwner owns ownedBy Person Unhappy PetOwner owns lovesToPlayWith UnhappyCat = Cat ownedBy.(DogOwner) ownedBy inverseOf(owns)
Families of E-Connection Languages • Two ways of defining new combination languages: C(…)(L1,….,Ln) Fix the component languages. Vary the expressivity of links Fix the expressivity of links. Vary the combination languages
Extensions of E-Connections • Basic E-Connections: CI(L1,…,Ln) • Inverses on links • Extensions with number restrictions FanaticPetOwner = PetOwner ≥20owns.Pet • Extensions with Link hierarchies lovesToPlayWith lovesActivity • Extensions with Booleans on Links FrustratedPetOwner = PetOwner (owns & ~likes).Pet
Reasoning with E-Connections of DLs • Depends on: • Which are the component logics • Expressivity of the links
Algorithms • Direct tableau algorithm • Intuitive extension: • Color the nodes • Apply standard techniques to source, links, and target “separately” • Does not perturb optimizations. • Actually implemented quickly
Integrating E-Connections in OWL <owl:Class rdf:ID="PetOwner"> <rdfs:comment>A Person who owns at least one pet</rdfs:comment> <owl:intersectionOf rdf:parseType="Collection"> <owl:Class rdf:about="#Person"/> <owl:Restriction> <owl:onProperty> <owl:LinkProperty rdf:about="#owns"> <owl:foreignOntology rdf:resource="&pets;"/> </owl:LinkProperty> <owl:onProperty> <owl:someValuesFrom> <owl:ForeignClass rdf:about="&pets;#Pet"> <owl:foreignOntology rdf:resource="&pets;"/> </owl:ForeignClass> </owl:someValuesFrom> </owl:Restriction> </owl:intersectionOf> </owl:Class>
Partitioning Problem • The task: • Input: An OWL ontology • Output (Intuition): A set of “semantically coherent modules”
Partitioning Problem (II) • Issue: How to precisely define the output • What is a module? • What does exactly mean that a module is ``semantically coherent?”
Intuitions • A module of an ontology O should be • a subset of the axioms of O • semantically self-contained • reusable • evolvable • But what does this mean? • It should have the “same essential” meaning • It should have no “extra” meaning
Formalizing the problem • OWL entities • Classes, properties, individuals (datatypes, etc.) • Typically with a URI • As terms • A subset of O has a vocabulary • Simply, all the URIs used as identifiers in it • I.e., all the entities it talks about
Formalizing the problem • Semantic Encapsulation • A subset of Oencapsulates an entity if it preserves certain essential entailments about it • A subset of O is strongly encapsulating if it encapsulates all the entities in its vocabulary • Strong encapsulation formalizes ``Self-Containment’’ • Which entailments (e.g., for classes)? • Subsumption • Satisfiabilty • Instantiation
Formalizing the Problem • A module is for an entity in an ontology • So relative to an ontology • A module for X in O: • Is a subset of O • Encapsulates X • Is strongly encapsulating • I.e., encapsulates all of the entities in its vocabulary • Is minimal • No strongly encapsulating encapsulation of X encapsulates fewer entities • Say that 5 times fast!
Formalizing the problem • Partitioning Problem (formal) • Input: An OWL Ontology, O • Output: A module for each entity in O
How to tackle the problem • Generate an E-Connection, K by • Dividing the axioms of O into disjoint sets • Respecting the E-Connection constraints • Preserve certain key entailments • With the maximal number of components • Some axioms must go together • Disjointness of domains enforces that
Constraints and Choices • Recall E-connection is a set of ontologies where: • Each ontology is disjoint • (share no classes, properties, or individuals) • There are link relations • between individuals in different ontologies • Classes in an ontology can be defined in terms of restrictions on link relations.
The Algorithm in a Nutshell • Order the axioms, then • Move the first axiom into a new component • Find all the related axioms and move them • Suppose, C D moves to the first component • Suppose the ontology has D E • D E must go into the first component • (Primarily) structural tracing • Make link relations whenever possible • Repeat until they original list is empty. Surprising Result: Obtained partitions are natural from a modeling perspective
From Components to Modules • Once the E-Connection has been obtained: How do we get the module for an entity (say, X)? • Find the component which contains X • Trace the transitive closure of all the link properties of that component • Import all the found components, converting link properties back to object properties • That’s the module we’ve found!
The Partitioning Graph • The partitioning graph generated from the E-Connection. • Picture: Graph for OWL-S
Output • For each entity X in the ontology • A subset of the axioms and assertions • which is strongly encapsulating • and is smaller than the ontology (usually) But… Isn’t minimal!
Advantages of “E-modules” • Correctness is ensured • Partitioning to an E-Connection is a worst-case quadratic problem. • The process is entirely structural • But the theory of partitioning provides us semantic guarantees! • Results in practice are good
The Wines Ontology • Canonical Example in the OWL Guide • Contains information about Wines (central domain), but also about: • Wines • Regions • Wineries • Grapes • …
Wines Ontology • Very simple tree-like decomposition. • Easy to see that wines are central to the ontology
The NCI Ontology • Huge ontology about the bio-medical domain. • Designed around a set of well-defined sub-domains, called kinds: • Genes • Drugs • Biological processes,…
The NCI Ontology • Nice decomposition • Components correspond to NCI kinds. • Large proportion of leaf and independent components. • For every entity, each E-module is strictly smaller than the whole ontology.
OWL-S Ontologies • Describe Web Services • Services: Core of the ontologies • Profiles: Capabilities of the services • Processes: internal structure of the service • Grounding: concrete message structures
OWL-S Ontologies • Intuitive domains are further decomposed • Presence of unused information
NASA’s SWEET-JPL • NASA’s terminology in Earth and Environmental Sciences • Ontologies: • EarthRealm: ocean, earth, atmosphere,… • Non-Living substances: particles, radiation,… • Living substances: plants, animals,.. • Physical Processes: diffusion, evaporation,… • Physical Properties: temperature, pressure,… • Units: pounds, miles,… • Time: • Space: • Human Activities: Commerce, Research,… • Data: Data storage, format,…
NASA’s SWEET-JPL • Each ontology within SWEET-JPL models a domain which is conceptually disjoint from the rest. • Intuitive scenario for E-Connections