1 / 33

Guide to a Repeatable Process for Ontology Creation (v 0.1)

Guide to a Repeatable Process for Ontology Creation (v 0.1). Draft Copy FOUO. Guide to a Repeatable Process for Ontology Creation. Point of Contact Bill Mandrick, Ph.D. MBO Partners (585) 721-7599 william.mandrick@us.army.mil. Repeatable Process for Ontology Creation. Purpose

berg
Download Presentation

Guide to a Repeatable Process for Ontology Creation (v 0.1)

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Guide to aRepeatable ProcessforOntology Creation (v 0.1) Draft Copy FOUO

  2. Guide to aRepeatable ProcessforOntology Creation Point of Contact Bill Mandrick, Ph.D. MBO Partners (585) 721-7599 william.mandrick@us.army.mil

  3. Repeatable Process for Ontology Creation Purpose The purpose of this guide is to provide a standardized process for creating an accurate and consistent domain representation, also known as an ontology. An accurate and consistent ontology, aimed at the representation of some portion of reality, necessarily contains accurate and consistent semantics. So what is an ontology used for? Most uses for an ontology are related in some way to the problem of coping with the large amount of information being generated in a given field (e.g. Civil Information Management, Stability Operations, Logistics, Position Reporting, Contact Reporting, etc.). However, another reason to create an ontology is to achieve a better understanding of some domain (e.g. Command and Control, Intelligence Preparation of the Operational Environment, Enemy Situation Reporting, etc. ). There are a number of ways that the use of a common set of ontologies, maintained by domain experts committed to the acceptance of tested best practices and vetted and maintained by a community of authorities in a well-documented governance process, can contribute to understanding and coping with prolific information. They include: • Improved understanding of the domain itself — The use of ontologies to represent the types of entities and events in a domain greatly improves understanding for the IT development community. • Improved reusability – The strategy of a perspective-neutral approach – as contrasted with an application-centric approach – means that the ontologies are designed in such a way as to be reusable by a large and varied community of users. Perspective-Centric approaches to creating ontologies or models result in data silos (e.g. to a Targeting Officer everything in the operational environment is a “Target”).

  4. Repeatable Process for Ontology Creation • Improved Discoverability – The use of a repeatable process for ontology creation makes it possible for groups to more easily discover and understand the data assets of other groups, thereby reducing the number of redundant efforts and increasing the collaborative use, and thus the value, of data and software tools. • Semantic Interoperability (Semantic Consistency)— The use of a repeatable process for ontology creation, along with an effective governance process, can bring about a network effectwhere the value of each ontology exponentially increases as more people use it to describe their respective data.A standard process for ontology creation is absolutely necessary for consistent semantics. Fortuitous interoperability (i.e. interoperability by way of good fortune or chance) is the best we can hope for when disparate communities employ their own idiosyncratic techniques for creating domain semantics and representation. Defining Ontology Ontologies are often mischaracterized as a type of Conceptual Data Model, when in fact they are intended to represent (i.e. model) portions of reality—not concepts or data about reality. Although the proposed process relies upon Subject Matter Expertise (SME) in a given domain (e.g. Logistics, Operations, Tactics, Intelligence, Forensics, etc.) , it does not result in an idiosyncratic (perspective-centric) product. Instead, the role of the Subject Matter Expert (SME) is to provide accurate (true) statements about the domain at hand—statements which are perspective-neutral. Who practices ontology? Everyone practices ontology, especially when faced with a new and unfamiliar situation where a response is necessary. As we observe an unfolding situation, and try to make sense of it (i.e. orient ourselves to it), we naturally look for the relations that exist between the entities and events that make up that situation. In some cases we have to create a new lexicon (e.g. the Improvised Explosive Device Lexicon). In short, the practice of ontology is a form of sense-making, and is a natural human activity.

  5. Ontology Defined Good ontology and good modeling...canbe advanced by the cultivation of a discipline that is devoted precisely to the representation of entities as they exist in reality...[1] An ontology is a representation of some part of reality, (e.g. medicine, social reality, physics, etc.). Smith states that: “Ontology is the science of what is, of the kinds and structures of objects, properties, events, processes and relations in every area of reality...Ontology seeks to provide a definitive and exhaustive classification of entities in all spheres of being.”[2] Ontologies enable the formulation of robust and shareable descriptions of a given domain by providing a common controlled vocabulary for doctrine writers, IT Developers, and war-fighters alike, thereby allowing these disparate communities to communicate with each other. An ontology should be a shared resource between communities, and its continued collaborative development should support the integration of information and facilitate knowledge discovery.[3]These two goals are realized by ensuring wide dissemination of the ontology, so that it will be used by many stakeholders, and its terms will be correspondingly familiar and readily used for search. [1] Barry Smith, Beyond Concepts: Ontology as Reality Representation, Forthcoming in AchilleVarzi and Laure Vieu (eds.), Proceedings of FOIS 2004. International Conference on Formal Ontology and Information Systems, Turin, 4-6 November 2004 [2] Preprint version of chapter “Ontology”, in L. Floridi (ed.), Blackwell Guide to the Philosophy of Computing and Information, Oxford: Blackwell, 2003, 155–166. http://ontology.buffalo.edu/smith/articles/ontology_pic.pdf [3] Blake, Judith. Bio-Ontologies—Fast and Furious. Nature and Biotechnology, volume 22 Number 6 June 2004 http://www.nature.com/nbt/index.html

  6. Ontology Defined One challenge that faces many ontology development projects is that there is little guidance on how best to develop ontologies. In this guide, we sketch out a repeatable ontology modeling process that is designed to encapsulate ontology best practices and design patterns in order to improve the quality of ontology development efforts and transfer ontology development knowledge and skills to a broader base of modelers.The process is broken down into five major activities: • Scope the domain • Create initial lexicon 3 • Create initial ontology, 3) • Verify and revise ontology, and • Publish ontology to potential users. The result of this process is a reality-centricOntology developed using the Web Ontology Language (OWL)[4] that extends from a Common Upper Ontology such as BFO or UCore-SL. In what follows, these activities will be broken down and explained so that the ontology developer(s) can proceed with confidence, even into a domain that may be unfamiliar to them. An accurate ontology not only provides superior semantics and an accurate representation of a given domain—it will also instill confidence in the developers and users, which results from truly understanding a domain. [4] http://www.w3.org/TR/owl2-overview/

  7. Perspective Neutrality & Realism The Repeatable Process for Ontology Creation begins with a “Perspective Neutral” view of the domain being represented or modeled. The intent is to avoid an idiosyncratic (i.e. perspective-laden) approach to representing a domain. The basic idea here is that these different perspectives describe different portions of the same reality, and often end up creating stove-pipe ontologies that are semantically incompatible with other ontologies. For example, an infantry perspective may describe an armored infantry vehicle much differently than a logistics perspective or a targeting perspective—i.e. the logistics perspective may describe an armored vehicle as “cargo” while the targeting officer may describe the same armored vehicle as a “target”. This happens because community-specific ontologies are often built bottom up with little thought to how others outside of their communities might reuse or understand their ontologies. In order to make community specific ontologies interoperable with one another it is practical to use a perspective-neutral common upper level from which the community specific ontologies extend. The upper level ontology provides the basis for a shared understanding across multiple communities and makes it possible to identify inconsistencies. For example, a Targeting Officer maintains a targeting perspective, which results in the categorization of buildings, vehicles, and people as being all “targets”. Likewise, a logistics planner will maintain a logistics perspective, which results in the categorization of buildings as “facilities”, vehicles as “cargo”, and people as “passengers”. It is precisely the perspective-centric approach to creating data models and taxonomies that results in data silos. The repeatable process for ontology creation employs a number of ontological distinctions from the outset that are designed to overcome many of the limitations of perspective-centric approaches. For example, the proposed process fastidiously distinguishes between roles and types. “Building,” “vehicle” and “person” are types. “Target,” “cargo” and “passenger” are roles. Roles are context and time sensitive. In one context, a vehicle may be a target and in another context the same vehicle may be cargo. The way to represent these sorts of facts is to say that a vehicle is in the target role for some temporal period (see next page for a graphic representation which distinguishes between types and roles).

  8. Types and Roles has_role Person Civilian has_role Key Leader has_role Combatant has_role Figure 1: It is important to distinguish between Types and Roles. Person is a Type and Key Leader is a Role that some person can be in. A Person can be in a variety of Roles, and in some cases a Person can be in several Roles at the same time. Insurgent has_role Commander

  9. The Repeatable Process for Ontology Creation Inputs Activities Outputs DoD Directives SME Guidance User Requirements Doctrinal Descriptions Doctrinal Definitions SME Feedback Authoritative Descriptions Doctrinal Models IER’s Domain Definition Domain Description Initial List of Domain Terms Statement of Metrics Iterative List of Terms List of Relations Domain Lexicon owl file Relations Schematics SME Update • Revised/Versioned OWL files • Revised Final Briefings • Semantic Conformance Testing Summary • Revised Domain Lexicon Domain Repository Lexicon Lessons Learned Finalized OWL Files Final Briefings Change Request Process

  10. 1. Scope the Domain

  11. 1. Scope the Domain Scoping a specific domain is essentially a defining and boundary setting process. The scoping activity encloses the domain within boundaries to determine which entities and events should be included. This is done by addressing very basic questions about the domain at hand. Ten example questions include: •What is the baseline description or definition of this domain? • What entities make up the domain? • What properties do they have? • What are the baseline definitions for these entities? • What events do they participate in? • What are the baseline definitions for these events? • What outcomes are there in this domain? • What are the relations between entities? • What are the relations be events? • What are the relations between entities and events? This activity will result in a Domain Questionnaire that can be presented to the Subject Matter Experts (SME’s) for the domain. The answers to these questions will result in a baseline document as part of the Domain Scope.

  12. 1.1 SME Interaction • Subject Matter Experts (SME’s) are critical in the Scoping, Verification, and Revisions activities in the Repeatable Process for Ontology Creation. • In order to properly scope some domain it is important to interact with the Subject Matter Experts. For example, to properly scope out a Disease Domain the developer would need to consult with an epidemiologist or some other expert in that domain. Another example would be consulting a Command and Control (C2) doctrine writer for the C2 Domain. SME’s can answer the baseline questions pertaining to the domain at hand, making them an indispensible asset for the Scoping Activity. • Later in the Repeatable Process the SME’s will play an important role in the Verification and Revisions activities. • 1.2 Identify Authoritative References (Doctrine) • It is also important to refer to authoritative references (doctrine) in the Scoping activity. The authoritative references and doctrine are the written expression of subject matter expertise and serves as the primary source for the ontology’s content. Therefore, it is a best practice to refer to the SME descriptions of the domain at hand and start to compile the source materials in a repository. Below is an example of a collection of authoritative references used in the creation of a Joint Operations Ontology.

  13. 1.3 Survey Authoritative References (Doctrine) • Because Subject Matter Expertise is expressed in various authoritative references and doctrine, it is important to conduct a thorough survey of these documents. The Repeatable Process for Ontology Development does not require the developer to become a Subject Matter Expert. However, the developer must have a good understanding of the composition of the domain, which is gained by SME interaction and doctrinal investigation. • 1.4 Create or Identify Domain Definition(s) • This activity starts by identifying the most basic entities and events for the domain at hand. For example, the Joint Operations Planning Domain would start with the doctrinal definition for Joint Operation Planning, which is defined as: • Planning activities associated with joint military operations by combatant commanders and their subordinate joint force commanders in response to contingencies and crises. Joint operation planning includes planning for the mobilization, deployment, employment, sustainment, redeployment, and demobilization of joint forces. (Joint Publication 3-0 Joint Operations) • This definition then needs to be decomposed into its constituent elements, which results in a rapidly expanding Joint Operation Planning Lexicon (see next page for decomposition).

  14. The above definition for Joint Operation Planning is decomposed into the following 15 elements: • Each of these new elements are added to the Joint Operations Planning Domain Lexicon, which will eventually contain all of the content for the ontology. Furthermore, each of these new elements must also be defined and decomposed in the same way.

  15. 1.5 Domain Description • The Domain Description activity follows SME Interaction, Survey of Authoritative References (Doctrine), Creation of Domain Definitions, Decomposition of Domain Definitions, and the Creation of a Domain Lexicon. At this point the developer should be able to compose a Domain Description Document, which describes the domain as an ontology or representation. The Domain Description should capture, at a minimum, the high-level entities and events for the domain, as well as their relations. It should also contain any SME descriptions as well as doctrinal definitions. • 1.6 Devise Metrics • Devising Metrics for the Domain Ontology is done by way of SME input. It is a best practice to devise a list of questions that the ontology must be able to answer. The ontology’s ability to answer these questions is called Coverage of the domain. The inability to answer these questions identifies gaps in the ontology, which must be filled with additional content. The next page is a list of questions used to determine coverage of the C2 Ontology:

  16. Metrics:20 Questions for C2 Related Domains What is the baseline definition/description for this domain? What are the primary activities involved in this domain? What are the subordinate activities in this domain? Who participates in these activities? What environment do these activities take place in? What are the intended outcomes of these activities? What are the intended products of these activities? What information is consumed in these activities? Who consumes this information?  What information is produced by these activities? Where is this information found? Where is this information stored?  What organizations are involved in this domain? How are these organizations related?  What do these outputs contribute to?  What is the relation between agents and organizations in this domain? What are the ultimate goals for the domain? What are the subordinate goals for the domain? What larger enterprise/objective does this domain contribute to? What happens if these activities fail to produce their intended outcomes?

  17. 2. Create Iterative Lexicon

  18. 2.1 Decompose Terms and Definitions This is a continuation of the activity described earlier, where each of the domain terms are decomposed, defined, and added to the Domain Lexicon. A handful of baseline terms can quickly grow into a rather large lexicon consisting of hundreds of terms. 2.2 Create Ontological Definitions Ontological definitions consist of two parts. The fist part of the definition refers to the parent class of the thing being defined (e.g. Dog is an Animal, Tsunami is a Natural Event, Car is a Vehicle, etc.). The second part of the definition describes the differentia for the thing being defined—i.e. that which makes this thing different from every other thing in its class. So a Dog is defined as: Dog: An Animal [parent class] which is a member of the genus Canis, probably descended from the common wolf, that has been domesticated by man since prehistoric times; occurs in many breeds [differentia from all other animals] (Merriam Webster’s Collegiate Dictionary) Definitions should always be written in this format: Parent Class…Differentia

  19. 2.3 Create List of Relations This activity results in the compilation of relations to be used in the domain ontology. The relations should be listed in the Lexicon in their own section. 2.4 Create Evolving Lexicon This is a continuation of the activity described earlier, but the focus turns towards consistent (ontological) definitions, organization of the terms, and graphic depictions of the relations between entities and events in the domain (see next page for an example of a graphic depiction) The Domain Lexicon can be organized alphabetically or by some other criteria, which makes more sense out of the content (e.g. by categories or subjects within the domain). The Lexicon may include a section for graphic depictions for the relations between entities and events are included. These graphic depictions are intended to answer the questions identified in the Scoping/Metrics activity (see next page for example).

  20. Graphic Depiction of Relations Latitude Measurement Latitude denotes Longitude has_property has_property denotes has_property Elevation Longitude Measurement Geospatial Location Altitude occurs_at has_property Event Hazardous Explosion Event is_a denotes Military Symbol instance_of denotes Symbol Code IED Detonation

  21. 3. Create Initial Ontology

  22. 3. Create Initial Ontology This activity uses content from the Domain Lexicon to create a hierarchical taxonomy, as well as a more robust domain representation that includes relations between Entities and Events. 3.1 Extend from a Common Upper Ontology (CUO) The Basic Formal Ontology (BFO) and UCore-Semantic Layer (UCore-SL) are examples of Common Upper Ontologies, which consist of the most general categories of reality. These categories enable the developer to quickly organize the terms in the Domain Lexicon. Developers should become familiar with both Common Upper Ontologies in order to choose the one that is more appropriate for their work. They are available for download at: http://www.ifomis.org/bfo/home https://www.milsuite.mil/wiki/UCore-SL_Implementation_Guidance_August_2010

  23. 3.1.1 Extend to Domain Continuants (these are “Entities” in UCore-SL) A Continuant is defined as: an entity that exists in full at any time in which it exists at all, persists through time while maintaining its identity and has no temporal parts. These are referred to as “Entities” in he UCore-SL Ontology. Examples include: a heart, a person, the color of a tomato, the mass of a cloud, a symphony orchestra, the disposition of blood to coagulate, the lawn and atmosphere in front of our building, the capability of some military organization. 3.1.2 Extend to Domain Occurrents (these are “Events” in UCore-SL) In BFO an Occurrent is defined as: an entity that has temporal parts and that happens, unfolds or develops through time. Sometimes also called perdurants. These are referred to as “Events” in the UCore-SL Ontology Examples of occurrents include: the life of an organism, a surgical operation, the maneuvering of a Brigade Combat Team, the most interesting part of Van Gogh's life, the flight of an artillery round, etc. *The next two figures depict a samples of the BFO and UCore-SL.

  24. BFO Continuants & Occurrents Domain Ontologies extend from BFO top-level categories. If chosen as the CUO, developers need to become familiar with the content in BFO.

  25. UCore-SL Entities & Events • Entities • Information Content Entity • Analysis • Objective • Opinion • Plan • Physical Entity • Agent • Artifact • Environment • Geographic Feature • Geospatial Boundary • Geospatial Region • Information Bearing Entity • Organization • Physical Object • Property • Capability • Physical Property • Role • Event • Act • Act of Communication • Act of Observation • Criminal Act • Terrorist Act • Cyberspace Event • Economic Event • Hazardous Event • Military Event • Natural Event • Planned Event • Political Event • Social Event

  26. 3.2 Relate Continuants and Occurrents Relations are how we make sense of the world around us. Ontological relations allow us to think and say things such as “A house is_a type of building” or “An Infantry Company is part_of an Infantry Battalion”. Without relations data is meaningless. Consider the elements of this simple message: 3rd Platoon is located_at grid coordinates AV 3479 8477. It is the relation “located_at” that gives meaning to the elements “3rd Platoon” and “AV 3479 8477”. 3.2.1 Relate Continuants to Continuants (Example): Infantry Company is part_of a Battalion 3.2.2 Relate Continuants to Occurrents (Example): Civil Affairs Team participates_in a Civil Reconnaissance 3.2.3 Relate Occurrents to Occurrents Military Engagement is part_of a Battle

  27. 3.2.4 Relate Universals to Universals House is a Building 3.2.5 Relate Instances to Universals 3rd Platoon, Alpha Company participates_in Combat Operations 3.2.6 Relate Instances to Instances 3rd Platoon, Alpha Company is located_at Forward Operating Base Warhorse

  28. 4. Revise Ontology

  29. 4. Revisions Process The revisions process results in a complete and accurate ontology, which contains all if the entities, events, and relations needed to represent a given domain. The intent of the revisions process is to improve, as much as possible, the ontology’s ability to answer questions about the domain at hand. 4.1 SME Feedback Domain Subject Matter Experts will verify that the ontology is accurate and that it covers the domain sufficiently. SME’s work closely with the Ontologist, making revisions to the content—a change to the Lexicon will result in a change to the OWL file and vice versa. This activity focuses upon elements names, their definitions, and the relations between entities and events.

  30. Although SME input is an essential part of the ontology development process, it is still important to review the ontology terms and definitions with SMEs to assure that the content is accurate. It is important to note that some SMEs may find it difficult to review ontologies as these artifacts can be difficult to understand for the uninitiated. There are number of ways to alleviate this issue, one of which is to generate spreadsheets with the relevant information for SMEs to review. If SMEs are interested reviewing the ontologies directly, there are number of free ontology editors that SMEs can use to review the ontology, including Protégé OWL, TopBraid Composer and Knoodl. 4.2 Domain Coverage Domain coverage is determined by the ontology’s ability to answer questions about the domain at hand. http://protege.stanford.edu/ http://www.topquadrant.com/ http://knoodl.com/

  31. 4.3 Semantic Conformance Testing • The repeatable ontology development process also involves a number of quality control measures referred to here as semantic conformance tests. These test are intended to ensure that best practices are being adhered to. Examples include: • • Run OWL Reasoner to identify inconsistencies • • Identify cases of multiple inheritance • • Identify classes that do not extend from the common upper level ontology • • Verify that every class has a preferred name and definition • • Verify that every relation has a domain and range • Some of these tests are hard violations (i.e. the violation must be corrected prior to publication) and some of these test are soft violations (i.e. it is left to the discretion of the ontologist to determine if the violation should be corrected or not).

  32. 5. Publish and Share Ontology

  33. Publish and Post to Repository

More Related