1 / 49

The Suggested Upper Merged Ontology (SUMO) at Age 7: Progress and Promise

Adam Pease Articulate Software apease@articulatesoftware.com http://www.articulatesoftware.com http://www.ontologyportal.org/ http://home.earthlink.net/~adampease/professional/. The Suggested Upper Merged Ontology (SUMO) at Age 7: Progress and Promise. Presented at Ontolog

Download Presentation

The Suggested Upper Merged Ontology (SUMO) at Age 7: Progress and Promise

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Adam Pease Articulate Software apease@articulatesoftware.com http://www.articulatesoftware.com http://www.ontologyportal.org/ http://home.earthlink.net/~adampease/professional/ The Suggested Upper Merged Ontology (SUMO) at Age 7: Progress and Promise Presented at Ontolog 6 September 2007

  2. Overview • SUMO is a large, open source, formal ontology stated in first-order logic • Mapped to a large multi-lingual lexicon • With open source tools for ontology development and application

  3. What's New • More content about social relationships, justice and law, military events-people-processes • Wikipedia (DBpedia) links • Updated mappings to WordNet 3.0 • New tests of inference and many new inference engines • SQL and XML generation tools • Many new academic and commercial uses

  4. SUMO Prize - 2007 • US$3000.00 • Due December 1, 2007 • Entries must be open source SUO-KIF files that extend SUMO • Judged on several criteria: • Degree of formalization • Scope and coverage • Coherent new topic or domain • Actual utility in an application

  5. Old-style (most common) standards specifications: (ISO 14258, Requirements for enterprise-reference architectures and methodologies)‏ “3.6.1.1 Time representation If an individual element of the enterprise system has to be traced then properties of time need to be modeled to describe short-term changes. If the property time is introduced in terms of duration, it provides the base to do further analyses (e.g., process time). There are two kinds of behavior description relative to time: static and dynamic.” Data-model standards (ISO 10303-41, Product Description and Support)‏ ENTITY product_context SUBTYPE OF (application_context_element); discipline_type : label; END_ENTITY; Semantic-model standards (IEEE P1600.1 - SUMO, ISO 18629-11, PSL Core)‏ (forall (?t1 ?t2 ?t3)‏ (=> (and (before ?t1 ?t2) (before ?t2 ?t3))‏ (before ?t1 ?t3)))‏ Thanks to Steve Ray, NIST Pursuit of Rigor in Data Standards

  6. Ontology work should be here, since logic is needed to substitute for human thought. Refers To Symbolizes Stands For Term Lots of “ontology” work has really been here. C.K. Ogden/I.A. Richards, The Meaning of Meaning A Study in the Influence of Language upon Thought and The Science of Symbolism London 1923, 10th edition 1969 Terms and Concepts Concept “Orange” Referent Slide adpated from (c) Key-Sun Choi for Pan Localization 2005 from the slide of [Bargmeyer, Bruce, Open Metadata Forum, Berlin, 2005]

  7. name Joe Smith BS Case Western Reserve, 1982 MS UC Davis, 1984 education CV private Married, 2 children 1985-1990 ACME Software, programmer work Imagine...your view of the web

  8. name education CV private work ...and the Computer's View Thanks to Frank van Harmelen for the original idea of this slide and Peter Yim for the Chinese language content

  9. <job name=”Joe Smith” title=”Programmer”> But wait, we've got XML -

  10. <job name=”Joe Smith” title=”Programmer”> <x83 m92=”|||||||||” title=”..............”> But wait, we've got XML -

  11. Mammal Person JoeSmith But wait, we've got Taxonomies -

  12. x931 o4839 i3729 But wait, we've got Taxonomies -

  13. Mammal Mammal subclass Person implies instance instance JoeSmith JoeSmith Wait, we've got semantics -

  14. Mammal Mammal subclass Person implies instance instance JoeSmith JoeSmith x9834 x9834 r22 u8475 implies r53 r53 p3489 p3489 Wait, we've got semantics -

  15. Semantics Helps a Machine Appear Smart • A “smart” machine should be able to make the same inferences we do • (let's not debate the AI philosophy about whether it would actually be smart)‏

  16. Definitions • An ontology is a shared conceptualization of a domain • An ontology is a set of definitions in a formal language for terms describing the world

  17. Frames • Object- or term-centered • Frames, slots, values, (and attributes)‏ Adam: Person height 5'8” cardinality: 1 occupation consultant

  18. Frame Restrictions • b is between a and c • (between1 a betweenness1)‏ • (between2 b betweenness1)‏ • (between3 c betweenness1)‏ • vs • (between a b c)‏ • Adam is not an accountant • (notOccupation Adam Accountant)‏ • vs • (not (occupation Adam Accountant))‏ • Existential vs. Universal quantification • Similar problems for many description logics • Very efficient computation however

  19. Digression: Implementation is Different from Representation • Why lose meaning at design time just because of runtime issues? • We can’t reason with English definitions, but that doesn’t mean we shouldn’t document our terms • Many different implementations may be done from the same representation • This does not mean that run time issues should be ignored at design time • If you represent information you know can’t be reasoned with, it better not be essential in most conceivable applications

  20. Many Ways to Use Ontology • As an information engineering tool • Create a database schema • Map the schema to an upper ontology • Use the ontology as a set of reminders for additional information that should be included • As more formal comments • Define an ontology that is used to create a DB or OO system • Use a theorem prover at design time to check for inconsistencies • For taxonomic reasoning • Do limited run-time inference in Prolog, a description logic, or even Java • For first order logical inference • Full-blown use of all the axioms at run time

  21. Upper Ontology • An attempt to capture the most general and reusable terms and definitions • Provokes thought on clarifying the meaning of more specific terms • Provides for large-scale reuse

  22. Ontology Language - Expandable - language independent - machine understandable - understood by humans - ambiguous Knowledge - changes rapidly - may be local to an entity Ontology vs Language and Knowledge

  23. Suggested Upper Merged Ontology • 1000 terms, 4000 axioms, 750 rules • Mapped by hand to all of WordNet 1.6 • then ported to 3.0 • Development begun in 2000 • US Government small business grant • Associated domain ontologies totalling 20,000 terms and 70,000 axioms • Free • SUMO is owned by IEEE but basically public domain • Domain ontologies are released under GNU • www.ontologyportal.org

  24. SUMO (continued)‏ • Formally defined, not dependent on a particular implementation • Open source toolset for browsing and inference • http://sigmakee.sourceforge.net • Many uses of SUMO (independent of the SUMO authors and funders)‏ • http://www.ontologyportal.org/Pubs.html

  25. SUMO Validation • Mapping to all of WordNet lexicon • A check on coverage and completeness (at a given level of generality)‏ • Peer review • Open source since its inception • Formal validation with a theorem prover • Free of contradictions (within a generous time bound for search)‏ • Application to dozens of domain ontologies

  26. WordNet • Lexical database • 100,000 word senses – synsets • Created by George Miller's group at Princeton • Free • De facto standard in the linguistics world

  27. WordNet to SUMO Mapping • WordNet synset “plant, flora, plant_life” is equivalent to the formal SUMO term 'Plant' • 00008864 03 n 03 plant 0 flora 0 plant_life 0 027@ . . . | a living organism lacking the power of locomotion &%Plant= • SUMO has axioms that explain formally what a plant is (=> (and (instance ?SUBSTANCE PlantSubstance)‏ (instance ?PLANT Organism)‏ (part ?SUBSTANCE ?PLANT))‏ (instance ?PLANT Plant))‏

  28. WordNet to SUMO Mapping • Most nouns map to classes • Most verbs map to subclasses of &%Process • Most adjectives map to a &%SubjectiveAssessmentAttribute • Most adverbs map to relations of &%manner

  29. Internationalization • Translation of SUMO paraphrases to diverse multiple languages • Some confidence there’s no cultural or linguistic bias • Chinese, Hindi, Tagalog, Czech, German, Italian, Korean, Romanian, Arabic • Estonian and Hungarian in development • SUMO is linked to multiple very large lexicons (Euro WordNet, Balkanet, HowNet etc)‏ • English, Chinese, Italian, Arabic

  30. Structural Ontology Base Ontology Set/Class Theory Numeric Temporal Mereotopology Graph Measure Processes Objects Qualities SUMO Structure

  31. Structural Ontology SUMO Base Ontology Set/Class Theory Numeric Temporal Mereotopology Graph Measure Processes Objects Qualities Mid-Level WMD Transnational Issues Financial Ontology Geography ECommerce Services Communications Distributed Computing Government People Military Terrorist Attack Types Terrorist Transportation Economy Biological Viruses Terrorist Attacks UnitedStates Elements NAICS Afghanistan France World Airports … SUMO+Domain Ontology Total Terms Total Axioms Rules 20399 67108 2500

  32. Are SUMO Terms Directly Usable? • Yes. • Study – 1/3 of upper ontology terms directly appear in answers on large test • Cohen, P., Chaudhri, V., Pease A., and Schrag, R. (1999), Does Prior Knowledge Facilitate the Development of Knowledge Based Systems, In Proceedings of the Sixteenth National Conference on Artificial Intelligence (AAAI-1999). Menlo Park, Calif.: AAAI Press. http://home.earthlink.net/~adampease/professional/cohen-aaai99.ps • before (in time), agent (of a process), etc.

  33. High Level Distinctions The first fundamental distinction is that between ‘Physical’ (things which have a position in space/time) and ‘Abstract’ (things which don’t)‏ Entity Physical Abstract

  34. High Level Distinctions Partition of ‘Physical’ into ‘Objects’ and ‘Processes’ Physical Object Process

  35. Objects Object SelfConnectedObject Substance CorpuscularObject Region Collection

  36. IntentionalProcess IntentionalPsychologicalProcess RecreationOrExercise OrganizationalProcess Guiding Keeping Maintaining Repairing Poking ContentDevelopment Making Searching SocialInteraction Maneuver Motion BodyMotion DirectionChange Transfer Transportation Radiating DualObjectProcess Substituting Transaction Comparing Attaching Detaching Combining Separating InternalChange BiologicalProcess QuantityChange Damaging ChemicalProcess SurfaceChange Creation StateChange ShapeChange Processes

  37. Abstract SetOrClass Relation Proposition Quantity Number PhysicalQuantity Attribute Graph GraphElement

  38. Case Roles • Roles that entities play in a Process • agent, patient, instrument etc.

  39. Case Roles • “Brutus stabbed Caesar with a knife on Tuesday.” Caesar patient agent instrument Brutus A Knife A Stabbing time A Tuesday

  40. (exists (?S ?K ?T)‏ (and (instance ?S Stabbing)‏ (instance ?K Knife)‏ (instance ?T Tuesday)‏ (agent ?S Brutus)‏ (patient ?S Caesar)‏ (time ?S ?T)‏ (instrument ?S ?K)))‏ Case Roles • “Brutus stabbed Caesar with a knife on Tuesday.”

  41. Attributes and Roles • (attribute JohnDoe Unemployed)‏ • (attribute GIJane Soldier)‏ • (attribute MyCar Blue)‏

  42. (=> (instance ?DRIVE Driving)‏ (exists (?VEHICLE)‏ (and (instance ?VEHICLE Vehicle)‏ (patient ?DRIVE ?VEHICLE))))‏ “If there's an instance of Driving, there's a Vehicle that participates in that action.” Not just an English definition for humans to read, but a logical definition that can be used in proofs. Example Rules

  43. Commercial Application • One year project for Articulate Software • Working with a company that creates financial transaction systems for royalty payments • Re-engineer current ontology management business process, tools and ontology

  44. Commercial Application • Extensive current ontology • Captured in spreadsheets • Local term names and definitions for every customer • An essential part of their process • Ontology management system that exports XML & RDF • One end-user database is nearly 3GB • Ontology functions can be batch-process

  45. Project Goals • To add formality to existing model • To support full logical inference, consistency checks • Give customers user-friendly ontology editor • so that they can maintain the ontology • Create broader set of definitions • Enable greater DB integration • Enable expansion into new markets • Leverage work • Exercise SUMO and Sigma in business

  46. Initial Tasks • Implement UI improvements to Sigma • Simplified tree-based editor • Simplified frame-style browser • XML/SQL ontology export • Uses meta-predicates for physical DB structure • Merge existing ontology with SUMO

  47. DBPedia • “People” content uses FOAF • Lightweight, redundant, ad-hoc • Only a tiny portion is used • birthdate, deathdate, birthplace, deathplace, names, firstname, lastname • http://xmlns.com/foaf/spec/ • 16MB KIF content http://www.ontologyportal.org/content/DBPediaPeople.zip • Recent announcement of DBPedia now mapped to WordNet • Which gets us links to SUMO

  48. TPTP • Research effort in automated theorem proving • 40+ different first order logic provers • Annual competition • Thousands of test problems • We will issue SUMO-based tests in TPTP format next month • Sigma connected to TPTP prover suite

  49. Controlled English to Logic Translation • Automated translation from English to Logic • Uses WordNet-SUMO mappings for 100,000 word sense vocabulary • Domain-independent • Development process • Start with a highly restricted language and gradually add linguistic features

More Related