Ontology Management in CALO, a Cognitive Assistant that Learns and Organizes

Ontology Management in CALO, a Cognitive Assistant that Learns and Organizes Adam Cheyer Program Director,Cognitive Computing Group SRI International

Abstract • CALO is one of DARPA's most ambitious efforts to develop a persistent assistant that lives with, learns from, and supports users in managing the complexities of their daily work lives. A multi-year project that unites some 200+ researchers from 25 academic and commercial organizations, the goal is to produce a single system where learning happens "in vivo", inside an ever-evolving agent that can observe, comprehend, reason, anticipate, act, and communicate. • In this talk, we will first provide an overview of CALO: the what, the how, the why. Next, we will discuss the engineering methods we use to develop and maintain the ontology of CALO. CALO has some unusual requirements, such as "Concept Learning" where the ontology is extended and modified "in-the-wild" by machine learning algorithms. Finally, we will demonstrate IRIS, a semantic desktop that serves as the office environment that integrates best with CALO. IRIS leverages many of CALO's techniques to ontology management, and being open source, provides a distributable, transparent example of the approach.

Outline • CALO Overview • (separate presentation) • Ontology Management in CALO • Ontology Usage in CALO’s Architecture • CALO’s Unique Issues (and Solutions Attempted) for Ontology Management and Maintenance • In Practice • Overview of IRIS Semantic Desktop • Demonstration of CALO/IRIS

CALO Functions Schedule & Organize in Time Monitor & Manage Tasks Organize & Manage Information CALO Prepare Information Products Observe & Mediate Interactions Acquire, Allocate Resources

High-level CALO Architecture Towel Task Registry IRISOffice Environment

Ontology in CALO’s Architecture • Query Manager • Provides single entry point for querying knowledge in CALO – unifies many data sources and reasoning components • Publish Subscribe Event Framework • Across all cyber/physical events in CALO • Episodic Memory (Timeline Server) • Records instances of events for learning • Task Interface Registry • Engineered and Learned Actions in CALO • Dialog Management • Used for understanding user intent and generating interactions to user • IRIS Office Environment • Rich model of user’s electronic life • MOKB Meeting Ontology KB • Rich model of meeting events • CALO Test Infrastructure (CATS) • Evaluates CALO’s abilities and how much learning in the wild has contributed High-level CALO Architecture

CALO Ontology: Core+Office • Core Ontology (aka CLIB) • Created by UT Austin: http://www.cs.utexas.edu/users/mfkb/RKF/clib.html • Library of generic, composable and re-usable knowledge components. • It was created before CALO and has been used in a variety of different projects including RKF, HALO and AURA. • 857 core components (as of 2005-11-14) • Ex: Time-Interval, Person, Organization, Message • Office Ontology • Extension of CORE suitable for CALO Office domain • 108 office components (as of 2005-11-14) • Ex: Author, Vendor, ProjectLeader, ElectronicPresentationDocument • Implemented in KM (“The Knowledge Machine”) • KM is a frame-based language with clear first-order logic semantics • It contains sophisticated machinery for reasoning, including selection by description, unification, classification, and reasoning about actions using a situations mechanism • http://www.cs.utexas.edu/users/mfkb/RKF/km.html

CALO’s Unique Ontology Management Issues • Very large project, many different representation and inference needs • 5 year project: Ontology will change. How to maintain consistency of code, data, and docs? • Enduring Personal Cognitive Assistant: can’t forget data. • Concept & Task Learning: Ontology can change “in the wild” by the user and by CALO • Uncertainty a reality, from many different reasoners and predictors

KM vs. OWL (tools) Keeping Code, Data, and Doc in Synch Migrating acquired data instances forward through ontology changes from Engineering Releases Concept learning allows user ontologies to diverge How to rationalize with engineering releases? How to validate CALO-learned changes? KM (master) exports to OWL Documentation  “POJOs” and Human Readable Doc Transactional POJOs SOUP: “Simple Ontology Update Program” applies system of patches to data to migrate forward to latest version Concept learning changes kept separate from main Engineering “trunk” Restrict changes allowed Add, rename properties and classes, Not move or delete Shadow ontology and validation processes Consistent Ontology Evolution Issues Solutions Attempted

Keeping Code, Data, and Docs in Synch Front End Action CATS Tester TaskMgr MOKB Timeline Apps CALO UI Query Data UI UI Separation IRIS Event frmwk Action frmwk QueryManager To TaskMgr QMDomainFile PluginSvcs Classifier Otherplugins ClusterFramewk HTML Doc Java (RN) SPARQL OntologyUsageSpec OWLDoc Query APIs POJOs Semantic Object JENAKB RadarNetworksKB LuceneFullTextIndex CLIBIn OWL OWL Translator CLIB Ontology (KM) ILR KB1 KB2 KB3 SpecializedOWLOntologies BackEnd

CALO Concept Learning • Concept Learning works in 2 steps / workflows • 1. Building a ‘Shadow Ontology/Knowledgebase’ • Information harvesting • Validation of harvested facts • Integrated into a Shadow Ontology and Knowledgebase • This is a longer term process and will be done first • 2. Realtime updating of CALO • Uses Shadow Ontology and KB • CALO Queries CL about a concept • CL returns one or more concepts • CALO user verifies which was actually meant • CALO Ontology and IRIS KB gets updated

Uncertainty across multiple sources • Issues • When to “write” hypothesis as “truth” into KB? • Maintaining consistency • How to rationalize/combine hypotheses from different algorithms • Credit assignment problem • Solutions Attempted • Year1: Global KB, some algorithms wrote, some hypotheses only accessible through APIs • Year2: Provenance in global KB – record multiple solutions and where they came from • Year3: • Separate KBs by learning component, “smart” queries across sources • Probablistic Consistency Engine maintains global “what CALO believes” repository

IRIS: “Integrate. Relate. Infer. Share.” • “Real” office applications(Mozilla, GLOW, Jabber, …) • Plug-in Architecture (180+ plugins: UI, KB, NL, learning, apps, …) • Semantic Object layer: JAVA objects on top of OWL • Full-text & relational query (SPARQL) • Ontology-based event and action framework • Machine learning framework: classification, extraction, clustering, ranking, … • LGPL Open Sourcehttp://www.openiris.org • Only small subset of CALO, but should be useful for many applications and uses many of techniques in this presentation IRIS Semantic Desktop

Questions? Adam Cheyer

Ontology Management in CALO, a Cognitive Assistant that Learns and Organizes

Ontology Management in CALO, a Cognitive Assistant that Learns and Organizes

Presentation Transcript

A Meeting Browser that Learns

Ontology Management

Congress Organizes

Kirsten Learns A Lesson

A Robot that Learns to Communicate with Human Caregivers

A Search Engine That Learns

CALO News

LLT Calo

Kali Calo

Family University: A Family That Learns Together, Succeeds Together

Constructing a School That Learns

The Idea of a School That Learns

Artificial Intelligence and Software that Learns and Evolves

CALO Introduction

LeaRNS

Calo DAQ

HE THAT NOTHING ASKS, NOTHING LEARNS!

Sam Houston Organizes a Government

Discusses and organizes information in a particular subject

Ontology Languages and Ontology Design and Management