330 likes | 411 Views
Chimaera is a powerful tool for diagnosing, refining, and merging knowledge bases. It supports reasoning analysis, bug detection, and content structuring for efficient KB evolution. Integrated into SHAKEN, it aids in diagnosing and repairing KB issues.
E N D
Knowledge Base Diagnostics Richard Fikes (Stanford KSL) Adam Pease (Teknowledge) Mala Mehrotra (Pragati Synergetic Research Inc.) Yolanda Gil (USC ISI) Deborah McGuinness (Stanford KSL) 10/18/01
Knowledge Evolution Tools • KB development requires knowledge evolution Debugging, refining, structuring, modularizing, … • Power tools are needed to support KB evolution • KB diagnosis • Bugs, omissions, heuristic warnings, architectural advice • KB partitioning • To enable effective reasoning • To produce reusable KB building blocks • KB merging • To enable interoperation of KBs with overlapping content • KSL is developing knowledge evolution tools
Chimaera • A Knowledge Evolution Tool Environment • Tools for KB diagnosis and merging • Available as a Web service or an OKBC client • www.ksl.stanford.edu/software/chimaera • Usable from a Web browser • Online user manual, tutorial, and demonstration movie • Performs KB diagnostics in batch mode • Uploads and analyzes user’s KB • Accepts KBs in OKBC, KIF, MELD, RDF, DAML, … • Provides results as HTML pages linked to frames and axioms • Provides user selectable set of diagnostic tests • Analyzes both the structure and content of a KB • Uses reasoners to analyze content
Classification of Diagnostic Results • Errors • Logical inconsistencies E.g., contradictory type constraints • Content structure errors E.g., terms used but not defined • Anomalies • Missing information E.g., type constraints • Redundancies E.g., redundant superclass and type links • Extraneous structure or content E.g., terms defined but not used • Summaries E.g., counts of term references • Suggestions E.g., use consistent naming conventions
“Background” Reasoning Analysis • Reasoning diagnostics that may take substantial time • Performed in background • Results incrementally posted on Web page • Completion notification sent to user via e-mail • Example reasoning diagnostics • Redundant axioms that are inferred by the KB (anomaly) • Inconsistent axioms whose negations are inferred by the KB (error) • Determine which relations in KB are primitive and non-primitive (summary) • Show relations on which each non-primitive relation depend • Determine classes that are disjoint (suggest adding results to KB) • Derive subclass and instance links (suggest adding links to KB) I.e., classification and recognition • Suggest reordering of an implication’s antecedents based on number of inferable instances of each antecedent (suggestion)
Integration Into SHAKEN • Chimaera is a KB diagnostics tool in the SHAKEN system • Used to diagnose both pump priming and SME KBs • OKBC was used to do the integration • Chimaera is an OKBC client • Interacts with any OKBC server using the OKBC API • The Chimaera Web service uses Ontolingua as its OKBC server • SRI added an OKBC wrapper to the KM system • Enabled KM to be an OKBC server usable by OKBC clients • Enabled Chimaera’s diagnostics to run directly on KM KBs
Chimaera Useful To SRI Team “Overall, we found that Chimaera was quite useful. It found 2 concepts (Indole and Imidazole) that were corrupted, several occurrences of redundant superclasses, and several incorrect domain and range constraints (due to our poor representation of "Information"). … We're currently fixing the bugs it revealed. It would be helpful if we could run Chimera on the component library frequently.” – Bruce Porter
Next Steps: SME-Oriented Support • Provide interactive repair oriented follow-up to diagnostics • Identify KB content on which diagnosis result is based • Suggest repairs or repair strategies • Guide user through repair procedure • Examples • Class is a direct subclass of “THING” • Provide direct subclasses of THING as candidate superclasses • Step down through the class hierarchy • Class has redundant superclass links • Suggest removal of link(s) to most general classes • Type, cardinality, or bounds conflict • Suggest changing local conflicting constraint(s) • Missing information • Initiate acquisition dialogues for missing information
Next Steps: Architectural Analysis • Summarize architectural features of a KB • Percentage of • Relations that are functions • Axioms that are propositional, first order, higher order • Axioms that are not horn clauses • Distribution of • Axioms by type (using the HPKB, RKF types) • Axiom lengths by number of literals • Functions by number of arguments • Relations by number of arguments • Direct subclasses per class • Direct subproperties per property • Restrictions per object • Property values per object
Next Steps: Partitioning and Beyond • Integration of KB partitioning tools into Chimaera • Provide automatic KB partitioning to enhance usability • Automatic running of test cases E.g., queries and expected answers • Support regression testing of evolving KB • Provide result summaries from failed tests • Help with typographical errors • Spelling correction for undefined names E.g., classes, slots, relations, functions, constants • Spelling correction for anomalously occurring variables • Suggest is the same as another variable in the sentence
Summary • KSL is developing Chimaera to support KB evolution • Chimaera was integrated into the SHAKEN Y1 system Using OKBC(!) • Incrementally adding diagnostics E.g., “background” diagnostics that use sophisticated reasoning • Next steps • KB partitioning tools • Repair dialogues for SMEs • KB architectural analysis • Regression testing
Role of Diagnostics in Systems • KE support • SME support • Increase productivity (“lightly trained”) • Step in managing KB development • Focus attention (e.g., redundant links) • Evaluation support • Diagnose KBs produced during evaluation • Batch mode • Foreground • Background • Changes in “patterns” in the KB between versions
Sharing Diagnostics Information • Diagnostic specifications • Logical specifications • English specifications • Test cases • Diagnostic classifications • Learnings • Tricks of the trade • Sharing facilitators: • Working group • Mailing list • Findings data • Author, group, or team specific • Repair strategies • Alignments during collaborative development
Developer Needs and Desires • Reasoner-specific diagnostics • Highly informative diagnostic results • Reporting architectural bias in a KB • Binary versus higher order relations • First order versus higher order axioms • Weakly versus strongly higher order • Disjunctions or conjunctions • Existential versus universal quantifiers • Frames to axioms ratios • Horn clauses • Axiom lengths • Functions • Confusion of existential and universal quantifiers • Type restrictions too general • Misspelling of variables
Developer Needs and Desires • Domain-specific tests • Semantic tests • Maintainability measures • Recognizing typographical errors • Spell check undefined or unused terms • Redefining (e.g., breaking up) a predicate • Large scale modification techniques • Prioritizing diagnostics
Integration Issues • Architecture • Use hosted services (like KSL) • Integrate special code • Take specifications from library • API • Interaction Mode - Batch versus Interactive/Repair • Translation issues • One major use of diagnostics is also in testing translators • Certain translations need to be done to do better analysis • Output integration
Evaluation • Record types and numbers of errors • Comparing KBs produced by SMEs versus KEs • Record use of repair strategies • Evaluate during testing • Feedback from SMEs about diagnostics
Classification of Diagnostic Results • Errors • Logical inconsistencies • Content structure errors • (See Randy Davis thesis) • Anomalies • Missing information • Missing portions of descriptions • Redundancies • Extraneous structure or content • Summaries • Architectural biases • Suggestions • Stylistic suggestions • Static versus operational tests • Use of expertise about KR paradigms
Diagnostic Issues/Goals • Role of Diagnostics in Systems • KE support, SME support • Evaluators of KBs • How to Share Diagnostics • Working Group? • Logical specification, English descriptions, tests, … • Know the Main Contributors • Possible Diagnostics • What do users want? • What can tool builders provide? • Integration Issues • Developer Needs/Desires • Evaluation
The Role of KB Diagnostics • KE support • SME support • Increase productivity (“lightly trained”) • Mgmt of kb • Inference dependent quality improvement • Focus attention (ex. Redundant links) • Evaluation support • Abstract patterns – average fanout of specialization, statistics of number of uses of a predicate – big picture view • Version comparison • Regression testing
Diagnostic Sharing • Diagnostic specifications • Logical specifications • English specifications • Test cases • Diagnostic classifications • Taxonomy of errors – bottlenecks, • Quantification • Alignments across systems – inconsistencies among smes • Repair strategies • How informative a system is (core dump vs. useful explanation) • Learnings • Tricks of the trade • Sharing Facilitators: • Working Group • Mailing list
Sharing facilities • Working group • Mailing list • Posting of papers • Utilize Teknowledge
biases • Binary vs. higher arity • First order vs higher order • Weakly vs strongly higher order • Universal over existential • Disjunction vs. conjunction • Frame-ism • Horn clauses • Lisp style • Relations -> functions • Depth vs. breadth in hierarchy • …. Maybe report in summarizations.. • At least document biases
Organizations/People • Cycorp – many special purpose - Kahlert • ISI – Why Not? – Chalupsky – KANAL – Gil - expect - Gil • Pragati – Clustering - Mehrotra • Stanford FRG/KSL – Partitioning – McCarthy, Amir, McIlraith • Stanford KSL – Chimaera - Fikes, McGuinness
Diagnostics • Errors – provable logical inconsistencies • Anomalies – redundancies, cycles,… • Summaries – word counts, … • Suggestions – naming conventions • Incompletenesses – explicit salient assertions or statistics • Stylistics - length of rule, … bad factoring, Randy davis – errors – incompleteness, inconsistent • Get this - Top ten list of things people do wrong in cyc - goolsbey Perspectives/units: Frame-like content vs. axioms vs. problem solving technology vs. learning to correct components
style • Static • Reasoner • Simulation / execution • Using examples • Summarization/improvements/critiquer
Integration Issues • Architecuture • Use hosted services (like KSL) • Integrate special code • Take specifications from library • API • Interaction Mode – Batch vs. Interactive/Repair • Translation issues • one major use of diagnostics is also in testing translators • Certain translations need to be done to do better analysis • Background ontologies – meld starter ontology • Output integration
Developer Needs/Desires Missing existentials Too high a type specification Variable name mismatch Semantic requests: Wrong semantic paradigm? Typos Spell check Large scale modification tools and their integration example removal/ fixing top level priotizing Diagnostics to minimize cost, ease maintenance
Evaluation • Record types of errors • Fine granularity • Kb differences across sme vs. ke developed ontologies across team • Record use of repair strategies… • Evaluate during testing… • Feedback from smes on features, usefulness, etc. • Attempt to keep extremely complete audit trails for future analysis • Important to be careful with diagnostic reporting
Action Items • Working Group • Diagnostics repository • Web site • Follow up briefing • Mailing list
Chimaera • A Knowledge Evolution Environment • Tools for KB diagnosis and merging • Available as a Web service • www-ksl-svc.stanford.edu www.ksl.stanford.edu/software/chimaera • Usable from a Web browser • Online user manual, tutorial, and demonstration movie • Provides user selectable set of diagnostic tests • Performs kb diagnostics in batch mode • Uploads and analyzes user’s KB • Accepts KBs in MELD, KIF, OKBC, DAML, RDF, XML, … • Provides results as HTML pages linked to frames and axioms • Analyzes both the structure and content of a KB • Uses hybrid reasoners to analyze content • Currently runs 28 diagnostic tests
Collection/Specification • Logical Specification of diagnostic • English Specification • Example kb that triggers diagnostic output
Classification of Diagnostic Results II • Axiom Analysis • Axiom Syntax Problems E.g., no consequent to a implications • Axiom Redundancy E.g., 1. A =>B 2. A=>C 3. C =>B means 1 is redundant • Axiom Variable Usage E.g., Variable used in antecedent but not in consequent • Axiom Consistency E.g., A => not A • Axiom Tautology E.g., consequent repeats (portion of) antecedent