1 / 30

IKRAFT: Interactive Knowledge Representation and Acquisition from Text

IKRAFT: Interactive Knowledge Representation and Acquisition from Text. Yolanda Gil Varun Ratnakar www.isi.edu/expect/projects/trellis trellis.semanticweb.org USC/Information Sciences Institute gil@isi.edu. Motivation: How KBs Are Built Today. Domain Expert. Read/ask /study/listen.

burian
Download Presentation

IKRAFT: Interactive Knowledge Representation and Acquisition from Text

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. IKRAFT:Interactive Knowledge Representation and Acquisition from Text Yolanda Gil Varun Ratnakar www.isi.edu/expect/projects/trellis trellis.semanticweb.org USC/Information Sciences Institute gil@isi.edu

  2. Motivation:How KBs Are Built Today Domain Expert Read/ask /study/listen... Knowledge Engineer …analyze/group/index... Knowledge Acquisition Tools …structure/relate/fit... KB …reason/deduce/solve

  3. Read/ask /study/listen... …analyze/group/index... …structure/relate/fit... Motivation:The Aftermath of Knowledge Base Development Domain Expert Knowledge Engineer TRASH Knowledge Acquisition Tools KB …reason/deduce/solve

  4. WWW Motivation:Capturing the Design of Knowledge Bases Richer representations More ambiguous More versatile Introductory texts, expert hints, explanations, dialogues, comments, examples, exceptions,... Info. extraction templates, dialogue segments and pegs, filled-out forms, high-level connections,... Knowledge Base Descriptions augmented with prototypical examples & exceptions, problem-solving steps and substeps, ... More formal More concrete More introspectible Alternative formalizations (KIF, MELD, RDF,…), alternative views of the same notion (e.g., what is a threat) ((( )) ()))) (defconcept bridge ()))

  5. Claims • Knowledge can be reused at any level of (in)formality • Knowledge can be extended more easily • Addt’l documents and semi-formal structures readily available • Knowledge can be translated and integrated at any level to facilitate interoperability • KR languages can be a straitjacket for some kinds of knowledge • Intelligent systems will provide better justifications • Many users want to know where axioms came from before they trust system’s reasoning • Content providers will not need to be sophisticated programmers/knowledge engineers • May be easier for end users to organize knowledge rather than formalize it • Good symbiosis of sophisticated and unsophisticated users

  6. An Example:Building a Knowledge Base from a Textbook(DARPA Rapid Knowledge Formation -- RKF) “…The first step a cell takes in reading out part of its genetic instructions is to copy the required portion of the nucleotide sequence of DNA – the gene – into a nucleotide sequence of RNA. The process is called transcription because the information, though copied into another chemical form, is still written in essentially the same language – the language of nucleotides. Like DNA, RNA is a linear polymer made of four different types of nucleotides subunits linked together by phosphodiester bonds. It differs from DNA chemically in two respects: (1) the nucleotides in RNA are ribonucleotides – that is, they contain the sugar ribose (hence the name ribonucleic acid) rather than deoxyribose; (2) although, like DNA, RNA contains the bases adenine (A), guanine (G), and cytosine (C), it contains uracil (U) instead of the thymine (T) in DNA. Since U, like T, can base-pair by hydrogen-bonding with A, the base-pairing properties described for DNA also apply to RNA…” -- Essential Cell Biology, Alberts et al. 1992

  7. Protein Synthesis in RKF’s SHAKEN Authored by a Biologist [Chaudri et al 2001]

  8. Step 1: Selecting Relevant Knowledge Fragments “…The first step a cell takes in reading out part of its genetic instructions is to copy the required portion of the nucleotide sequence of DNA – the gene – into a nucleotide sequence of RNA. The process is called transcription because the information, though copied into another chemical form, is still written in essentially the same language – the language of nucleotides. Like DNA, RNA is a linear polymer made of four different types of nucleotides subunits linked together by phosphodiester bonds. It differs from DNA chemically in two respects: (1) the nucleotides in RNA are ribonucleotides – that is, they contain the sugar ribose (hence the name ribonucleic acid) rather than deoxyribose; (2) although, like DNA, RNA contains the bases adenine (A), guanine (G), and cytosine (C), it contains uracil (U) instead of the thymine (T) in DNA. Since U, like T, can base-pair by hydrogen-bonding with A, the base-pairing properties described for DNA also apply to RNA…” -- Essential Cell Biology, Alberts et al. 1992

  9. Step 2:Composing Stylized Knowledge Fragments - ribose - it is a kind of sugar, like deoxyribose - it is contained in the nucleotides of RNA - uracil - it is a kind of nucleotide, like adenine and guanine - it can base-pair with adenine - RNA - it is a kind of nucleic acid, like DNA - it contains uracil instead of thymine - it is single-stranded - it folds in complex 3-D shapes - nucleotides are linked with phospohodiester bonds, like DNA - there are many types of RNA - RNA is the template for synthesizing protein - its nucleotides contain the sugar ribose (DNA has deoxyribose) - gene - subsequence of DNA that can be used as a template to create protein - protein synthesis - non-destructive creation process: RNA and protein created from DNA - its speed is regulated by the cell - substeps: (ordered in sequence) 1) RNA transcription - a DNA fragment (a gene) is copied, just like DNA is copied during DNA synthesis - the result is an RNA chain 2) protein translation - RNA is used as a template

  10. Step 3:Creating Knowledge Base Items … (defconcept uracil :is-primitive nucleotide :constraints (:the base-pair adenine)) (defconcept RNA :is (:and nucleic-acid (:some contains uracil))) …

  11. IKRAFT: Interactive Knowledge Representation and Acquisition from Text • User starts with documents, extracts a small amount of information from them • Text contains significant portions for context/reference/recall • IKRAFT allows users to annotate text with statements, expressed in natural language • Highlight portions of original text, annotate statement • Statements tend to be stylized • Statements are parsed, system generates summary of: • Objects • Events/actions

  12. IKRAFT: Annotating Manual Information Extraction

  13. IKRAFT: Extracting Statements from Complementary/Contradictory Text Sources

  14. IKRAFT: Documenting Seismic Hazard in Southern California

  15. Seismic Hazard Analysis (SHA) for Southern California Earthquake Center (SCEC)

  16. DOCKER: Scientist Publishes SHA Models User specifies: • Types of model parameters • Format of input messages • Documentation • Constraints Web Browser AS97 DOCKER Model Specification User Interface AS97 docs types msg constrs Wrapper Generation (WSDL, PWL) Constraint Acquisition AS97 ontology SCEC ontologies

  17. Documenting the Model with IKRAFT

  18. Documenting Each Constraint

  19. Formalizing Simple Constraints

  20. Documentation of Constraints (Some Are Formalized, Some Are Not)

  21. DOCKER: Engineer Uses SHA Model User can: • Browse through SHA models • Invoke SHA models • Get help in selecting appropriate model AS97 Web Browser DOCKER AS97 docs constrs Model Reasoning User Interface types msg AS97 ontology Pathway Elicitation Constraint Reasoning Shared ontologies KR&R (Powerloom)

  22. DOCKER Detects Constraint Violations

  23. Should Engineer Override Constraint Specified by Model Developer?

  24. Engineer Brings Up IKRAFT to Find Reasons for the Constraint

  25. Engineer Can Check Additional Model Constraints (Not Formalized)

  26. Constraints Grounded on Model Documentation

  27. Engineers Makes an Informed Decision on Whether to Override the Constraint

  28. Discussion • Overhead in capturing the rationale? • Related to motivation and payoff • Rationale here is captured in a very simple process • Related Work: • Documenting design rationale [Shum 96] • Methodologies for knowledge base development [Schreiber et al 00] • Higher-level languages, e.g., KARL [Fensel et al 98]

  29. Conclusions and Future Work • IKRAFT helps users document formal expressions • Each formal expression is back up by a concise NL statement that is linked back to one or more sources • Users can understand justification for system’s reasoning (e.g., SHA) • Future work: • NLP techniques to extract terms from user’s concise statements • Controlled grammar for formulation of statements • Other documentation: e.g., tables, forms, exceptions High payoff in capturing the rationale of knowledge bases

  30. Speculation: Will the (Semantic) Web End Up Looking Like This? Richer representations More ambiguous More versatile Introductory texts, expert hints, explanations, dialogues, comments, examples, exceptions,... Info. extraction templates, dialogue segments and pegs, filled-out forms, high-level connections,... Descriptions augmented with prototypical examples & exceptions, problem-solving steps and substeps, ... More formal More concrete More introspectible Alternative formalizations (KIF, MELD, RDF,…), alternative views of the same notion (e.g., what is a threat) ((( )) ()))) (defconcept bridge ()))

More Related