1 / 54

Reading to Learn Q1 review (7/6/05)

Reading to Learn Q1 review (7/6/05). Peter Clark Michael Glass Phil Harrison Tom Jenkins John Thompson Rick Wojcik Boeing Phantom Works. Agenda. Introduction 1. Textual knowledge to CPL 2. CPL to logic: How CPL is interpreted Processing some CPL (demo) 3. Knowledge Integration

nigel
Download Presentation

Reading to Learn Q1 review (7/6/05)

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Reading to LearnQ1 review (7/6/05) Peter Clark Michael Glass Phil Harrison Tom Jenkins John Thompson Rick Wojcik Boeing Phantom Works

  2. Agenda Introduction 1. Textual knowledge to CPL 2. CPL to logic: How CPL is interpreted • Processing some CPL (demo) 3. Knowledge Integration Extracting knowledge from text

  3. LbR Framework Solidification Loop Knowledge Acquisition Loop Knowledge Repository “Worldview” Using what you know to get more Robust Reasoning Tasks Corpus Knowledge Integration Introspection

  4. LbR Framework Solidification Loop Knowledge Acquisition Loop Knowledge Repository “Worldview” Using what you know to get more Robust Reasoning Tasks Corpus Knowledge Integration Introspection or Logic CPL

  5. Overview What does a person do? • Start aready knowing something about a domain + general knowledge • Read; existing knowledge helps him/her understand the new material • integrate the new knowledge into pre-existing knowledge • can now perform new tasks • In its full and unrestricted form, is too difficult to be implemented • BUT: significant, partial approaches are feasible

  6. Reduced Version • Select a domain where KB exists (chemistry) • Manually reformulate a section of text into controlled English (CPL) • Automatically process that controlled English to generate new knowledge • Integrate the new knowledge into the KB • Rationale: • Two key problems in reading to learn: • full natural language processing • knowledge integration • This approach separates them and can focus on (2)

  7. Step 1: Unrestricted NL to CPL Solidification Loop Knowledge Acquisition Loop Knowledge Repository “Worldview” Using what you know to get more Robust Reasoning Tasks Corpus Knowledge Integration Introspection or Logic CPL

  8. Step 1: Possible methods Manual reformulation Machine “translation” Text in restricted English Texts (unrestricted English) Locate subset in simple English Knowledge extraction from corpora

  9. Step 1: Possible methods Manual reformulation Machine “translation” Text in restricted English Texts (unrestricted English) Locate subset in simple English Knowledge extraction from corpora

  10. Reformulating Chemistry Text into CPL John A. Thompson

  11. Selecting the important text • Textbook: • “Acids have a sour taste (for example, citric acid in lemon juice) and cause certain dyes to change color (for example, litmus turns red on contact with acids). Indeed, the word acid comes from the Latin word acidus, meaning sour or tart.” • Selected and reworded as CPL: • “Acids have a sour taste.” • “Acids cause some dyes to change color.” • Judgment calls regarding “what might be on the test” • Simplified rewordings – no pronouns or complex sentences, see CPL User Guide

  12. Example 2: Text to CPL • Textbook: • “Sodium hydroxide is an Arrhenius base. Because NaOH is an ionic compound, it dissociates into Na+ and HO- ions when it dissolves in water, thereby releasing OH- ions into the solution.” • CPL: • “Sodium hydroxide is an Arrhenius base.” • “NaOH is sodium hydroxide.” [insertedbackground knowledge] • “NaOH is an ionic compound.” • “NaOH dissolves in water.” • “NaOH dissociates in water.” • “The dissociating produces Na-plus ions and OH-minus ions.”

  13. Example 3: Text to CPL • Textbook: • “Some substances can act as an acid in one reaction and as a base in another. For example, H2O is a Bronsted-Lowry base in its reaction with HCl and a Bronsted-Lowry acid in its reaction with NH3. A substance that is capable of acting as either an acid or a base is called amphoteric.” • CPL: • “Some substances sometimes act as a Bronsted-Lowry acid and sometimes act as a Bronsted-Lowry base.” [required “and”] • “These substances are called amphoteric substances.” [vocab] • “H2O acts as a Bronsted-Lowry base in a reaction with HCl.” • “H20 acts as a Bronsted-Lowry acid in a reaction with NH3.” • “Therefore, H2O is an amphoteric substance.” [deduced]

  14. Rewording generics • “Acids have a sour taste” is a generic sentence about a class of things – the textbook is full of generics! • One interpretation: • “Every instance of an acid has a sour taste” • But is it “every” instance, or just “typically”? • Another interpretation: • “If a person is tasting an acid, then the person is experiencing a sour taste” • We are planning to do some automatic interpretation of generic sentences in the future • Our short-term strategy is to reword every generic sentence to another form, especially to an if-then rule

  15. Rewording generics - 2 • Another example: “Acids cause some dyes to change color” • One interpretation: • “There are some dyes that change color when in contact with any acid” • Another interpretation: • “For each acid there is some dye that changes color when in contact with the acid” • Possible CPL rewording as an if-then rule: • “If an acid-sensitive dye is in contact with an acid, then the acid is causing the dye to change color” • Bottom line: Generics are a major issue with textbook knowledge

  16. CPL now includes if-then rules • Rewriting generic sentencesrequired if-then rules to be added to CPL • Examples of if-then rules in CPL: • “If HCl is immersed in water, then the HCl is dissolving in the water.” • “If HCl is dissolving water, then each molecule of the HCL is reacting with a molecule of the water.” • “If an HCl molecule is reacting with a water molecule, then an H-plus ion is transferring from the HCl molecule to the water molecule.”

  17. Connecting verb and noun forms • Example: reacts vs. reaction • “Hydrogen chloride gas in water reacts with the water” • “The reaction produces H-plus ions and Cl-minus ions • The CPL writer assumes that the interpreter will make the connection from the verb to the related noun in the next sentence • We are building a large set of verb-noun relations to handle these cases

  18. Gross vs. molecular events • Chemistry text switches between: • gross-level events involving substances • reactions between two molecules • Example: • “When HCl dissolves in water, we find that the HCl molecule transfers an H+ ion (a proton) to a water molecule” • In CPL, we interpret “HCl” as the gross-level substance, and “HCl molecule” as the molecular-level individual • If-then rules carry the logic from the gross level to the molecular level: • “If HCl is dissolving water, then each molecule of the HCL is reacting with a molecule of the water.”

  19. Issues with fuzzy information • Examples: • “An H-plus ion sometimes reacts with an H2O molecule” • “A molecule of a Bronsted-Lowry acid can donate a proton to another substance” • “The NH4 is mostly solid particles” • “Some substances containing hydrogen are not acids” • “An ion of a Bronsted-Lowry acid must have a hydrogen atom.” • These are easy to state in CPL, but what logical representation should be produced? • This is a universal problem in any kind of language interpretation • We are developing solutions as we progress

  20. The textbook teaches by example • Example: • “A molecule of a Bronsted-Lowry acid can donate a proton to another substance” • “An HCl molecule in water donates a proton to an H2O molecule” • “Therefore, the HCl molecule acts as a Bronsted-Lowry acid” • Text is using an example to teach the student how to make similar deductions • Unclear how to capture this in CPL – how to specify where the deductive chain begins? • Perhaps we should skip the examples and only enter the general principles being taught?

  21. Teaching by hypothetical situations • Example: • “[Assume that] H2O is a stronger base than X-minus in Equation 16.9” • “[Assume that] X-minus is the conjugate base of HX in Equation 16.9” • “H2O extracts the proton from HX in the reaction in Equation 16.9” • “The reaction produces H3O-plus and X-minus” • Therefore, the equilibrium is on the right side of Equation 16.9” • How can CPL make it clear that this is a hypothetical? • Note that X stands for a variety of atoms • So the hypothetical teaches the student a useful pattern • How should we represent and reason about this?

  22. Real world vs. written equations • The textbook switches between the real world and the syntax of written chemical equations: • “If H2O (the base in the forward reaction) is a stronger base than X- (the conjugate base of HX), then H2O will abstract the proton from HX to produce H30+ and X-. As a result, the equilibrium will lie to the right.” • The CPL author must be careful to mention the equation number when referring to its syntax: • “In Equation 16.9 H2O is the base in the forward reaction.” • “The equilibrium is on the right side of Equation 16.9”

  23. Textbook figures and tables • A textbook is not just text! • Some figures and diagrams can be re-expressed in text, but others cannot be • Tables (such as 16.4, a listing of the relative strengths of some conjugate acid-base pairs) can be converted to many lines of CPL text • We are omitting sample & practice exercises from the CPL

  24. Textbook to CPL: Conclusions • Authors can learn to restate the essential parts of a textbook in simple CPL sentences • Most textbook sentences contain ambiguous “generics” about classes of things • For now, we can disambiguate generics by rewriting them as if-then rules in CPL • For most CPL sentences, we can process them to get an adequate logical representation (with some user assistance) • Special problems involving fuzzy statements and hypothetical examples require further work

  25. Step 2: CPL Interpretation Solidification Loop Knowledge Acquisition Loop Knowledge Repository “Worldview” Using what you know to get more Robust Reasoning Tasks Corpus Knowledge Integration Introspection or Logic CPL

  26. Immerse object is-inside-of HCl Water Step 2: CPL Interpretation (overview) “HCl is immersed in water” Parser & LF Generator Linguistic Knowledge Word sense disambiguator Relational disambiguator Coreference identifier World Knowledge Structural reorganizer (_HCl13320 instance_of HCl-Substance) (_Water13321 instance_of Water) (_Immerse13319 instance_of Move-Into) (_Immerse13319 object _HCl13320) (_Immerse13319 is-inside-of _Water13321)

  27. CPL Processing • Spots coreferences across sentences • Handles nominalizations (“reacts”, “the reaction”) • Use of WordNet to coerce text to KB’s ontology • Set of heuristics for identifying semantic relations • Interprets rules, as well as ground facts • Rules are in KM (inference-capable) • Can be used in interactive or automatic mode

  28. Integrating Knowledge from Reading into a Knowledge Base Michael Glass Boeing Phantom Works

  29. Step 3: Knowledge Integration Solidification Loop Knowledge Acquisition Loop Knowledge Repository “Worldview” Using what you know to get more Robust Reasoning Tasks Corpus Knowledge Integration Introspection or Logic CPL

  30. The Task • After knowledge is input through CPL it must integrated with existing knowledge in the KB. • This task carries with it three main problems. • Missing Concepts: The new knowledge may contain concepts the KB does not have. • Conflicting Concepts: The new knowledge may use concepts in the KB in an incompatible way. • Elaboration Requires Updates: The knowledge base may be resistant to new knowledge.

  31. 1. Missing Concepts • Consider: If HCl is immersed in water, then the HCl is dissolving in the water. • The knowledge base does not know the concept “immerse” or “dissolve”. • Even if they were added in a superficial way, the knowledge base would have no axioms allowing it to reason about immersion or dissolving. • This is the simplest type of mismatch. • It is relatively easy to detect. • It simply requires adding the concepts to the KB or the user may restrict himself to concepts the KB already has.

  32. 1. Missing Concepts: Solutions • Add the concepts to the KB. • The concepts could be added selectively as needed and (partially) axiomatized. • The concepts could be added in bulk in a vacuous way. • Defining new concepts through CPL. • Restrict the user to the set of concepts in the KB. • In many cases a similar concept in the knowledge base may serve just as well.

  33. 2. Conflicting Concepts • Two significant classes of conflicting concepts: • Genuine Conflicting Concepts: A concept in the KB may simply not match what is intended. • Naïve Encodings: The KB may expect knowledge structured in a specific way.

  34. 2. Conflicting Concepts • Consider: If HCl is dissolving water, then each molecule of the HCl is reacting with a molecule of the water. • The knowledge base has a concept of a Reaction, but the inputs are required to be Chemicals (aggregates of molecules). NOT Molecules. • The reaction concept is well axiomatized, but some of these axioms are not applicable to molecules.

  35. 2. Conflicting Concepts • This might be considered another case of missing concepts. • The KB has concepts for reasoning about reactions at the macroscopic level, but not at the molecular level.

  36. 2. Conflicting Concepts • Consider: If an HCl molecule is reacting with a water molecule, then an H-plus ion is transferring from the HCl molecule to the water molecule. • Transferring is in the KB, but it is axiomatized in a way consistent with a person transferring an object to another person. • The object transferred must be in the “possession” of the donor.

  37. 2. Naïve Encodings, A Special Case of Conflicting Concepts • There can be a mismatch in the form in which a new piece of knowledge is stated and the form in which it is expected. • A chemistry textbook might refer to the equilibrium constant of a reaction rather than the equilibrium constant of a equilibrium reaction. • The Chemistry knowledge base expects only equilibrium reactions to have equilibrium constants.

  38. 2. Naïve Encodings • Also the CPL parser may produce KM that is not quite right: • Produced by CPL: (a Chemical with (color (*red))) • Expected by KM: (a Chemical with (color ((a Color-Value with (value (*red)))))) • The difference is purely structural.

  39. 2. Conflicting Concepts: Solutions • First the conflict must be detected • Constraint violations: Certainly if a reaction requires Chemicalraw-materials and it is given Molecules instead, there must be a conflict. • Resemblance check: If the knowledge looks totally novel, such as a Transfer from one Molecule to another, it may be a conceptual conflict. • To fix a naïve encoding an automatic solution may be used to coerce the concepts into an acceptable form. (James Fan’s Loosespeak)

  40. 3. Elaboration Requiring Updates • Two significant ways a KB may require updates to add knowledge: • Closed World Assumption: The KB may implicitly or explicitly assume it already knows everything about a given topic. • Unstated Assumptions: The KB designers may have built the KB with some assumptions about the circumstances it will reason about.

  41. 3. Updates and the Closed World Assumption • Some rules in the KB close off the KB to further elaboration. • The else part of an if…then…else may give a default value that prevents other components from concluding anything else.

  42. 3. Unstated Assumptions • Often a knowledge base will be built with a set of unstated assumptions about how it will be used. • These assumptions may lead to assertions in the knowledge base stated as always true, but really true only under certain circumstances. • This can result in knowledge that is difficult to extend.

  43. 3. Unstated Assumptions: Example • Consider a knowledge base meant to deal with chemical reactions at room temperature and standard pressure. • This knowledge base might include the assertion that water is a liquid. • (every Water has (state (*liquid))) • However, if the scope of the knowledge base is extended to consider a range of temperatures, the knowledge base is not simply incomplete, but wrong.

  44. 3. Elaboration Requiring Updates: Solutions • First the conflict must be detected. • The knowledge base could forward chain for a while on concepts related to the new knowledge. • Resolving the conflict is difficult in general. • It may entail a case of the attribution problem.

  45. Conclusion and Future Directions • Missing Concepts • Allow additions to ontology through CPL • Naïve Encodings • Loosespeak interpreter • Genuine Conflicting Concepts and Elaboration Requiring Updates • Detection through constraint checks, resemblance check and forward chaining. • Correction assisted by feedback.

  46. Acquiring Knowledge from Reading Phil Harrison Boeing Phantom Works

  47. Step 1: Unrestricted NL to CPL - an additional approach Solidification Loop Knowledge Acquisition Loop Knowledge Repository “Worldview” Using what you know to get more Robust Reasoning Tasks Corpus Knowledge Integration Introspection or Logic CPL

  48. The Problem • Uncontrolled (non CPL) text is difficult for computers to analyze. • Advances are needed in all areas of NLP: grammar, WSD, semantic representation, and discourse processing. • A method is needed for extracting as much knowledge as possible from text.

  49. Tuple extraction • The Parse trees are a source of “head-complement” or “head-modifier” relations: • From “The heavy man bought an expensive book”  (S “man” “buy” “book”) (AN “heavy” “man”) (AN “expensive” “book”) • “Books can be bought” “Men can be heavy” “Books can be expensive” • Even incorrect parses can generate some valid tuples.

  50. Examples from chemistry • Acids can be strong or weak. • Acids can be vitamins. • Acids can release ions. • Bases can turn litmus. • Carbonates can form CO2. • Chlorides can become ions. • Concentrations can be measured.

More Related