1 / 41

Machine Reading from Multiple Texts

Machine Reading from Multiple Texts. Peter Clark and John Thompson Boeing Research and Technology. What is Machine Reading?. The soldier died The soldier was shot There was a fight …. “A soldier was killed in a gun battle”. Not (just) parsing, fact extraction

abays
Download Presentation

Machine Reading from Multiple Texts

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Machine Reading from Multiple Texts Peter Clark and John Thompson Boeing Research and Technology

  2. What is Machine Reading? The soldier died The soldier was shot There was a fight … “A soldier was killed in a gun battle” • Not (just) parsing, fact extraction • Construction of a coherent representation of the scene the text describes • Challenge: much of that representation is not in the text

  3. What we are trying to do: • Multiple text approach: • Reduce need for precision/coverage on individual texts • Assess confidence using redundancy • Exploit the vast amount of text available • Domains: 2 stroke engines, Pearl Harbor

  4. What we’re trying to do: 2 Stroke engines Output: Single, Coherent Representation Multiple Input Texts ...the mixture of fuel and air in the cylinder has been compressed. This mixture ignites when the spark plug generates a spark. Igniting the mixture causes an explosion. The explosion forces the piston down.... Compress mixture Suck in fresh mixture Generate spark ...The piston compresses the air-fuel mixture in the combustion chamber. The vacuum in the crankcase sucks a fresh mixture of air-fuel-oil into the cylinder. The spark from the spark plug begins the combustion stroke… Ignite mixture Mixture explodes

  5. What we’re trying to do: Pearl Harbor Output: Single, Coherent Representation Multiple Input Texts …at 6am, the first attack wave of 183 Japanese planes takes off from the carriers and heads for Pearl Harbor. At 7:53am the first Japanese assault wave commences the attack, targeting airfields and battleships.Eight battleships are damaged, with five sunk… Japanese planes take off Planes fly to Pearl Harbor Planes bomb airfields & ships As the sun was just beginning to rise, a fleet of Japanese forces were taking off from carriers in various locations in the Pacific. At 7:55am, just as many islanders were waking up for breakfast, the first Japanese bomb was dropped on Wheeler Field, eight miles from Pearl Harbor….Most planes returned to their carriers intact… Eight battleships damaged Planes return

  6. Incredibly Challenging • Basic language processing is hard → Need high-quality language engine • Multiple alignments and implications of text → Treat reading as model building, not fact extraction • Multiple viewpoints/perspectives • → Knowledge-guided model extraction process

  7. Basic language processing is hard • Usual suspects: syntax, WSD, SRL, LF, NE, … • Discourse structure contains much implicit knowledge (e.g., parts, event ordering) A two-stroke engine's combustion stroke occurs when the sparkplug fires. At the beginning of the combustion stroke, the mixture of fuel and air in the cylinder has been compressed. This mixture ignites when the sparkplug generates a spark. Igniting the mixture causes an explosion. The explosion forces the piston down. The piston compresses the mixture in the crankcase as it moves down. As the piston approaches the bottom of its stroke, the exhaustport is uncovered. The pressure in the cylinder forces exhaust gases out of the cylinder. As the piston reaches the bottom of the cylinder, the intakeport is uncovered. The piston's movement pressurizes the mixture in the crankcase. The mixture displaces the burned gases in the cylinder.

  8. Incredibly Challenging • Basic language processing is hard → Need high-quality language engine • Multiple alignments and implications of text → Treat reading as model building, not fact extraction • Multiple viewpoints/perspectives • → Knowledge-guided model extraction process

  9. Want:

  10. Finding Equivalences, Entailments, and Matches • Basic operation: relating (then integrating) texts ...The piston compresses the air-fuel mixture in the combustion chamber. The vacuum in the crankcase sucks a fresh mixture of air-fuel-oil into the cylinder. The spark from the spark plug begins the combustion stroke… ...the mixture of fuel and air in the cylinder is compressed. This mixture ignites when the spark plug generates a spark. Igniting the mixture causes an explosion….

  11. Finding Equivalences, Entailments, and Matches T ...The piston compresses the air-fuel mixture in the combustion chamber. The vacuum in the crankcase sucks a fresh mixture of air-fuel-oil into the cylinder. The spark from the spark plug begins the combustion stroke… ...the mixture of fuel and air in the cylinder is compressed. This mixture ignites when the spark plug generates a spark. Igniting the mixture causes an explosion….

  12. Finding Equivalences, Entailments, and Matches T ...The piston compresses the air-fuel mixture in the combustion chamber. The vacuum in the crankcase sucks a fresh mixture of air-fuel-oil into the cylinder. The spark from the spark plug begins the combustion stroke… = ? → ? ←?  ? ...the mixture of fuel and air in the cylinder is compressed.This mixture ignites when the spark plug generates a spark. Igniting the mixture causes an explosion…. H Textual “Entailment” = The “Modus Ponens” of NLU

  13. Recognizing Textual Entailment (RTE) • Task: does H “reasonably” follow from T? • (or: what is the relationship between T and H?) • Annual RTE competition for 4 years • Is very difficult, and largely unsolved still • typical scores ~50%-70% (baseline is 50%) • RTE4 (2008): Mean score was 57.5% T: The piston's movement pressurizes the mixture. H: The piston compresses the mixture.

  14. Examples A few are easy(ish)…. T: The piston's movement pressurizes the mixture. H: The piston compresses the mixture. but most are difficult… T: A 1,760 pound armor-piercing shell slammed through the deck and hit the ship’s forward ammunition magazine. H: A 1,760 pound bomb penetrated into the front of the ship.

  15. Boeing’s RTE System • Interpret texts using BLUE • (Boeing Language Understanding Engine) • See if: • H subsumes (is implied by) T • H:“An animal eats a mouse” ← T:“A black cat eats a mouse” • H subsumes an elaboration of T • H:“An animal digests a mouse” ← T:“A black cat eats a mouse” • via IF X eats Y THEN X digests Y Two sources of World Knowledge • WordNet subsumption and part of speech relations • DIRT paraphrases

  16. BLUE’s Pipeline “Igniting the mixture causes an explosion.” (DECL ((VAR _X1 "the" "mixture") (VAR _X2 NIL (S (ING) NIL "ignite" _X1)) (VAR _X3 "an" "explosion")) (S (PRESENT) _X2 "cause" _X3)) Parse + Logical form "mixture"(mixture01), "ignite"(ignite01), sobject(ignite01,mixture01), "explosion"(explosion01), "cause"(cause01), subject(cause01,ignite01), sobject(cause01,explosion01). Initial Logic isa(mixture01,mixture_n1), isa(ignite01,light_v4), isa(explosion01,explosion_n1), causes(ignite01,explosion01), object(ignite01,mixture01). Final Logic

  17. “Lexico-semantic inference” • Subsumption T: A black cat ate a mouse subject(eat01,cat01), object(eat01,mouse01), mod(cat01,black01) “by”(eat01,animal01), object(eat01,mouse01) H: A mouse was eaten by an animal

  18. With Inference… T: A black cat ate a mouse IF X isa cat THEN X has a tail IF X eats Y THEN X digests Y T’: A black cat ate a mouse. The cat has a tail. The cat digests the mouse. The cat chewed the mouse. The cat is furry. ….

  19. With Inference… T: A black cat ate a mouse IF X isa cat THEN X has a tail IF X eats Y THEN X digests Y T’: A black cat ate a mouse. The cat has a tail. The cat digests the mouse. The cat chewed the mouse. The cat is furry. …. Subsumes H: An animal digested the mouse.

  20. Acquiring paraphrase/inference rules ? X Y X Y freq freq freq X falls to Y table chair bed cat dog Fred Sue person table chair bed cat dog Fred Sue person freq table chair bed cat dog Fred Sue person table chair bed cat dog Fred Sue person word word word word • Where do the rules come from? • paraphrasing technology can learn these, e.g., DIRT IF X loves Y THEN X likes Y X loves Y

  21. Acquiring paraphrase/inference rules X Y X Y freq freq freq table chair bed cat dog Fred Sue person table chair bed cat dog Fred Sue person freq table chair bed cat dog Fred Sue person table chair bed cat dog Fred Sue person word word word word • Where do the rules come from? • paraphrasing technology can learn these, e.g., DIRT IF X loves Y THEN X likes Y X loves Y X falls to Y

  22. Acquiring paraphrase/inference rules X Y freq freq table chair bed cat dog Fred Sue person table chair bed cat dog Fred Sue person • Where do the rules come from? • paraphrasing technology can learn these, e.g., DIRT IF X loves Y THEN X likes Y X Y freq X loves Y freq table chair bed cat dog Fred Sue person table chair bed cat dog Fred Sue person ? word word X likes Y

  23. Acquiring paraphrase/inference rules X Y freq freq table chair bed cat dog Fred Sue person table chair bed cat dog Fred Sue person • Where do the rules come from? • paraphrasing technology can learn these, e.g., DIRT IF X loves Y THEN X likes Y X Y freq X loves Y freq table chair bed cat dog Fred Sue person table chair bed cat dog Fred Sue person word word X likes Y

  24. Some selected paraphrases from DIRT IF Sergei organizes a symposium THEN: Sergei promotes a symposium. Sergei participates in a symposium. Sergei makes preparations for a symposium.  Sergei intensifies a symposium. Sergei denounces a symposium. Sergei urges a boycott of a symposium. 

  25. Good Entailments and Alignments ...The pressure in the cylinder displaces the burned gases from cylinder…. (DIRT) IF Y is displacedfrom X THEN Y poursout of X the burned gases pour out of the cylinder (WordNet) …Burned gases flow out of the cylinder through the exhaust port….

  26. Good Entailments and Alignments …The piston’s movement pressurizes the mixture in the crankcase…. (DIRT) IF X’s movement changes Y THEN X changes Y the piston pressurizes the mixture (WordNet) ...The piston compresses the mixture in the crankcase….

  27. Bad Entailments ...The burned air-fuel mixture exits thecylinder through the exhaust port… (DIRT) IF X exits Y THEN X squeezes into Y  the mixture squeezes into the cylinder (WordNet) The air-fuel mixture goes into the cylinder as the piston moves….

  28. Other entailments ... Following the explosion, the exploding gases push the piston,forcing it down the cylinder… the piston is moved down by the gases  the gases drive the piston  the gases pull the piston  the piston militates against the gases  …….

  29. The Bottom Line • Simply finding local alignments, and computing local implications, is not enough • Machine-learned world knowledge is too noisy • Local decisions are unacceptably error-prone • Reading is not (just) a set of local processes • Rather: Also need a “global” aspect: Machine Reading = a process of model formation • a search for a “most coherent” set of facts

  30. The gases pull the piston. The exploding gases push the piston down the cylinder… The gases push the piston down. The gases propel the piston. The gases are moved by the piston. The gases push the piston. The explosion of the gases drive the piston… The gases drive the piston. The gases race the piston. Text Interpretation Entailments

  31. The gases pull the piston. The exploding gases push the piston down the cylinder… The gases push the piston down. The gases propel the piston. The gases are moved by the piston. The gases push the piston. The explosion of the gases drive the piston… The gases drive the piston. The gases race the piston. Text Interpretation Entailments

  32. The gases pull the piston. The exploding gases push the piston down the cylinder… The gases push the piston down. The gases propel the piston. The gases are moved by the piston. The gases push the piston. The explosion of the gases drive the piston… The gases drive the piston. The gases race the piston. Best, consistent subset of elaborations = Overall, integrated theory

  33. Is a Markov-based search process: • Can transform this to a satisfiability problem… • Maximize (weighted) number of happy (satisfied) formulae! Propositions: P1: gases push piston down P2: gases drive piston P3: gases pull piston P4: gases propel piston “Things we’d like to be true” Weights: ∞ ∞ 10 8 10 ∞ Formulae: P1 P2 P1 → P3 P1 → P4 P2 → P4 not P1 & P3 Given fact → DIRT rule → Inconsistent →facts can’t both hold

  34. Is a Markov-based search process: • Can transform this to a satisfiability problem… • Maximize (weighted) number of happy (satisfied) formulae! Propositions: P1: gases push piston down P2: gases drive piston P3: gases pull piston P4: gases propel piston Best assignment: t t f t “Things we’d like to be true” Weights: ∞ ∞ 10 8 10 ∞ Formulae: P1 P2 P1 → P3 P1 → P4 P2 → P4 not P1 & P3 Results in: t t f t t t Given fact → DIRT rule → Inconsistent →facts can’t both hold

  35. Incredibly Challenging • Basic language processing is hard → Need high-quality language engine • Many possible equivalences and implications → Treat reading as model building, not fact extraction • Multiple viewpoints/perspectives • → Knowledge-guided model extraction process

  36. Want:

  37. Got: Fuchida shouted “Tora! Tora!” The ships reached position. It was a sunny day. The attack was audacious. • Do have coherent, supported facts • BUT: • There’s a lot going on in any scene! • Multiple viewpoints and levels of detail

  38. Expectations/ Scripts • Better: Use world knowledge to guide what to look for • e.g., scripts of generalized event sequences

  39. Expectations/ Scripts (Entailment-like reasoning again!)

  40. System can still make mistakes… WordNet sandwich#n2: “submarine” “hoagie” “torpedo” “sandwich” “poor boy”, “bomber”: a large sandwich made with meat and cheese ….torpedoes attacked Pearl Harbor… ….bombers attacked Pearl Harbor… Pearl Harbor is being attacked by sandwiches (!) ….Japanese submarines attacked Pearl Harbor…

  41. Summary • Machine Reading from multiple texts • tolerate gaps, ambiguity, errors through redundancy • Three critical requirements • High-quality language engine • Reading as model building, not fact extraction • entailment technology as “modus ponens” of NLU • search for coherence to overcome (many) local errors • Knowledge-guided model extraction process • expectations to guide what to look for • Implications of success are huge! Thank you!

More Related