1 / 23

START: Natural Language Access to Information

START: Natural Language Access to Information. Boris Katz, Gary Borchardt, Sue Felshin, Jimmy Lin, Jerome McFarland, Ali Ibrahim, Luciano Castagnola, Baris Temelkuran, Aaron Fernandes, Alp Simsek, Jonathan Wolfe, Matthew Bilotti MIT Artificial Intelligence Lab

Download Presentation

START: Natural Language Access to Information

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. START: Natural Language Accessto Information Boris Katz, Gary Borchardt, Sue Felshin, Jimmy Lin, Jerome McFarland, Ali Ibrahim, Luciano Castagnola, Baris Temelkuran, Aaron Fernandes, Alp Simsek, Jonathan Wolfe, Matthew Bilotti MIT Artificial Intelligence Lab http://www.ai.mit.edu/projects/infolab/

  2. I had a dream... ? Library of Congress

  3. Reality • What we can do: • Understand ordinary sentences and questions • What we can’t do (yet): • 1. Full-text NL understanding still beyond reach • Common sense implication • Intersentential reference • Summarization • 2. Not all information is language—most Web resources are not textual • Maps and Images • Sound and Video • Multimedia • Web resources are distributed across numerous non-traditional databases

  4. An object at rest tends to remain at rest. In 1492, Columbus sailed the ocean blue. Four score and seven years ago our forefathers brought forth Bridging the Gap + Library of Congress

  5. The Solution: Natural Language Annotations • Annotations bridge the gap between our ability to analyze natural • language sentences and our desire to access the huge amount of data available in our libraries and on the Web. • Annotations are collections of natural language sentences and phrases that describe the content of various information segments. • START • analyzes these annotations • creates the necessary representational structures • produces special pointers to the information segments summarized by the annotations

  6. is long year related-to Mars year Natural Language Annotations START knowledge base ... one Mars year lasts 687 Earth days. Annotation + “Mars’s year is long.” Annotator Questions • “How long is the Martian year?” • “How long is a year on Mars?” • “How many days are in a Martian • year?” • … User ... one Mars year lasts 687 Earth days.

  7. Parsing N N N N V A chain of reactions converts each molecule of glucose into two smaller molecules of pyruvate. S NP VP PP det prep NP NP converts noun PP a each quantity into prep NP chain noun PP of noun two smaller molecule of glucose molecules of pyruvate reactions

  8. related-to pyruvate quantity into two converts is molecules smaller related-to related-to chain glucose reactions quantifier molecule each Ternary expressions (T-expressions) A chain of reactions converts each molecule of glucose into two smaller molecules of pyruvate. <chain-1 related-to reactions-1> <molecules-5 related-to pyruvate-1> <molecules-5 quantity 2> <molecules-5 is smaller> <molecule-1 related-to glucose-1> <molecule-1 quantifier each> <<chain-1 convert molecule-1>into molecules-5>

  9. T-expression Representation • List of node-link-node triples • Nouns, adjectives are nodes • Links cover: • relationships between verbs and their arguments • fundamental semantic relationships: “is-a” (for equality, membership, and subclass relationships), “related-to” (for possessives, etc.) • modification of nouns: “quantifier”,“quantity”,“is” (for adjectives) • prepositions

  10. S-rules for Structural Variation The president impressed the country with his determination. The president’s determination impressed the country. S-rule for the Property Factoring alternation: someone1 emotional-reaction-verb someone2 with something someone1’s something emotional-reaction-verb someone2 related-to with related-to someone1 something1 emotional- reaction- verb someone1 something emotional- reaction- verb Emotional reaction verbs: surprise stun amaze startle impress please embarrass annoy etc. something1 someone2 someone1 someone2

  11. related-to pyruvate quantity into two converts is molecules smaller related-to related-to chain glucose reactions quantifier molecule each Sample Assertion A chain of reactions converts each molecule of glucose into two smaller molecules of pyruvate. <chain-1 related-to reactions-1> <molecules-5 related-to pyruvate-1> <molecules-5 quantity 2> <molecules-5 is smaller> <molecule-1 related-to glucose-1> <molecule-1 quantifier each> <<chain-1 convert molecule-1>into molecules-5>

  12. related-to pyruvate into converts molecules related-to something glucose molecules Sample Query How are the glucose molecules converted into pyruvate molecules? <molecules-5 related-to pyruvate-1> <molecules-1 related-to glucose-1> <<something convert molecules-1> into molecules-5>

  13. related-to pyruvate quantity into two converts is molecules smaller related-to related-to chain glucose reactions quantifier molecule something each Matching T-expressions from Query T-expressions from Assertion Matcher Key: • Input Processing • Query Processing

  14. A. Reply by Generating Ternary expressions related-to pyruvate quantity into two converts is molecules smaller related-to related-to chain glucose reactions quantifier molecule each Displayed Answer Generator Query: How are the glucose molecules converted into pyruvate molecules? Answer: A chain of reactions converts each molecule of glucose into two smaller molecules of pyruvate.

  15. Reply by Generating: Example

  16. B. Reply from annotation Ternary expressions Annotated resource related-to Cog picture Displayed Answer Find resource Query: Show me a picture of Cog. +

  17. Reply from annotation: Example

  18. C. Reply from annotation with script T-exps directs any-IMDb-movie any-person Displayed Answer Find resource Run script Query: Who directed Gone with the Wind? • Script • get http://us.imdb.com/Details?0031381 • match regexp... Gone with the Wind (1939) was directed by George Cukor, Victor Fleming, and Sam Wood. Source: The Internet Movie Database + IMDb

  19. Reply from annotation with script: Example

  20. Uniform Access IMDb NL questions Queries U.S. Census START Omnibase Webster Data Multimedia responses POTUS NASA • Local knowledge base of ternary expressions • Core vocabulary • Uniform interface to multiple database formats (Web, text, etc.) • Integration time independent of size of database • Extended lexicon

  21. How START works Web browser START HTML Omnibase (external knowledge) English Parser English Scripts Scripts Input T-exps Generator Potus IMDb Matcher Annotations U.S. Census World Factbook T-exps from KB Database of T-exps Native knowledge WWW

  22. Multi-Modal Interaction Q. "I'd like to speak to Trevor." Q. "Is Trevor in his office?" A. "Trevor is in his office but he is on the phone." A. "Trevor is in his office but he is talking to Boris now." A. "Trevor is in his office; however, he doesn't want to be disturbed until 2pm."

More Related