150 likes | 248 Views
Utilize START, a cutting-edge system that enables multimedia information access through natural language annotations. Access text, images, movies, and more by asking questions in English. START provides precise answers without overwhelming you with extraneous information, facilitating virtual collaboration and efficient data retrieval. Benefit from natural language annotations that bridge the gap between our linguistic capabilities and the vast data available online. Discover how START seamlessly combines external knowledge sources and databases to provide uniform access to diverse information formats. Learn how this system processes annotations to streamline information retrieval, transforming how you interact with the digital world.
E N D
The START InformationAccess System Boris Katz http://www.ai.mit.edu/projects/infolab/
The Problem: • Finding information on line • Two Approaches: • 1. Keyword search (search engines, e.g., AltaVista) • 2. Natural language processing
What’s Wrong with Natural Language Processing (today)? • 1. Too hard • Full-text NL understanding still beyond reach • Intersentential reference • Paraphrasing • Summarization • Common sense implication • 2. Too slow • 3. Not all information is language • Most Web resources are not textual • Maps and Images • Sound and Video • Multimedia • Web resources are distributed across numerous non-traditional databases
What is START? • START (SynTactic Analysis using Reversible Transformations) provides multimedia information access using natural language. • Natural language • Natural language is human language. You don’t have to learn a special language to use START. Ask your questions in English; enter information using English. • Multimedia access using natural language annotations • START lets you use English to access any kind of information: text, pictures, movies, and more. • “Just the right information” • START gives you the answer you want without including a thousand others. • Virtual collaboration • START retrieves information from its own knowledge base and from databases all over the Web.
Natural Language • Natural language is human language. You don’t have to learn a special language to use START. Ask your questions in English; enter information using English
Multimedia Access Using Natural Language Annotations • START lets you use English to access anykind of information: text, pictures, movies, and more.
Just the Right Information • START gives you the answer you want without including a thousand other answers.
Virtual Collaboration • START retrieves information from its own knowledge base and from databases all over the Web.
Natural Language Annotations • Bridge the gap between our ability to analyze natural language sentences and other information and our desire to access the huge amount of data now available on the Web. • Annotations are collections of natural language sentences and phrases that describe the content of various information segments. • START • analyzes these annotations • creates the necessary representational structures • produces special pointers to the information segments summarized by the annotations.
Natural Language Annotations Document Annotation + Xxx xx xx xxx xxxx x “Neptune was discovered using mathematics.” START Server START Server Xxx xx xxxx xx xx xxxxx x xxx xxx x xxx x xxx START Server START Server Information Provider (negotiation) Question “How was Neptune discovered?” (submitted) Information Seeker (retrieved) Document Xxx xx xx xxx xxxx x Xxx xx xxxx xx xx xxxxx x xxx xxx x xxx x xxx
Uniform Access NL questions IMDb Queries U.S. Census START Omnibase Fortune500 Data Multimedia responses POTUS HPKB • Local knowledge base of ternary expressions • Core vocabulary • Uniform interface to multiple database formats (Web, text, etc.) • Extended lexicon
How START Works Omnibase (external knowledge) Scripts Potus IMDb U.S. Census World Factbook WWW Web browser START HTML English English Scripts Parser Generator Input T-exps Matcher Annotations Native knowledge T-exps from KB Database of T-exps