Provably reliable qa interfaces
This presentation is the property of its rightful owner.
Sponsored Links
1 / 28

Provably Reliable QA Interfaces PowerPoint PPT Presentation


  • 87 Views
  • Uploaded on
  • Presentation posted in: General

Provably Reliable QA Interfaces. http://cs.washington.edu/research/nli. Outline. Motivation Reliable NLIs Semantic tractability theory Implementation and experiments Joint work with Ana-Maria Popescu, Alex Yates, Henry Kautz, and Dan Weld. I. The Dominant UI Paradigm.

Download Presentation

Provably Reliable QA Interfaces

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Provably reliable qa interfaces

Provably Reliable QA Interfaces

http://cs.washington.edu/research/nli


Outline

Outline

  • Motivation

  • Reliable NLIs

  • Semantic tractability theory

  • Implementation and experiments

    Joint work with Ana-Maria Popescu, Alex Yates,

    Henry Kautz, and Dan Weld.

Etzioni, 473


I the dominant ui paradigm

I. The Dominant UI Paradigm

  • Menu buttons

  • Hyperlinks

  • Remote controls

    Just Click it!

Etzioni, 473


What to click

What to click?

Etzioni, 473


Clicking breaks when

Clicking breaks when..

  • Click-challenged

    • Hands busy (driving).

    • Disabled.

  • Limited Screen/keyboard Real estate

    • Cell phones, Microwave.

    • Ubiquitous computing..

  • Complexity: SQL, shell scripts, menu hell.

Etzioni, 473


Alternative interface paradigm

Alternative Interface Paradigm

Speech + NL Understanding + Agents.

  • “Where is Lord of the Rings showing?”

  • “Defrost my corn.”

  • “Delete all my old messages except the ones from Mom.”

Substantial Research Challenges!

Etzioni, 473


State of the art

State of the Art

  • Commercial Speech Systems allow single word input.

  • NLIDBs are unreliable (try Microsoft’s English Query).

  • Nontrivial Autonomous Agents that respond to complex requests?!

Etzioni, 473


Ii our focus

II. Our Focus

  • Not speech.

  • 1991 – 1997: built softbots.

  • Today:

    Reliable Natural Language Interfaces

Etzioni, 473


Why reliable

Why Reliable?

Imagine “intelligent” interfaces that..

  • Sometimes delete the wrong file.

  • Can report incorrect flight times.

    AI cannot be an excuse for incompetence

    (Norman, Schneiderman).

Etzioni, 473


Some common objections

Some Common Objections

  • Can’t the interface just confirm?

    • Yes, but will users attend?

  • Speech understanding isn’t reliable.

    • That will gradually change.

    • Still need reliable language module.

  • NL understanding is AI-complete!

Etzioni, 473


Iii semantics is hard

III. Semantics is hard

  • Syntactic, scopal ambiguity

  • Word-sense ambiguity

  • Time, events, liquids, holes,…

  • Shakespeare, Faulkner,…

  • Discourse, Pragmatics (speech acts)

    We have to think about tractable classes!

Etzioni, 473


Our basic hypothesis

Our Basic Hypothesis

There are common situations where Semantic Interpretation is tractable.

Sentence  Target expression

Etzioni, 473


Our research strategy

Our Research Strategy

  • Identify easy-to-understand NL sentences.

  • Develop Taxonomic Theory of Semantic Tractability.

  • Build Reliable NLIs.

  • Test NLIs experimentally.

Etzioni, 473


Semantic tractability

Semantic Tractability

Easy to understand questions:

What are the Chinese restaurants in Seattle?

What Microsoft jobs require 2 years of experience?

What rivers run through Texas?

Semantically tractable questions

are quite common: 77.5% - 97%

Etzioni, 473


Semantic intractability

Semantic Intractability

Q: What is the second highest mountain in the US?

A: The word ‘second’ is unknown; please rephrase your query.

Q: What are the states bordering the states bordering the states bordering Montana?

A: huh?

Etzioni, 473


Semantically tractable qs

Semantically Tractable Qs

Given a lexicon and a parse, IF

  • Q contains at least one wh-word.

  • There exists a valid mapping from Q to a set of database elements.

  • Q maps to a nonrecursive datalog clause.

    THENQ is Semantically Tractable.

Etzioni, 473


Valid mapping

Valid Mapping

Words (phrases) DB elements.

  • Lexical constraints: Lord of the Rings.

  • Attachment constraints:

    • What is the population of Atlanta?

  • Semantic constraints:

    • Attributes need values: Cuisine  Chinese.

    • Values can have implicit attributes.

Etzioni, 473


Guarantees on st qs

Guarantees on ST Qs.

Precise = NLIDB implementation.

  • Precise can detect ST Qs.

  • Precise is sound for ST Qs.

  • Precise is complete for ST Qs.

Will Precise capture “user intent?”

Etzioni, 473


Iv precise implementation

IV. Precise Implementation

  • Lexicon extracted from DB + Wordnet.

  • Parser plug in (Charniak 2000).

  • Semantic constraints via graph match.

    • Word1DB_EL1

    • Phrase2 DB_EL2

      DB_EL3

      Match is computed via maxflow.

Etzioni, 473


Provably reliable qa interfaces

PRECISE graph representation

DB Values

DB Attributes

Attribute

Tokens

Value

Tokens

Movies = what

Movies

film

What

Actor = Woody Allen

Actor

E

S

Woody Allen

Director = Woody Allen

Director

I

K

2

Paul Mazursky

Director = Paul Mazursky

“What are the Paul Mazursky films with Woody Allen?”

Etzioni, 473


Provably reliable qa interfaces

PRECISE Architecture

Database

Lexicon

Parser

Tokenizer

Matcher

Query Generator

Equivalence Checker

English Question

SQL Query + Answer Set

Etzioni, 473


Ambiguity meets reliability

Ambiguity meets Reliability

  • Precise computes all valid mappings.

  • Many possibilities  clarifying Q.

  • What is the population of New York?

    • The population of New York City is…

    • The population of New York State is..

Etzioni, 473


Experiments

Experiments

Systems

Mooney, PRECISE, EnglishQuery

Datasets

Sets of NL questions labeled with SQL queries

Geography (846) Jobs (577) Restaurants (224)

Measure

PRECISE’s performance

Prevalence of semantically tractable questions

Etzioni, 473


Provably reliable qa interfaces

Fraction Answered

Recall = Qanswered/Q

Recall > 75 %

Etzioni, 473


Provably reliable qa interfaces

Error Rate

Precision = Qcorrect/Qanswered

PRECISE made no mistakes on semantically tractable questions

Etzioni, 473


Exact case study

Exact – Case Study

  • Exact is an NLI to the Panasonic KX-TC1040W telephone/answering machine

  • It uses Precise to formulate goals for the Blackbox planner (Kautz & Selman)

  • The database model for the phone has 5 relations (between 2 and 11 attributes each) and the action set includes 37 actions.

Etzioni, 473


Exact diagram

Exact – Diagram

Etzioni, 473


Conclusion

Conclusion

  • Click-it UI paradigm is fraying.

  • Need reliable NLIs.

  • Semantically Tractable sentences.

    • Taxonomic Theory.

    • Precise and Exact implementations.

    • Promising, preliminary experiments.

Etzioni, 473


  • Login