aquaint answer spotting application n.
Skip this Video
Loading SlideShow in 5 Seconds..
AQUAINT: Answer Spotting Application PowerPoint Presentation
Download Presentation
AQUAINT: Answer Spotting Application

Loading in 2 Seconds...

play fullscreen
1 / 20

AQUAINT: Answer Spotting Application - PowerPoint PPT Presentation

  • Uploaded on

AQUAINT: Answer Spotting Application. Herbert Gish, Rukmini Iyer and Chia-Lin Kao. Overview. Project Goals Answer-Spotting Problem Description Query Formation Technical Approach Choice of Corpora System Architecture & Components Long-Term Application Features Portability across languages

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
Download Presentation

AQUAINT: Answer Spotting Application

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
aquaint answer spotting application

AQUAINT: Answer Spotting Application

Herbert Gish, Rukmini Iyer and Chia-Lin Kao

  • Project Goals
  • Answer-Spotting Problem Description
    • Query Formation
    • Technical Approach
    • Choice of Corpora
  • System Architecture & Components
  • Long-Term Application Features
    • Portability across languages
    • Using Limited Data and Linguistic Resources
  • Summary
project goals
Project Goals
  • Primary Objectives
    • Develop answer-spotting technology to provide analysts with best answers available from a spontaneous speech database
    • Develop application for multiple languages and with potentially limited resources
  • Application Features
    • Explain to the user the basis for decisions
    • Export semantic components of answer to a multi-media system
    • Account for variability in resources in extracting information
    • Enable rapid deployment in new languages
query formation
Query Formation
  • Unstructured Query
    • Words or phrases framed as a query
    • Example: Is there any instance of a yellow Chevy car in this data?
    • Requires user to know exactly what (s)he is looking for
  • Structured Query
    • Topic: defines at a macro-level whether a conversation is relevant to the query
    • Semantic categories/classes: define all the words/phrases of interest
    • Keywords: define the specific word or phrase of interest to the user if the user knows what (s)he is looking for
answer spotting approach
Answer Spotting Approach
  • Recognize topic specific language activity
    • Generalization of word and phrase spotting
  • Integrate search for best answers into the speech recognition process
    • Use topic relevant language model(s) to select relevant data
    • Incorporate semantic classification of words or phrases into language model used in recognition
    • Requires minimal resources and provide the best performance
  • Post-processing of speech recognition output to put together semantic components of answer
choice of corpora
Choice of Corpora
  • Desired Corpora Features
    • Spontaneous (telephone) speech
    • Conversations between people
    • Consistent query formation and answer representation from data
  • Selected Corpora
    • Switchboard
      • Spontaneous telephone conversations between strangers
      • Topic-driven conversations
      • Abundant amounts of transcribed data
    • Callhome
      • Spontaneous telephone conversations between family members
      • Corpora available in multiple languages: Spanish, Mandarin and Arabic
query formation in switchboard
Query Formation in Switchboard
  • Topics
    • Selected 5 diverse topics
    • Topic descriptions: Buying a car, Credit cards, News media, Vacation spots, Music
    • Amount of data for each topic varies from 30 to 60 conversations
  • Semantic Categories
    • For each topic, defined a set of semantic classes or categories
    • At least 5 categories per topic were picked
    • Manual annotation of semantic categories underway – no syntactic information used in annotation
  • User-Defined Keywords/Phrases
topic buying a car
Topic: Buying a Car

What kind of car do you think you might buy next? What sorts of things will enter your decision? See if your requirements and the other caller’s requirements are similar

topic news media
Topic: News Media

Discuss howyou and the caller keep up with current event.

topic vacation spots
Topic: Vacation Spots

Please discuss types of vacations and trips you enjoy.

example switchboard queries
Example Switchboard Queries
  • Topic: Buying a car, Semantic Classes: Class Make, Keywords: SUV, Nissan
  • Topic: Vacation Spots, Semantic Classes: Location spot, Keywords: Disneyland, summer
  • Topic: Credit card, Semantic Classes: Purchase location Keywords:
system components
System Components
  • Recognizer
    • State-of-the-art Byblos system
    • Real-time or near real-time performance
  • Topic Identifier
    • Parallel language model structure in recognizer that separates query topic from non-topic data
    • Topic & text integrator that uses language model information and word confidences to filter relevant text
  • Category Identifier
    • Categories integrated into the language model or
    • Use separate component, for example, Identifinder
recognition system
Recognition System

Multi-pass Byblos system:

  • Forward pass uses bigram language models and PTM non-crossword acoustic models
  • Backward pass uses approximate trigram language models and SCTM non-crossword models to generate N-best
  • N-best rescoring pass uses more complex trigram language models and SCTM cross-word models
topic identification
Topic Identification
  • Parallel language model implementation
    • Query topic language model(s) is in parallel with a general English language model
    • Models are parallel at the utterance-level, i.e., no cross-over allowed within an utterance
    • Topic model trained on topic conversations and smoothed with the non-topic data
  • Topic identification
    • Use best recognition path during decoding to indicate whether utterance is topic or non-topic
    • Develop a separate topic classifier that uses decoding information and other features to determine if the utterance belongs to the query topic
category identification identifinder
Category Identification - Identifinder

Semantic Class 1

Semantic Class N

End of sentence

Start of sentence


  • Identifinder is an HMM with internal states defined by the semantic classes and a single “not-a-semantic-class state.
  • The state generates words conditioned on the previous state as well as the previous word.
  • Word features can also be used in addition to the word identity.
portability across languages
Portability Across Languages
  • Use Callhome corpora for testing system capabilities
    • Callhome English has conversations between family members
    • Topics range from family events to immigration issues
  • Callhome is available in multiple languages
    • Languages that can be tested include Spanish, Mandarin and Arabic
    • Limited data and linguistic resources are available in these languages posing additional technical challenges
using limited resources
Using Limited Resources
  • Investigate effect of variations in data on various system components
    • Impact of reduced number of manually annotated conversations on category identification
      • Use word clustering on other available text resources to find words that fit into the semantic classes of interest
      • Use relevance feedback techniques, where the user provides feedback that can be used to adapt system response
    • Impact of reduced transcriptions for acoustic/language modeling on recognition performance
      • Use auto-transcription techniques if additional audio data is available
      • Use newspaper & broadcast news text available to augment language modeling performance
using limited resources1
Using Limited Resources
  • Building system with limited data resources and/or linguistic expertise
    • Enabling rapid deployment in new languages where linguistic resources (for example, word pronunciation dictionary or word transcriptions) are limited
    • Annotating topics and semantic categories on a new language where transcriptions are limited
  • Develop answer-spotting technology that can respond to analyst queries by providing the best answer in a speech database
  • Both structured and unstructured queries can be handled by the application
  • Application will be tested for limited data and resource conditions
  • Enable rapid deployment of application in new languages and domains