NXT meets the ICSI Corpus

NXT meets the ICSI Corpus Jean Carletta and Jonathan Kilgour University of Edinburgh HCRC Language Technology Group

ICSI Meeting Corpus • 75 natural meetings from research groups • close-talking and far-field microphones • orthographic transcription • "speech quality" tags (e.g., emphasis) • dialogue acts using MRDA • hot spots

The NITE XML Toolkit • library support for data handling and search using a data model that can express both timing and complex structure • multiple file stand-off XML data storage • some standard GUIs, data utilities • library support for writing tailored GUIs

Stand-off XML extract from Bdb001.A.speech-quality.xml <speechquality nite:id="Bdb001.emphasis.16" type="emphasis"> <nite:child href="Bdb001.A.words.xml#id(Bdb001.w.1,342)..id(Bdb001.w.1,344)" /> </speechquality> extract from Bdb001.A.words.xml <w nite:id="Bdb001.w.1,342" starttime="356.39" endtime="" c="W">time</w> <w nite:id="Bdb001.w.1,343" starttime="" endtime="" c="HYPH">-</w> <w nite:id="Bdb001.w.1,344" starttime="" endtime="356.59" c="W">line</w>

Tasks • pre-NXT: up-translation and tokenization • hand annotation (topic segmentation, dialogue acts, extractive summaries, ...) • automatic annotation/indexing by query match

Queries in NXT ($a w):(TEXT($a) ~ /th.*/):: ($s speechquality):($s ^ $a) && ($s@type="emphasis") • Find instances of words starting with “th” • For each find instances of speech quality tags of type "emphasis" that dominate the word • Discard words that are not dominated by at least one such tag Use queries to understand data, verify quality, index.

NXT as Meeting Browser • Browser = display + signal indexing + search • NXT data displays: • synchronize with signal • highlight search results

Issues • Already can't load all the ICSI data at once on some machines • NXT supports display of one meeting at a time but browsing may be over several meetings • Really complicated queries are often too slow for browser response times Key: Pre-indexing of query results, tailored data builds

Conclusions • NXT available, free, open source, useful in a surprising number of ways http://www.ltg.ed.ac.uk/NITE

NXT meets the ICSI Corpus

NXT meets the ICSI Corpus

Presentation Transcript

Using the LEGO NXT

Virtual NXT

ICWAI / ICSI

NXT

Powering the NXT U2C3

NXT Programming

ICSI Conference

The ICSI Summarization System

Mindstorms nxt

Nxt lego

Educate NXT

ACEware Nxt

NXT Vision

ICSI treatment in Hyderabad | ICSI in Hyderabad

ICSI

ICSI Training

ICSI IVF

ICSI treatment

ICSI Treatment

The ICSI Summarization System

ICSI Conference