calo decoder progress report for march n.
Skip this Video
Loading SlideShow in 5 Seconds..
CALO Decoder Progress Report for March PowerPoint Presentation
Download Presentation
CALO Decoder Progress Report for March

Loading in 2 Seconds...

play fullscreen
1 / 14

CALO Decoder Progress Report for March - PowerPoint PPT Presentation

  • Uploaded on

CALO Decoder Progress Report for March. Arthur (Decoder and ICSI Training) Jahanzeb (Decoder) Ziad (ICSI Training) Moss (ICSI Training) Carnegie Mellon University Apr 13, 2004. This Presentation. Progress report for March In February Batch mode recognizer completed

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
Download Presentation

PowerPoint Slideshow about 'CALO Decoder Progress Report for March' - naeva

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
calo decoder progress report for march

CALO Decoder Progress Report for March

Arthur (Decoder and ICSI Training)

Jahanzeb (Decoder)

Ziad (ICSI Training)

Moss (ICSI Training)

Carnegie Mellon University

Apr 13, 2004

this presentation
This Presentation
  • Progress report for March
  • In February
    • Batch mode recognizer completed
    • Live-mode recognizer didn’t work
  • In March
    • More decoder work
      • Speed, Accuracy, Interface.
    • ICSI transcription conversion task
      • Resources, Conversion Scripts
    • Miscellaneous efforts in improving the decoder
      • Contact with other groups, web page(s), manual.
decoder work speed
Decoder work (Speed)
  • By Arthur and Jahanzeb
  • Sphinx 3.4 starts to work reasonably in Communicator task
    • 1G: 1.1xRT, 2G: 0.48xRT
  • Phoneme look-ahead research completed
    • 15-20% gain when CIGMMS applied
    • Will incorporate as a functionality
  • Outlook of April
    • Machine Optimization (Still there!)
    • WSJ evaluation
    • Technical report version of the results publishing.
decoder work accuracy
Decoder work (Accuracy)
  • First comparison between s2 and s3.4
    • S3.0 ~ S2 > S3.3 > S3.4
    • Not the fairest comparison
      • S3 model is trained by female speakers only
      • S3 model is less tuned
  • Outlook of April
    • Learn how to do training. Do a fairer comparison.
    • Change search structure.
decoder work interface
Decoder work (Interface)
  • Live-mode decoder works
    • Live-mode recognizer interface is still poorer than S2
    • No config file yet.
    • Many users complained (Well, actually 2-3 of them)
  • Outlook of April
    • Focus on building better API-interface and command-line interface.
    • Jahanzeb will be there while Arthur is working on training.
icsi training
ICSI Training
  • Transcription Conversion Task
  • By Moss, Ziad and Arthur
  • Completion of Resource
    • <VocalSound> mapping (100%)
    • <NonVocalSound> mapping (100%)
    • OOV (~20%)
    • Conversion script (90%)
icsi transcription how does it look like
ICSI Transcription: How does it look like?
  • <Segment StartTime="41.311" EndTime="43.773" Participant="me013" DigitTask="true">
  • three six two four three zero seven <Comment Description="Digits"/>
  • </Segment>
  • <Segment StartTime="0.931" EndTime="3.611" Participant="me034">
  • <VocalSound Description="whistling"/>
  • </Segment>
xml tags conversion
XML tags conversion
  • Transcription is more detail than necessary.
  • Current Treatment:
    • <Comment> : Ignore whole sentence. Too many occurrences, too many varieties..
    • <Emphasis> : Ignore.
    • <Pronounce> : Replace by ++GARBAGE++
    • <Foreign> : Ignore whole sentence. Too few occurrence. Don’t want to care
    • <Uncertain> : Replace by ++GARBAGE++
    • <VocalSound> & <NonVocalSound> : Use mapping.
plain text normalization
Plain-text Normalization
  • After XML Conversion
    • “I – I am no- , I mean C-zero”
  • ‘-’ can mean
    • “-” : Interruption/Interjection marks
    • “-XXX” or “XXX-” : Broken words
    • “XXX-XXX” : hyphenated words
  • AM transcription
    • Get rid all pronunciations and leave broken words alone
  • LM transcription
    • Interruption marks and broken words will be removed
    • (Optional) Leave interruption marks there.
xml conversion script
XML conversion script
  • Functionalities
    • Optional conversion
    • Resource (dict/mapping/rules) read-in
    • XML parser
    • Generate both transcription and control file for close-talking microphones
    • Generate both LM and AM transcription
  • TODO:
    • Incorporate Ziad’s script
      • Correct timing information
      • Generation of far-field channels
    • Fix small bugs.
outlook of icsi training task in april
Outlook of ICSI training task in April
  • Complete OOVs transcription (Arthur, Moss and Ziad)
  • Fix bugs in conversion script (Arthur
  • Learn AM training (Ziad and Arthur)
  • LM training (Moss)
  • Fix potential problems in SphinxTrain.
miscellaneous contact with other group
Miscellaneous (Contact with other group)
  • Want to seek a better interface for Sphinx
  • Try to contact other groups to see what’s up
  • XVoice-sphinx,
    • “command-and-control” application that tried to use Sphinx.
    • Actually it does dictation.
    • Not very happy with Sphinx after Sphinx’s default AM and LM in command-and-control
    • No clear goal yet
    • Start to gather funding.
    • Don’t really like Sphinx because “Sphinx is poorer than ViaVoice in C&C”
we need to help them more
We need to help them more……
  • We need better ……
    • Release (to replace s3.3)
      • After WSJ evaluation, S3.4 will officially released to replace the current S3.3
    • Sphinx web page (also CMU web page)
      • Sphinx’s web page need to have a more unified theme.
      • Task force will be gathered after ICSLP 2004.
    • Manual
      • Need to provide basic education to developers and “hard-core” hackers.
      • wrote the first outline of the manual.
      • 1st draft will appear in a quarter time-frame.
  • Still need to build good model for ICSI first. (Arthur/Ziad/Moss)
    • Training is also critical to understand why s2> s3.3.
  • Better everything for the decoder
    • Arthur/Jahanzeb -> 50/50
  • Others : always on my “priority queue”, will pop up at the right time.