1 / 9

Experiments in Adaptive Language Modeling

Experiments in Adaptive Language Modeling. Lidia Mangu & Geoffrey Zweig. Motivation. Multi-domain recognition IBM Superhuman Recognition Program Switchboard / Fisher Voicemail Call Center ICSI Meetings One-size LM may not fit all Even a gigantic LM. Lots of Past Work.

marius
Download Presentation

Experiments in Adaptive Language Modeling

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Experiments in Adaptive Language Modeling Lidia Mangu & Geoffrey Zweig

  2. Motivation • Multi-domain recognition • IBM Superhuman Recognition Program • Switchboard / Fisher • Voicemail • Call Center • ICSI Meetings • One-size LM may not fit all • Even a gigantic LM

  3. Lots of Past Work • Kneser & Steinbiss ’93 • On The Dynamic Adaptation of Stochastic Language Modeling” • Tune mixing weights to suit particular text • Chen, Gauvain, Lamel, Adda & Adda ’01 • “Language Model Adaptation for Broadcast News Transcription” • Build and add new LMs from relevant training data • Florian & Yarowsky ’99 – Hierarchical LMs • Gao, Li & Lee ’00 – Upweight training counts whose frequency is similar to that in test • Seymore & Rosenfeld ’97- Interpolate Topic LMs • Bacchiani & Roark ’03 – MAP adaptation for voicemail • Many others.

  4. Plan of Attack • No adaptation: The Superhuman LM • 8-way LM from multiple domains • Baseline adaptation: Adjust interpolation weights per conversation • Extended adaptation: build new LM from relevant training data

  5. Description of Atomic LMs • SWB + CallHome • 3.4M words, 1.4M 3-gms • Broadcast News • 148M words, 38M 3-gms • Financial Call Ceneters • 655K words, 303K 3-gms • UW Web data (conversational-like) • 192M words, 48M 3-gms • SWB Cellular • 244K words, 134K 3-gms • UW Web data (meeting-like) • 28M words, 12M 3-gms • UW Newsgroup data • 102M words, 34M 3-gms • Voicemail • 1.1M words, 551K 3-gms

  6. Description of Lattice-Building Models & Process • Generate lattices with bigram LM • Word-internal acoustic context • 3.6K acoustic units; 142K gaussians • PLP + VTLN + FMLLR + MMI • LM rescoring w/ 8-way interpolated LM • Acoustic rescoring w/ cross-word AM • Cross-word AM • 10K acoustic units; 589K gaussians • PLP + VTLN + FMLLR + ML • Adapt on scripts of the last step • Adjust interpolation weights to minimize perplexity on decoded scripts

  7. Baseline Adaptation Results

  8. Results on RT03

  9. Conclusions • Simple adaptation effective for a multi-domain system • Contrasts some previous results on BN • Not very sensitive to initial decoding errors • Dynamic LM construction to be explored

More Related