slide1
Download
Skip this Video
Download Presentation
Yuya Akita , Tatsuya Kawahara

Loading in 2 Seconds...

play fullscreen
1 / 24

Yuya Akita , Tatsuya Kawahara - PowerPoint PPT Presentation


  • 133 Views
  • Uploaded on

Topic-independent Speaking-Style Transformation of Language model for Spontaneous Speech Recognition. Yuya Akita , Tatsuya Kawahara. Introduction. Spoken-style v.s. writing style Combination of document and spontaneous corpus Irrelevant linguistic expression Model transformation

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Yuya Akita , Tatsuya Kawahara' - tareq


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
slide1

Topic-independent Speaking-Style Transformation of Language model for Spontaneous Speech Recognition

Yuya Akita , Tatsuya Kawahara

introduction
Introduction
  • Spoken-style v.s. writing style
    • Combination of document and spontaneous corpus
      • Irrelevant linguistic expression
    • Model transformation
      • Simulated spoken-style text by randomly inserting fillers
      • Weighted finite-state transducer framework (?)
      • Statistical machine translation framework
  • Problem with Model transformation methods
    • Small corpus, data sparseness
    • One of solutions:
      • POS tag
statistical transformation of language model
Statistical Transformation of Language model
  • Posteriori:
    • X: source language model (document style)
    • Y: target language model (spoken language)
  • So,
    • P(X|Y) and P(Y|X) are transformation model
  • Transformation models can be estimated using parallel corpus
    • n-gram count:
statistical transformation of language model cont
Statistical Transformation of Language model (cont.)
  • Data sparseness problem for parallel corpus
    • POS information
      • Linear interpolation
      • Maximum entropy
training
Training
  • Use aligned corpus
    • Word-based transformation probability
    • POS-based transformation probability
    • Pword(x|y) and PPOS(x|y) are estimated accordingly
training cont
Training (cont.)
  • Back-off scheme
  • Linear interpolation scheme
  • Maximum entropy scheme
    • ME model is applied to every n-gram entry of document-style model
    • spoken-style n-garm is generated if transform probability is larger than a threshold
experiments
Experiments
  • Training coprus:
    • Baseline corpus: National Congress of Japan, 71M words
    • Parallel corpus: budget committee in 2003, 666K
    • Corpus of Spontaneous Japan, 2.9M words
  • Test corpus:
    • Another meeting of Budget committee in 2003, 63k words
experiments cont
Experiments (cont.)
  • Evaluation of Generality of transformation model
  • LM
conclusions
Conclusions
  • Propose a novel statistical transformation model approach
concept
Concept
  • Probability of sentence
    • n-gram LM
  • Actually,
  • Miss long-distance and word position information while applying Markov assumption
training cont1
Training (cont.)
  • ML estimation
  • Smoothing
    • Use low order
    • Use small bins
    • Transform with

Smoothed normal ngram

  • Combination
    • Linear interpolation
    • Back-off
smoothing with lower order cont
Smoothing with lower order (cont.)
  • Additive smoothing
  • Back-off smoothing
  • Linear interpolation
smoothing with small bins k 1 cont
Smoothing with small bins (k=1) (cont.)
  • Back-off smoothing
  • Linear interpolation
  • Hybrid smoothing
transformation with smoothed ngram
Transformation with smoothed ngram
  • Novel method
    • If t-mean(w) decreases, the word is more important
    • Var(w) is used to balance t-mean(w) for active words
    • active word: words can appears at any position in the sentences
  • Back-off smoothing & linear interpolation
experiments1
Experiments

Observation: Marginal position & middle position

experiments cont3
Experiments (cont.)
  • Comparison with three smoothing techniques
experiments cont4
Experiments (cont.)
  • Error rate with different bins
conclusions1
Conclusions
  • Traditional n-gram model is enhanced by relaxing its stationary hypothesis and exploring the word positional information in language modeling
essential
Document x

Class k

X1

X2

Poisson 1

...

πk1

xp

πk2

Poisson 2

Σ

πkRk

Poisson Rk

Essential
  • Poisson distribution
  • Poisson mixture model

Multivariate Poisson, dim = p (lexicon size)

*Word clustering:

reduce Poisson dimension

=> Two-way mixtures