Less is more
1 / 22

Less is More? - PowerPoint PPT Presentation

  • Uploaded on

Less is More?. Yi Wu Advisor: Alex Rudnicky. People:. There is no data like more data!. Goal: Use less to Perform more. Identifying an informative subset from a large corpus for Acoustic Model (AM) training. Expectation of the Selected Set Good in Performance Fast in Selection.

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
Download Presentation

PowerPoint Slideshow about 'Less is More?' - sef

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Less is more

Less is More?

Yi Wu

Advisor: Alex Rudnicky


There is no data like more data!

Goal use less to perform more
Goal: Use less to Perform more

  • Identifying an informative subset from a large corpus for Acoustic Model (AM) training.

  • Expectation of the Selected Set

    • Good in Performance

    • Fast in Selection


  • The improvement of system will become increasingly smaller when we keep adding data.

  • Training acoustic model is time consuming.

  • We need some guidance on what is the most needed data.

Approach overview
Approach Overview

  • Applied to well-transcribed data

  • Selection based on transcription

  • Choose subset that have “uniform” distribution on speech unit (word, phoneme, character)

How to sample data wisely a simple example

k Gaussian distribution with known priorωi and unknown density function fi(μi ,σi)

How to sample data wisely?--A simple example

How to sample wisely a simplified example
How to sample wisely?--A simplified example

  • We are given access to at most N examples.

  • We have right to choose how much we want from each class.

  • We train the model use MLE estimator.

  • When a new sample generated, we use our model to determine its class.


    How to sample to achieve minimum error?

The optimal bayes classifier
The optimal Bayes Classifier

If we have the exact form of fi(x), above classification is optimal.

To approximate the optimal
To approximate the optimal

  • We use our MLE

  • The true error would be bounded by optimal Bayes error plus error bound for our worst estimated

Sample uniformly
Sample Uniformly

  • We want to sample each class equally.

    • The data selected will have good coverage on each class.

    • This will give robust estimation on each class.

Data selection for asr system
Data Selection for ASR System

  • The prior has been estimated independently by language model.

  • To make acoustic model accurate, we want to sample the W uniformly.

  • We can take the unit to be phoneme, character, word. We want their distribution to be uniform.

Entropy measure for uniformness
Entropy: Measure for “uniformness”

  • Use the entropy of the word (phoneme) as ways of evaluation

    • Suppose the word (phoneme) has a sample distribution p1, p2…. pn

    • Choose subset have maximum -p1*log(p1)-p2*log(p2)-... pn *log(pn))

  • Entropy actually is the KL distance from uniform distribution

Computational issue
Computational Issue

  • It is computational intractable to find the transcription set that maximizes the entropy

  • Forward Greedy Search


  • There are multiple entropies we want to maximize.

  • Combination Method

    • Weighted Sum

    • Add sequentially

Experiment setup
Experiment Setup

  • System: Sphinx III

  • Feature: 39 dimension MFCC

  • Training Corpus: Chinese BN 97(30hr)+ GaleY1(810hr data)

  • Test Set: RT04(60 min)

Experiment 3 with vtln
Experiment 3 (with VTLN) 150hr)

Table 3

Summary 150hr)

  • Choose data uniformly according to speech unit

  • Maximize entropy using greedy algorithm

  • Add data sequentially

Future Work

  • Combine Multiple Sources

  • Select Un-transcribed Data