100 likes | 194 Views
This study addresses the challenges of erroneous transcription files, vocabulary mismatches, and noisy sound files in speech recognition. Document expansion techniques are proposed to enhance retrieval accuracy by adding new terms and reweighing existing ones. Experiments show significant improvements in retrieval results.
E N D
Document Expansion forSpeech Retrieval(Singhal, Pereira) Teoman ToramanÇağrı ToramanBilkent University, 2010
Problem Statement Reasonable Transcription File: news_today.rtf Speech File: news_today.wav Automatic (or Manual) Speech Recognition 2 / 10
Problem Statement Aboutness: Fatal train crash in Italy Query Indexing Results:D1, D2 3 / 10
Problem Statement Erroneous Transcription File Noisy / Dirty Sound File Corrupted / Erroneous Automatic (or Manual) Speech Recognition 4 / 10
Problem Statement Same Query Erroneous Corrupted / Erroneous Indexing Results:D2 (Vocabulary Mismatch) 5 / 10
Problem Statement Noisy / Dirty Sound File Automatic (or Manual) Speech Recognition Corrupted / Erroneous • Recognition Mistakes: • Deletions • Wrong term weighting • Insertions 6 / 10
Solution Expanded Corrupted / Erroneous Document Expansion 7 / 10
Solution What is Document Expansion ? Step 2) Step 3) Step 1) RELATED CORPUS Corrupted / Erroneous Reweighing & Adding New Terms ... 10 similar files 8 / 10
Experiments & Results 9 / 10
Experiments & Results %10-15loss %20-25loss 10 / 10