1 / 12

Automatic Transcript Generation

Automatic Transcript Generation. Helmer Strik A 2 RT Dept. of Language & Speech University of Nijmegen. Problem & Solution. Problem: We have Audio from radio & TV We need Transcripts Solution ASR: Automatic Speech Recognition. History of ASR. It all started more than 100 years ago.

illias
Download Presentation

Automatic Transcript Generation

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Automatic Transcript Generation Helmer Strik A2RT Dept. of Language & Speech University of Nijmegen

  2. Problem & Solution Problem: • We have Audio from radio & TV • We need Transcripts Solution ASR: Automatic Speech Recognition

  3. History of ASR It all started more than 100 years ago

  4. History of ASR 1870 - Alexander Graham Bell: Make speech visible, for the hearing impaired 1952 - AT&T Bell Laboratories: 1st ASR - ten English digits 2001 - ASR is ‘everywhere’ : • PC: dictation + ‘Command & Control’ • mobile phones (hands free) • call-centers • tap phone calls

  5. First: A/D-conversion Before ASR: A/D-conversion Speech - analogue & continuous Mic. + sound card WAV file - digital & discrete

  6. What is ASR? Answer: conversion from speech to text X: unknown speech signal ASR W: a string of words

  7. How: probabilistic approach Find W that max. P(W|X) P(W|X) = P(X|W) * P(W) / P(X) • P(W) - language model • P(X|W) - acoustic model • Whole word models • Phoneme models + Lexicon

  8. ASR ASR = • Phoneme models (HMMs) • Lexicon • Language model P(X|W) P(W)

  9. Training HMMs & LMs are trained: Speech + manual transcripts (lexicon) Training procedure • ASR: • HMMs (Hidden Markov Models) • Language Models

  10. Decoding Automatic Transcript Generation: X: unknown speech signal ASR W: the automatic transcripts

  11. C-3PO - 6 million languages

  12. MUMIS

More Related