1 / 13

Letter to Phoneme Alignment

Letter to Phoneme Alignment. Using Graphical Models. N. Bolandzadeh, R. Rabbany Dept of Computing Science University of Alberta. 1. Text to Speech Problem. Conversion of Text to Speech: TTS Automated Telecom Services E-mail by Phone Banking Systems Handicapped People. Pronunciation.

metta
Download Presentation

Letter to Phoneme Alignment

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Letter to Phoneme Alignment Using Graphical Models N. Bolandzadeh, R. Rabbany Dept of Computing Science University of Alberta 1

  2. Text to Speech Problem Conversion of Text to Speech: TTS Automated Telecom Services E-mail by Phone Banking Systems Handicapped People

  3. Pronunciation Pronunciation of the words Dictionary Words Non-Dictionary Words Phonetic analysis Dictionary lookup? Language is alive, new words add Proper Nouns Machine Learning  higher accuracy  L 2 P alignment is needed

  4. L2P Problem Automatic Speech Recognition & Spelling Correction • Letter to Phoneme Alignment • Letter: c a k e • Phoneme: k ei k  4

  5. It's not Trivial! why? • No Consistency • City / s / • Cake / k / • Kid / k / • No Transparency • K i d (3) / k i d / (3) • S i x (3) / s i k s / (4)‏ • Q u e u e (5)  / k j u: / (3)‏ • A x e (3) / a k s / (3)‏ 5

  6. Framework L2P aligner Aligned Dictionary Dictionary Brick brIk Brightening br2tHIN British brItIS Bronx brQNks Bugle bjugP Buoy b4 b|r|i|ck|b|r|I|k| b|r|ig|ht|en|i|ng| b|r|2|t|H|I|N| b|r|i|t|i|sh|b|r|I|t|I|S| b|r|o|n|x|b|r|Q|N|ks| b|u|g|le|b|ju|g|P| bu|oy| b|4|

  7. Evaluation • No Aligned Dictionary • Unsupervised Learning • Previously aligner was tied with a generator • Evaluation on percentage of correctly predicted phonemes and words Tee’s L2P Generator Aligned Dictionary Accuracy

  8. Model of our problem B | r | i | t | i | sh | B | r | I | t | I | S |

  9. Static Model, Structure • Independent sub alignments l1 l2 l3 l4 ln-1 ln a1 a2 ak p1 p2 p3 p4 pm-1 pm

  10. Static Model, Learning • EM • Initialize Parameters • Expectation Step: • Parameters  Alignments • Maximization Step: • Alignments  Parameters

  11. Result of Static Model

  12. Dynamic Model • Sequence of data • Unrolled model for T=3 slices l1 l2 l3 l4 l5 l6 a1 a2 ak p1 p2 p3 p4 p5 p6

  13. Questions

More Related