1 / 22

Letter to Phoneme Alignment

Letter to Phoneme Alignment. Reihaneh Rabbany Shahin Jabbari. Outline. Motivation Problem and its Challenges Relevant Works Our Work Formal Model EM Dynamic Bayesian Network Evaluation Letter to Phoneme Generator AER Result. Text to Speech Problem.

jamese
Download Presentation

Letter to Phoneme Alignment

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Letter to Phoneme Alignment Reihaneh Rabbany Shahin Jabbari

  2. Outline • Motivation • Problem and its Challenges • Relevant Works • Our Work • Formal Model • EM • Dynamic Bayesian Network • Evaluation • Letter to Phoneme Generator • AER • Result

  3. Text to Speech Problem • Conversion of Text to Speech: TTS • Automated Telecom Services • E-mail by Phone • Banking Systems • Handicapped People

  4. Phonetic Analysis Word Pronunciation Pronunciation • Pronunciation of the words • Dictionary Words • Non-Dictionary Words • Phonetic Analysis • Dictionary Look-up • Language is alive, new words add • Proper Nouns

  5. Outline • Motivation • Problem and its Challenges • Relevant Works • Our Work • Formal Model • EM • Dynamic Bayesian Network • Evaluation • Letter to Phoneme Generator • AER • Result

  6. L2P Problem • Letter to Phoneme Alignment • Letter: c a k e • Phoneme: k ei k 

  7. Challenges • No Consistency • City / s / • Cake / k / • Kid / k / • No Transparency • K i d (3) / k i d / (3) • S i x (3) / s i k s / (4)‏ • Q u e u e (5)  / k j u: / (3)‏ • A x e (3) / a k s / (3)‏

  8. Outline • Motivation • Problem and its Challenges • Relevant Works • Our Work • Formal Model • EM • Dynamic Bayesian Network • Evaluation • Letter to Phoneme Generator • AER • Result

  9. One-to-one EMDaelemanset.al., 1996 • Length of word = pronunciation • Produce all possible alignments • Inserting null letter/phoneme • Alignment probability

  10. Decision TreeBlack et.al., 1996 • Train a CART Using Aligned Dictionary • Why CART? • A Single Tree for Each Letter

  11. Kondrak • Alignments are not always one-to-one • A x e / a k s / • B oo k  /b ú k / • Only Null Phoneme • Similar to one-to-one EM • Produce All Possible Alignments • Compute the Probabilities

  12. Outline • Motivation • Problem and its Challenges • Relevant Works • Our Work • Formal Model • EM • Dynamic Bayesian Network • Evaluation • Letter to Phoneme Generator • AER • Result

  13. Formal Model • Word: sequence of letters • Pronunciation: sequence of phonemes • Alignment: sequence of subalignments • Problem: Finding the most probable alignment

  14. Many-to-Many EM 1. Initialize prob(SubAlignmnets) // Expectation Step 2. For each word in training_set 2.1. Produce all possible alignments 2.2. Choose the most probable alignment // Maximization Step 3. For all subalignments 3.1. Compute new_p(SubAlignmnets)

  15. Dynamic Bayesian Network • Model • Subaligments are considered as hidden variables • Learn DBN by EM li Pi ai

  16. Context Dependent DBN • Context independency assumption • Makes the model simpler • It is not always a correct assumption • Example: Chat and Hat • Model li Pi ai-1 ai

  17. Outline • Motivation • Problem and its Challenges • Relevant Works • Our Work • Formal Model • EM • Dynamic Bayesian Network • Evaluation • Letter to Phoneme Generator • AER • Result

  18. Evaluation Difficulties • Unsupervised Evaluation • No Aligned Dictionary • Solutions • How much it boost a supervised module • Letter to Phoneme Generator • Comparing the result with a gold alignment • AER

  19. Letter to Phoneme Generator • Percentage of correctly generated phonemes and words • How it works? • Finding Chunks • Binary Classification Using Instance-Based-Learning • Phoneme Prediction • Phoneme is predicted independently for each letter • Phoneme is predicted for each chunk • Hidden Markov Model

  20. Alignment Error Ratio • AER • Evaluating by Alignment Error Ratio • Counting common pairs between • Our aligned output • Gold alignment • Calculating AER

  21. Outline • Motivation • Problem and its Challenges • Relevant Works • Our Work • Formal Model • EM • Dynamic Bayesian Network • Evaluation • Letter to Phoneme Generator • AER • Result

  22. Results • 10 fold cross validation

More Related