1 / 19

An extension to the SSML for diacritics auto-completion

R & D Centre. Vocal Services Section . An extension to the SSML for diacritics auto-completion. W3C Workshop, Beijing, 2nd of November 2005. Plan of the presentation. The nature of the problem Similarities among other languages Possible solutions Discussion. Diacritics.

jack
Download Presentation

An extension to the SSML for diacritics auto-completion

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. R&DCentre Vocal Services Section An extension to the SSML for diacritics auto-completion W3C Workshop, Beijing, 2nd of November 2005

  2. Plan of the presentation • The nature of the problem • Similarities among other languages • Possible solutions • Discussion Regarding: Diacritics auto-completion

  3. Diacritics diacritical mark or diacritic, sometimes called an accent mark, is a mark added to a letter to alter a word's pronunciation or to distinguish between similar words. Example: Polish letters with diacritics: Ą ą Ć ć Ę ę Ł ł Ń ń Ó ó Ś ś Ź ź Ż ż • Polish alphabet contains 35 letters=26 basic + 9 with diacritics • Different pronunciation from letters without diacritics • Included in ISO-8859-2, UNICODE, CP-1250, DOS 852… • Not included in US ASCII 7-bit codepage ? Regarding: Diacritics auto-completion

  4. Why Polish diacritics sometimes disappear? • No possibility to obtain while typing • Application / hardware does not support non US-ASCII characters • Improper regional settings in OS or firmware • The codepages hell • All the codepages differs from each other • Unicode (utf-8) is still not very popular • Pruned on WWW - SMS gateways • A little bit hard to type • As a combination Alt gr+<letter> on a PC keyboard („Polish programmer” variant of US keyboard) • As the 5th or further letter on a key of mobile phone keypad (key 2 sequence=„ABC2ĄĆ” Regarding: Diacritics auto-completion

  5. quasi-Polish text (without diacritics) • Is not orthographically correct • Is not up to netiquette • Is not Polish (in fact) • Cannot be transformed into Polish with simple substitution rules • Speech synthesised from this text may be incomprehensible …but: • Sometimes it is the only possibility to represent text • Is easier to write = can be written faster • Can be quite easily read by human as if it was written correctly (because of nature of „human reading device”) thus: is widespread in Polish e-mails, SMSes, news posts and chats Regarding: Diacritics auto-completion

  6. Examples slonce –>słońce(Eng.the Sun) - unambiguous mapping maki –>maki(Eng.poppies Nominative, plural) ormąki(Eng.flour Genitive, singular) Question: add a diacritic or not ? zeby –>zęby(Eng.teeth Nominative, plural) orżeby(Eng.in order that ) Question: Where to add a diacritic ? Regarding: Diacritics auto-completion

  7. Other languages • Czech, Slovak • Problem with diacritics is very similar to Polish • German • Umlaut „ ä ü ö ” and sharp „s” = β • Russian • Volapuk encoding – informal romanization used in SMSes • e.g.: „Ж” = „}” + „|” + „{” • French • Accents strongly affecting pronunciation, e.g.: „è” „é” „ê” • Other diacritics: „ë” „ï” „ô” „û” • … and many other Regarding: Diacritics auto-completion

  8. How to classify the problem? • a new dialect? • an alternative spelling (context dependent orthography)? • an erroneous text that requires correction (jargon)? Regarding: Diacritics auto-completion

  9. Example: Multi-channel access to Instant Messaging From: chris Date: 2nd Nov 05 Time: 10h15 Msg: <msg content> Correct text Text without diacritics Home IM user SMS gateway Message usertext IM Server Mobile user Text Processing Speech Synthesis Visually impaired user Regarding: Diacritics auto-completion

  10. Variant 1: correction by IM server • Do everything on server side • SSML content developer takes care about correct spelling in text send to TTS • Text processing (correction software) is tight to the IM Server vendor which may lead to proprietary solutions • TTS is given correct text so has no problem to render it Message in SSML 1.0 TTS engine IM Server Built-in Text Processing Speech Synthesis Proprietary Text Processing Rules No need for data exchange format standardization Regarding: Diacritics auto-completion

  11. Variant 2: correction by TTS engine • IM does not do anything – lets the TTS engine render the text • No additional work of SSML content developer required • TTS must recognize scope of the quasi-correct part of text (no tags in current SSML) • TTS must complete diacritics to correctly pronounce text TTS engine Message in SSML 1.0 IM Server Built-in Text Processing Speech Synthesis Proprietary Text Processing Rules Regarding: Diacritics auto-completion

  12. Variant 3 – use external lexicons • Use special lexicon file to properly render text: • Quite simple and easy for SSML developer • Lexicon affects the whole file: correct and quasi-correct parts • No context dependent rules in PLS (req. 7.3) • No prefix/suffix morphological rules in PLS (req. 7.2) • The lack of diacritics is not a pronunciation exception but a spelling error TTS engine Message in SSML 1.0 IM Server Lexicon-based built-in Text Processing Speech Synthesis Lexicons in PLS 1.0 Text Processing Lexicons Regarding: Diacritics auto-completion

  13. Recommendation • Use separate correction unit for jargon (external) • Enclose quasi-correct text with tags • Still easy for SSML developers • Text Correction software knows which part of text should be specifically pre-processed • For diacritics completion an external program can be used • For simpler cases, just dedicated lexicon can be used • SSML needs to be extended TTS engine Message in enhanced SSML 1.0 IM Server Lexicon-based built-in Text Processing Jargon Text Correction Speech Synthesis Lexicons in PLS 1.0 Text Processing Lexicons Regarding: Diacritics auto-completion

  14. Example of SSML document (jsp) <speak> ... User <%= sSender%>writes: <say-as interpret-as=”jargon” format=”im”> <%= sMessageContent %> </say-as> The message has been sent: <say-as intepret-as=”date”> <%= sDate %> </say-as> at <say-as intepret-as=”time”> <%= sTime %> </say-as> </speak> Regarding: Diacritics auto-completion

  15. Another example <speak> ... User <%= sSender%>writes: The message has been sent: <say-as intepret-as=”date”> <%= sDate %> </say-as> at <say-as intepret-as=”time”> <%= sTime %> </say-as> </speak> <jargon format=”im”> <%= sMessageContent %> </jargon> Regarding: Diacritics auto-completion

  16. Conclusions • In modern communication services people use specific language, frequently not conforming to orthographic rules (e.g. without diacritics) • Applying standard phonetization rules to erroneous text may result in incomprehensible speech • TTS for best rendering results should have complete information about the text • One SSML document can have both correct and erroneous text; there is a need to mark it • Correcting erroneous text can be context and application dependent Regarding: Diacritics auto-completion

  17. Questions and doubts • How many types of erroneous input should we consider? • How to handle jargon evolution? • How does input device affect the text? • New interpret-as value or a new tag? • Scope and structure of the new tag (if applicable)? • Will future TTS be a software composed of complex text processor and acoustic synthesis engine, or will we have a possibility to freely choose these modules from different vendors? Regarding: Diacritics auto-completion

  18. Dziękujemy Thank you Regarding: Diacritics auto-completion

  19. Prepared by: Name: Name: Przemyslaw Zdroik Krzysztof Majewski Division: Division: Vocal Services Secion Vocal Services Section TP S.A. Research and Development Centre TP S.A. Research and Development Centre Department: Department: (+ 48) 22 699 56 06 (+ 48) 22 699 55 64 Phone#: Phone#: Przemyslaw.Zdroik@telekomunikacja.pl Krzysztof.Majewski@telekomunikacja.pl E-mail: E-mail: Regarding: Diacritics auto-completion

More Related