1 / 16

Overview of CSSML

Overview of CSSML. Yan Jun, Department Manager Anhui USTC iFLYTEK Co., Ltd University of Science & Tech of China. Presentation Outline. Motivation and solutions Standardization Application. CSSML. Chinese Speech Synthesis Markup Language CSSML is a extension of SSML for Chinese

cholena
Download Presentation

Overview of CSSML

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Overview of CSSML Yan Jun, Department Manager Anhui USTC iFLYTEK Co., Ltd University of Science & Tech of China

  2. Presentation Outline • Motivation and solutions • Standardization • Application

  3. CSSML • Chinese Speech Synthesis Markup Language • CSSML is a extension of SSML for Chinese • Objective • To meet Chinese speech synthesis requirements • To provide more flexible and convenient methods to adjust parameters and optimize speech synthesis effect

  4. Motivation • Special problems of Chinese speech synthesis • Pronunciation of Chinese characters • Disposure of words composed of English letters • Segmentation of Chinese words • Requirements of Chinese speech market • Using background music

  5. Pronunciation of Chinese characters • Syllables: Chinese characters • Chinese characters have four tones, or no tone to express unstressed syllables • Chinese Romanization (PinYin) is widely used in China as a formal notation of Chinese character pronunciation. 广 ɡuǎnɡ guang3 光 ɡuānɡ guang1

  6. words composed of English letters • Words composed of English letters • English words: James, New York • PinYin words: Anhui, Hefei, Jiang Zemin • PinYin words speak as English words • Not according to pronunciation custom • Difficult to understand

  7. phoneme • Attributes supported by the phoneme element are extended • alphabet attribute can take ‘py’ and ph attribute can be PinYin notation • new lang attribute is added to indicate the language or dialect of the content 他姓<phoneme alphabet=“py” ph=“zeng1”>曾</phoneme> 国家主席<phoneme lang=“cn”>Jiang Zemin</phoneme>

  8. Segmentation of Chinese word • Basic grammatical unit of Chinese: Chinese character • No blanks or punctuations to separate word • Thus, one sentence may have several results of segmenting words that may be correct 南京市长江大桥 南京市ˇ长江大桥The Bridge of the Yangtse River in Nanking city 南京市长ˇ江大桥Jiang Daqiao, the mayor of Nanking city

  9. Segmentation of Chinese word • Different result of segmenting words • Greatly affect the meaning of the sentence • The pronunciation of Chinese characters may be different ( monograph ) • Thus, influence or even destroy the effect of speech synthesis 南京市ˇ长江大桥nan2 jing1 shi4 chang2 jiang1 da4 qiao2 南京市长ˇ江大桥 nan2 jing1 shi4 zhang3 jiang1 da4 qiao2

  10. word and phrase • word element is used to define the boundary between Chinese words • phrase element define the boundary between phrases at different levels <word>南京市</word><word>长江大桥</word> <phrase><word>我们的</word><word>最高目标</word></phrase> <phrase>是</phrase> <phrase>得到高自然的语音</phrase>

  11. Using background music • Synthesized speech can be played together with background music • To upgrade user experience • Background music may be added in a given position • Background sound may be switched during the synthesis process

  12. environment • environment element is introduced to present the sound field environment of synthesizing • src attribute • repeat attribute <environment repeat= “yes” src= “1.wav”> 有三千余年建城史的北京,经过改革开放的洗礼,将以崭新的、多姿多彩的面貌进入新世纪,她将以饱满的热情欢迎全世界的体育健儿和各界朋友,共同参与奥运盛会。 </ environment >

  13. CSSML:enterprise standard • iFLYTEK setup the enterprise standard CSSML to define the markup language used in speech synthesis product in 2002 • Since 2003, the standard has been supported by InterPhonic product series of iFLYTEK

  14. CSSML: candidate of national standard • Human-machine speech alternation standard workgroup of the Ministry of China Information Industry • CSSML was proposed in the workgroup in 2003 and was widely debated • CSSML was voted through by the workgroup in Oct 24, 2005 and it will be submitted to the Ministry of China Information Industry as a candidate of national standard

  15. Application • Speech synthesis product that support CSSML are widely used in telecom, banking, insurance, negotiable securities, education and so on. • telecom: 168 and 114 information inquiry service • securities: stock comment, company introduction • enterprise: customer telephone service • education: to teach pronunciation of Chinese characters and words

  16. Question? Thank you and good bye!

More Related