1 / 1

2.1 The SEAME Corpus [ D.C. Lyu et al., 2011] SEAME = South East Asia Mandarin-English

RECURRENT NEURAL NETWORK LANGUAGE MODELING FOR CODE-SWITCHING CONVERSATIONAL SPEECH Heike Adel, Ngoc Thang Vu Franziska Kraus, Tim Schlippe, Haizhou Li, Tanja Schultz heike.adel@student.kit.edu, thang.vu@kit.edu. 1. Overview Goal

xerxes
Download Presentation

2.1 The SEAME Corpus [ D.C. Lyu et al., 2011] SEAME = South East Asia Mandarin-English

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. RECURRENT NEURAL NETWORK LANGUAGE MODELING FOR CODE-SWITCHING CONVERSATIONAL SPEECH Heike Adel, Ngoc Thang Vu Franziska Kraus, Tim Schlippe, Haizhou Li, Tanja Schultz heike.adel@student.kit.edu, thang.vu@kit.edu • 1. Overview • Goal • Predictionof Code Switches based on textualfeatures (wordsand POS tags) • Extended structureofrecurrentneuralnetworksfor Code-Switching • => 10.8 % (2 %) relative improvement in termsofperplexity (WER) on the SEAME developmentsetand 16.9 % (2.7 %) relative on theevaluationset • Whatis Code-Switching (CS)? • Code-Switchingspeechisdefinedasspeechthatcontainsmorethanonelanguage. Itis a commonphenomenon in multilingual communities. • 2.2 Code-Switching-Analysesofthe Corpus • Predictionof Code-Switches • Trigger words: • POS Taggingfor CS-Speech Post-processing Language islands (> 2 embeddedwords) POS taggerfor English Output „Matrix language“= Mandarin „Embedded language“ = English English segments in remainingtext CS-text POS taggerfor Mandarin Remainingtext Output Effort (# rules) and quality using cross-lingual rules Mandarin triggerwords • Trigger POS: • 2.1 The SEAME Corpus [D.C. Lyu et al., 2011] • SEAME = South East Asia Mandarin-English • Conversational Mandarin-English Code-Switch speechcorpus • Temporarilyprovidedaspartof a jointresearchprojectby NTU and KIT • About 63 hoursofaudiodataandtheirtranscriptions • Fourlanguagecategories: English, Mandarin, particles (SingapoureanandMalaysiandiscourseparticles) andothers (otherlanguages) • Averagenumberof CS per utterance: 2.6; veryshortmonolingualsegments • => challenging bilingual task Mandarin trigger POS English triggerwords English trigger POS • 3. RecurrentNeural Network Language Model (RNNLM) for Code-Switching • 4. Experiments andResults • Perplexity Evaluation andRescoring Experiments • Rescoringof 100-best listsofour CS-ASR system [Vu, 2012] with different settingsforlanguage model weights (lz) andwordinsertionpenalties (lp): • RNNLM andthe 3-gram LM ofthe ASR systemareweightedequally (λ = 0.5) • Performance Measure: Mixed Error Rate (MER): worderrorratesfor English segmentsandcharactererrorratesfor Mandarin segments English trigger POS • Input: • Word vector w(t) • Feature vector f(t) containing POS tags • Hidden Layer: Vector s(t) containingthestateofthenetwork • Output: • Vector c(t) withtheprobabilitiesforeachlanguage • Vector y(t) withprobabilitiesforeachwordgivenitslanguage • U1, U2, V, W: weightsfortheconnectionsbetweenthelayers • Training with back-propagationthrough time (BPTT) • Computationoftheprobabilities: • Reference to CS task: usewordsandfeaturesto not only • determinethenextword but also thenextlanguage (OF: outputfactorization, FI: featureintegration) ICASSP 2013 – The 38th International Conference on Acoustics, Speech, and Signal Processing

More Related