240 likes | 407 Views
Developing SynctoLearn, an Automatic Video and Script Synchronization Tool, for Language Learners. Howard Chen and Berlin Chen National Taiwan Normal Univesity hjchen@ntnu.edu.tw. Authentic Video and ESL Learners. Videos have been widely used in foreign language teaching for a long time.
E N D
Developing SynctoLearn, an Automatic Video and Script Synchronization Tool, for Language Learners Howard Chen and Berlin Chen National Taiwan Normal Univesity hjchen@ntnu.edu.tw
Authentic Video and ESL Learners • Videos have been widely used in foreign language teaching for a long time. • Currently, there is increasing attention to authentic video clips since there are many interesting and exciting video clips available on the Internet. • These authentic videos should be very useful for foreign language learners. However, learners might still need some supports while watching these authentic learning materials.
Input Modification in SLA • Some argue that the design of the pedagogical materials should be informed by theory such as the interactionist SLA theory, which suggests that input modification can help comprehension. • Based on our observations, without captions and other supporting devices, intermediate level ESL learners often have great difficulties in comprehending these videos because of the fast speed and many unknown vocabulary items.
Captions as a Key Support for Learners • Although there are plenty of video clips available, these video clips often do not have captions because most of these videos are targeted at native speakers. • Captions for videos are commonly believed to be able to facilitate listening comprehension (Chapelle, 1997). • If captions can be added to some of these videos, students are more likely to better understand the content and pick up more new vocabulary items of the target language. However, adding captions manually would be a time-consuming task.
A Recent Study Comparing Scripts and Captions • Grgurovic, M. & Hegelheimer, V. (2007). conducted a study comparing learners’ use of both scripts and captions. • They want investigated whether subtitles or transcripts are more effective in providing modified input to learners. • A multimedia listening activity containing a video of an academic lecture was designed to offer help in the form of target language subtitles (captions) and lecture transcripts in cases of comprehension breakdowns. • The results indicate that participants interacted with the subtitles more frequently and for longer periods of time than with the transcript.
Examples of Online Video and Scripts • VOA news • CNN news • Students need to view the video and read the scripts
An automatic video and script synchronization system called SynctoLearn • To help language teachers and students to make better use of a wide variety of authentic videos, we developed an automatic video and script synchronization system called SynctoLearn. • We used videos and scripts taken from VOA (Voice of America) web site. This automatic synchronization system was developed mainly with the help of speech recognition technologies. The system was first trained with VOA video and scripts. A tri-phone acoustic model of the VOA news was then built up. The HTK (Hidden Markov Model Toolkit) of Cambridge University was used to run the force alignment procedure. Through the alignment procedure, we have time-stamped VOA videos.
PS. MFCC (Mel-Frequency Cepstral Coefficient) 乃是語音辨識中常用的語音特徵參數 最後可得到每則新聞含有時間標註資訊的腳本 這邊利用HTK裡面的HCopy函式來抽取MFCC的語音特徵向量 這邊利用HTK裡面的HVite函式來執行force align的動作 FeatureExtraction Force Alignment Transcriptionwith time boundary Speech feature vector sequence VOA Corpus VOA transcription OOV Removal 這邊是rm檔經過軟體所抽取出來的wave音訊 每一則新聞的腳本必須先根據詞典來查詢每一個詞所組成的聲學模型,串起所有的聲學模型後而構成最後的搜尋網路 Lexicon AcousticModel 每一則新聞腳本於辨識前必須先進行前處理,把標點符號與OOV (Out Of Vocabulary)給過濾掉 這是經由庭瑋根據VOA語料所訓練得到的三連聲學音素模型(Tri-phone acoustic model) PS. HTK Toolkit : http://htk.eng.cam.ac.uk/乃是劍橋大學所開發的語音辨識軟體
A SynctoLearn Server • With the help of this automatic synchronization engine, anyone can upload videos and scripts into a SynctoLearn system and obtain automated captioning videos. In addition to VOA videos, we also uploaded many videos and scripts of the CNN Student News to the server and found that SynctoLearn system can synchronize the CNN student news accurately.
Video-viewing System with Automatic Captions • In addition to the core automatic synchronization engine, some other useful options of viewing videos were also provided. When students watch the video, the scripts automatically synchronize with the audios/videos by default. Nevertheless, students can also choose to turn off the captions (synchronized texts) and watch the videos without captions. • This option can encourage students not to rely on the scripts. If students’ listening abilities reach a higher level, they can sometimes turn off the captions. In addition, because the videos and scripts are time-stamped, students can click on any word in the script and the video will be (re)played from that specific word. The convenient playback function can help students quickly capture what they missed in the video viewing processes. These options might be useful for vocabulary learning and listening comprehension.
User Feedback on This System • Based on the survey results from two groups of ESL students who used this system for several months, we found that most students enjoyed watching the synchronized video clips generated by SynctoLearn. • Most students (85%) felt very satisfied or satisfied about this new tool. They felt more comfortable and confident with the support of this synchronization tool. In addition, students indicated that they in particular like the following two options: the option to hide the captions and the option to randomly replay the video segments by clicking on the words in the scripts. With automated captions, students had more opportunities to learn the new words and their pronunciations. They also could better understand the video content.
Suggestions for Improvement • However, there were some problems in this prototype system. Students suggested that screen size and the quality of the VOA video can be improved. They expected to see a larger video with higher resolution. In addition, they also recommended that the display of captions can be modified or improved.
The Future Development • Based on these encouraging results of using SynctoLearn on VOA/CNN videos, we can further extend the learning content to other types of English videos and scripts and fix the problems identified by students. There are more and more video clips and scripts available on the Internet and these materials can be synchronized automatically with the same technologies. Similar synchronization technologies can also be adapted to process video and texts in other different languages. It is expected that the automatic synchronization system can help more language learners improve their listening abilities and learn more vocabulary items.
Procedures of Preparing the Alignment 1. Download video clips and use Flash Video Encoder to convert them into flv format
Procedure 2. Use audio converting software to extract audio (wav files) from video clips Sample Rate : 16 khz Bits : 16 bits Channel : Mono Bitrate : 256.0 kb/sec
Procedure 3. Use Adobe Audition to convert wav files into pcm files Run Batch Processing
Procedure 3. Use Adobe Audition to convert wav files into pcm files Run Batch Processing Add wav files
Procedure 3. Use Adobe Audition to convert wav files into pcm files Run Batch Processing Add wav files Sample Rate : 16000 Hz Channels : Mono Resolution : 16 bit Change destination format
Procedure 3. Use Adobe Audition to convert wav files into pcm files Run Batch Processing Add wav files Change destination format Run Batch
Procedure 4. Upload the flv file, the subtitle text (txt file), and the pcm file onto the website (140.122.83.227/flash_sync/admin/upload.php)
Thanks for your attention • Questions and Discussions • hjchen@ntnu.edu.tw • National Taiwan Normal University