1 / 14

語音資料之處理與標記

語音資料之處理與標記. 曾淑娟 中央研究院語言學研究所籌備處. 語音資料類型. 新聞側錄語音 (broadcast news data) 麥克風語音 (microphone speech data) 電話語音 (telephone speech data) 實驗室語音 (lab speech data). 語料類型 ─ I. 朗讀語料 (read speech) 朗讀、 公眾人物完全依講稿內容演講 準備性語料 (prepared speech) 公眾人物依講稿內容背誦演講、記者採訪、談話性節目主持人

dale
Download Presentation

語音資料之處理與標記

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. 語音資料之處理與標記 曾淑娟 中央研究院語言學研究所籌備處

  2. 語音資料類型 新聞側錄語音 (broadcast news data) 麥克風語音 (microphone speech data) 電話語音 (telephone speech data) 實驗室語音 (lab speech data)

  3. 語料類型 ─ I 朗讀語料 (read speech) 朗讀、 公眾人物完全依講稿內容演講 準備性語料 (prepared speech) 公眾人物依講稿內容背誦演講、記者採訪、談話性節目主持人 自發性語料 (spontaneous speech) 平時交談、無準備談話

  4. 語料類型 ─ II 獨白 (monologues) 朗讀講稿(reading)、敘述事實或故事(narratives) 對話 (dialogues) 記者採訪(interview)、談話性節目對談、兩人對話 多人會話 (conversations) 兩人以上交談、兩人以上談話性節目對談

  5. 數位錄音─硬軟體設備 錄音機械: Digital Audio Tape(DAT)需轉錄成聲音檔案 -- > Master Tape Mini Disc (MD) 需轉錄成聲音檔案 錄音軟體 (Speech Analyzer, PCQuirer, Cool Edit Pro) 直接錄成聲音檔案 麥克風: 單一指向性 (uni-directional) (Audio Technica, AKG, Sennheiser, Shure) 錄音場所: 普通房間、 錄音間、 戶外 錄音情境: 對談、 訪問、 獨白/敘述、操作預先設計任務

  6. 數位錄音─格式 取樣品率(sampling rate): 8kHz 、 16kHz 、 44.1kHz 、 48kHz 取樣大小: 8 bits、16 bits (位元) 聲道: 單聲(mono)、立體聲(stereo) 檔案格式: pcm、wav、ptk、sd

  7. 語音資料─meta data 內容 檔頭(header): 錄音地點、錄音日期、語音類型、語言、取樣品率、錄音格式 語音內容(transcripts): 所屬聲音檔、 發音人資料(編號、 年齡、 性別)、漢字內容轉寫、拼音內容轉寫 註記(comments): 單筆資料註記

  8. 範例─架構 Header -- record place -- record date -- speech type1 -- speech type2 -- language -- sampling rate -- record type Body -- voice segment -- voice segment -- wave filename -- speaker info -- start time -- end time [.wav] [MISC-n-age-gender] [msec] [msec] -- transcriber info -- character transcription -- Pinyin transcription [name] Big5, foreign words, Pinyin, foreign words, markers/particles, @, markers/particles, tags: <name></b name> pronunciation: [ ], @, tags: <name></b name> -- comment --

  9. 範例─實際格式 <recordplace>Taipei, Taiwan <recorddate>June 3, 2001 <speechtypei>spontaneous <speechtypeii>dialogue <language>Mandarin <samplingrate>48 kHz <recordtype>stereo <segment> <voicefile>d:\分割完成的檔\stereo_01\mcdc-01-01.wav <speaker>MISC-08-male-25 <start>000000 <end>009514 <translator>Fen <chinese> <b particle>EI </b particle><b clear throat>@</b clear throat>你好我姓賴請問一下貴姓<b hiccup>@</b hiccup> <b breathe>@</b breathe> </chinese> <english> EI @ ni3 hao3 wo3 xing4 lai4 qing3 wen4 yi2 xia4 gui4 xing4 @ @ </english> <comment> </comment> </segment>

  10. 範例─語音內容轉寫與標記 Character Transcription 蓋章認可<b inappropriate pronunciation>的</b inappropriate pronunciation><b short break>@</b short break>只有<b assimilation>三分</b assimilation>之一<b inhale>@</b inhale><b marker>NA </b marker>其它的<b clear throat>@</b clear throat><b exhale>@</b exhale><b assimilation>三分</b assimilation>之二是<b inhale>@</b inhale>警察局自己<b pause>@</b pause><b inappropriate pronunciation>就</b inappropriate pronunciation><b inappropriate pronunciation>是</b inappropriate pronunciation> Pinyin Transcription gai4 zhang1 ren4 ke3<b inappropriate pronunciation>de5</b inappropriate pronunciation><b short break>@</b short break>zhi3 you3<b assimilation>san1 fen1</b assimilation>zhi1 yi1<b inhale>@</b inhale><b marker>NA </b marker>qi2 ta1 de5<b clear throat>@</b clear throat><b exhale>@</b exhale><b assimilation>san1 fen1</b assimilation>zhi1 er4 shi4<b inhale>@</b inhale>jing3 cha2 ju2 zi4 ji3<b pause>@</b pause><b inappropriate pronunciation>jiu4</b inappropriate pronunciation><b inappropriate pronunciation>shi4</b inappropriate pronunciation>

  11. Dialogue Act Annotation (MTCC) general opening: opening negotiating a topic: suggest_topic accept_topic reject_topic comment_topic introducing a topic: introduce_topic talking about the topic: begin_statement agree agree_part oppose oppose_part feedback_understanding feedback_non_understanding backchannel question question_request_answer answer exclamation rephrase clarify correct repeat completion_by_self completion_by_other comment_by_self comment_by_other ending the topic: end_topic general closing: closing uninterpretable fragments: not_classified

  12. Taxonomy of Spontaneous Speech Phenomena (MCDC) • Disfluency: prosodic disfluency (silence, pause, short break, stutter), repair (restart, repetition, overt repair, editing term, error, word fragment), lexico-syntactic disfluency (inappropriate, interrupted, abridged utterances), discourse particles and discourse markers (both transcribed in capital letters) • Socio-linguistic Phenomena: code switching, dialect-influenced pronunciation, new words • Particular Vocalisation: lengthening, assimilation, syllable contraction, inappropriate pronunciation • Unintelligible and Non-speech Sounds

  13. TransList – Interface

  14. TransList - Interface

More Related