Introducing The Buckeye Speech Corpus http://buckeyecorpus.osu.edu

Introducing The Buckeye Speech Corpushttp://buckeyecorpus.osu.edu Kyuchul Yoon English Division, Kyungnam University March 21, 2008 School of English, Kyung Hee University

What is it? Project Personnel Collection & Recording Transcription & Analysis Why create the corpus? The Buckeye Corpus of conversational speech 40 speakers in Columbus, OH conversing freely with an interviewer Orthographically transcribed and phonetically labeled Audio/text files & time-aligned phonetic labels (Xwaves, Wavesurfer) Available to researchers in academics and industry The Buckeye Speech Corpus

What is it? Project Personnel Collection & Recording Transcription & Analysis Why create the corpus? Principal Investigators Mark Pitt (Department of Psychology) Eric Fossler-Lussier (Department of Computer Science and Engineering) Elizabeth Hume (Department of Linguistics) Keith Johnson (Department of Linguistics) Post-doctoral researchers (4) Graduate students (7) Undergraduate students (15) The Buckeye Speech Corpus

What is it? Project Personnel Collection & Recording Transcription & Analysis Why create the corpus? Collection of speech completed by spring 2000 40 speakers, all natives of Central Ohio (i.e. born in/near Columbus, or moved there no later than age 10) Sample design is strafied for age/sex Class was not strictly controlled Most are middle class to upper working class The Buckeye Speech Corpus

What is it? Project Personnel Collection & Recording Transcription & Analysis Why create the corpus? From 40 speakers, about 300,000 words of speech were collected (about 40 hours) This large sample should ensure that the estimates of the forms and frequency of phonological variation are representative of the population under study There should be a large number of tokens of many variant forms appearing in different phonetic environments Useful for studying variation The Buckeye Speech Corpus

What is it? Project Personnel Collection & Recording Transcription & Analysis Why create the corpus? Qualified speakers had a conversation about everyday topics such as politics, sports, traffic, schools, etc. A modified sociolinguistic interview format was chosen Interviews conducted in a small seminar room by the (male) postdoc and (female) graduate assistant The Buckeye Speech Corpus

What is it? Project Personnel Collection & Recording Transcription & Analysis Why create the corpus? A detailed description of the procedures/conventions used in creating the corpus can be found in the manual Sound files and text transcriptions Digital recordings were transferred onto a PC using a digital I/O card Recorded conversations were transcribed into written English text by undergraduate transcribers using Soundscriber software (http://www-personal.umich.edu/~ebreck/sscriber.html) Transcripts are stored as ASCII text files The Buckeye Speech Corpus

What is it? Project Personnel Collection & Recording Transcription & Analysis Why create the corpus? Automatic word and phone alignment Sound files and written transcriptions were input to an automatic phonetic transcription program, Entropics Aligner Aligner uses acoustic phone models trained on the TIMIT corpus of spoken English. It comes with a dictionary that lists several alternative pronunciations for many words RA’s used Aligner to select the best fitting alternative pronunciations of words from among the alternatives listed in the dictionary and aligned the selected words and their phones to a portion of the sound wave The Buckeye Speech Corpus

What is it? Project Personnel Collection & Recording Transcription & Analysis Why create the corpus? Hand realignment Errors produced by the Aligner were corrected by phonetically trained RA’s Corrections were made when the Aligner’s labels were placed at the wrong locations or when a label that is not a part of Aligner’s segmental repertoire was needed For the hand alignment procedure, deciding upon the appropriate transcription of a given sequence was done using combined waveform and spectrographic displays of the signal using Entropics waves+ or Wavesurfer software The Buckeye Speech Corpus

What is it? Project Personnel Collection & Recording Transcription & Analysis Why create the corpus? The .words / .phones / .log label files The alignment procedure creates three (ASCII text) ‘label’ files corresponding to each sound file The first contains the word labels and offset times The second contains the phone labels and offset times The third label file is a log of notes supplied by the labelers, marking instances of unusual voice quality, manner of speaking, nasality, etc. The Buckeye Speech Corpus

What is it? Project Personnel Collection & Recording Transcription & Analysis Why create the corpus? Can be used for both pure research and for applied research and product development As a resource for pure research The corpus provides one of the richest sources of data on pronunciation variation in conversational speech Auditory word recognition in psycholinguistics Rules of pronunciation variation in phonology Age and gender related conditioning on pronunciation variation in sociolinguistics Effects of pronunciation variation on automatic speech recognition The Buckeye Speech Corpus

What is it? Project Personnel Collection & Recording Transcription & Analysis Why create the corpus? On the applied side Training acoustic models for speech recognition systems Lexicon training for handling pronunciation variation Testbed for grammar training The Buckeye Speech Corpus

Corpus Citation • Pitt, M.A., Dilley, L., Johnson, K., Kiesling, S., Raymond, W., Hume, E. and Fosler-Lussier, E. (2007) Buckeye Corpus of Conversational Speech (2nd release) [www.buckeyecorpus.osu.edu] Columbus, OH: Department of Psychology, Ohio State University (Distributor). • Related Publications Raymond, William D., Robin Dautricourt, and Elizabeth Hume. (2006). Word-medial /t,d/ deletion in spontaneous speech: Modeling the effects of extra-linguistic, lexical, and phonological factors. Language Variation and Change, 18(1), 55-97. Pitt, Mark, Keith Johnson, Elizabeth Hume, Scott Kiesling, and William Raymond. (2005). The Buckeye Corpus of Conversational Speech: Labeling Conventions and a Test of Transcriber Reliability. Speech Communication, 45, 90-95. Pitt, Mark and Keith Johnson. (2003). Using pronunciation data as a starting point in modeling word recognition. Paper presented at the 15th International Congress of Phonetic Sciences.Johnson, Keith. (2003). Aligning phonetic transcriptions with their citation forms. Acoustic Research Letters Online.Johnson, Keith. (2003). Massive reduction in conversational American English. Proceedings of the Workshop on Spontaneous Speech: Data and Analysis. August, 2002. Tokyo, JP.Raymond, William D., Robin Dautricourt, and Elizabeth Hume. (Submitted, 2003). Medial /t,d/ deletion in spontaneous speech. Manuscript submitted to Language Variation and Change.Raymond, William D. (2003). An analysis of coding consistency in the transcription of spontaneous speech from the Buckeye corpus. Proceedings of the Workshop on Spontaneous Speech: Data and Analysis. August, 2002. Tokyo, JP.Raymond, William D., Mark Pitt, Keith Johnson, Elizabeth Hume, Matthew Makashay, Robin Dautricourt, and Craig Hilts. (2002). An analysis of transcription consistency in spontaneous speech from the Buckeye corpus. Proceedings of ICSLP-02. September, 2002. Denver.

What it looks like

이후 순서 • Buckeye Corpus 검색 스크립트 소개 • 인터넷 방송 저장 방법 및 상용프로그램 소개 • 포먼트 변형/합성 스크립트 소개 • Voice bar/prevoicing/VOT 길이 조정 스크립트 소개 • TextGrid 자동 생성 스크립트 소개 • …

Introducing The Buckeye Speech Corpus http://buckeyecorpus.osu.edu