html5-img
1 / 1

Automatic Detection of “g-dropping” in American English Using Forced Alignment

Automatic Detection of “g-dropping” in American English Using Forced Alignment Jiahong Yuan and Mark Liberman University of Pennsylvania. “g-dropping” : -ing is pronounced with an alveolar (instead of a velar) nasal coda, e.g., workin’, tryin’ . Automatic detection of “g-dropping”:

dyanne
Download Presentation

Automatic Detection of “g-dropping” in American English Using Forced Alignment

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Automatic Detection of “g-dropping” in American English Using Forced Alignment Jiahong Yuan and Mark Liberman University of Pennsylvania “g-dropping”: -ing is pronounced with an alveolar (instead of a velar) nasal coda, e.g., workin’, tryin’. Automatic detection of “g-dropping”: Step 1. Two acoustic models were trained, one for -in' (/IHN/) and the other for -ing (/IHNG/). The models were GMM-based, five-state HMMs on 39 PLP coefficients. Step 2. The parameters of the models were initially estimated using the Buckeye corpus and then re-estimated using the SCOTUS corpus. Step 3. The models were added to the Penn Phonetics Lab Forced Aligner, and forced alignment will choose the more probable pronunciation from the two alternatives. Test material: 200 words randomly selected from Buckeye; 100 were transcribed as -in’ and 100 as -ing. Identification test: -in’/-ing forced choice 8 native English speakers 10 native Mandarin speakers Pairwise percentage agreements: Among the native English speakers: 79% - 96%, mean = 0.863 Between the aligner and the English speakers: 79% - 90%, mean = 0.849 Vowel quality difference: Using the majority vote of the English listeners as gold standard, Mandarin listeners performed poorly (Mandarin has N and NG distinction, but no tense and lax vowel distinction). We computed an approximate KL-divergence between the GMM acoustic models at each of the five HMM states. The distance between the two models reaches its peak at the middle state, and it is larger on the left side (the vowel side) than the right (the nasal coda side).

More Related