Download Policy: Content on the Website is provided to you AS IS for your information and personal use only and may not be sold or licensed nor shared on other sites. SlideServe reserves the right to change this policy at anytime. While downloading, If for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.
1. Acoustic properties of consonants Reading spectrograms
2. Acoustic cues for manner of articulation in consonants: Speech can be roughly segmented into manner of articulation categories by context-free acoustic cues.
Stops: Closure (silent) period followed by release burst and abrupt vowel onset.
Fricatives: Turbulent noise burst, strong for sibilants, weak for non-sibilants.
Nasals: Abrupt onset and offset of a segment with very weak formant structure. Low frequency, periodic energy (voice bar on spectrogram).
Approximants: Non-abrupt onset and offset; dynamic (changing) formant structure (diphthong-like); weaker F2 and F3 than for (more open) vowels.
3. Segment spectrogram into manner of articulation categories _ _ __ _ _ _ _ __ _ __ _ _ _ __ _ _ ___ _ _ _ __ __
V S V S V NV F V: N S V F A V F S V: N V S V F
A bird in the hand is worth two in the bush
4. Stop Consonants Cues to place of articulation
Different voice and airstream mechanisms
5. Phases of a stop release Transient release:
Burst associated with oral release gesture.
Turbulent airflow through narrow constriction at place of release.
Glottal turbulence as air flows through open glottis prior to closure and onset of voicing.
6. Spectral patterns of release burst for labial, alveolar, velar stops
7. Formant transitions
8. Formant transitions in three synthetic stop consonant continua
9. The ?locus? of a formant transition Figure shows steady state F2 for different vowels and their formant transitions for [d] (alveolar stop)
The formant transitions point back to a common ?locus? at 1.8 kHz.
10. The ?locus? of a formant transition The ?locus? of F2:
3 kHz for velars
1.8 kHz for alveolars
.6-.8 kHz for labials.
However, the locus is a somewhat idealized notion.
Analysis of natural speech does not provide strong support for the locus concept.
Velar stops vary in place of articulation with different vowels.
11. Summary: cues to place of articulation in stop consonants. Spectral energy distribution in the noise burst and formant transitions are the main cues.
Formant transitions are context-sensitive cues.
Context-sensitive cues require more complex signal processing.
A need for specialized phonetic feature detectors?
12. The voicing feature in stops A matter of timing the glottal gesture in relation to the oral constriction gesture.
Voice onset time (VOT)
A continuum from fully voiced (1) to strongly aspirated (voiceless) stops (5)
13. Differences in voice onset time across languages
14. A voice onset time continuum Voiceless aspirated ejective voiceless unaspirated fully voiced
VOT=80 msec VOT=150 msec VOT=10 msec VOT= - 150
15. Sequence of gestures for an ejective stop
16. How to sustain voicing during oral closure: implosives
17. The principle airstream mechanisms used in stop consonants
18. Principal phonation types used in stop consonants Modal voice: vocal folds lightly approximated, spontaneous vibration on small subglottal ? supraglottal pressure differential.
Voiceless: vocal cords open and somewhat stiff, preventing spontaneous voicing; some aspiration noise.
Murmur (breathy voice): lax, partially open glottis. (Hindi)
Creaky voice (laryngealized) or glottalized. (Gitksan)
19. The feature tense (lax) in Korean stops. Korean has a three-way contrast between tense, voiced, and aspirated stops, affricates and fricatives.
Tense stops are made with increased laryngeal and supralaryngeal muscular tension.
Voiceless stops are typically produced with a somewhat more tense vocal and articulatory setting than voiced stops.
However, in Korean obstruents voicing and tensity appear to be independently controlled.
20. Korean labial stops arm foot sucking
Aspirated unaspirated tense
[phal] [pal] [p?al]
21. Summarizing: Voicing features in stop consonants The timing dimension (VOT) is the most important acoustic cue.
Airstream mechanisms (pulmonic, glottalic, velaric) used for some types of stops: plain stops, ejectives, implosives, clicks.
Different phonation types may also be employed (modal voice, breathy, creaky, or tense voice.
22. Fricatives Characterised as a class by a turbulent noise source.
Subclassified by their spectral energy distributions.
23. Sibilant fricative spectra
24. Non-sibilant fricatives
25. Segment this spectrogram _ _ _ _ __ _ __ _ _ _ _ __
The ship sails close to the shore.
26. The glottal fricative [h] The [h] in hard and hid.
Because the turbulence is generated at the glottis, the spectrum of an [h] has the formant structure of the following vowel.
27. Nasals and nasalization Nasal consonants are like stops in that the oral airstream is completely blocked, but they are also resonant sounds (like approximants and vowels).
They have both stop-like and resonant acoustic properties.
A nasalized segment contains a mixture of oral and nasal resonances.
Nasalized vowels typically lack clear formant structure.
28. Place of articulation in nasal consonants Recognized from formant transitions on preceding or following vowels.
Particularly, the F2 transition.
29. Nasalization Caused mainly by anticipatory lowering of velum prior to oral closure for the nasal consonant. Hence the familiar phonological rule: Vowels nasalize before an nasal consonant.
Introduces nasal resonances and anti-resonances into the spectrogram, resulting in some ?smearing? of the vowel formant structure
Nasal formants may be visible around 250, 2500, 3250 Hz.
Because nasal resonances are fixed and tend to be different for different speakers, nasal murmur has been suggested as a useful acoustic signature for speaker identification.
30. Approximants The most vowel-like of consonants
Composed almost entirely of formant transitions, which also serve to identify their respective places of articulation.
31. Approximants The /r-l/ contrast:
Not many languages have it.
/r/ is characterized by a dramatic lowering of F3. There are two varieties of rhotic (?r? sound); one made by retracting the tongue tip (retroflexion), the other made by tongue bunching (tongue tip lowered with front of tongue bunched up to form a narrow central passage in the post-alveolar region). These two types of /r/ are acoustically indistinguishable on the spectrogram, and possibly on auditory grounds as well.
The lateral approximant /l/ has a relatively abrupt onset and offset. Weak formant structure. No movement of F3.
The /w-y/ contrast:
These semi-vowels or glides have formant structure that resembles their respective vowels /i/ and /u/.
32. Nasal segments have: Low frequency, voicing energy - a voice bar
Very weak formant structure, made up of nasal resonances (nasal formants or ?poles?) and anti-resonances (nasal anti-formants or ?zeros?). The anti-resonances are regions of the spectrum robbed of acoustic energy, caused by the introduction of another resonator - the nasal cavity.
Abrupt onset and offset, corresponding to the closure of the oral cavity (the stop gesture) and the direction of airstream through the nasal cavity. Release of the oral closure results in an equally abrupt offset registered on the spectrogram.