Speech Perception: Theoretical approaches
Scope of the problem • Speech perception involves the mapping of speech acoustic signals onto linguistic messages (e.g., phonemes, distinctive features, syllables, words, phrases…)
Why is the problem theoretically hard to solve? • Acoustic variability due to context, talker, dialect, rate, prosodic, and other differences. • Segmentation problems
Three theoretical approaches: • Motor theory • Direct realism • General approach
Motor theory of speech perception(Liberman & Mattingly, 1985) • Listeners perceive gestures (more specifically, intended gestures, or neuromotor commands). • Speech is perceived in humans by means of a specialized speech module.
How the speech module works: …“the candidate signal descriptions are computed by an analogue of the production process—an internal, innately specified vocal-tract synthesizer…—that incorporates complete information about the anatomical and physiological characteristics of the vocal tract and also about the articulatory and acoustic consequences of linguistically significant gestures” (Liberman & Mattingly, 1985, p. 26).
Direct realist theory of speech perception (C. Fowler) • Derived from James J. Gibson’s perceptual theory. • Objects of speech perception are actual gestures. • No special mechanisms are required.
How direct realism works: • “Perceptual systems have a universal function. They constitute the sole means by which animals can know their niches. Moreover, they appear to serve this function in one way: They use structure in the media that has been lawfully caused by events in the environment as information for the events. Even though it is the structure in media (light for vision, skin for touch, air for hearing) that sense organs transduce, it is not the structure in those media that animals perceive. Rather, essentially for their survival, they perceive the components of their niche that caused the structure.” (Fowler, 1996, p. 1732)
General approach to speech perception (Diehl, Lotto, & Holt, 2004) • Objects of speech perception are (primarily) acoustic/auditory events. • Speech perception relies on general mechanisms of audition and perceptual learning.
VOT frequency histograms of voicing categories across six languages (Lisker & Abramson, 1964)
English identification functions for VOT stimuli superimposed on VOT frequency histograms
Thai identification functions for VOT stimuli superimposed on Thai VOT frequency histograms
-50 TOT +50 TOT 0TOT Frequency (Hz) Time (ms) 50 ms 50 ms Tone onset time (TOT): a nonspeech analog of Voice onset time (VOT) (Pisoni, 1977; Holt, Lotto, & Diehl, 2004)
/ga/-/ka/ identification by typically developing children and dyslexic children (with and without ADHD)
Back to the three approaches to speech perception • Motor theory • Direct realism • General approach