1 / 33

Motor Theory + Signal Detection Theory

Motor Theory + Signal Detection Theory. March 23, 2010. Oh Yeahs. Nasometer labs due! Dental vs. alveolar vs. bilabial release bursts. Examples from Yanyuwa: Examples from Hindi: Creating synthetic formant transitions: KlattTalk. KlattTalk.

edric
Download Presentation

Motor Theory + Signal Detection Theory

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Motor Theory + Signal Detection Theory March 23, 2010

  2. Oh Yeahs. • Nasometer labs due! • Dental vs. alveolar vs. bilabial release bursts. • Examples from Yanyuwa: • Examples from Hindi: • Creating synthetic formant transitions: KlattTalk.

  3. KlattTalk • KlattTalk has since become the standard for formant synthesis. (DECTalk) • http://www.asel.udel.edu/speech/tutorials/synthesis/vowels.html

  4. Categorical Perception • Categorical perception = • continuous physical distinctions are perceived in discrete categories. • In the in-class perception experiment: • There were 11 different syllable stimuli • They only differed in the locus of their F2 transition • F2 Locus range = 726 - 2217 Hz • Source: http://www.ling.gu.se/~anders/KatPer/Applet/index.eng.html

  5. Stimulus #1 Stimulus #6 Stimulus #11 Example stimuli from the in-class experiment.

  6. Identification • In Categorical Perception: • All stimuli within a category boundary should be labeled the same.

  7. Discrimination • Original task: ABX discrimination • Stimuli across category boundaries should be 100% discriminable. • Stimuli within category boundaries should not be discriminable at all. In practice, categorical perception means: the discrimination function can be determined from the identification function.

  8. Identification  Discrimination • Let’s consider a case where the two sounds in a discrimination pair are the same. • Example: the pair is stimulus 3 followed by stimulus 3 • Identification data--Stimulus 3 is identified as: • [b] 95% of the time • [d] 5% of the time • The discrimination pair will be perceived as: • [b] - [b] - .95 * .95 = .9025 • [d] - [d] - .05 * .05 = .0025 • Probability of same response is predicted to be: • (.9025 + .0025) = .905 = 90.5%

  9. Identification  Discrimination • Let’s consider a case where the two sounds in a discrimination pair are different. • Example: the pair is stimulus 9 followed by stimulus 11 • Identification data: • Stimulus 9: [d] 80% of the time, [g] 20% of the time • Stimulus 11: [d] 5% of the time, [g] 95% of the time • The discrimination pair will be perceived as: • [d] - [d] - .80 * .05 = .04 • [g] - [g] - .20 * .95 = .19 • Probability of same response is predicted to be: • (.04 + .19) = 23%

  10. Discrimination • In this discrimination graph-- • Solid line is the observed data • Dashed line is the predicted data • (on the basis of the identification scores) Note: the actual listeners did a little bit better than the predictions.

  11. Categorical, Continued • Categorical Perception was also found for stop/glide/vowel distinctions: 10 ms transitions: [b] percept 60 ms transitions: [w] percept 200 ms transitions: [u] percept

  12. Interpretation • Main idea: in categorical perception, the mind translates an acoustic stimulus into a phonemic label. (category) • The acoustic details of the stimulus are discarded in favor of an abstract representation. • A continuous acoustic signal: • Is thus transformed into a series of linguistic units:

  13. The Next Level • Interestingly, categorical perception is not found for non-speech stimuli. • Miyawaki et al: tested perception of an F3 continuum between /r/ and /l/.

  14. The Next Level • They also tested perception of the F3 transitions in isolation. • Listeners did not perceive these transitions categorically.

  15. The Implications • Interpretation: we do not perceive speech in the same way we perceive other sounds. • “Speech is special”… • and the perception of speech is modular. • A module is a special processor in our minds/brains devoted to interpreting a particular kind of environmental stimuli.

  16. Module Characteristics • You can think of a module as a “mental reflex”. • A module of the mind is defined as having the following characteristics: • Domain-specific • Automatic • Fast • Hard-wired in brain • Limited top-down access (you can’t “unperceive”) • Example: the sense of vision operates modularly.

  17. A Modular Mind Model central processes judgment, imagination, memory, attention vision hearing touch speech modules transducers eyes ears skin etc. external, physical reality

  18. Remember this stuff? • Speech is a “special” kind of sound because it exhibits spectral change over time. •  it’s processed by the speech module, not by the auditory module.

  19. SWS Findings • The uninitiated either hear sinewave speech as speech or as “whistles”, “chirps”, etc. • Claim: once you hear it as speech, you can’t go back. • The speech module takes precedence • (Limited top-down access) • Analogy: it’s impossible to not perceive real speech as speech. • We can’t hear the individual formants as whistles, chirps, etc. • Motor theory says: we don’t perceive the “sounds”, we perceive the gestures which shape the spectrum.

  20. McGurk Videos

  21. McGurk Effect explained • Audio Visual Perceived • ba + ga  da • ga + ba  ba (bga) • Some interesting facts: • The McGurk Effect is exceedingly robust. • Adults show the McGurk Effect more than children. • Americans show the McGurk Effect more than Japanese.

  22. Original McGurk Data • Auditory Visual • Stimulus: ba-ba ga-ga • Response types: • Auditory: ba-ba Fused: da-da • Visual: ga-ga Combo: gabga, bagba • Age Auditory Visual Fused Combo • 3-5 19% 36 81 0 • 7-8 36 0 64 0 • 18-40 2 0 98 0

  23. Original McGurk Data • Auditory Visual • Stimulus: ga-ga ba-ba • Response types: • Auditory: ba-ba Fused: da-da • Visual: ga-ga Combo: gabga, bagba • Age Auditory Visual Fused Combo • 3-5 57% 10 0 19 • 7-8 36 21 11 32 • 18-40 11 31 0 54

  24. Audio-Visual Sidebar • Visual cues affect the perception of speech in non-mismatched conditions, as well. • Scientific studies of lipreading date back to the early twentieth century • The original goal: improve the speech perception skills of the hearing-impaired • Note: visual speech cues often complement audio speech cues • In particular: place of articulation • However, training people to become better lipreaders has proven difficult… • Some people got it; some people don’t.

  25. Sumby & Pollack (1954) • First investigated the influence of visual information on the perception of speech by normal-hearing listeners. • Method: • Presented individual word tokens to listeners in noise, with simultaneous visual cues. • Task: identify spoken word • Clear: • +10 dB SNR: • + 5 dB SNR: • 0 dB SNR:

  26. Sumby & Pollack data • Auditory-Only Audio-Visual • Visual cues provide an intelligibility boost equivalent to a 12 dB increase in signal-to-noise ratio.

  27. Tadoma Method • Some deaf-blind people learn to perceive speech through the tactile modality, by using the Tadoma method.

  28. Audio-Tactile Perception • Fowler & Dekle: tested ability of (naive) college students to perceive speech through the Tadoma method. • Presented synthetic stops auditorily • Combined with mismatched tactile information: • Ex: audio /ga/ + tactile /ba/ • Also combined with mismatched orthographic information: • Ex: audio /ga/ + orthographic /ba/ • Task: listeners reported what they “heard” • Tactile condition biased listeners more towards “ba” responses

  29. Fowler & Dekle data orthographic mismatch condition tactile mismatch condition read “ba” felt “ba”

  30. fMRI data • Benson et al. (2001) • Non-Speech stimuli = notes, chords, and chord progressions on a piano

  31. fMRI data • Benson et al. (2001) • Difference in activation for natural speech stimuli versus activiation for sinewave speech stimuli

  32. Mirror Neurons • In the 1990s, researchers in Italy discovered what they called “mirror neurons” in the brains of macaques. • Macaques had been trained to make grasping motions with their hands. • Researchers recorded the activity of single neurons while the monkeys were making these motions. • Serendipity: • the same neurons fired when the monkeys saw the researchers making grasping motions. •  a neurological link between perception and action. • Motor theory claim: same links exist in the human brain, for the perception of speech gestures

  33. Motor Theory, in a nutshell • The big idea: • We perceive speech as abstract “gestures”, not sounds. • Evidence: • The perceptual interpretation of speech differs radically from the acoustic organization of speech sounds • Speech perception is multi-modal • Direct (visual, tactile) information about gestures can influence/override indirect (acoustic) speech cues • Limited top-down access to the primary, acoustic elements of speech

More Related