1 / 127

Perception & Cognition, One at last in Spoken Word Recognition

Perception & Cognition, One at last in Spoken Word Recognition Temporal Integration at Two Time Scales. 5/9/05 Cochlear Implant Team UIHC. Bob McMurray University of Iowa Dept. of Psychology. Collaborators. Richard Aslin Michael Tanenhaus David Gow. Joe Toscano Dana Subik

jarvis
Download Presentation

Perception & Cognition, One at last in Spoken Word Recognition

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Perception & Cognition, One at last in Spoken Word Recognition Temporal Integration at Two Time Scales 5/9/05 Cochlear Implant Team UIHC Bob McMurray University of Iowa Dept. of Psychology

  2. Collaborators Richard Aslin Michael Tanenhaus David Gow Joe Toscano Dana Subik Julie Markant

  3. critical for Perceptual processes Continuous acoustic detail Specifically: Sensitivity to fine-grained perceptual detailcan helpintegrateinformation overtime. • High-level language Processes: • Word Recognition • Syntax • Reference

  4. critical for Perceptual processes Continuous acoustic detail Specifically: Sensitivity to fine-grained perceptual detailcan helpintegrateinformation overtime. • High-level language Processes: • Word Recognition • Syntax • Reference

  5. Perceptual processes Continuous acoustic detail for support provide • High-level language Processes: • Word Recognition

  6. doot/ toot duke/ tuke d t Ganong (1980): Lexical information biases perception of ambiguous phonemes. Phoneme Restoration (Warren, 1970, Samuel, 1997). % /t/ Lexical Feedback: McClelland & Elman (1988); Magnuson, McMurray, Tanenhaus & Aslin (2003)

  7. words phonemes Ganong (1980): Lexical information biases perception of ambiguous phonemes. Lexical Feedback: McClelland & Elman (1988); Magnuson, McMurray, Tanenhaus & Aslin (2003)

  8. words phonemes Ganong (1980): Lexical information biases perception of ambiguous phonemes. Lexical Feedback: McClelland & Elman (1988); Magnuson, McMurray, Tanenhaus & Aslin (2003)

  9. Perceptual processes Continuous acoustic detail for support provide • Invariance, Covariance & Temporal Integration • Short-term storage. • Covariance. • Limit sensitivity to necessary detail. • High-level language Processes: • Word Recognition

  10. In language, information arrives sequentially. • Partial syntactic and semantic representations are formed as words arrive. The Eastside is prettier than the Westside. • Words are identified over sequential phonemes. l  ŋ g ə d 

  11. Spoken Word Recognitionis an ideal arena in which to study these issues because: • Research divides word recognition intoperceptualandcognitive mechanisms. • Perceptual information available fortemporalinformation integration. • Cognitive architectures may support perception.

  12. Scales of temporal integration in word recognition • A Word: ordered series of articulations. • - Build abstract representations. • - Form expectations about future events. • - Fast (online) processing. • A phonology: • - Abstract across utterances. • - Expectations about possible future events. • - Slow (developmental) processing

  13. Mechanisms of Temporal Integration • Stimuli do not change arbitrarily. • Perceptual cues reveal something about the change itself. • Active integration: • Anticipating future events • Retain partial present representations. • Resolve prior ambiguity.

  14. Representational Medium: Lexical Activation • Lexical activation shows: • Online processing dynamics. • Sensitivity to fine-grained detail. • Integration of asynchronous material.

  15. Overview • Speech perception and Spoken Word Recognition. 2) Lexical activation is sensitive to fine-grained detail in speech. 3) Fast temporal integration: taking advantage of regularity in the signal for temporal integration. 4) Slow temporal integration: Developmental consequences

  16. X basic bakery bakery X ba… kery barrier X X bait barricade X baby • Online Word Recognition • Information arrives sequentially • At early points in time, signal is temporarily ambiguous. • Later arriving information disambiguates the word.

  17. Current models of spoken word recognition • Immediacy: Hypotheses formed from the earliest moments of input. • Activation Based: Lexical candidates (words) receive activation to the degree they match the input. • Parallel Processing: Multiple items are active in parallel. • Competition: Items compete with each other for recognition.

  18. Input: b... u… tt… e… r time beach butter bump putter dog

  19. These processes have been well defined for a phonemic representation of the input. k A g n I S  n • Considerably less ambiguity if we consider subphonemic information. • Bonus: processing dynamics may solve problems in speech perception. Example: subphonemic effects of motor processes.

  20. Coarticulation n n ee t c k Any action reflects future actions as it unfolds. Example:Coarticulation Articulation (lips, tongue…) reflectscurrent, futureandpastevents. Subtle subphonemic variation in speech reflects temporal organization. Sensitivity to theseperceptualdetails might yield earlier disambiguation. Lexical activation could store these perceptual details.

  21. These processes have largely been ignored because of a history of evidence that perceptual variability gets discarded. Example:Categorical Perception

  22. Categorical Perception B 100 100 Discrimination % /p/ Discrimination ID (%/pa/) 0 0 B VOT P • Sharp identification of tokens on a continuum. P • Discrimination poor within a phonetic category. Subphonemic variation in VOT is discarded in favor of adiscretesymbol (phoneme).

  23. Evidence against the strong form of Categorical Perception from psychophysical-type tasks: • Discrimination Tasks • Pisoni and Tash (1974) • Pisoni & Lazarus (1974) • Carney, Widin & Viemeister (1977) • Training • Samuel (1977) • Pisoni, Aslin, Perey & Hennessy (1982) • Goodness Ratings • Miller (1997) • Massaro & Cohen (1983)

  24. Speech Perception • Acoustics -> phonemes • Perceptual processes (e.g. templates) Acoustic Sublexical Units /la/ /ip/ /a/ /b/ /l/ /p/ Lexicon • Word Recognition • Phonemes -> words • Cognitive processes (e.g. competition, activation) Fundamental independence of fields. Enabled by CP. Evidence against CP seen to support paradigm.

  25. Experiment 1 ? Does within-category acoustic detail systematically affect higher level language? Is there a gradient effect of subphonemic detail on lexical activation?

  26. McMurray, Aslin & Tanenhaus (2002) A gradient relationshipwould yield systematic effects of subphonemic information on lexical activation. If this gradiency is useful for temporal integration, it must be preserved over time. Need a design sensitive to bothacoustic detailand detailedtemporal dynamicsof lexical activation.

  27. Acoustic Detail Use a speech continuum—more steps yields a better picture acoustic mapping. KlattWorks:generate synthetic continua from natural speech. • 9-step VOT continua (0-40 ms) • 6 pairs of words. • beach/peach bale/pale bear/pear • bump/pump bomb/palm butter/putter • 6 fillers. • lamp leg lock ladder lip leaf • shark shell shoe ship sheep shirt

  28. Temporal Dynamics How do we tap on-line recognition? With an on-line task:Eye-movements Subjects hear spoken language and manipulate objects in a visual world. Visual world includes set of objects with interesting linguistic properties. abeach, apeachand some unrelated items. Eye-movements to each object are monitored throughout the task. Tanenhaus, Spivey-Knowlton, Eberhart & Sedivy, 1995

  29. Why use eye-movements and visual world paradigm? • Relatively naturaltask. • Eye-movements generated veryfast(within 200ms of first bit of information). • Eye movementstime-lockedto speech. • Subjectsaren’t awareof eye-movements. • Fixation probability maps ontolexical activation..

  30. Task A moment to view the items

  31. Task Bear Repeat 1080 times

  32. Identification Results 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 0 5 10 15 20 25 30 35 40 High agreement across subjects and items for category boundary. proportion /p/ B VOT (ms) P By subject:17.25 +/- 1.33ms By item: 17.24 +/- 1.24ms

  33. Task 200 ms Trials 1 2 3 4 5 % fixations Time Target = Bear Competitor = Pear Unrelated = Lamp, Ship

  34. Task 0.9 VOT=0 Response= VOT=40 Response= 0.8 0.7 0.6 0.5 Fixation proportion 0.4 0.3 0.2 0.1 0 0 400 800 1200 1600 2000 0 400 800 1200 1600 Time (ms) More looks to competitor than unrelated items.

  35. Task target Fixation proportion Fixation proportion time time • Given that • the subject heard bear • clicked on “bear”… How often was the subject looking at the “pear”? Categorical Results Gradient Effect target target competitor competitor competitor competitor

  36. Results 20 ms 25 ms 30 ms 10 ms 15 ms 35 ms 40 ms 0.16 0.14 0.12 0.1 0.08 0.06 0.04 0.02 0 0 400 800 1200 1600 0 400 800 1200 1600 2000 Response= Response= VOT VOT 0 ms 5 ms Competitor Fixations Time since word onset (ms) Long-lasting gradient effect: seen throughout the timecourse of processing.

  37. 0.08 0.07 0.06 0.05 0.04 0.03 0.02 0 5 10 15 20 25 30 35 40 Area under the curve: Clear effects of VOT B: p=.017* P: p<.001*** Linear Trend B: p=.023* P: p=.002*** Response= Response= Looks to Competitor Fixations Looks to Category Boundary VOT (ms)

  38. 0.08 0.07 0.06 0.05 0.04 0.03 0.02 0 5 10 15 20 25 30 35 40 Unambiguous Stimuli Only Clear effects of VOT B: p=.014* P: p=.001*** Linear Trend B: p=.009** P: p=.007** Response= Response= Looks to Competitor Fixations Looks to Category Boundary VOT (ms)

  39. Summary Subphonemic acoustic differences in VOT have gradient effect on lexical activation. • Gradient effect of VOT on looks to the competitor. • Effect holds even for unambiguous stimuli. • Seems to be long-lasting. Consistent with growing body of work using priming (Andruski, Blumstein & Burton, 1994; Utman, Blumstein & Burton, 2000; Gow, 2001, 2002).

  40. The Proposed Framework Sensitivity & Use Word recognition is systematically sensitive to subphonemic acoustic detail. 2) Acoustic detail is represented as gradations in activation across the lexicon. This sensitivity enables the system to take advantage of subphonemic regularities for temporal integration. 4) This has fundamental consequences for development: learning phonological organization.

  41. Lexical Sensitivity P L Bear B Sh Word recognition is systematically sensitive to subphonemic acoustic detail. • Voicing • Laterality, Manner, Place • Natural Speech X Metalinguistic Tasks

  42. Lexical Sensitivity 0.1 Response=P Looks to B 0.08 0.06 Competitor Fixations Response=B Looks to B 0.04 Category Boundary 0.02 0 0 5 10 15 20 25 30 35 40 VOT (ms) Word recognition is systematically sensitive to subphonemic acoustic detail. • Voicing • Laterality, Manner, Place • Natural Speech X Metalinguistic Tasks

  43. Lexical Sensitivity 0.1 0.08 0.06 0.04 0.02 0 Word recognition is systematically sensitive to subphonemic acoustic detail. • Voicing • Laterality, Manner, Place • Natural Speech X Metalinguistic Tasks Response=P Looks to B Competitor Fixations Response=B Looks to B Category Boundary 0 5 10 15 20 25 30 35 40 VOT (ms)

  44. Lexical Sensitivity Word recognition is systematically sensitive to subphonemic acoustic detail. • Voicing • Laterality, Manner, Place • Natural Speech X Metalinguistic Tasks • ? Non minimal pairs • ? Duration of effect • (experiment 1)

  45. bump pump dump bun bumper bomb 2) Acoustic detail is represented as gradations in activation across the lexicon. Input: b... u… m… p… time

  46. Temporal Integration This sensitivity enables the system to take advantage of subphonemic regularities for temporal integration. • Regressive ambiguity resolution (exp 1): • Ambiguity retained until more information arrives. • Progressive expectation building (exp 2): • Phonetic distinctions are spread over time • Anticipate upcoming material.

  47. Development 4) Consequences for development: learning phonological organization. • Learning a language: • Integrating input across many utterances to build long-term representation. • Sensitivity to subphonemic detail (exp 4 & 5). • Allows statistical learning of categories (model).

  48. Experiment 2 How long are gradient effects of within-category detail maintained? Can subphonemic variation play a role in ambiguity resolution? How is information at multiple levels integrated? ? ?

More Related