Adult Speech Perception - PowerPoint PPT Presentation

Slide1 l.jpg
Download
1 / 31

  • 886 Views
  • Updated On :
  • Presentation posted in: Pets / Animals

Adult Speech Perception . Chris Darwin. Place of articulation. Labial. Alveolar. Velar. b. d. g. Stop. Voiced. p. t. k. Stop. Voiceless. ng. m. n. Nasal. Voiced. Wide and narrowband spectrograms. Wave. Wide. Narrow. Voice-Onset Time (VOT). bit. pit. 5 ms. 40 ms.

Related searches for Adult Speech Perception

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.

Download Presentation

Adult Speech Perception

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Slide1 l.jpg

Adult Speech Perception

Chris Darwin


Slide2 l.jpg

Place of articulation

Labial

Alveolar

Velar

b

d

g

Stop

Voiced

p

t

k

Stop

Voiceless

ng

m

n

Nasal

Voiced


Wide and narrowband spectrograms l.jpg

Wide and narrowband spectrograms

Wave

Wide

Narrow


Slide4 l.jpg

Voice-Onset Time (VOT)

bit

pit

5 ms

40 ms


Formants in a wide band spectrogram l.jpg

Formants in a wide-band spectrogram

<-- F 3

Burst -->

<-- F 2

<-- Formant transitions ------->

<-- F1

“w e g o”


Categorical perception 1 l.jpg

Categorical Perception - 1

1. Set up a continuum of sounds between two categories

/ba/ - /da/

1... 3 …5…7


Categorical perception 2 l.jpg

1 ... 3 … 5 … 7

Categorical Perception - 2

2. Run an identification experiment

100

Sharp phoneme boundary

% /ba/

0


Categorical perception 3 l.jpg

1 versus 3

Categorical Perception - 3

2. Run a discrimination experiment

100

Discrimination peak

% difft

0

1 ... 3 … 5 … 7


Categorical perception 4 l.jpg

Categorical Perception - 4

Defined as:

1. Sharp phoneme boundary

2. Discrimination peak at phoneme boundary

3. Discrimination predicted from identification

(only “different” if different phoneme)


Categorical perception 5 l.jpg

Categorical Perception - 5

• Occurs for many consonant continua

Even with “proper” discrimination paradigms

• Less evident for vowel continua


Categorical perception 6 l.jpg

Categorical Perception - 6

  • • For most “ordinary” continua, such as frequency, loudness, brightness etc, our ability to discriminate far exceeds our ability to label

  • • Continua that show Categorical Perception are different from this norm.

  • Liberman claims that CP is an indicator of a special Speech Mode of perception that is distinctively human.


Categorical perception 7 l.jpg

Categorical Perception - 7

  • Is CP restricted to speech?

  • No, also shown by comparisons of musical intervals.

    • Burns, E. M. and Campbell, S. L. (1994). "Frequency and frequency ratio resolution by possessors of relative and absolute pitch: Examples of categorical perception?," J. Acoust. Soc. Am. 96, 2704-2719.

  • Is CP shown for speech unique to humans?

  • No. Chinchillas and quails show the same VOT boundary as humans.

    • Kuhl, P. K. and Miller, J. D. (1978). "Speech perception by the chinchilla: identification functions for synthetic VOT stimuli," J. Acoust. Soc. Am. 63, 905-917.

  • Macaques show discrimination peaks at human VOT and place-of-articulation boundaries.

    • Kuhl, P. K. and Padden, D. M. (1982). "Enhanced discriminability at the phonetic boundaries for the voicing feature in macaques," Percept. Psychophys. 32, 542-550.

    • Kuhl, P. K. and Padden, D. M. (1983). "Enhanced discriminability at the phonetic boundaries for the place feature in macaques," J. Acoust. Soc. Am. 73, 1003-1010.

  • So - human speech exploits discontinuities in the way that vertebrate auditory systems represent sound.


Natural auditory categories l.jpg

Natural auditory categories

Category 2

Auditory

values

Acoustic

values

Category 1

Stimulus dimension

Stimulus dimension

Voicing and place-of-articulation dimensions show these natural categories.

Sinex, D. G. and McDonald, L. P. (1989). "Synchronized discharge rate representation of voice-onset time in the chinchilla auditory nerve," J. Acoust. Soc. Am. 85, 1995-2004.

Sinex, D. G., McDonald, L. P. and Mott, J. B. (1991). "Neural correlates of nonmonotonic temporal acuity for voice onset time," J. Acoust. Soc. Am. 90, 2441-9.


Is cp innate or acquired l.jpg

Is CP innate or acquired?

Yes !

Infants born with ability to make many speech discriminations that they can subsequently NOT make (see next lecture)

Adults (and 1-year-old infants) have lost the ability to make distinctions that their language does not use.


Different languages make different regions of acoustic space distinctive l.jpg

Different languages make different regions of acoustic space distinctive


Slide16 l.jpg

/r/ - /l/...

  • Phone - a particular sound used by any language eg the sound [r]

  • Phoneme - a sound used in contrast to another in a particular language eg the category /r/ as distinct from /l/

  • Different languages make different phonemic contrasts.


R l 2 l.jpg

/r/ - /l/ …- 2

  • Phonemes in a particular language are defined by minimal pairs

  • i.e. since in English “lice” and “rice” have a different meaning, then they contain different phonemes: /l/ and /r/

  • But there is no such minimal pair in Japanese, so they have a single phoneme /r/


R l 3 l.jpg

/r/ - /l/ …- 3

  • Each language has its own distinctive set of phonemic categories

  • English distinguishes /r/ from /l/ but Japanese doesn’t

  • Tamil distinguishes dental /t1/ from an alveolar /t2/ from a retroflex /t3/. English doesn’t.


Slide19 l.jpg

Synthetic Stimuli: /ra/-/la/

/ra/

/la/


R l 4 l.jpg

/r/ - /l/ 4

English identification

3-oddball

100

English discrimination

% correct

or

% /ra/

Japanese discrimination

50

0

F3

1 ... 5 … 10 … 15

/ra/

/la/

Miyawaki et al (1975) Perception & Psychophysics18 331-340


Slide21 l.jpg

Synthetic Stimuli: /ra/-/la/

Second Formant

Third Formant

Iverson, P., et al. (2003). "A perceptual interference account of acquisition difficulties for non-native phonemes," Cognition 87, B47-57.


Slide22 l.jpg

American MDS Solution

/l/

/r/

Second Formant

Third Formant

Physical Spacing of Stimuli

…and rated the similarity of stimulus pairs


Slide23 l.jpg

Japanese MDS Solution

Second Formant

Third Formant


R l 424 l.jpg

/r/ - /l/ - 4

  • Can Japanese really not hear any difference?

  • Use implicit technique (Mann 1986 Cognition):

    • Co-articulation: /d/ and /g/ are pronounced differently after /l/ and /r/ as in /arda/-/arga/ compared with /alda/-/alga/

  • So, for English speakers /d/-/g/ boundary has different formant values after /l/ than after /r/.

  • Also true for Japanese who can hear /r/ vs /l/

  • But ALSO true for those who CAN’T.


R l 5 l.jpg

/r/ - /l/ - 5

Is this still something specific to speech?NO!! QUAILS DO IT TOO !!!Lotto, Kluender & Holt (1997) J.Acoust. Soc. Am. 102, 1134-1140

So may reflect a general auditorycontrast effect.i.e. the auditory representation of the [d] sound is different after an [r] than after an[l].Lotto & Kluender (1998) Perc & Psychophys. 60, 602-619


R l 6 l.jpg

/r/ - /l/ - 6

Fowler, C. A., Brown, J. M. and Mann, V. A. (2000). "Contrast effects do not underlie effects of preceding liquids on stop-consonant identification by humans," J. exp. Psychol.: Hum. Perc. & Perf. 26, 877-888.

McGurk effect experiment shows compensation for coarticulation by listeners when neither frequency contrast nor masking can be the source of the compensations.


Mcgurk effect l.jpg

McGurk effect

watch it on YouTube


R l fowler s mcgurk expt l.jpg

/r/ - /l/: Fowler’s McGurk expt

Ambiguous

ar/l

(constant)

Either

da or ga

+

Vision:either ar or al

Still get shift in d/g boundary, so not auditory contrast.


Resolution l.jpg

Resolution

  • Holt, L. L., Stephens, J. D. and Lotto, A. J. (2005). "A critical evaluation of visually moderated phonetic context effects," Percept Psychophys 67, 1102-12.

Fowler’s effect due to McGurk effect caused by visual

input concurrent with test syllable, NOT precursor.


Trading relations l.jpg

Trading relations

  • Most phonetic distinctions have more than one acoustic cue as a result of the particular articulatory gesture that gives the distinction.

  • Perception must establish some "trade-off" between the different cues. Can this trade-off be explained by low-level auditory processes such as short-term adaptation, or do they require processes specific to speech?

  • Repp (1982) Psych Bull. 92, 81-110


Summary l.jpg

Summary

  • Many consonantal speech sounds perceived categorically.

  • For some due to speech exploiting discontinuities in the way that auditory systems represent sound.

  • Some of it is due to cultural differences, acquired in the first year of life.

  • Some aspects of the decoding of co-articulation may be due to general contrast effects (eg through adaptation in the auditory nerve).

  • Others are non-auditory in nature and may be specific to human listeners


  • Login