Spectrogram & its reading - PowerPoint PPT Presentation

Spectrogram its reading l.jpg
1 / 30

Spectrogram & its reading. by Tae-Yeoub Jang. What is spectrogram?. Begin to be used since 1940s Another representation of frequency domain analysis The most popular way of representing spectral information 3 dimensional representation X-axis: Time Y-axis: Frequency

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.

Download Presentation

Spectrogram & its reading

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript

Spectrogram its reading l.jpg

Spectrogram & its reading

by Tae-Yeoub Jang

What is spectrogram l.jpg

What is spectrogram?

  • Begin to be used since 1940s

  • Another representation of frequency domain analysis

  • The most popular way of representing spectral information

  • 3 dimensional representation

    • X-axis: Time

    • Y-axis: Frequency

    • Darkness (or color): Energy

Reviving Sonus

Spectrogram example color resolution of word compute l.jpg

Spectrogram example (color resolution of word “compute”)

Reviving Sonus

Spectrogram example grayscale of word compute l.jpg

Spectrogram example (grayscale of word “compute”)

Reviving Sonus

Slide5 l.jpg

Wideband vs. Narrowbandspectrograms of the question "Is Pat sad, or mad?" The 5th, 10th and 15th harmonics have been marked by white squares in two of the vowels

Reviving Sonus

Types of spectrogram l.jpg

Types of spectrogram

  • Wideband spectrogram

    • better time resolution

    • eg) 15 msec window, 1 msec shift, 125 Hz bandwidth

  • Narrowband spectrogram

    • better frequency resolution

    • eg) 50 msec window, 1 msec shift, 40 Hz bandwidth

Reviving Sonus

Advantages disadvantages l.jpg

Advantages & Disadvantages

  • Advantages

    • Time alignment

  • Disadvantages

    • Less reliable than waveform

Reviving Sonus

Vowel spectrogram l.jpg

Vowel Spectrogram

  • Formant frequencies are critical cues for vowel distinction

  • F1: Height

    • high vowels: low F1

  • F2: Backness

    • back vowels: low F2

Reviving Sonus

Example formant frequencies of english monophthongs l.jpg

Example formant frequencies of English monophthongs

Reviving Sonus

Heed hid head had hod hawed hood who d a male speaker american english l.jpg

"heed, hid, head, had, hod, hawed, hood, who'd" (a male speaker, American English)

Reviving Sonus

Consonant spectrogram l.jpg

Consonant Spectrogram

  • General

    • Acoustic structure more complicated than vowels

    • Adjacent sounds (especially vowels) convey important information  locus

    • High frequency characteristics

       especially for fricatives and affricates

Reviving Sonus

What is locus l.jpg

What is LOCUS

  • Information of formant transition from vowels into obstruents or from obstruents into vowels

  • The target frequency that each formant transition is heading toward as an obstruction is made, or the frequency the transition comes as the obstruction is released

  • The characteristic of the consonantal place and manner  roughly the same in different vowel contexts

Reviving Sonus

Stops l.jpg


  • General

    • Fairly distinct locus for each place

    • Burst

    • Silence during the closure (only at syllable onset position)

    • Virtually no difference during the closure

Reviving Sonus

Stops cntd l.jpg

Stops (cntd.)

  • Voicing distinction

    • voiced: vertical striations for voiced sounds, less abrupt burst, frequently weakened to be like fricatives or approximants

    • voiceless: generally abrupt burst at higher frequency area

Reviving Sonus

Stops cntd15 l.jpg

Stops (cntd.)

  • Place distinction

    • bilabial

      • relatively low F2, F3 locus  rising into and falling out of vowel

      • weak and spread vertical lines

    • alveolar

      • F2 locus about 1800 Hz

      • Strong vertical lines

    • velar

      • Velar pinch: vowels F2, F3 merging

      • often double burst

      • long formant transitions

Reviving Sonus

Stops cntd16 l.jpg

Stops (cntd.)

  • Manner distinction

    • Silence duration, VOT, vowel F0

Reviving Sonus

Examples a bab a dad a gag l.jpg

Examples -- “a bab, a dad, a gag”

Reviving Sonus

Place dependent loci l.jpg

Place dependent loci

Reviving Sonus

Fricatives l.jpg


  • General

    • Random noise pattern especially in high frequency regions

    • Place distinction

      • Labiodental [f, v]: rising locus into the following vowel

      • Dental [, ð]: major energy above 6000Hz

      • Alveolar [s, z]: major energy above 4000Hz

      • Alveopalatal [š, ž ]: major energy above 6000Hz

      • Glottal [h]: the trace of formant frequencies of neighbouring vowels

Reviving Sonus

Fricatives cntd l.jpg

Fricatives (cntd.)

  • Weak vs. strong

    • Strong [s, z, š, ž ]: darker bands

    • Weak [f, v, , ð ]: spread and fainter

      • Voiced [v, ð ]: often so weak and confused with nasals or approximants

      • Cues to tell [] from [f]: higher formants of [] fall into adjacent vowels

Reviving Sonus

Example fie thigh sigh shy l.jpg

Example –“fie, thigh, sigh, shy”

Reviving Sonus

Example ever weather fizzer pleasure l.jpg

Example –“ever, weather, fizzer, pleasure”

Reviving Sonus

Nasals l.jpg


  • General

    • Formants similar to vowels but fainter

    • Very low F1 (about 250Hz), F2 (about 2500Hz), and F3 (about 3250Hz)

  • Place distinction

    • bilabial [m]: downward F2, F3 locus

    • alveolar [n]: less amount of F2 transition

    • velar [ŋ ]: velar pinch

Reviving Sonus

Examples a pam a tan a kang l.jpg

Examples -- “a Pam, a tan, a kang”

Reviving Sonus

Liquies approximants l.jpg

Liquies & Approximants

  • General

    • Formants similar to vowels but fainter (especially at high frequency regions)

    • Approximately F1(250Hz), F2(1200Hz), F3(2400Hz)

    • Change in formant structure

Reviving Sonus

Liquids approximants cntd l.jpg

Liquids & Approximants(cntd.)

  • Phone specific properties

    • Labial glide [w]:

      • very low F1, F2 (600-1000Hz|) and gets too close to each

      • relatively low F3

      • rapid falloff of spectral amplitude

    • Palatal glide [y]:

      • extremely low F1

      • extremely high F2, F3

Reviving Sonus

Liquids approximants cntd27 l.jpg

Liquids & Approximants(cntd.)

  • Phone specific properties (cntd.)

    • Flap [Ր]: soft burst, short duration

    • Retroflex [r]:

      • F3 dipping down close to F2

      • General lowering of F3, F4

    • Lateral [l]:

      • Low F1, F2 (approx. F1 250Hz, F2 1200Hz)

      • usually substantial energy in the high F region

Reviving Sonus

Example led red wed yell l.jpg

Example –“led, red, wed, yell”

Reviving Sonus

Final remarks l.jpg

Final remarks

  • Spectrogram is not the only cue for acoustic distinction of speech sounds

  • Very often, the waveform is more reliable

Reviving Sonus

References links l.jpg

References & Links

  • http://cslu.cse.ogi.edu/tutordemos/SpectrogramReading/spectrogram_reading.html

  • http://hctv.humnet.ucla.edu/departments/linguistics/VowelsandConsonants/course

  • http://www.cs.indiana.edu/~port/teach/306/speech.acoustics.html

  • http://www.phon.ucl.ac.uk/courses/spsci/b203/week2-5.pdf

Reviving Sonus

  • Login