wolfgang hess 60 years young n.
Skip this Video
Download Presentation
Wolfgang Hess 60 years young

Loading in 2 Seconds...

play fullscreen
1 / 20

Wolfgang Hess 60 years young - PowerPoint PPT Presentation

  • Uploaded on

Wolfgang Hess 60 years young. Speech is beautiful. Louis C.W. Pols Institute of Phonetic Sciences University of Amsterdam. Bonn, Sept. 29, 2000. IKP, Bonn. IFA, Amsterdam. Speech is beautiful. most natural form of communication it is efficient highly complex and challenging

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
Download Presentation

PowerPoint Slideshow about 'Wolfgang Hess 60 years young' - neve-wagner

Download Now An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
wolfgang hess 60 years young

Wolfgang Hess 60 years young

Speech is beautiful

Louis C.W. Pols

Institute of Phonetic Sciences

University of Amsterdam

Bonn, Sept. 29, 2000


IKP, Bonn

IFA, Amsterdam

speech is beautiful
Speech is beautiful
  • most natural form of communication
  • it is efficient
  • highly complex and challenging
  • towards multi- and interdisciplinary communities
  • natural speech synthesis  full knowledge
  • ASR lasting challenge
  • speech is extremely robust to distortions
  • speech is eloquent; singing; speeches are awful
  • speech community is nice
robustness to degraded speech
robustness to degraded speech
  • partly reversed speech

(Saberi & Perrott, Nature, 4/99)

fixed duration segments time reversed or

shifted in time

perfect sentence intelligibility up to 50 ms

(demo: every 50 ms reversed original )

  • engineer by training
  • emphasis on signal processing (Münich)
  • pitch-synchronous spectral analysis
  • applied for phoneme and word recognition
  • and for voice detection and pitch extraction
  • speech synthesis (Bonn)
history almost 30 yrs ago
History, almost 30 yrs ago
  • 7th International Congress on Acoustics 1971, Budapest, Hungary
  • first international (speech) conference
  • Satellite Speech Symposium, Szeged
  • Hess, “Grundfrequenzsynschrone digitale Spektralanalyse von Sprachsignalen mit beliebig feiner Auflösung im Frequenzbereich”
    • also papers in German, and even in Russian
    • engineering interest in speech analysis
    • forthcoming specialization in sp. recogn. & pitch extr.
budapest ica
Budapest ICA
  • many influential people from international speech science community, already present there
  • topics at that time far away from our present interests in almost every respect:
    • topics and ambitions
    • approaches taken
    • type and size of data sets
  • see some names and topics (nostalgia!)
speech processing
speech processing
  • Velichko (Russia): dynamic programming
  • Bishnu Atal: towards predictive coding
  • Sakoe (Japan): dynamic processing for time normalization
  • Osamu Fujimura:

- dynamic palatography,

- electromyography (hooked-wire electrodes),

- computer-controlled dynamic radiography

(Tokio x-ray microbeam generator)

  • Jim Flanagan: focal points in sp. comm. research
speech synthesis
speech synthesis
  • Cecile Coker: articulatory synthesis
  • Paul Mermelstein & Bishnu Atal: vocal transfer functions for speech synthesis
  • Johan Liljencrants: formant synthesis OVE III
  • Helmut Mangold: synthesis with a limited set of dynamic transitions
  • Werner Endress: synthesis via intermediate sounds
  • Peter Denes: word concatenation
  • Fujimura, Coker & Umeda: prosody in synthesis
  • Larry Rabiner: 2-pole digital filters for synthesis

“we were away a year ago”

speech recognition
speech recognition
  • Hans Tillmann (abs.): Bonner DAWID-II-system
  • Kasya, Kido, Krause & Tarnóczy: vowel recogn.
  • Velichko: 60 words
  • Rao: 225 VCV utterances, diad matching
  • Sakoe: 2300 isolated Japanese 10 digits
  • Dreyfus-Graf: artificial language
  • Erman: 54 isolated words over telephone
  • Neeley: 54 words recognition in noise
  • Pols: 50 Dutch words, stationary phoneme parts
  • Renato de Mori: zero crossings
speech perception musical acoustics psycho acoustics
speech perception, musical acoustics, psycho-acoustics
  • Rao: plosive-vowel interaction
  • Kozhevnikov: AM vowel-like stimuli
  • Ludmilla Chistovich: vowel discrimination
  • Johan Sundberg: pitch extraction of folk music
  • Max Mathews: music synthesis
  • Tammo Houtgast: lateral inhibition in psychoac.
  • Evans & Wilson: neurophysiological evidence
  • Bela Julesz: critical bands in vision and audition
  • Egbert de Boer: reverse-correlation method
wolfgang s further carrier
Wolfgang’s further carrier
  • Dissertation in 1972

“Digitale grundfrequenzsynchrone Analyse von Sprachsignale als Teil eines automatischen Spracherkennungssystems”

  • Masterpiece in 1983, 698-pages book

“Pitch determination of speech signals. Algorithms and devices”, published by Springer Verlag.

  • Chair in Phonetics in Bonn in 1986
  • publications, keynotes, conference organizer
esca isca and eurospeech
ESCA/ISCA and Eurospeech
  • ESCA grounded in 1988
  • Joseph Mariani first president (1988-1993)
  • Louis Pols 2nd president (1993-1997)
  • Wolfgang final keynote at E’97 in Rhodes
  • since Sept. 1997: Roger Moore president
  • since death Christian Benoit (April 25, 1998) Wolfgang secretary of ESCA
  • since Eurospeech’99 in Budapest  ISCA
ica 1971
ICA 1971
  • all speech analysis based on filters or formants
  • LPC was about to be introduced
  • all synthesis based on formant synthesis
  • diphone concept did not yet exist
  • virtually no attention for TTS synthesis-by-rule
  • all speech recognition based on word-template matching
  • probabilistic approach yet unknown
  • vocabulary size of the order of 50 words only
present day synthesis
present-day synthesis
  • mainly corpus-based concatenative synthesis with non-uniform units (e.g., CHATR, Festival, Next-Gen, Laureate, Bonner system)
  • large storage, optimal search
  • high naturalness and intelligibility
  • but….one speaker, one style, one application
  • room for further improvement
possible improvements
possible improvements
  • general or application-specific corpus
  • how to reduce storage requirements
  • annotation details at various levels
  • optimize search algorithms and cost functions
  • fewer prototypes, generate certain variants
  • preferable units, fall-back mechanism
  • new voice, speaking style, emotion, rate
  • can voice be personalized (cont.)
possible improvements cont
possible improvements (cont.)
  • how much manipulation in concatenation
  • combining stored speech and synthetic speech
  • better prosody (copy, concept, rules)
  • intonation modelling (discrete or continuous; detailed or sparse; signal oriented or linguistically meaningful)
  • concept for duration modelling
  • sentence accent and prominence
presently not very popular
presently not very popular
  • formant synthesis (but see MITalk)
  • diphone and demisyllable synthesis (but see many operational systems: Dutch Fluency, German Hadifix, Multi-lingual Lucent TTS)
  • use of forms of parameterized speech (as soon as more manipulation is required again)
  • many voices, speaking styles, emotions, rates
  • importanc of system evaluation (Jenolan Caves)
future for wolfgang
future for Wolfgang
  • being in the midst of new and challenging developments
  • to produce
  • (in the most efficient way)
  • the highest achievable quality of synthetic speech
  • (given specific dialogue applications)
  • is a large responsibility
  • but also a lot of fun to do (cont.)
future for wolfgang cont
future for Wolfgang (cont.)
  • Wolfgang and the IKP group enjoy doing this
  • for German and other languages
  • and like to report about it at international forums
  • it attracts many good students
  • these are excellent conditions for continuing this work

 I wish Wolfgang and all his colleague a lot of

success in the years to come!