T. T. Centrum. för talteknologi. Multi-modal expression of Swedish prominence Björn Granström Centre for Speech Technology, Department of Speech, Music and Hearing, KTH, Stockholm, Sweden. Historical background. Prosody for speech synthesis at KTH, together with Rolf Carlson
Multi-modal expression of Swedish prominenceBjörn Granström Centre for Speech Technology, Department of Speech, Music and Hearing, KTH, Stockholm, Sweden
Profs – Prosodic phrasing in Swedish ~1989-1992
Gösta Bruce, Björn Granström and more
First reference: G. Bruce and B. Granström. Modelling Swedish intonation in a text-to-speech system. STL-QPSR, 30(1):17-21, 1989. (on the KTH web)
Prosodiag - Prosodic Segmentation and Structuring of Dialogue (HSFR + NUTEK) 1993 –1996
Gösta Bruce, Björn Granström, Kjell Gustafson, David House, Paul Touati
The object of study is the prosody of dialogue in a language technology framework. The primary goal of the project is to increase our understanding of how prosodic aspects of speech are exploited interactively in dialogue and on the basis of this increased knowledge to be able to create a more powerful prosody model.
Late reference: Gösta Bruce, Johan Frid, Björn Granström, Kjell Gustafson, Merle Home, and David House. Prosodic segmentation and structuring of dialogue. TMH-QPSR, 37(3):1-6, 1996.
More than 20 joint publications – and then?
1 No eyebrow motion
2 Eyebrow motion
controlled by the
of the voice
3 Eyebrow motion at
focal accents +
4 Eyebrow motion at
the first focal accent +
“Jag heter Axel, inte Axell” (translation: “My name is Axel, not Axell”). In Sweden Axel is a first name as opposed to Axell, which is a family name.
EmotionsVisual prosodic functions
Formal experiment next:Prominence due to eyebrow rise5 content words: ”När pappafiskar stör piper Putte”When dad is fishing sturgeon, Putte is whimpering
No eyebrow movement (neutral)
Granström, House & Swerts (2002)
(Granström, House & Swerts 2002)
Automatic tracking of reflective spots in 3D (Qualisys)
left mouth corner
”left mouth corner”
(Svanfeldt et al. 2003)
annotated head nods (adapted from Cerrato & Svanfeldt 2006)
Focal accent on: Båten seglade förbi
MPEG-4 Facial Animation Parameter (FAP) A subset of 31 FAPs out of the 68 FAPs defined in the MPEG-4 standard, including only the ones that we were able to calculate directly from our measured point data
Focal Motion Quotient, FMQ, defined as the standard deviation of a FAP parameter taken over a word in focal position, divided by the average standard deviation of the same FAP in the same word in non-focal position.
The focal motion quotient, FMQ, averaged across all sentences, for all measured MPEG-4 FAPs for several expressive modes
articulation I smile I brows I head
FMQ (Focal Motion Quotient)
FMQ (Focal Motion Quotient)
SIMULEKT - Simulering av svenskans prosodiska dialekttyper (Simulating intonational varieties of Swedish)