1 / 18

Beyond F0: sentence modality and speech rate

Beyond F0: sentence modality and speech rate. Francesco Cangemi Laboratoire Parole et Langage & Université de Provence Aix-en-Provence. Outline. 1. Introduction 2. Material 3. Discrete analysis (phone durations ) 3a. Hypotheses 3b. Method 3c. Results 3d. Discussion

cedric
Download Presentation

Beyond F0: sentence modality and speech rate

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Beyond F0: sentence modality and speech rate Francesco Cangemi LaboratoireParole et Langage & Université de Provence Aix-en-Provence

  2. Outline 1. Introduction 2. Material 3. Discreteanalysis(phone durations) 3a. Hypotheses 3b. Method 3c. Results 3d. Discussion 4. Continuousanalysis (local phone rate) 4a. Hypotheses 4b. Method 4c. Results 4d. Discussion 5. Conclusions

  3. 1. Introduction • Until recent years, researchers mainly regarded speech rate as a phonetic feature falling outside the scope of the core form-function relations in language. • In many cases, e.g. in the study of intrinsic phone durations, speech rate has been considered as a source of noise to be controlled for or, in worst cases, simply normalized. • Other scholars see speech rate as an idiosyncratic feature (thus potentially useful in speaker verification applications) or as related to paralinguistic dimensions (as in the case of emotional speech). • Speech rate has also been considered as an acoustic cue to stylistic variation or, from the perspective of conversational analysis, as a resource for turn management.

  4. 1. Introduction • Efforts to put speech rate in direct relation with core modules of language structure remain quite rare and, moreover, they usually lack explicitness. • For example, the hypothesis of a link between speech rate and pragmatic meaning has only been asistematically foreshadowed in isolated studies [1], on few languages [2,3], or as a byproduct of analyses focused on other acoustic cues [1,3,4]. • In this paper, a production experiment on the effects of sentence modality (i.e. declarative vs. yes/no question) on speech rate in Neapolitan Italian (which is also the variety examined in [1,4]) is presented. [1] Maturi, P. (1988), L’intonazione delle frasi dichiarative ed interrogative nella varietà napoletana dell’Italiano, Rivista Italiana di Acustica, 12: 13-30 [2] van Heuven, V. & van Zanten, E. (2005), Speech rate as a secondary prosodic characteristic of polarity questions in three languages, SpeechCom, 47: 87–99 [3] Smith, C. L. (2002), Prosodic Finality and Sentence Type in French, Language and Speech, 45 (2): 141–178 [4] Petrone, C. (2008), Le rôle de la variabilité phonétique dans la représentation des contours intonatifs et de leur sens, PhDThesis, Université Aix-Marseille I

  5. 2. Material Since focalization is relevant both to pragmaticinterpretation and speech rate (through accenting and consequent lengthening phenomena), different focus patterns were also included in the experimental design. Contrastive Narrow Focus on {S,V,O}. Neapolitan Italian read speech in sound-treated booth 30 speakers x 54 utterances preceded by contextualization paragraph from 3 highly controlled sentences: • [σ.σ̀.σ]S [σ̀.σ]V [σ.σ̀.σ]O • σ = CV • entirely voiced • no diphthongs • no “rare phones” • S and O are fantasy names • (lexical frequency control) sentences • E.g.RalegodomaBoveda • Segmentation performed using ASSI (Cangemiet alii, 2011) - 14h30 talk

  6. Outline 1. Introduction 2. Material 3. Discreteanalysis(phone durations) 3a. Hypotheses 3b. Method 3c. Results 3d. Discussion 4. Continuousanalysis (local phone rate) 4a. Hypotheses 4b. Method 4c. Results 4d. Discussion 5. Conclusions

  7. 3a. Hypotheses Previous studies (on various languages with various methods) yield a very fragmented picture of duration across modality. No clear effects, no unified theories • H1: duration difference between S/Q on utterance level • H2: duration differences on phrase/syllable/phone level

  8. 3b. Method X-axis: phone position (see table above) Y-axis: normalized duration (on duration of entire utterance) Example across Focus conditions:

  9. 3c. Results

  10. 3c. Results

  11. 3d. Discussion • Focus patterns and Sentence modality both seem to cause lengthening. A full control of these factors is needed if comparisons with results from the literature are to be done. • Utterance duration is the same in statements and questions, but final vowel duration is longer in questions. Initial segments are longer in statements, but this doesn’t apply to S-Focus condition. These results clearly point to the need for global rather than local metrics: DURATION  SPEECH RATE

  12. Outline 1. Introduction 2. Material 3. Discreteanalysis(phone durations) 3a. Hypotheses 3b. Method 3c. Results 3d. Discussion 4. Continuousanalysis (local phone rate) 4a. Hypotheses 4b. Method 4c. Results 4d. Discussion 5. Conclusions

  13. 4a. Hypotheses • For these reasons, in the second part of the study a different metric for the assessment of speech rate was employed, in order to capture global patterns of variation rather than punctual differences localized on specific parts of the utterance. • This is in line with current developments in the analysis of other acoustic cues, as shown by recent quantitative studies which used Functional Data Analysis on F0 data [5,6]. • Global representation of speech rate variations should be more useful in disentangling Focus-induced and Modality-induced lengthening. Separate analysis of focus conditions is crucial. H3: Q and S show globally different speech rate patterns (qualitative assessment) [5] Gubian, M., Cangemi, F. & Boves, L. (2010), Automatic and Data Driven Pitch Contour Manipulation with Functional Data Analysis, Proceedings of 5th Speech Prosody Conference (Chicago, May 11-14) [6] Gubian, M., Cangemi, F. & Boves, L. (2011), Joint analysis of F0 and speech rate with FDA, Proceedings of 36th ICASSP Conference (Prague, May 22-27)

  14. 4b. Method LOCAL PHONE RATE: a continuous representation of variations in phone durations was calculated by revising some of the algorithms proposed in [7] • X-Axis:Normalized Utterance Duration • Y-Axis: Local Phone Rate • Example across Focus conditions: [7] Pfitzinger, H. (2001), Phonetische Analyse der Sprechgeschwindigkeit, Forschungs-berichte des Instituts für Phonetik und Sprachliche Kommunikation der Universität München, pp. 117-264.

  15. 4c. Results H3 confirmed: Q and S do show different Local Phone Rate curves Moreover, a comparison between S-Focus (left) and O-Focus (right) shows an INTERACTION between Focus- and Modality-induced lengthening.

  16. 4d. Discussion The results of duration (§3) and speech rate (§4) analyses allow us to draw conclusions at different levels: • First of all, and most importantly, the existence of a link between speech rate and pragmatic contrasts is confirmed. • Speech rate in an utterance seem to be affected by both Focus and Modality. Focus-induced lengthening seem to be stronger in Declarative Modality (“interaction”). • In conclusion, it seems that sentence modality affects speech rate in a global (rather than local) way. Local Phone Rate extraction could be more suited than discrete (utterance/phrase/syllable/phone) duration measurements for research in this field.

  17. 5. Conclusions We still need to master reliable statistical techniques for the analysis of functional data in linguistics New studies are directly addressing this issue [5,6] Production studies as the one presented here ought to be complemented by perception studies in order to achieve a better understanding of speech processing Phonetic Detail reinforcing cue offline/online tasks The use of more spontaneous speech material could also represent an important phase of the theory validation process Corpus DanSer as a first step Could the exploration of this link between pragmatics (sentence modality) and phonetics (speech rate) benefit from a more abstract phonological representation? [5] Gubian, M., Cangemi, F. & Boves, L. (2010), Automatic and Data Driven Pitch Contour Manipulation with Functional Data Analysis, Proceedings of 5th Speech Prosody Conference (Chicago, May 11-14) [6] Gubian, M., Cangemi, F. & Boves, L. (2011), Joint analysis of F0 and speech rate with FDA, Proceedings of 36th ICASSP Conference (Prague, May 22-27)

  18. Beyond F0: sentence modality and speech rate Francesco Cangemi LaboratoireParole et Langage & Université de Provence Aix-en-Provence

More Related