Assessment of Vocal Noise via Bi-directional Long-term Linear Prediction of Running Speech

Assessment of Vocal Noise via Bi-directional Long-term Linear Prediction of Running Speech F. Bettens*, F. Grenez*, J. Schoentgen*,** *Université Libre de Bruxelles **National Fund for Scientific Research Belgium

Existing Cues of Vocal Noise • Detection of individual vocal cycles(or harmonics) • Steady vowel fragments • (Pseudo)-Periodicity • Period Perturbation Quotient • Amplitude Perturbation Quotient • Harmonics-to-Noise Ratio

Objectives : Analyses of Dysperiodicities • Give up request that speech fragments are : • (Pseudo)-Periodic • Steady • Any Speech Fragment : • Modal Voices & (Very) Hoarse Voices • Sustained Vowels & Running Speech

Motivation : Analysis of Running Speech • Voicing in running speech • Variable acoustic impedance • Voicing onsets & offsets • Variable pressure drops • Variable laryngeal positions • Voice Loading

Double Linear Predictive Analysis • Conventional short-term linear prediction: • Long-term linear prediction: remove existing correlations  unpredictable noise component (Qi, 1999) forward short-term prediction error forward double prediction error

Double Linear Predictive Analysis  Solutions:  remove short-term linear predictive analysis stage  proceed to bi-directional analysis Drawbacks: • eS[n] is an artificial signal • the dysperiodicities in weighted sum x[n] are omitted • eL[n] is inflated to the right of unvoiced/voiced boundaries

Bi-directional Long-term Prediction • Forward long-term linear prediction: • Backward long-term linear prediction: • Bi-directional long-term linear prediction: keep the “best” (frame by frame) forward long-term prediction error backward long-term prediction error bi-directional long-term prediction error

Long-term Prediction Distance : P Maximum of the auto-correlation function example: steady vowel [a] (dysphonic speaker)   P = 184 (2 cycles)

Vocal Noise Cue Signal-to-Dysperiodicity Ratio: example: steady vowel [a] healthy speaker dysphonic speaker speech signal x[n] bi-directional long-term prediction error eL[n] SDR = 31,2 dB SDR = 10,1 dB

Results1:Sentence(1 female speaker; modal phonation type)(http://www.limsi.fr/VOQUAL/ : “Il est sorti avant le jour”) segments [il] speech signal bi-directional long-term prediction error forward long-term prediction error

Results 2 : Sentence (1 female speaker; 5 phonation types)(http://www.limsi.fr/VOQUAL/ : “Il est sorti avant le jour”)

Conclusion The forward & backward long-term prediction of speech enables the analysis of any speech signal with a view to the assessment of the vocal noise (i.e. vocal dysperiodicities) The analysis is not based on any assumptions regarding the periodicity or stationarity of the speech signals

Assessment of Vocal Noise via Bi-directional Long-term Linear Prediction of Running Speech

Assessment of Vocal Noise via Bi-directional Long-term Linear Prediction of Running Speech

Presentation Transcript

Linear Prediction

Automatic Lip-Synchronization Using Linear Prediction of Speech

long-term hydrologic impact assessment

Highlights of the 2011 MACHI Bi-directional Grant

Linear Prediction

Assessment of prediction error of risk prediction models

Linear Prediction

UNCERTAINTY ANALYSIS OF LONG TERM WIND SPEED PREDICTION ALEX KAPETANOVIC

Bi-directional incremental evolution

WP 3: Long Term Assessment

Bi-directional Symmetric LSP

CCAP Bi Directional Interface

Long-Term Ambient Noise Statistics in the Gulf of Mexico

Low-Complexity Lossless Compression of Hyperspectral Imagery via Linear Prediction

Team Bi-Directional

Bi-directional Thinking

Linear prediction

Bandwidth Expansion of Narrow band Speech using Linear Prediction

LINEAR PREDICTION

Team Bi-directional

Low-Complexity Lossless Compression of Hyperspectral Imagery via Linear Prediction

Linear Prediction Coding of Speech Signal