Automatic Fluency Assessment

Automatic Fluency Assessment Suma Bhat

The Problem • Language fluency • Component of oral proficiency • Indicative of effort of speech production • Indicates effectiveness of speech • Language proficiency testing • Automated methods of language assessment • Fundamental importance • Automatic assessment of language fluency

Why is it hard? • Fluency a subjective quantity • Measurement of fluency requires • Choice of right quantifiers • Means of measuring the quantifiers • Automatic scores should • Correlate well with human assessment • Interpretable

Automatic Speech Scoring • Automatic scoring of predictable speech • factual information in short answers (Leacock & Chodorow, 2003) • read speech • PhonePass (Bernstein, 1999) • Automatic scoring of unpredictable speech • spontaneous speech • SpeechRater (Zechner, 2009)

State of the art • SpeechRater from Educational Testing Services (2008, 2009) • Uses ASR for automatic assessment of English speaking proficiency • In use as online practice test for TOEFL internet based test (iBT) takers since 2006

Proficiency assessment in SpeechRater • Test aspects of language competence • Delivery (fluency, pronunciation) • Language use (vocabulary and grammar) • Topical development (content, coherence and organization) • Current system • Scores fluency and language use • Overall proficiency score • Combination of measures of fluency and language use • Multiple Regression and CART scoring module

SpeechRater Architecture

System • Speech recognizer • Trained on 40 hours of non-native speech • Evaluation set 10 hours of non-native speech • Word accuracy 50% • Feature set • Fluency features • Mean silence duration, Articulation rate • Vocabulary • Word types per second • Pronunciation • Global acoustic model score • Grammar • Global language model score

Performance • Measured in Human-Computer score correlation • Multiple Regression based scoring 0.57 • CART based scoring 0.57 • Compared with inter-human agreement 0.74

Requirements • Superior quality audio recordings for ASR training • tens of hours of language specific speech • tens of hours of transcription Language-specific resources

Is this the end? • What if language-specific resources are scarce? • superior quality audio recordings for ASR training • hours of language specific speech • hours of transcription • Tested language is a minority language ASR performance affected Alternative methods sought

Alternative method • Our approach (autorater) • makes signal level measurements to obtain quantifiers of fluency • Constructs classifier based on 20 second-segments of speech • Requires no transcription

Autorater Speech signal Preprocessor Feature Extractor Classifier Fluency score Scorer

Measurements • Convert stereo to mono • Downsample to 16kbps • Extract pitch and intensity information • Segment signal into speech and silence • Feature extraction • Used Praat Using sox

dur1=duration of speech without silent pauses dur2= total duration of speech Feature Extractor

Classifier • Logistic regression model • Target scores: Human-rated scores • Variables: Measurements of the quantifiers • PTR, ROS, MLS • Observed scores: Real values between 0 and 1 (inclusive)

Experiments • 3 configurations of the classifier • Rater-independent model: • Most general form • Rated utterances are considered independent of the raters • Does not take into account individual rater bias • Rater-biased model: • additional binary features equal to the number of raters • indicates individual rater-bias • Rater-tuned model: • One model per rater

Quantifier-Score correlation (sig.)

Results –Pilot rating (Trained) • R-tuned: • Best model 0.197worstmodel 0.5 • Inter-rater agreement 48.2%

ASR for our data

Summary • Quantifiers obtained from low-level acoustic measurements are good indicators of fluency • Logistic regression models for automated scoring of spontaneous speech appropriate • Main contribution • Alternative method of automatic fluency assessment • Useful in resource-scarce testing • Main Result: • Rater-biased logistic regression model for scoring fluency

Automatic Fluency Assessment

Automatic Fluency Assessment

Presentation Transcript

Fluency

Fluency

Making Fact Fluency Assessment Meaningful

Automatic Photo Quality Assessment

Fluency

Chapter 9: Fluency Assessment

Fluency

Fluency

Fluency

Chapter 9: Fluency Assessment

Fluency

Fluency

Chapter 9 Fluency Assessment

Fluency Assessment

Fluency

Chapter 9 - Fluency Assessment

FLUENCY

Fluency

Fluency: Importance, Instruction and Assessment

Fluency

FLUENCY:

Fluency