1 / 21

Automatic Fluency Assessment

Automatic Fluency Assessment. Suma Bhat. The Problem. Language fluency Component of oral proficiency Indicative of effort of speech production Indicates effectiveness of speech Language proficiency testing Automated methods of language assessment Fundamental importance

deniselewis
Download Presentation

Automatic Fluency Assessment

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Automatic Fluency Assessment Suma Bhat

  2. The Problem • Language fluency • Component of oral proficiency • Indicative of effort of speech production • Indicates effectiveness of speech • Language proficiency testing • Automated methods of language assessment • Fundamental importance • Automatic assessment of language fluency

  3. Why is it hard? • Fluency a subjective quantity • Measurement of fluency requires • Choice of right quantifiers • Means of measuring the quantifiers • Automatic scores should • Correlate well with human assessment • Interpretable

  4. Automatic Speech Scoring • Automatic scoring of predictable speech • factual information in short answers (Leacock & Chodorow, 2003) • read speech • PhonePass (Bernstein, 1999) • Automatic scoring of unpredictable speech • spontaneous speech • SpeechRater (Zechner, 2009)

  5. State of the art • SpeechRater from Educational Testing Services (2008, 2009) • Uses ASR for automatic assessment of English speaking proficiency • In use as online practice test for TOEFL internet based test (iBT) takers since 2006

  6. Proficiency assessment in SpeechRater • Test aspects of language competence • Delivery (fluency, pronunciation) • Language use (vocabulary and grammar) • Topical development (content, coherence and organization) • Current system • Scores fluency and language use • Overall proficiency score • Combination of measures of fluency and language use • Multiple Regression and CART scoring module

  7. SpeechRater Architecture

  8. System • Speech recognizer • Trained on 40 hours of non-native speech • Evaluation set 10 hours of non-native speech • Word accuracy 50% • Feature set • Fluency features • Mean silence duration, Articulation rate • Vocabulary • Word types per second • Pronunciation • Global acoustic model score • Grammar • Global language model score

  9. Performance • Measured in Human-Computer score correlation • Multiple Regression based scoring 0.57 • CART based scoring 0.57 • Compared with inter-human agreement 0.74

  10. Requirements • Superior quality audio recordings for ASR training • tens of hours of language specific speech • tens of hours of transcription Language-specific resources

  11. Is this the end? • What if language-specific resources are scarce? • superior quality audio recordings for ASR training • hours of language specific speech • hours of transcription • Tested language is a minority language ASR performance affected Alternative methods sought

  12. Alternative method • Our approach (autorater) • makes signal level measurements to obtain quantifiers of fluency • Constructs classifier based on 20 second-segments of speech • Requires no transcription

  13. Autorater Speech signal Preprocessor Feature Extractor Classifier Fluency score Scorer

  14. Measurements • Convert stereo to mono • Downsample to 16kbps • Extract pitch and intensity information • Segment signal into speech and silence • Feature extraction • Used Praat Using sox

  15. dur1=duration of speech without silent pauses dur2= total duration of speech Feature Extractor

  16. Classifier • Logistic regression model • Target scores: Human-rated scores • Variables: Measurements of the quantifiers • PTR, ROS, MLS • Observed scores: Real values between 0 and 1 (inclusive)

  17. Experiments • 3 configurations of the classifier • Rater-independent model: • Most general form • Rated utterances are considered independent of the raters • Does not take into account individual rater bias • Rater-biased model: • additional binary features equal to the number of raters • indicates individual rater-bias • Rater-tuned model: • One model per rater

  18. Quantifier-Score correlation (sig.)

  19. Results –Pilot rating (Trained) • R-tuned: • Best model 0.197worstmodel 0.5 • Inter-rater agreement 48.2%

  20. ASR for our data

  21. Summary • Quantifiers obtained from low-level acoustic measurements are good indicators of fluency • Logistic regression models for automated scoring of spontaneous speech appropriate • Main contribution • Alternative method of automatic fluency assessment • Useful in resource-scarce testing • Main Result: • Rater-biased logistic regression model for scoring fluency

More Related