1 / 29

CIS Division, NATO C3 Agency

The NATO Post-2000 Narrow Band Voice Coder: Test and Selection of STANAG 4591. Technical Presentation-001. CIS Division, NATO C3 Agency. Voice@nc3a.info. Abstract and Conditions of Release. Abstract

paul
Download Presentation

CIS Division, NATO C3 Agency

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. The NATO Post-2000Narrow Band Voice Coder: Test and Selection of STANAG 4591 Technical Presentation-001 CIS Division, NATO C3 Agency Voice@nc3a.info

  2. Abstract and Conditions of Release Abstract The work described in this presentation was carried out under customer funded projects 25.12.00 and N.25.12.00, conducted by NC3A on behalf of AC322(SC/6-AHWG/3). This presentation gives a general introduction to the work, which is documented in NC3A Technical Note-881 and NC3A Technical Memorandum-946. This presentation is a working paper that may not be cited as representing formally approved NC3A opinions, conclusions or recommendations.

  3. NBVC and NC3A Customers NATO Infrastructure Committee Voice coder developers NATO Narrow Band Voice Coder Ad-Hoc Working Group Host Nation Customer funded • NC3A-NL, The Hague NC3A-BE, Brussels • Scientific staff Acquisition staff • Set up voice coding testbed Equipment Acquisition • Process input data Contractual issues • Blind and deblind data • Support to AHWG NBVC, test labs • and coder developers

  4. Introduction to STANAG 4591

  5. Background • Voice Coding technology is constantly improving • driven by mobile telephony • narrow band • wireless channels • new coders outperform existing NATO voice coders • STANAG 4198 - LPC10e • + low rate (2.4k) • - low speech quality • - low resilience to noise • STANAG 4209 - CVSD • + good resilience to noise • - poor speech quality in no noise • - high rate (16 k) • AHWG NBVC tasked by NATO to select a future Narrow Band Voice Coder for NATO use at 1.2kbps and 2.4kbps

  6. France • HSX (Harmonic Stochastic eXcitation) • Turkey • SB-PLC (Split-Band Linear Predictive Coding) • USA • MELP (Mixed Excitation Linear Prediction) Voice Coders Tested • NATO requested candidates to be submitted by member nations • Three candidates submitted (each candidate operates at both 1.2k & 2.4k) • plus LPC-10e (2.4k) CELP (4.8k) CVSD (16k) as known reference coders

  7. The TNO test laboratory at Soesterberg, NL NATO data being analysed at TNO Test Resources and Responsibilities • Project was ‘customer funded’ by NATO Infrastructure Committee and nations submitting coders • NC3A host nation, but worked with specialist speech processing labs • NC3A ran raw audio data through coders and ‘blinded’ all output • National test labs analysed raw audio from NC3A. Test labs were: • TNO, NL • CELAR, FR • Arcon, US • NC3A impartially collated results

  8. A typical test booth where subjects listen to speech for analysis NATO NBVC testsPhase 1 • Floating Point vocoder implementations • Performance • Intelligibility • Quality • Noise Conditions • Quiet • Modern office • Acoustic noise, (6 dB, 12 dB) • 5488 Mb of processed audio in 5848 files

  9. LPC10e LPC10e CVSD CVSD BITSTREAM CELP CELP FR1200 FR1200 FR2400 FR2400 TU1200 TU1200 TU2400 TU2400 US1200 US1200 US2400 US2400 Processing by NC3A Encode Decode LPC10e LPC10e CVSD CVSD 3 4 5 6 7 8 9 2 1 Nine raw audio output files Sent to test labs for analysis CELP CELP FR1200 FR1200 Raw audio file 8kHz sample rate, 16 bit samples FR2400 FR2400 TU1200 TU1200 TU2400 TU2400 US1200 US1200 US2400 US2400

  10. Double blinding process Decoded output files Single blinded files Double blinded files LPC10e LPC10e Coder1 Vocoder1 CVSD Coder2 Vocoder2 CELP Coder3 Vocoder3 B L I N D B L I N D Nine audio output files To test lab FR1200 Coder4 Vocoder4 FR2400 Coder5 Vocoder5 TU1200 Coder6 Vocoder6 TU2400 Coder7 Coder7 Vocoder7 US1200 Coder8 Vocoder8 US2400 Coder9 Vocoder9 Vocoder9 Blinded by NC3A Blinded by DSTL

  11. MNRU 5db MNRU 10dB MNRU 15dB MNRU 20dB MNRU 25dB MNRU 30dB MNRU 35dB MNRU 40dB Modulated Noise Reference Unit • MNRU is a standard method to apply known levels of noise. It provides known references against which listeners can compare vocoder outputs BITSTREAM Nine raw audio output files LPC10e LPC10e CVSD CVSD CELP CELP FR1200 FR1200 FR2400 FR2400 TU1200 TU1200 TU2400 TU2400 US1200 US1200 US2400 US2400 MNRU 5db 10 17 16 14 15 12 17 11 13 17 raw audio output files. MNRU files to test labs as references for analysing speech quality MNRU 10dB One raw audio file MNRU 15dB MNRU 20dB MNRU 25dB MNRU 30dB MNRU 35dB MNRU 40dB

  12. Fixed point implementation C plus ETSI libraries Performance Measurements Intelligibility, Quality Speaker recognition Language dependency English, French, German, Dutch, Polish, Turkish 10 acoustic noise environments Transmission channel 1% BER Tandem 16 kbps CVSD - vocoder Whispered speech NATO NBVC testsPhase II

  13. Phase 2 additional test conditions 1% random bit errors Audio output file Coder n Bitstream Decoder n Audio input file Test configuration: 1% Bit error rate Audio B i t s B i t s Audio output file CVSD Coder CVSD Decoder Coder n Decoder n Audio input file Test configuration: Voice coder tandem

  14. HMMWV Bradley Fighting Vehicle Le Clerc Tank Blackhawk helicopter Mirage 2000 F-15 NATO NBVC tests - Phase 2Noise ConditionsPhase 1 plus …….. MCE field shelter Volvo (staff car)

  15. NATO NBVC Phase 2 3 test labs x 9 coders (+ 8 MNRU levels) x £5 tests • Over 36,000 files • Over 30 GB of processed speech data • @500 hours of speech • Some voice coders ran approx 10 times real time x £ 12 noise conditions x £88 files per test

  16. Need for Precision Weighted Ranking • Graphs show variation between intelligibility tests performed by the 3 test labs • General trends are the same • Absolute scores vary • Need to combine all results accurately and fairly • Simple scaling is not sufficient US24 CELP FR24 CVSD TU24 US12 LPC TU12 FR12 US24 CELP FR24 CVSD TU24 US12 LPC TU12 FR12

  17. Range of test results divided into segments or bins Confidence interval of test Confidence interval of test • The resolution (or 95% confidence interval) of the test determines bin size Bin 1 Bin 3 Bin 4 Bin 5 Bin 7 Score = 7 Score = 1 Precision Weighted Ranking • Coders in subsequent intervals score bin number • Coder scores are determined by which bin their test result falls into • Worst coder always scores 1. In this test Vocoder 7 came last • Scores for vocoders 6, 8 and 9 were 4 - 5 confidence intervals above that of V7. They all score 5

  18. Combined Performance Index

  19. Phase 2 Combined Performance Index US FR TU • Selection made on combined scores at 2400 and 1200 bps • 60% - 2400 bps score • 40% - 1200 bps score

  20. Phase 2 Combined Performance Index US2400 CELP FR2400 CVSD TU2400 US1200 FR1200 TU1200 LPC10

  21. Specific Results - Intelligibility • Results of all coders in all noise conditions (CVC test) US24 CELP FR24 CVSD TU24 US12 LPC TU12 FR12 Intelligibility score (%) Intelligibility score (%) US24 CELP FR24 CVSD TU24 US12 LPC TU12 FR12

  22. Specific Results - Speech Quality • Range of Mean Opinion Score test • 1 (Bad) • 2 (Poor) • 3 (Fair) • 4 (Good) • 5 (Excellent) • Results of all coders in all noise conditions (MOS test) Mean Opinion Score US24 CELP FR24 CVSD TU24 US12 LPC TU12 FR12 Mean Opinion Score US24 CELP FR24 CVSD TU24 US12 LPC TU12 FR12

  23. Specific Results - Language Dependency • Language dependency of all tested coders • The closer a point lies to the x=y diagonal, the less language dependant the voice coder

  24. Current position • Phase 1 • Completed • Results available in NC3A Technical Note-881 • Phase 2 • All material processed and analysed • Results collated • Results analysed and blind removed • Coder selected on 24 October 2001 • Stanag 4591 known • MELPe

  25. NC3A - Current activity • Test Process Phase 3 • Real-time Implementation of Phase 2 winner • Communicability tests • real-life communication problem • end-to-end delay effects • Assist in drafting STANAG 4591 • Advise on the use and implementation of STANAG 4591

  26. Stanag 4591 vs COTS voice coders COTS X = 6 kbps COTS Y = 4.56 kbps COTS X = 4.56 kbps MELPe = 2.4 kbps Male speaker Female speaker

  27. Conclusion • STANAG 4591 provides • substantially improved performance • speech quality • intelligibility • noise immunity • reduced throughput requirements • interoperability

  28. Further information Stanag 4591 test and selection process Street MD, “Future NATO narrow band voice coder selection: Stanag 4591”, NC3A Technical Note 881, The Hague, December 2001 http://nc3a.info/Voice Street MD and Collura JS, “Interoperable Voice Communications: test and selection of STANAG 4591”, RTA IST Symposium - NATO Research and Technology Agency (Information Systems and Technology panel) Tactical Military Communications symposium, Warsaw, October 2001 http://www.rta.nato.int/IST.htm Street MD and Collura JS, “The test and selection of the future NATO narrow band voice coder”, RCMCIS - NATO Regional Conference on Military CIS, Warsaw, Zegrze, October 2001. http://www.wil.waw.pl/ses3.htm MELPe: the selected voice coder Collura JS and Rahikka DJ, “Interoperable secure voice communications in tactical systems, IEE coll. on Speech coding algorithms for radio channels, London, February 2000. An overview of the MELP voice coder and its use in military environments http://www.iee.org/OnComms/pn/communications Collura JS, Rahikka DJ, Fuja TE, Sridhara D and Fazel T, “Error coding strategies for MELP vocoder in wireless and ATM environments”, IEE coll. on Speech coding algorithms for radio channels, London, February 2000. Performance of MELP with a variety of different error correction mechanisms http://www.iee.org/OnComms/pn/communications

  29. Information and Source Code available from: http://elayne.nc3a.nato.int/S4591/Applied Communication Technologies BranchCIS DivisionNATO C3 AgencyPO Box 1742501 CD , The HagueThe NetherlandsTel: +31 70 374 3043Fax: +31 70 374 3049Email: voice@nc3a.info

More Related