Distant speech recognition in smart homes initiated by hand clapping within noisy environments
Download
1 / 21

Distant Speech Recognition in Smart Homes Initiated by Hand Clapping within Noisy Environments . - PowerPoint PPT Presentation


  • 85 Views
  • Uploaded on
  • Presentation posted in: General

Distant Speech Recognition in Smart Homes Initiated by Hand Clapping within Noisy Environments. Florian Bacher & Christophe Sourisse. [623.400] Seminar in Interactive Systems. Agenda. Introduction Methodology Experiment Description Implementation Results Conclusion.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha

Download Presentation

Distant Speech Recognition in Smart Homes Initiated by Hand Clapping within Noisy Environments .

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Distant Speech Recognition in Smart Homes Initiated by Hand Clapping within Noisy Environments.

Florian Bacher & Christophe Sourisse

[623.400] Seminar in Interactive Systems


Agenda

  • Introduction

  • Methodology

  • Experiment Description

  • Implementation

  • Results

  • Conclusion


I. Introduction


Introduction

  • Smart homes have become a major field of research in information and communication technologies.

  • Possible way of interaction: Voice commands.

  • Goal of our experiment:evaluate the possibility of recognizing voice commands initiated by hand claps in a noisy environment.

  • Gather a set of voice commands uttered by various speakers.


II. Methodology


Methodology

  • Main method: Lecouteux et al. [1]

    • Deals with speech recognition within distress situations.

    • Problem: no background noise was considered.

  • Chosen methodology: adapt Lecouteux et al. protocol considering:

    • Noisy settings.

    • Initiating recognition using hand claps.


Methodological issues

  • Choice of the room setting

    • Lecouteux et al. [1]: a whole flat.

    • Vovos et al. [7]: one-room microphone array.

    • Choice: one room with 2 microphones.

  • Choice of background noises

    • Hirsch and Pierce [8]: NoiseX 92 database.

    • Moncrieff et al. [5]: “Background noise is defined as consisting of typical regularly occurring sounds.”

    • Choice: background noises of the daily house life.


III. Experiment Description


Experiment Settings

  • Performed in a 3m x 3m room.

  • Sounds were captured by two microphones which were hidden in the room.


Experimental Protocol

  • 20 participants (10 men, 10 women, 25,5 ± 11 years) participated to a 2-phase exp.

  • 1st phase: recognize a word (“Jeeves”) as a command

    • System’s attention is catched by double clapping.

    • 4 scenarios.

    • Background noises tested: step noises, opening doors, moving chairs, radio show.

  • 2nd phase: Gather a set of voicecommands

    • List of 15 command-words.

    • Reference record for pronounciation issues.

    • Eachwordisuttered 10 times.


IV. Implementation


Implementation

  • Used technologies:

    • C# Library System.Speech.Recognition: Interface to the Speech Recognition used by Windows.

    • Microphones: Two dynamic microphones with cardioid polar pattern (Sennheiser BF812/e8155)

    • Line6 UX1 Audio Interface

    • Line6 Pod Farm 2.5


Implementation

  • Signal is captured in real time.

  • If there are exactly two signal peaks within a certain timeframe, the software classifies them as a double clap.

  • After a double clap has been detected, the actual speech recognition engine is activated (i.e. the software is waiting for commands).


V. Results


Results’ Classification


General Results


Detailed Results


VI. Conclusion


Conclusion

  • A new idea of how to initiate speech recognition in human computer interaction.

  • An evaluation of the potential influence of a noisy environment.

  • Results: encouraging, but not yet satisfying.

  • Next step: perform this experiment in a real smart-home-context.


References

  • [1] B. Lecouteux, M. Vacher and F. Portet. Distant speech recognition in a smart home: comparison of several multisouce ASRs in realistic conditions. Interspeech., 2011.

  • [2] A. Fleury, N. Noury, M. Vacher, H. Glasson and J.-F. Serignat. Sound and speech detection and classification in a health smart home. 30th Annual International IEEE EMBS Conference, Vancouver, British Columbia, Canada, August 2008.

  • [3] M. Vacher, N. Guirand, J.-F. Serignat and A. Fleury. Speech recognition in a smart home: Some experiments for telemonitoring. Proceedings of the 5th Conference on Speech Technology and Human-Computer Dialogue, pages 1 – 10, June 2009.

  • [4] J. Rouillard and J.-C. Tarby. How to communicate smartly with your house? Int. J. Ad Hoc and Ubiquitous Computing, 7(3), 2011.

  • [5] S. Moncrieff, S. Venkatesh, G. West, and S. Greenhill. Incorporating contextual audio for an actively anxious smart home. Proceedings of the 2005 International Conference on Intelligent Sensors, Sensor Networks and Information Processing, pages 373 – 378, Dec. 2005.

  • [6] M. Vacher, D. Istrate, F. Portet, T. Joubert, T. Chevalier, S. Smidtas, B. Meillon, B. Lecouteux, M. Sehili, P. Chahuara and S. Méniard. The sweet-home project: Audio technology in smart homes to improve well-being and reliance. 33rd Annual International IEEE EMBS Conference, Boston, Massachusetts, USA, 2011.

  • [7] A. Vovos, B. Kladis and N. Fakotakis, Speech operated smart-home control system for userswithspecialneeds, in Proc. Interspeech 2005, 2005, pp. 193 – 196.

  • [8] H.-G. Hirsch and D. Pearce. The AURORA experimentalframework for the performance evaluation of speech recognition systemsundernoisy conditions. In ASR-2000, pages 181 – 188.


Thank you for your attention!

Questions


ad
  • Login