Scream and gunshot detection and localization for audio surveillance systems
This presentation is the property of its rightful owner.
Sponsored Links
1 / 17

Scream and Gunshot Detection and Localization for Audio-Surveillance Systems PowerPoint PPT Presentation


  • 55 Views
  • Uploaded on
  • Presentation posted in: General

Scream and Gunshot Detection and Localization for Audio-Surveillance Systems. G. Valenzise * , L. Gerosa, M. Tagliasacchi * , F. Antonacci * , A. Sarti *. * Dipartimento di Elettronica e Informazione, Politecnico di Milano. IEEE Int. Conf. On Advanced Video and Signal-based Surveillance, 2007.

Download Presentation

Scream and Gunshot Detection and Localization for Audio-Surveillance Systems

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Scream and gunshot detection and localization for audio surveillance systems

Scream and Gunshot Detection and Localization for Audio-Surveillance Systems

G. Valenzise*, L. Gerosa, M. Tagliasacchi*, F. Antonacci*, A. Sarti*

*Dipartimento di Elettronica e Informazione,

Politecnico di Milano

IEEE Int. Conf. On Advanced Video and Signal-based Surveillance, 2007


Presentation outline

PresentationOutline

  • Descriptionof the problem

  • System Overview

  • Classification

    • GMM

    • Featureextraction

    • Featureselection

    • Experimentalresults

  • Localization

    • TimeDelayEstimation

    • Source Localization

    • Experimentalresults


Description of the problem

Descriptionof the problem

  • Increasingneedforsafety in public places (e.g. squares):

    • High degreeofcriminality

    • Largenumberofvideo-camerasinstalled

       Aidto the humancontrolof the video-surveillancesystemsusingaudio signaltodetect and localizeanomalousevents (e.g. gunshots, screams) and tosteer a video-camera


General classification of events

GeneralClassificationofevents


Feature extraction

FeatureExtraction


Correlation features example

CorrelationFeatures: example

Autocorrelation filtered in the frequency range 1000-2500 Hz


Feature selection

FeatureSelection

  • From the full set offeatures, wewant a vectoroflfeatures:

    • Similardiscriminationpower

    • Lesscomputationally intensive

    • Resistanttooverfitting

Filter-based

featurevector

construction

Wrapper-based

featurevector

selection


Feature selection example

FeatureSelection: example


Experimental results classification at different snrs

Experimentalresults: classification at differentSNRs

Test: 0dB

Test: 5dB

Test: 15dB

Test: 10dB

Test: 20dB


Localization setup

Localization: setup

  • Consider a T-shaped mic array

  • Center mic is taken as reference

  • Localization problem can be split in two tasks:

    • Estimate Time Differences of Arrivals (TDOA) between each mic and reference mic

    • Estimate source location from TDOAs


Scream and gunshot detection and localization for audio surveillance systems

Step 1: TimeDelayEstimation

  • Acousticmodelof the audio signalreceived at a coupleofmicrophones:

  • The TDE problemconsists in the estimationofτ GCC

Generalized Cross Correlation (GCC)

signal waveform


Step 2 source localization

Step 2: source localization

Linear-CorrectionLeastSquaresLocalization

(Huang & Benesty, 2004)


Experimental results localization threshold effect

Experimentalresults: Localization – Thresholdeffect

  • SNR > threshold small TDOA estimation errors around the true time delay

  • SNR < threshold  large errors on TDOA estimation


Experimental results localization angular error

Experimentalresults: Localization – AngularError


Conclusions future works

Conclusions & Future works

  • Combined system yields a precision of 93% and a false rejection rate of 5% at 10dB SNR

  • Hybrid feature selection allows to effectively select the most representative features with a reasonable computational effort

    Future Extensions:

  • Fusion of multiple mic arrays into a sensor network  increase range and precision


References

References

  • M. Figueiredo and A. Jain, “Unsupervised learning of finite mixture models,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 24, no. 3, pp. 381–396, 2002.

  • C. Knapp and G. Carter, “The generalized correlation method for estimation of time delay,” IEEE Transactions on Acoustics, Speech, and Signal Processing, vol. 24, no. 4, pp. 320–327, 1976.

  • J. Chen, Y. Huang, and J. Benesty, Audio Signal Processing for Next-Generation Multimedia Communication Systems. Kluwer, 2004, ch. 4-5

  • J. Ianniello, “Time delay estimation via cross-correlation in the presence of large estimation errors,” IEEE Transactions on Acoustics, Speech, and Signal Processing, vol. 30, no. 6, pp. 998–1003, 1982


Thank you

Thankyou


  • Login