1 / 9

Visualizing Audio for Anomaly Detection Research

This research aims to enhance anomaly detection using audio visualization techniques, addressing challenges in interactive browsing and automated event detection. It proposes color-coding audio segments based on probabilistic features for improved analysis. Datasets include meeting room and airport audio, with unique data representations utilizing multiscale FFT and Bayesian feature selection. Two testbeds are explored: a portable multi-day audio timeliner for emergency responders and a vast dataset analysis with 1000 microphones. Findings indicate progress in meeting room audio transcription and ongoing advancements in data representations and testbed visualization.

thooten
Download Presentation

Visualizing Audio for Anomaly Detection Research

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Visualizing Audio for Anomaly Detection • Mark Hasegawa-Johnson, Camille Goudeseune, Hank Kaczmarski, Thomas Huang, David Cohen, Xiaodan Zhuang, Xi Zhou, and Kyung-Tae Kim

  2. Research Goals • Problem: Microphones are cheap, yet they are rarely used in security installations. • Interactive browsing is difficult: Audio is hard to browse much faster than real-time • Automatic acoustic event detection (AED) is nearly useless: many false alarms Proposal: Best of both worlds • Use probabilistic features (AED) to color-code audio segments, guiding analysts to pieces worth closer study

  3. Dataset #1: Meeting Room Audio 30 annotators, 24 hours of data, 14 acoustic events

  4. Dataset #2: Willard Airport • 24 hours of audio, `labeled' by commercial airplane takeoff & landing records (inadequate!)

  5. Data Representations #1: Multiscale FFT • Problem: Short-time Fourier transforms with sizes N1, N2, N3, … require O{T(log2N1+log2N2+log2N3+...)}, T=audio filesize, N=FFT size • Solution: XN(2k) = XN/2,1(k)+XN/2,2(k)

  6. Data Representations #2: Bayesian Feature Selection • Problem: Best features for nonspeech acoustic event detection are unknown (different from speech), different for different classes of acoustic events • Solution: Select the best features from a big pile, according to a minimum-Bayes-risk selection criterion

  7. Testbed #1: Portable Multi-Day Audio Timeliner • Dramatis Personae: Emergency first responders (EFRs) • Analysis Object: One microphone, one month • Act 1, Scene 1: EFRs arrive on scene, download surveillance audio to a handheld • Objective: Event diagnosis, prognosis, & management

  8. Testbed #2: 1000 Microphones = One Milliphone • Dramatis Personae: Command center data analysts • Analysis Object: 1000 microphones, 24 hours • Act 2, Scene 1: Analyst in a Virtual Reality Theater (the Beckman CUBE) seeks anomalies in a large dataset • Objective: Find the anomalies

  9. Conclusions: Current Status of this Research, August 18 2009 • Results so far • Meeting room audio: transcription nearly complete. Airport audio: no transcriptions. • Data representations, Testbeds: separate prototypes exist • Ongoing research • Insert (Data Representations) into (Testbeds) • Create new data representations in order to improve testbed visualization • Formal human subject tests

More Related