1 / 19

Duraid Y. Mohammed Philip J. Duncan Francis F. Li.

Audio Content Analysis in The Presence of Overlapped Classes - A Non-Exclusive Segmentation Approach to Mitigate Information Losses. Duraid Y. Mohammed Philip J. Duncan Francis F. Li. School of Computing Science and Engineering, University of Salford UK. Global Summit and Expo on

hansond
Download Presentation

Duraid Y. Mohammed Philip J. Duncan Francis F. Li.

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Audio Content Analysis in The Presence of Overlapped Classes - A Non-Exclusive Segmentation Approach to Mitigate Information Losses Duraid Y. Mohammed Philip J. Duncan Francis F. Li. School of Computing Science and Engineering, University of Salford UK Global Summit and Expo on Multimedia & Applications August 10-11, 2015 Birmingham, UK

  2. Increasing volume of digital Media archives leading to increased demand for these goals Introduction

  3. Classical classification problems are logically exclusive, i.e. an element is assumed to be a member of one class and of that class only. This hinders some practical uses in audio information mining, since a segment of the soundtrack can have either speech, music, event sounds or a combination of them (fuzzy element) • Non-exclusive classification can mitigate info losses. Classification- Challenge and Solution 13:22

  4. TheConcept Door Knock Hello A system integration approach to audio information mining can be hypothetically built upon the success in the following diverse areas. To re-deploy these tools, it is essential that a pre-processor should effectively Where speech, music and audio events of interest occur. These audio segments can be further processed by dedicated algorithms to obtain further information.

  5. Universal Open Architecture

  6. A noise reduction technique. • VAD is employed to detects musical speech and musical segments • Calculate spectral magnitude to musical and musical speech segments. • Estimate the clean speech through the following formula Spectral Subtraction Algorithm

  7. Feature Spaces • Data reduction. • Extract characteristic features. • Mel Frequency Cepstrum Coefficients (MFCCs). • STFT –Temporal pattern analysis. • ZCR, RMS ‘Loudness’, Entropy, Short term energy. • Optimized Feature Space For Speech and Music Detection.

  8. Music Analysis Retrieval and SYnthesis for Audio Signals. • Open source framework for audio processing by George Tzanetakis  University of Victoria Canada. • Development of real time audio analysis and synthesis tools • Audio processing system with specific emphasis on MIR. • Implemented for exclusive classification (Speech or Music). • Music genre organisation.

  9. Training Database Building • Speech and Music classes are involved as starting point. • Toward generalization, different styles of samples were included in the training set. • Speech samples (children, male, female, speaker with different languages, aloud speech, speech with laughs,). • Music, all genres are added (Jazz, pop, classical, rock ,…). • All speech and music samples were mixed together after normalizing them to produce speech over music samples.

  10. Toolbox Demonstration

  11. Results Comparison Before and After Speech Enhance

  12. Summary and Conclusions • Open Structure and Common Interfaces toward general classifier. • Redeployment of currently available techniques. • Encourage third party contributions. • Rapid prototyping of UOA Audio Information Mining system.

  13. Thank you for Listening

  14. Audio Routing

  15. Machine Learning

  16. Sound Events Detections

  17. ASR

  18. Role of MIR in UOA

More Related