1 / 16

Segmentation and Event Detection in Soccer Audio

Segmentation and Event Detection in Soccer Audio. Lexing Xie, Prof. Dan Ellis EE6820, Spring 2001 April 24 th , 2001. The problem. Event detection in sports video In this project: the audio part Our approach Segmentation + Event Detection Incorporate domain knowledge. Related work

spriggs
Download Presentation

Segmentation and Event Detection in Soccer Audio

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Segmentation and Event Detection in Soccer Audio Lexing Xie, Prof. Dan EllisEE6820, Spring 2001 April 24th, 2001

  2. The problem • Event detection in sports video • In this project: the audio part • Our approach • Segmentation + Event Detection • Incorporate domain knowledge

  3. Related work Observations on soccer audio Segmentation Features Decision scheme Result Event detection Scope Feature metric Result Generalization Next step Outline

  4. Related Work • Audio segmentation • Speech-silence discrimination [Rabiner78] • Speech / music / mixture segmentation[Saunders96] [Scheirer97] [Williams99] • Sports audio analysis • Classify excited speech [Rui2000] • Keyword/event template matching [Chang96] [Rui2000]

  5. Observations #1 • Sound Types • Foreground speech • Noisy vocal sound with visible phoneme structure • Background noise • Ambient crowd, whistles, cheers, etc. • Acoustics [Fahy2001] • Sound intensity in open space: • Sound attenuation in air • Production conditions • Frequency response of microphone • Automatic Gain Control

  6. Observations #2 • Large variety across games • Commentator “verbosity” • Audience “excitability”  not labeling and training • In different languages  not ASR • Not template-matching & training • Assumptions on temporal characteristics • Short-term dynamics  • Long-term variety 

  7. Seg. boundary sound Post-processing Feature extraction Decision Rules 1st formant energy Fricative energy Morphological operations Energy > Global Avg. & adaptive threshold Segmentation Algorithm • Commentary vs. Crowd segmentation

  8. Segmentation Result commentary commentary commentary crowd crowd

  9. Most distinctive segment Seg. boundaries Distance metric Pick up crowd,chop into units Feature calculation Spectral: centroid, roll-off Energies: E, Er1, Er2 feature contour and moments of the contours Detection #1 • Detecting audio events in crowd noise • Examples: crowd cheering, whistle, … • Subjective definition

  10. Detection #2 • Compute Mahalanobis distances [Duda 73] • Feature element normalization and decorrelation • Pick up distinctive segments • Largest distance to all other segments (typically top 5~10%) • Clustering: detecting outliers • Merge adjacent segments

  11. Time (sec) 100.2 55.0 95.2 49.1 0 128 Attacking.. Start Foul! Penalty kick GOAL! Detection Results • The game: River Plate vs. Los Andes • Assumptions: • The majority are Unimportant • We do have Important parts! • Cluster analysis helps

  12. Generalization • Segmentation tasks • Other Sports (baseball, tennis, etc.) • Film sound track (Sense and Sensibility) • Detection of sparse audio events • Surveillance video Speech Speech Music Silence Silence

  13. Next step • More experiments • Improve decision scheme • Improve GMM in segmentation • Use cluster analysis in detection • New features • Wish list • Classification of speech segments • Other interesting noise patterns • Investigate sound mixtures

  14. Summary • Segmentation • Use energy features • Best result: precision 95%, recall 92% • Event detection • Use feature distance • Interesting segments retrieved • More work to follow

  15. Thanks!

More Related