locating cover songs and alternate performances in databases of raw audio l.
Download
Skip this Video
Download Presentation
Locating Cover Songs and Alternate Performances in Databases of Raw Audio

Loading in 2 Seconds...

play fullscreen
1 / 12

Locating Cover Songs and Alternate Performances in Databases of Raw Audio - PowerPoint PPT Presentation


  • 257 Views
  • Uploaded on

Locating Cover Songs and Alternate Performances in Databases of Raw Audio Robert Turetsky rjt72@columbia.edu Advent Workshop May 17, 2002 Technology enables “liquid music” Production Distribution Consumption Content-Based Analysis: Motivation

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Locating Cover Songs and Alternate Performances in Databases of Raw Audio' - andrew


Download Now An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
locating cover songs and alternate performances in databases of raw audio

Locating Cover Songs and Alternate Performances in Databases of Raw Audio

Robert Turetsky

rjt72@columbia.edu

Advent Workshop

May 17, 2002

technology enables liquid music
Technology enables “liquid music”

Production

Distribution

Consumption

content based analysis motivation
Content-Based Analysis: Motivation
  • Search on file-sharing systems (e.g. KaZaA) involves meta-data
    • Meta-data prone to errors, omission, distortion
    • Only works if user already knows what to look for
  • Musical Content Analysis means:
    • Query by humming
    • Query by segment/prototype
    • Recommendation engines and artist discovery
    • Machine feedback/collaboration in composition
  • Locating cover songs is a first step
locating cover songs prior work
Locating Cover Songs: Prior Work
  • Query By Humming
    • Mature field (kiosks, applets) but limited to monophonic music or manually transcribed polyphonic music
  • Jonathan Foote (FX Palo Alto)
    • ARTHUR (2000): align RMS energy. Works only on orchestral music, pop music has less dynamic range.
    • Content-Based Retrieval of Music and Audio (1997). Measures acoustic similarity, not equivalence.
  • Cheng Yang (Stanford)
    • Music Database Retrieval Based on Spectral Similarity (2001). Aligns MFCC at points of high energy using DTW.
    • MACS (2001). Aligns estimates of pitch likelihood. Indexing. “Bad” alignments discarded after linearity filter.
why is locating cover songs so difficult
Why is locating cover songs so difficult?
  • Alternate performances can vary:
    • Studio vs. Live
      • Tempo (non-linear time shifting)
      • Pitch transposition
      • Production technique, acoustic character
      • Additions (i.e. audience interaction)
      • Alternate lyrics (i.e. Don’t Cry versions I and II)
    • Cover versions, artist re-interpretations
      • Vocalist, instrumentation, ornamentation
      • Entire character changes (i.e. Layla, dance remixes)
  • Yet we still know these songs are the same!
system overview
System Overview

Locate Section Breaks

Identify Summary Sections

Preprocessing

Pitch Extraction

Tonic Estimation

Query

Alignment

phase 1 locate section breaks
Phase 1: Locate Section Breaks
  • Employ Foote’s Similarity Matrix
  • Theory: Windows of same section will have similar features. Windows of different sections will have features.
  • Similarity Matrix: Cosine distance between every fixed width window of the song
  • Novelty Score - measure of ‘newness’: correlation with checkerboard matrix.
  • Section breaks are peaks in the Novelty Score.
phase 2 summary segments
Phase 2: Summary Segments

Section 1 ->

  • Motivation: Only transcribe and align salient segments
  • Measure of salience: Repetition
  • Method: Search for largest off-diagonal line in Similarity Matrix for each segment to measure extent of repetition (“score”)
  • Summary segment is most repeated section. Prune rows/columns of similar sections in score matrix. Repeat until 45-75 sec of audio is kept

Section 4 ->

Sec 1

Sec 2

Sec 3

Sec 4

Sec 1

Sec 2

Sec 3

Sec 4

phase 3 pitch extraction
Phase 3: Pitch Extraction

Noise Suppression

  • Multi-pitch extraction algorithm based on Klapuri et al, 2001.
  • Works well, except in presence of drums.

Predominant

Pitch Estimation

Time ->

Estimate Pitched Sound Characteristics

Estimate # Voices and Iterate

Remove Found

Sound from Mixture

<- Pitch ->

phase 3 mpe details
Phase 3: MPE Details

Noise Reduction: RASTA style filter

Predominant pitch estimation: “Fuzzy search” for harmonic peaks

Spectral Smoothing to estimate sound parameters

Resynthesis

Repeat on mixture after removal

Resynthesis

phase 4 5 query time alignment
Phase 4-5: Query-time alignment
  • Exhaustively align summary segments
  • Two alignments needed: Pitch and Time
  • Pitch Alignment: Tonic Estimation
    • Align two piano rolls at point of maximum cross-correlation between note histograms
  • Temporal Alignment: Dynamic Programming (Dynamic Time Warp)
    • Currently investigating different weights for rewarding note matches, penalizing mismatches
locating cover songs future work
Locating Cover Songs: Future Work
  • Indexing scheme, other alignment techniques to improve speed of query
  • Thematic extraction to find only melody or harmony lines
  • Include Beat Tracking as part of score
  • Investigate harmonic analysis (identifying chord structure) for better feature
  • Speech recognition on lyrics???
ad