a music search engine built upon audio based and web based similarity measures l.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
A Music Search Engine Built upon Audio-based and Web-based Similarity Measures PowerPoint Presentation
Download Presentation
A Music Search Engine Built upon Audio-based and Web-based Similarity Measures

Loading in 2 Seconds...

play fullscreen
1 / 23

A Music Search Engine Built upon Audio-based and Web-based Similarity Measures - PowerPoint PPT Presentation


  • 269 Views
  • Uploaded on

A Music Search Engine Built upon Audio-based and Web-based Similarity Measures P. Knees, T., Pohle, M. Schedl, G. Widmer SIGIR 2007 INTRODUCTION

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'A Music Search Engine Built upon Audio-based and Web-based Similarity Measures' - Samuel


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
a music search engine built upon audio based and web based similarity measures

A Music Search EngineBuilt upon Audio-based andWeb-based Similarity Measures

P. Knees, T., Pohle, M. Schedl, G. Widmer

SIGIR 2007

introduction
INTRODUCTION
  • Basically all existing music search systems make use of manually assigned subjective meta-information like genre or style to index the underlying music collection.
    • Explicit manual annotations
    • A small set of meta-data
  • Recent approaches
    • Content-based analysis of the audio files
    • Collaborative recommendations
    • Incorporate information from different sources
related work
RELATED WORK
  • Query-by-example
    • Query-by-Humming/Singing (QBHS)
    • Operate on MIDI
    • Music piece → Meta-data
  • Cross-media
    • Semantic ontology
  • Semantic relations
    • Crawler on “audio blogs”
    • Word sense disambiguation
    • Text surrounding the links to audio files
    • Last.fm – listening habits & tags
preprocessing the collection
PREPROCESSING THE COLLECTION
  • ID3 tags
    • Artist
    • Album
    • Title
  • Ignored
    • Only speech pieces ( skit in rap)
    • Intro / Outro
    • Duration below 1 minute
web based features
WEB-BASED FEATURES
  • Queries to Google
    • “artist” music
    • “artist” “album” music review
    • “artist” “title” music review -lyrics
  • For each query, retrieve top-ranked 100 pages
    • Clean HTML tags and stop words in 6 languages
web based features cont
WEB-BASED FEATURES (CONT.)
  • term list of each music piece
    • Remove all terms with dftm <= 2
  • global term list
    • Remove all terms that co-occur < 0.1%
  • Resulting 78,000 terms (dimensions)
  • weight( t, m )
    • tf * idf
    • N – # of music pieces
    • mpft – music piece frequency
  • Cosine normalization
    • Removes the influence of the length of pages
audio based similarity
AUDIO-BASED SIMILARITY
  • MFCCs, Gaussian Mixture Model, KL divergence
  • Problem
    • Hubs - frequently similar
    • Outliers - never similar to others
    • Triangle inequality - does not fulfill
  • Author’s previous work solve these problems
audio based similarity cont
AUDIO-BASED SIMILARITY (CONT.)
  • Always similar – hubs
    • ndist(A) = distance to the nth nearest neighbour
    • g(A, Pi) = Dbasic(A, Pi) / ndist(Pi), for all i
    • sort g(A, Pi) ascending, pick nth value as f(A)
    • Dn-NN norm(A, B) = Dbasic(A, B) / ( f(A) * f(B) )
  • Never similar – outliers
    • like above
  • Triangle inequality
    • sort Dbasic(A, Pi), for all i
    • interpolating Dbasic(A, B) into Dbasic(A, Pi)
    • DP(A, B) is the rank of Dbasic(A, B) in Dbasic(A, Pi)
    • Dpv(A, B) = DP(A, B) + DP(B, A)
dimensionality reduction
DIMENSIONALITY REDUCTION
  • χ2 test
    • s : 100 most similar tracks
    • d : 100 most dissimilar tracks
    • Calculate χ2( t, s )
    • N terms with highest value are then joined into a global list
vector adaptation
VECTOR ADAPTATION
  • Particularly necessary for tracks where no related information could be retrieved from the web
  • Perform a simple smoothing
querying the music search engine
QUERYING THE MUSIC SEARCH ENGINE
  • Original query + “music”
    • -site:last.fm
  • Google search
  • 10 top-most web pages
  • Map to vector space
  • Calculate Euclidean distances
audioscrobbler ground truth
AUDIOSCROBBLER GROUND TRUTH
  • Common approach
    • genre information
    • several drawbacks
  • http://www.audioscrobbler.net
    • Web services to access Last.fm data
    • Tag information provided by Last.fm
    • drawbacks
  • Using top tags for tracks (total 227 tags)
performance evaluation
PERFORMANCE EVALUATION
  • Dimensionality reduction

χ2 /50 best

random permutation

pass significance test

performance evaluation14
PERFORMANCE EVALUATION
  • Vector adaptation

(re-weighting)

no significance

performance evaluation15
PERFORMANCE EVALUATION
  • Overall
  • Precision after 10 documents
examples
EXAMPLES

Rock with great riffs

Punk

Relaxing music

future work
FUTURE WORK

12601

tracks

ID3 tag

Google search

Audio similarity

Web-based feature

Vector adaptation

Dimensionality reduction

Vector space

Query

Google search

results

future work18
FUTURE WORK

12601

tracks

ID3 tag

合輯, remix

Google search

Audio similarity

Web-based feature

Vector adaptation

Dimensionality reduction

Vector space

Query

Google search

results

future work19
FUTURE WORK

12601

tracks

ID3 tag

Lyrics

Google search

Audio similarity

Web-based feature

Vector adaptation

Dimensionality reduction

Vector space

Query

Google search

results

future work20
FUTURE WORK

12601

tracks

ID3 tag

Google search

Indexing documents

Audio similarity

Web-based feature

Vector adaptation

Dimensionality reduction

Vector space

Query

Google search

results

future work21
FUTURE WORK

12601

tracks

ID3 tag

Google search

Audio similarity

Web-based feature

PLSA

Vector adaptation

Dimensionality reduction

Vector space

Query

Google search

results

future work22
FUTURE WORK

12601

tracks

ID3 tag

Google search

Audio similarity

Web-based feature

Vector adaptation

Dimensionality reduction

Vector space

Query

Google search

results

Computation inefficient

future work23
FUTURE WORK

12601

tracks

ID3 tag

Google search

Audio similarity

Web-based feature

Vector adaptation

Dimensionality reduction

Ground truth?

Vector space

Query

Google search

results