Factors Affecting Music Retrieval in Query-by-MelodyChristian Godi
FACTORS • Accuracy of the Query Provider • Query transcription Accuracy of the acoustic Front End • Query Length
What is • Query-by-Melody? • Query Transcription? • Length of a Query?
Architecture of QBM System Query Query Ordered List Front-End Back End Song ID’s (Signal) (transcription) Melody Database
SYSTEMS & DATABASES • Cuby Hum Back End It performs an approximate match between a relatively short query transcription and a much longer monophonic melody. This match is based on a Dynamic Programming procedure which computes Melody Distance. • Acoustic Front Ends 1. Solo Explorer 2. Ear Analyzer 3. MAMI • The Databases 1. Query Database 2. Melody Database
Errors • The Cuby Hum engine distinguishes between the following errors 1. Interval Change 2. Interval Transposition 3. Interval Insertion or Deletion 4. Note Insertion 5. Note Deletion 6. Duration Error
Evaluation Methodology • Indicator of Query Transcription Accuracy Indicator of QTA is based on comparison of automatically generated and manually verified transcriptions of all the queries in a query database. Total Transcription Error (TTE) TTE = no. of deletions + insertions + substitutions No. of notes in manual transcriptions
Indicator of Music Retrieval Accuracy The QBM system is supposed to produce an output list of melodies and a target is said to be retrieved correctly if at least one of its characterizing melodies appear in that list. The indicator of music retrieval accuracy that is independent of any output list is the MEAN RECIPROCAL RANK (MRR). MRR = 1/NqΣ1/ranki Nq= the number of tested queries Ranki = the position of the melody of the target of query I in an output list of size Sl = Sd
Under the assumption that each target is characterized by one melody, the mean uncertainty about finding target Ti(i=1,….,Sd) in the output list L[q(Ti)] generated for some query q(Ti) of that target, can be computed as H(Ti εL[q(Ti)]) = -E[logP(Ti εL[q(Ti)]) ] RIF = log P = log P log Po log Sl/Sd RIF = Remaining information Factor The QBM system that is capable of always putting the target melody on top of the output list will yields a RIF=0 whereas a QBM system that behaves like that random system will yield a RIF =1.
IMPACTS • IMPACT OF USER PERFORMANCE Backend when supplied with the perfect query transcriptions it behaves like a perfect system with RIF=0. But when supplied with real life queries the performance degrades significantly. • IMPACT OF THE FRONT END TRANSCRIPTION ACCURACY Front ends with the highest transcription accuracies yield the highest music retrieval accuracies.
RIF 0,3 • IMPACT OF THE QUERY LENGTH 0,25 0,2 0,15 0,1 0,05 L min 10 15 20 25 30 35 40 45 RIF as a function of minimal query length (no. of notes) By plotting the retrieval performances using RIF for the different front-end/backend combinations as a function of minimal query length one Can see that the performance differences caused by changes of the Front end remain equally important irrespective of this length., RIF starts to raise as soon as query counts <20 notes
Conclusion • The first conclusion of this study is that the RIF is a robust and attractive indicator of the music retrieval accuracy of a QBM system. • The second Conclusion is that due to the limited accuracy of the query provider, the music retrieval accuracy of QBM system, does not yet approach the perfect accuracy RIF=0 one could have hoped for.