Scott McCaulay Joe Rinkovsky Pervasive Technology Institute Indiana University

Music Information Retrieval With Condor Scott McCaulayJoe Rinkovsky Pervasive Technology Institute Indiana University

Overview • PFASC is a suite of applications developed at IU to perform automated similarity analysis of audio files • Potential applications include organization of digital libraries, recommender systems, playlist generators, audio processing • PFASC is a project in the MIR field, an extension and adaptation of traditional Text Information Retrieval techniques to sound files • Elements of PFASC, specifically the file by file similarity calculation, have proven to be a very good fit with Condor

What We’ll Cover • Condor at Indiana University • Background on Information Retrieval and Music Information Retrieval • The PFASC project • PFASC and Condor, experience to date and results • Summary

Condor at IU • Initiated in 2003 • Utilizes 2350 Windows Vista machines from IU’s Student Technology Clusters • Minimum 2GB memory, 100 Mb network • Available to students at 42 locations on the Bloomington campus 24 x 7 • Student use is top priority, Condor jobs are suspended immediately on use

Costs to Support Condor at IU • Annual marginal annual cost to support Condor Pool at IU is < $15K • Includes system administration, head nodes, file servers • Purchase and support of STC machines are funded from Student Technology Fees

Challenges to Making Good use of Condor Resources at IU • Windows environment • Research computing environment at IU is geared to Linux, or to exotic architectures • Ephemeral resources • Machines are moderately to heavily used at all hours, longer jobs are likely to be preempted • Availability of other computing resources • Local users are far from starved for cycles, limited motivation to port

Examples of Applications Supported on Condor at IU • Hydra Portal (2003) • Job submission portal • Suite of Bio apps, Blast, Meme, FastDNAml • Condor Render Portal (2006) • Maya, Blender video rendering • PFASC (2008) • Similarity analysis of audio files

Information Retrieval - Background • Science of organizing documents for search and retrieval • Dates back to 1880s (Hollerith) • Vannevar Bush, first US presidential science advisor, presages hypertext in “As We May Think” (1945) • The concept of automated text document analysis, organization and retrieval was met with a good deal of skepticism until the 1990s. Some critics now grudgingly concede that it might work

Calculating SimilarityThe Vector Space Model • Each feature found in a file is assigned a weight based on the frequency of its occurrence in the file and how common that feature is in the collection • Similarity between files is calculated based on common features and their weights. If two files share features not common to the entire collection, their similarity value will be very high • This vector space model (Salton) is the basis of many text search engines, and also works well with audio files • For text files, features are words or character strings. For Audio files, features are prominent frequencies within frames of audio or sequences of frequencies across frames.

Some Digital Audio History • Uploaded to Compuserve 10/1985 • one of the most popular downloads at the time! • 10 seconds of digital audio • Time to download (300 baud): 20 minutes • Time to load: 20 minutes (tape) 2 minutes (disk) • Storage space: 42K • From this to Napster in less than 15 years

Explosion of Digital Audio • Digital audio today similar to text 15 years ago • Poised for 2nd phase of the digital audio revolution? • Ubiquitous, easy to create, access, share • Lack of tools to analyze, search or organize

How can we organize this enormous and growing volume of digital audio data for discovery and retrieval?

What’s done today • Pandora - Music Genome Project • expert manual classification of ~ 400 attributes • Allmusic • manual artist similarity classification by critics • last.fm – Audioscrobbler • collaborative filtering from user playlists • iTunes Genius • collaborative filtering from user playlists

What’s NOT done today • Any analysis (outside of research) of similarity or classification based on the actual audio content of song files

Possible Hybrid Solution Automated Analysis • Classification/Retrieval system could use elements of all three methods to improve performance User Behavior Manual Metadata

Music Information Retrieval • Applying traditional IR techniques for classification, clustering, similarity analysis, pattern matching, etc. to digital audio files • Recent field of study, has accelerated with the inception of the ISMIR conference in 2000 and MIREX evaluation in 2004.

Common Basis of an MIR System • Select very small segment of audio data, 20-40ms • Use fast Fourier transform (FFT) to convert to frequency data • This ‘frame’ of audio becomes the equivalent of a word in a text file for similarity analysis • The output of this ‘feature extraction’ process is input to various analysis or classification processes • PFASC additionally combines prominent frequencies from adjacent frames to create temporal sequences as features

PFASC as an MIR Project • Parallel Framework for Audio Similarity Clustering • Initiated at IU in 2008 • Team includes School of Library and Information Science (SLIS), Cognitive Science, School of Music and Pervasive Technologies Institute (PTI) • Have developed MPI-based feature extraction algorithm, SVM classification, vector space similarity analysis, some preliminary visualization. • Wish list includes graphical workflow, job submission portal, use in MIR classes

PFASC Philosophy and Methodology • Provide an end-to-end framework for MIR, from workflow to visualization • Recognize temporal context as an critical element of audio and a necessary part of feature extraction • Simple concept, simple implementation, one highly configurable algorithm for feature extraction • Dynamic combination and tuning of results from multiple runs, user controlled weighting • Make good use of available cyberinfrastructure • Support education in MIR

PFASC Feature Extraction Example • Summary of 450 files classified by genre, showing most prominent frequencies across spectrum

PFASC Similarity Matrix Example • Audio file summarized as a vector of feature values, similarity calculated between vectors • Value is between 0.0 and 1.0, 0.0 = no commonality, 1.0 = files are identical • In the above example, same genre files had similarity scores 3.352 times higher than different genre files

Classification vs. Clustering • Most work in MIR involves classification, e.g. genre classification, an exercise that may be arbitrary and limited in value • Calculating similarity values among all songs in a library may be more practical for music discovery, playlist generation, grouping by combinations of selected features • Calculating similarity is MUCH more computationally intensive than categorization, comparing all songs in a library of 20,000 files requires ~200 million comparisons

Using Condor for Similarity Analysis • Good fit for IU Condor resources, a very large number of short duration jobs • Jobs are independent, can be restarted and run in any order • Large number of available machines provides great wall clock performance advantage over IU supercomputers

PFASC Performance and Resources • A recent run of 450 jobs completed in 16 minutes. Time to run in serial on a desktop machine would have been about 19 hours • Largest run to date contained 3,245 files, over 5 million song-to-song comparisons, completed in less than eight hours, would have been over 11 days on a desktop • Queue wait time for 450 processors on IU’s Big Red is typically several days, for 3000+ processors it would be up to a month

Porting to Windows

Visualizing Results

PFASC Contributors • Scott McCaulay (Project Lead) • Ray Sheppard (MPI Programming) • Eric Wernert (Visualization) • Joe Rinkovsky (Condor) • Steve Simms (Storage & Workflow) • Kiduk Yang (Information Retrieval) • John Walsh (Digital Libraries) • Eric Isaacson (Music Cognition)

Thank you!

Scott McCaulay Joe Rinkovsky Pervasive Technology Institute Indiana University

Scott McCaulay Joe Rinkovsky Pervasive Technology Institute Indiana University

Presentation Transcript

Indiana University Committee for Technology Review

INDIANA UNIVERSITY

Indiana University

Indiana University

Indiana University

Anurag Shankar University Information Technology Services Indiana University

Indiana University

Indiana University

Indiana University

Indiana University

INDIANA UNIVERSITY

Indiana University

INDIANA UNIVERSITY

Indiana University

Indiana University

Indiana University

Indiana University

Indiana University

Indiana University*

Indiana University

Indiana University

Indiana University