1 / 37

xkcd/655/

http://www.xkcd.com/655/. Audio Retrieval. David Kauchak cs160 Fall 2009 Thanks to Doug Turnbull for some of the slides. Administrative. CS Colloquium vs. Wed. before Thanksgiving . 250M iPods. 8M artists. 9 B songs. 150M songs. 1M downloads/month. 93M a mericans. consumers.

Download Presentation

xkcd/655/

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. http://www.xkcd.com/655/

  2. Audio Retrieval David Kauchak cs160 Fall 2009 Thanks to Doug Turnbull for some of the slides

  3. Administrative • CS Colloquium vs. Wed. before Thanksgiving 

  4. 250M iPods 8M artists 9B songs 150M songs 1Mdownloads/month 93M americans consumers producers music technology

  5. Audio Index construction Audio files to be indexed wav mp3 midi audio preprocessing slow, jazzy, punk indexer may be keyed off of text may be keyed off of audio features Index

  6. Audio retrieval Query Index Systems differ by what the query input is and how they figure out the result

  7. Song identification • Examples: • Query by Humming • 1-866-411-SONG • Shazam • Bird song identification • … Given an audio signal, tell me what the “song” is Index

  8. Song identification • How might you do this? Query by humming “song” name

  9. Song identification “song” name

  10. Song similarity • Examples: • Genius • Pandora • Last.fm Find the songs that are most similar to the input song song Index

  11. Song similarity • How might you do this? IR approach f1 f2 f3 … fn f1 f2 f3 … fn f1 f2 f3 … fn f1 f2 f3 … fn f1 f2 f3 … fn f1 f2 f3 … fn rank by cosine sim

  12. Song similarity: collaborative filtering user4 user1 user2 user3 usern song1 song2 What might you conclude from this information? song3 … songm

  13. Songs using descriptive text search • Examples: • Very few commercial systems like this … jazzy, smooth, easy listening Index

  14. Music annotation • The key behind keyword based system is annotating the music with tags dance, instrumental, rock blues, saxaphone, cool vibe pop, ray charles, deep Ideas?

  15. Annotating music • The human approach “expert musicologists” from Pandora Pros/Cons?

  16. Annotating music • Another human approach: games

  17. Annotating music • the web: music reviews challenge?

  18. Automatically annotating music Learning a music tagger song signal review tagger tagger blues, saxaphone, cool vibe

  19. Automatically annotating music Learning a music tagger song signal review What are the tasks we need to accomplish? tagger

  20. Data Features Training Data T T Annotation Document Vectors (y) Audio-Feature Extraction (X) System Overview Model Vocabulary Parameter Estimation

  21. Automatically annotating music First step, extract “tags” from the reviews Frank Sinatra - Fly me to the moon This is a jazzy, singer / songwriter song that is calming and sad. It features acoustic guitar, piano, saxophone, a nice male vocal solo, and emotional, high-pitched vocals. It is a song with a light beat and a slow tempo. Dr. Dre (feat. Snoop Dogg) - Nuthin' but a 'G' thang This is a dance poppy, hip-hop song that is arousing and exciting. It features drum machine, backing vocals, male vocal, a nice acoustic guitar solo, and rapping, strong vocals. It is a song that is very danceable and with a heavy beat.

  22. Automatically annotating music First step, extract “tags” from the reviews Frank Sinatra - Fly me to the moon This is a jazzy,singer / songwritersong that is calming and sad. It features acoustic guitar,piano, saxophone, a nice male vocal solo, and emotional, high-pitchedvocals. It is a song with a light beatand a slow tempo. Dr. Dre (feat. Snoop Dogg) - Nuthin' but a 'G' thang This is adance poppy, hip-hopsong that is arousing and exciting. It features drum machine, backing vocals, male vocal, a nice acoustic guitar solo, and rapping, strong vocals. It is a song that is very danceableand with a heavy beat.

  23. Content-Based Autotagging Learn a probabilistic model that captures a relationship between audio content and tags. Frank Sinatra ‘Fly Me to the Moon’ ‘Jazz’ ‘Male Vocals’ ‘Sad’ ‘Slow Tempo’ Autotagging p(tag | song)

  24. + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + Modeling a Song Bag of MFCC vectors cluster feature vectors + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +

  25. romantic Mixture of song clusters romantic Tag Model p(x|t) + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + Modeling a Tag • Take all songs associated with tag t • Estimate ‘features clusters’ for each song • Combine these clusters into a single representative model for that tag clusters

  26. + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + Determining Tags • Calculate the likelihood of the features for a tag model Romantic? Inference with Romantic Tag Model S1 + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + romantic? + + + + + + + + + + + + + + + + + S2 + + + + + + + + + + + +

  27. Annotation Semantic Multinomial for “Give it Away” by the Red Hot Chili Peppers

  28. The CAL500 data set The Computer Audition Lab 500-song (CAL500) data set • 500 ‘Western Popular’ songs • 174-word vocabulary • genre, emotion, usage, instrumentation, rhythm, pitch, vocal characteristics • 3 or more annotations per song • 55 paid undergrads annotate music for 120 hours Other Techniques • Text-mining of web documents • ‘Human Computation’ Games - (e.g., Listen Game)

  29. Retrieval The top 3 results for - “pop, female vocals, tender” 0.33 1. Shakira - The One 0.02 2. Alicia Keys - Fallin’ 0.02 3. Evanescence - My Immortal 0.02

  30. Retrieval

  31. Annotation results Annotation of the CAL500 songs with 10 words from a vocabulary of 174 words.

  32. Retrieval results

More Related