topics for today n.
Skip this Video
Loading SlideShow in 5 Seconds..
Topics for Today PowerPoint Presentation
Download Presentation
Topics for Today

Loading in 2 Seconds...

play fullscreen
1 / 34

Topics for Today - PowerPoint PPT Presentation

  • Uploaded on

Topics for Today. General Audio Speech Music Presentation of MusicWiz project. General Audio. Mapping audio cues to events Recognizing sounds related to particular events (e.g. gunshot, falling, scream) Mapping events to audio cues Audio debugger to speed up stepping through code

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
Download Presentation

PowerPoint Slideshow about 'Topics for Today' - tahir

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
topics for today
Topics for Today
  • General Audio
  • Speech
  • Music
  • Presentation of MusicWiz project
general audio
General Audio
  • Mapping audio cues to events
    • Recognizing sounds related to particular events (e.g. gunshot, falling, scream)
  • Mapping events to audio cues
    • Audio debugger to speed up stepping through code
  • Spatialized audio
    • Provides additional geographic/navigational channel
    • Example: Michael Joyce’s Interactive Central Park
spatialized audio
  • Spatialized audio is easier when assuming headphones because of control
  • Head-related transfer function (HRTF)
    • Difference in timing and signal strength determine how we identify position of sound
  • Beamforming
    • Timing for constructive interference to create stronger signal at desired location
  • Crosstalk Cancellation
    • Destructive interference to remove parts of signal at desired location
audio signal analysis
Audio Signal Analysis
  • Fast Fourier Transform (FFT) and Discrete Wavelet Transform (DWT)
    • Transforms commonly used on audio signals
    • Allow for analysis of frequency features across time (e.g. power contained in a frequency interval)
    • FFTs have equal sized windows where wavelets can vary based on frequency
  • Mel-frequency cepstralcoeffients (MFCC)
    • Based on FFTs
    • Maps results into bands approximating human auditory system
  • An interactive soundscape combining human collaboration with aquarium activity
  • Engage visitors to spend more time with (and learn more about) Beluga whales
  • Spatialized sound based on whale activity and human interaction
  • Speaker segmentation
    • Identify when a change in speaker occurs
    • Useful for basic indexing or summarization of speech content
  • Speaker identification
    • Identify who is speaking during a segment
    • Enables search (and other features) based on speaker
  • Speech recognition
    • Identify the content of speech
speech recognition
Speech Recognition
  • Start by segmenting utterances and characterizing phonemes
    • Use gaps to segment
    • Group segments into words
  • Limited vocabulary of commands
    • Classifiers for limited vocabulary (HMMs)
  • Continuous speech
    • Language models for disambiguation
    • Speaker dependent or not
  • Music processing can support a variety of activities
  • Composition
    • From traditional to interactive
  • Selection
    • Example: iTunes, Pandora,
    • Use for shared spaces
  • Playback
    • Example: MobinLenin
  • Management & Summarization
    • Example: MusicWiz
  • Games
    • Guitar Hero, Rockband, etc.
  • Enable interaction with music in a public space
    • Not karaoke
  • Voting like in many pub/bar games
  • Audience can affect which version of music and video is shown
  • Gave a focal point for interaction between members of a group
  • Content variety is necessary for continued engagement
  • Lottery for free beer motivated participation
music summarization
Music Summarization
  • Most summaries in commercial sites are either the first phrase or a single selected musical phrase
  • Study of whether 22 second long multi-phrase music summaries would be better previews
    • Three algorithms vary the selection of the components between phrases that are
      • sonically distinctand phrases that are
      • repeated more often
  • A comparative evaluation study showed that:
    • Multi-phrase previews were selected in 87% of the cases over the preview representing the first 22 seconds of the song
    • 90% of the summary choices valued at least a good representation of the song
managing personal music collections
Managing Personal Music Collections
  • Music management is mainly based on:
    • explicit attributes (e.g. metadata values like the artist, the composer and the genre).
    • explicit feedback (e.g. ratings of preference and relevance)
  • Benefits
    • Easy to understand
    • Formal: consistent updating and access
    • Context-free
  • Question
    • How can music be accessed based on the feelings or memories it triggers?
current practices
Current Practices
  • Common metadata tags usually not sufficient to describe mood, feelings, memories and complex concepts
    • Effort/benefit trade-off issues
    • Personal reactions to music change
  • Explicit feedback and usage statistics helpful in retrieving music of preference
  • Questions
    • How would people organize music if there was a low-effort way of expressing their personalized interpretation of music?
preliminary study
Preliminary Study
  • 12 participants asked to organize songs & create playlists using spatial hypertext
  • In spatial hypertext, information has visual attributes & spatial layout that can be changed to express associations
  • The majority found spatial hypertext helpful in organizing
  • Participants appreciated:
    • expressive power and freedom of the workspace
    • directly accessible metadata information of music
    • music previews for remembering music
  • Participants missed:
    • interactive hierarchical / tree views
    • music previews for understanding music
music access implicit attributes
Music Access & Implicit Attributes
  • Considerable research into extracting and using implicit cues for associating music to overcome:
    • limitations of metadata & statistics to describe music concepts
    • unwillingness of users to provide explicit feedback
    • cost of employing human experts to find music similarity
  • Music Management extended by:
    • signal features (e.g. intensity, timbre and rhythm)
    • collaborative filtering
    • interaction
  • e.g., Genius, Music Gathering Application, Flytrap, Musicovery, MusicSim, Musicream

MusicWiz Interface


Related Song Titles

Workspace Status

Music Collection

Songs & Metadata

Artist Module

Audio Signal Module


Lyrics Module

Worksp. Express. Module

Relatedness Assessment


Inference Engine

Sim. Values

Statistics of Artist Similarity



MusicWiz Architecture

  • Music management environment that combines:
    • explicit information
    • implicit information
    • non-verbal expression of personal interpretation
  • Two basic components:
    • interface for interacting with the music collection
    • inference engine for assessing music relatedness
musicwiz interface
MusicWiz Interface

Hierarchical Folder Tree View


Playlist Pane

Related Songs & Search Results View

Playback Controls

The MusicWiz interface

musicwiz inference engine
MusicWiz Inference Engine
  • 5 modules for extracting, processing and comparing artists, metadata, audio content, lyrics, and workspace expression

Overall Similarity (S1, S2) =

= W1 * Overall Metadata Similarity(S1, S2) ++ W2 * Overall Audio Signal Similarity(S1, S2) + + W3 * Overall Lyrics Similarity(S1, S2) + + W4 * Overall Workspace Expression Similarity(S1, S2)


    • S1, S2 are the songs under comparison and Wn, n = 1..4 the user adjusted weights of the specialized similarity assessments
musicwiz inference engine artist module
MusicWiz Inference Engine – Artist Module
  • Assesses relatedness in music using online resources:
    • human evaluations of artist similarityfrom:
      • Similar Artists lists of the All Music Guide website
    • co-occurrence of artists in playlists from:
      • OpenNap file-sharing network
      • Art of the Mix website
musicwiz inference engine metadata module
MusicWiz Inference Engine – Metadata Module
  • Evaluates the pair wise similarity of the metadata values of all songs
  • String comparison is applied to the title, genre, album-name, and year of the songs as well as the file-system path where they are stored
    • uses a distance metric that combines the Soundex and the Monge-Elkan algorithms
musicwiz inference engine audio signal module
MusicWiz Inference Engine – Audio Signal Module
  • Uses signal processing techniques to analyze music content
  • Extracts and compares information about the harmonic structure and acoustic attributes of music
    • beat, brightness, pitch, starting note and potential key (music scale) of the song
musicwiz inference engine lyrics module
MusicWiz Inference Engine – Lyrics Module
  • Textually analyzes the lyrics
  • Lyrics are scraped from a pool of popular websites for:
    • display in music objects
    • comparison
  • Lyrical comparison uses term vector cosine similarity:

Overall Lyrics Similarity (S1, S2)= cos(θ)

  • The more words lyrics have in common, the greater the possibility that the songs are motivated by or describe related themes
musicwiz inference engine workspace expression module




MusicWiz Inference Engine – Workspace Expression Module
  • Music objects can be related visually and spatially
  • Spatial parser identifies relations between the music objects
  • Recognizes three types of spatial structures: lists, stacks and composites
musicwiz functionality
MusicWiz Functionality
  • Music collection can be explored by filtering:
    • attribute values (i.e. id3 tags, audio signal attributes and lyrics)
    • similarity values (i.e. overall similarity)
  • Playlists can be created:
    • manually: songs can be added from the left-side views & the workspace)
    • automatically:
      • filter - based mode: selection based on the ID3 tags
      • similarity - based mode: selection based on the relatedness of songs on the current playlist
musicwiz evaluation
MusicWiz Evaluation
  • 20 participants were asked to:
    • Task 1: organize 50 rock songs into sub-collections according to their preference
    • Task 2: form three, twenty-minutelong playlists based on three different moods or occasions of their choice
    • Task 3: form three six-song long playlists, where each of them had to be related to a provided “seed”-song (not from the fifty of the original collection)
topics from today
Topics From Today
  • General Audio
    • Audio cues, spatialized audio
  • Speech
    • Segmentation, speaker id, recognition
  • Music
    • Interactive music, summarization, organization