semantic hifi browsing listening interacting sharing on future hifi systems
Download
Skip this Video
Download Presentation
SEMANTIC HIFI Browsing, listening, interacting, sharing on future HIFI systems

Loading in 2 Seconds...

play fullscreen
1 / 21

UPF-HIFI - PowerPoint PPT Presentation


  • 307 Views
  • Uploaded on

SEMANTIC HIFI Browsing, listening, interacting, sharing on future HIFI systems. Music Technology Group Universitat Pompeu Fabra (UPF) Barcelona. WP5. Performance Workpackage. Interaction & Performance.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'UPF-HIFI' - Mia_John


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
semantic hifi browsing listening interacting sharing on future hifi systems
SEMANTIC HIFIBrowsing, listening, interacting, sharing on future HIFI systems

Music Technology Group

Universitat Pompeu Fabra (UPF)

Barcelona

interaction performance
Interaction & Performance

“…it becomes possible for more people to make more satisfying music, more enjoyably and easily, regardless of physical coordination or theoretical study, of keyboard skills or fluency with notation. This doesn’t imply a dilution of musical quality. On the contrary, it frees us to go further and raises the base-level at which music making begins.” (Laurie Spiegel)

“Let’s develop virtual instruments that do not just play back music for people, but become increasingly adept at making new and engaging music with people, at all levels of technical proficiency.” (Robert Rowe)

interaction
Interaction

Has to be:

  • natural & intuitive
  • easy

And yet…

  • allow expression
  • enjoyable
  • rewarding
input devices
Input devices
  • Feel natural
  • Maximize bandwidth
  • Profit from users’ knowledge
input devices6
Input devices
  • Feel natural
  • Maximize bandwidth
  • Profit from users’ knowledge

We propose the use of

  • Mouth: microphone + small video camera
  • Hands & arm: remote command used as a baton
input devices7
Input devices
  • Feel natural
  • Maximize bandwidth
  • Profit from users’ knowledge

We propose the use of

  • Mouth: microphone + small video camera
  • Hands & arm: remote command used as a baton
slide8

Mouth control information will be reinforced by the two simultaneous input modes (sound + image)

mouth
Mouth
  • Mouth interaction will not only allow karaoke
  • The system will be able to detect at least 4 different mouth input modes:
    • Singing (karaoke)
    • Scat (instrumental solos)
    • Beat boxing (drums)
    • Silent mouth movements (filters & timbre changes)
  • Voice transformations include
    • Voice Excitation based Transformations (pitch change, hoarseness, whisper…)
    • Vocal Tract based Transformations (timbre…)
music context
Music Context
  • The results of each of these interaction modes, will depend on the music being played
  • Use of metadata will provide increasing information
music context11
Music Context
  • The results of each of these interaction modes, will depend on the music being played
  • Use of metadata will provide increasing information
  • Example: Scatting on different musical styles
music context12
Music Context
  • This would correspond to a simplified context
  • More information can be obtained:
    • From the type of voiced sound (voice analysis - not mere pitch-2-MIDI – should profit all timbre information)
    • From additional metadata
music context13
Music Context
  • This would correspond to a simplified context
  • More information can be obtained:
    • From the type of voiced sound (voice analysis - not mere pitch-2-MIDI – should profit all timbre information)
    • From additional metadata
additional metadata
Additional Metadata*

Time-stamped information:

Music

  • Composition parts (A, B, chorus…)
  • Harmonic & rhythmic details
  • Score
  • Program changes
  • ….

Audio Analysis

  • ….

*Format and contents to be defined in WP1.2

editable metadata
Editable Metadata
  • Advanced users will be able to edit and enrich the Metadata (in non real time), adding value to their contribution
hands movements
Hands Movements

Will provide complementary information

  • e.g. crash cymbal on beat boxing

Alternate functions

  • e.g. baton conduction
    • tempo changes
    • dynamic changes
    • groove & swing modification
    • ……
  • ……
hand body tracking
Hand & Body tracking
  • A camera fixed to the system could be used
  • For better tracking resolution (spatial & temporal) an additional device seems necessary
  • We propose to use the same command, fitted possibly with accelerometers (and wireless communication with the system)
score following
Score Following

IRCAM: Instrument Score follower

(for automatic performer accompaniment)

To be defined:

  • Options
    • MIDI (or synthetic) accompaniment
    • Time-stretched prerecorded audio
  • Data formats
    • data resulting from the audio analysis (UPF), sent to the score follower module (IRCAM) (voice2MIDI?)
    • position data from the score follower to the time-stretching module
performing on a simple keyboard
Performing on a simple keyboard

In this part Sony CSL will implement style and performing rules in a simple keyboard able to follow and continue the user play according to simple style constraints.

mtg participants
MTG Participants
  • Xavier Serra, local manager
  • Sergi Jordà, technical manager
  • Alex Loscos, voice processing
  • Martin Kaltenbrunner, interfaces
  • 1 additional programmer
ad