1 / 18

Polyphonic Audio Key Finding Using the Spiral Array CEG Algorithm

Polyphonic Audio Key Finding Using the Spiral Array CEG Algorithm. Research by Elaine Chew and Ching-Hua Chuan University of Southern California Presentation by Sean Sweeney DigiPen Institute of Technology CS 582 / April 17, 2011 Dr. Dimitri Volper. Presentation Flow.

lonna
Download Presentation

Polyphonic Audio Key Finding Using the Spiral Array CEG Algorithm

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Polyphonic Audio Key Finding Using the Spiral Array CEG Algorithm Research by Elaine Chew and Ching-Hua Chuan University of Southern California Presentation by Sean Sweeney DigiPen Institute of Technology CS 582 / April 17, 2011 Dr. DimitriVolper

  2. Presentation Flow • Musical Pitch and Key • Human Perception of Pitch • The Spiral Array Model • Pitches • Chords • Keys • The CEG Algorithm • Algorithm • Visualization

  3. Musical Pitch and Key • Pitch • The perceived value of a tone, “Low” to “High” • Psycho-acoustic (subjective) perception of Frequency • Frequency (Hz) is a scientific measurement of period • Key (Western music) • Labels the “center” tone in a section of music • Standard smallest interval: Semitone or “half-step” • Standard pattern of semitones around “center” • Ascending: 2,2,1,2,2,2,1

  4. Human Perception of Pitch • Limited range of perception • Typically 20Hz – 20,000Hz • Range tends to decrease with age • Noticable Difference is coarser at low Hz • Less distance (Hz) between lower sounds • Around 1400 perceivable intervals • Certain frequency distances sound relatively close • Thirds, Fifths, Octaves

  5. The Spiral Array Model

  6. The Spiral Array Model Helical Structure Toroidal across Octaves Distance in 3D model approximates perceived closeness between pitch Pitch, chord and key can all map to the same space

  7. Chords in the Spiral Array Standard chords are based on three supporting tones Create Triangles in 3D relative to the model Triangles are effectively continuous, as pitch is Major and Minor chords’ centers thus form helixes

  8. Key in the Spiral Array Simple keys are based on three supporting chords Creates triangles in 3D, based on supporting chords’ triangular centers Triangles are effectively continuous, as chords are Major and Minor keys’ centers thus form helixes

  9. Center of Effect • Center of Effect (CE) • Relative location of a chord based on its supporting tones • Notes of different strength change the CE location • Complex chord CE’s will not line up exactly on the model

  10. Center of Effect Generator (CEG) Key-Finding • Center of Effect relates position of multiple pitches in model • Spatially closest chord is most likely key • Correlates input music to standard key structure

  11. Helping Visualize the CEG Algorithm Keys exist as a triangle in 3-space Keys’ centers-of-effect make up two helixes in the 3D model In standard intonation, keys are discrete (12 minor, 12 major) 

  12. Helping Visualize the CEG Algorithm From a complex audio signal, weighted values are calculated for bins on each discrete tone The weighted values approximate the current key’s location on the model The spatially-closest key is the most likely match

  13. CEG Key-Finding Algorithm • Pitch detection • Extract pitch class and strength from signal • Key finding • Nearest Neighbor Search in Spiral Array

  14. Fast Fourier Transform • Efficient algorithm to compute Discrete Fourier Transform • O(n log n) vs O(n2) • Transforms function into its Frequency Domain representation • Widely used across many fields • Solving Partial Differential Equations • Data Compression • Polynomial Multiplication • Spectral Analysis • Frequency bands 

  15. Algorithm for Pitch Class/Strength from FFT For each frequency spectrum in a 0.37 second period: • For each frequency band find peak value • For each pitch-class, k, and its strength at time j: Fjk, is the sum of all peak values for that frequency band (and others related by octaves) • Normalize • Divide all pitch-strength values by the largest: • Divide all pitch-strength values by their sum: (k = 0, 1, …, 11)

  16. CEG Key-Finding Algorithm • Pitch detection • Extract pitch class and strength from signal • Key finding • Nearest Neighbor Search in Spiral Array

  17. CEG Algorithm For pitch class and strength from each 0.37 seconds: • Assign pitch-names to pitch classes: • Generate CE for previous 5 seconds; and • Assign pitch-names to current pitch-classes by nearest neighbor search in Spiral Array Space • Determine Key based on pitch names: • Generate the cumulative CE from beginning to current • Perform nearest-neighbor search to find closest key

  18. Questions? Bibliography: Polyphonic Audio Key Finding Using the Spiral Array CEG Algorithm Chuan, C. and Chew, E. IEEE International Conference on Multimedia & Expo 2005 Towards a Mathematical Model of Tonality Chew, E. Doctoral dissertation, MIT 2000

More Related