1 / 35

A System for Hybridizing Vocal Performance

A System for Hybridizing Vocal Performance. By Kim Hang Lau. Parameters of the singing voice . Parameters of the singing voice can be loosely classified as: Timbre Pitch contour Time contour (rhythm) Amplitude envelope (projections). Vocal Modification.

Roberta
Download Presentation

A System for Hybridizing Vocal Performance

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. A System for Hybridizing Vocal Performance By Kim Hang Lau

  2. Parameters of the singing voice • Parameters of the singing voice can be loosely classified as: • Timbre • Pitch contour • Time contour (rhythm) • Amplitude envelope (projections)

  3. Vocal Modification • Vocal modification refers to the signal processing of live or recorded singing to achieve a different inflection and/or timbre • Commercially available units include • Intonation corrector • Pitch/formant processor • Harmonizer • Vocoder

  4. Objectives • Prototype a system for vocal modification • Modify a source vocal sample to match the time evolution, pitch contour and amplitude envelope of a similarly sung, target vocal sample • Simulates a transfer of singing techniques from a target vocalist to a source vocalist – thus a hybridizing vocal performance

  5. Order of Presentation • System Overview • Individual components • System evaluation • System limitations • Conclusions and recommendations

  6. System Overview • Three components • Pitch-marking • Time-alignment • Time/pitch/amplitude modification engine • Inspired by Verhelst’s prototype system for the post-synchronization of speech utterances

  7. Targeted System Specifications

  8. Component No.1Pitch-marking

  9. Pitch-marks P P’ 5ms 5ms Pitch-marking and Glottal Closure Instants (GCIs) • Information generated from pitch-marking • Pitch period • Amplitude envelope • Voiced/unvoiced segment boundaries

  10. Pitch-marking applying Dyadic Wavelet Transform (DyWT) • Kadambe adapted Mallat’s algorithm for edge detection in image signal to the detection of GCIs in speech signal • He assumed the correlation between edges in image signal and GCIs in speech signal • DyWT computation for dyadic scales 2^3 to 2^5 was sufficient for pitch-marking • If a particular peak detected in DyWT matches for two consecutive scales, starting from a lower scale, that time-instant is taken as a GCI

  11. Mallat Kadambe Original Signal 2^1 2^2 2^3 2^4 2^5 Base-band

  12. The proposed pitch-marking scheme • Detection principle • Detection of the scale that contains the fundamental period • Starting from a higher scale (of lower frequency), there is a considerable jump in frame power when this scale is encountered • Features • 4X decimation to support high sampling rates • Frame based processing and error correction for possible quasi-real-time detection

  13. The proposed pitch-marking system

  14. Comparisons of results with Auto-Tune Proposed system Auto-Tune

  15. Component No.2The Modification Engine

  16. (n) (n) (n) D(n) Time/pitch/amplitude modification engine (n): time-modification factor (n): pitch-modification factor (n): amplitude modification factor D(n): time-warping function

  17. TD-PSOLA(Time-domain Pitch Synchronous Overlap-Add) • Time-domain splicing overlap-add method • Used in prosodic modification of speech

  18. Evaluation of the modification engine Original TD-PSOLA Auto-Tune

  19. Component No.3Time-alignment

  20. Time-alignment • Based on Verhelst’s prototye system that applies Dynamic Time Warping (DTW) • He claimed that the basic local constrain produces the most accurate time-warping path • Exponential increase in computation as length of comparison increases • Accuracy deteriorates as length of comparison increases

  21. Adaptations from Verhelst’s method • Proposed to perform time-alignment on a voiced/unvoiced segmental basis • DTW for voiced segments • Linear Time Warping (LTW) for unvoiced segments • Global constraints are introduced to further reduce computations • Synchronization of voiced/unvoiced segments are required, which is manually edited in current implementation

  22. Manipulation of modification parameters • Simple smoothing of (n), (n) using linear phase FIR low-pass filters are performed before feeding them to the modification engine

  23. The Prototype System

  24. System Evaluation: case 1

  25. System Evaluation: case 2

  26. System Limitations • Segmentation • Lack of a reliable technique for voiced/unvoiced segmentation • Segmentation and classification of different vocal sounds is the key to devise rules for modification • Modification engine • Lack capabilities to handle pitch transition, total dependence to the pitch-marking stage

  27. System Limitations • Pitch-marking • Proposed system lacks robustness • Despite desirable time-response of the wavelet filter bank, its frequency response is not capable of isolating harmonics effectively and efficiently • Time-alignment • The DTW basic local constraint allows infinite time expansion and compression. • This factor often causes distortions in the synthesized vocal sample

  28. Conclusions and Recommendations • Current systems works well for slow and continuous singing • Further improvements on the individual components are recommended to handle greater dynamic changes of the vocal signal, thereby extending the current good results to a wider range of singing styles

  29. Questions & Answers

  30. Wavelet filter bank

  31. Dyadic Spline Wavelet

  32. Wide-band analysis

  33. DTW local constraints

  34. Calculation of pitch-marks

  35. DyWT

More Related