1 / 18

A Spectral-Temporal Method for Pitch Tracking

A Spectral-Temporal Method for Pitch Tracking. Stephen A. Zahorian*, Princy Dikshit, Hongbing Hu* Department of Electrical and Computer Engineering Old Dominion University, Norfolk, VA 23529, USA. * Currently at Binghamton University 09/17/2006. Outline. Introduction Algorithm

sera
Download Presentation

A Spectral-Temporal Method for Pitch Tracking

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. A Spectral-Temporal Method for Pitch Tracking Stephen A. Zahorian*, Princy Dikshit, Hongbing Hu* Department of Electrical and Computer Engineering Old Dominion University, Norfolk, VA 23529, USA. * Currently at Binghamton University 09/17/2006

  2. Outline • Introduction • Algorithm • Algorithm overview • The use of nonlinear processing • Pitch tracking from the spectrum • Experimental evaluation • Conclusion

  3. Introduction • Pitch(the fundamental frequency) applications • Automatic speech recognition (ASR), speech synthesis, speech articulation training aids, etc. • Pitch detection algorithms • “Robust and accurate fundamental frequency estimation based on dominant harmonic components,” Nakatani, etc => High accuracy for noisy speech reported using the harmonic dominance spectrum • “Yet another algorithm for pitch tracking(YAAPT),” Zahorian, etc => Hybrid spectral-temporal processing for pitch tracking

  4. Algorithm Overview

  5. 1st harmonic 2nd harmonic Fundamental The fundamental reappears The Use of Nonlinear Processing • Restoration of missing fundamental in telephone speech • A periodic sound is characterized by the spectrum of its harmonics • The signal the fundamental missed be approximated as • After squaring and applying trigonometric identities

  6. Illustration of Nonlinear Processing • The telephone speech signal (top panel) and squared telephone signal (bottom panel) for one frame

  7. Illustration of Nonlinear Processing • The magnitude spectrum for the telephone (top panel) and nonlinear processed signal (bottom panel)

  8. Spectral Effects from Nonlinear Processing • The missing fundamental in the telephone speech (top panel) is restored in the squared signal (bottom panel)

  9. Pitch Tracking From the Spectrum • The pitch track from the spectrum refines the pitch candidates estimated from the temporal method • To achieve a noise robust pitch track from the spectrum, an autocorrelation type of function is proposed

  10. k 4k 2k 3k WL X X X : Frequency index, : The spectrum, : The number of harmonics (3), : Window length (20Hz) Autocorrelation type of Function • The function takes into account multiple harmonics • Equation

  11. A very prominent peak is observed in the proposed function Peaks in Autocorrelation Type of Function

  12. P2(Hz)=P1(Hz)/2 Candidate Insertion to Reduce Pitch Doubling/Halving • If all candidates are larger than a threshold (typically 150 Hz), an additional candidate is inserted at half the frequency of the highest-ranking candidate • Similar logic is used to reduce pitch halving

  13. Experimental Evaluation • Database • Keele pitch extraction database • 5 male and 5 female speakers, about 35seconds speaker • High quality speech and telephone speech • Additive Gaussian noise • Controls (reference pitch) • Control C1: supplied in Keele database • Control C2: computed from the laryngograph signal with the proposed algorithm

  14. Definition of Error Measures • Gross error • The percentage of frames such that the pitch estimate of the tracker deviates significantly (typically 20%) from the reference pitch (control) • Only evaluated in the voiced sections of the reference

  15. Experiment 1 Results • Individual performance of the proposed algorithm YAAPT*: Using control C1 for the spectral pitch track NCCF : Normalized cross correlation function, used as the temporal method in YAPPT

  16. The results of the new method with various error thresholds Experiment 2 Results

  17. Comparisons • DASH, REPS, YIN: the results are reported in “Robust and accurate fundamental frequency estimation ... ,” Nakatani, etc. • *: SRAEN filter simulated telephone speech

  18. Conclusion • A new pitch-tracking algorithm has been developed which combines multiple information sources to enable accurate robust F0 tracking • An analysis of errors indicates better performance for both high quality and telephone speech than previously reported performance for pitch tracking • Acknowledgements • This work was partially supported by JWFC 900

More Related