1 / 25

Image Processing Algorithm for Speech Acoustics

This article presents an algorithm designed in MATLAB for automating tracing of articulators in speech acoustics images, with a focus on the tongue. The algorithm incorporates techniques from the EdgeTrak System, Chan-Vese, and snake methods. The results show promising energy minimization and potential for application to other articulators. Future work includes applying the model to consecutive frames and exploring intensity-based external energy methods.

skern
Download Presentation

Image Processing Algorithm for Speech Acoustics

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Erin Plasse Advisors: Professor Hanson Professor Rudko Image Processing Algorithm for Speech Acoustics

  2. Introduction • Experiment done in 1960’s by Kenneth Stevens and Dr. Sven Öhman in Sweden • Used a cineradiograph x-ray to take lateral images of the vocal tract • 31 utterances and 2 sentences were made • Analyzed how articulators displace over time • 45 frames/second

  3. Movie Clip

  4. Image Processing • Perkell (1969)- Used manual methods to make tracings of the images

  5. Typical Tracing Perkell (1969) used manual

  6. Typical Analysis Perkell (1969) used manual measures

  7. Goals & Parameters • Design an algorithm in MATLAB to automate the tracings using edge detection methods • Trace certain articulators, such as, lips, velum, epiglottis, hyoid bone, etc. • Results should be similar to the original tracings • Only 13 utterances were analyzed • Obtain tracings for the 20 utterances not analyzed by Perkell (1969) • Manual extraction is time consuming • Smooth and continuous curves

  8. Design Alternatives • Snakes: Active Contour Models • Matlab script written by Eric Debreuve • Chan-Vese Region Based Segmentation Algorithm • Matlab script written by Shawn Lankton • EdgeTrak System for Ultrasound images • VIMS Lab, University of Delaware • Customize one of above to create own design for the data

  9. Snakes: Active Contour ModelsMichael Kass • Snake: Energy minimizing spline guided by external forces • Image forces pull it toward lines and edges • MATLAB code written by Eric Debreuve • Only worked with binary images

  10. Chan-Vese Algorithm • Region based segmentation • Use homogeneity of intensity in a region as the constraint • Only applicable to closed contours • Uses an initial mask region • MATLAB script written by Shawn Lankton

  11. Pharynx using Chan-Vese

  12. EdgeTrak System • Li, Kambhamettu, Stone • Uses gradient image forces and intensity information in local regions • Energy definition for snakes: • ETotal= α Eint+ β Eext • Energy band gap • External energy is redefined for EdgeTrak as: • E′ext(vi) = Eband(vi) •Eext(vi) • Not effective for closed contours • Good for tracking tongue in noisy images with high-contrast unrelated edges

  13. Energy Minimization Band • Main contribution of EdgeTrak method, finds the intensity of the regions. • Energy band regions are found around each snake element • Find mean intensity difference between regions • Find new external energy using band energy • Minimize total energy using dynamic programming

  14. EdgeTrak Program

  15. The Final Design • Used methods from both the EdgeTrak System, Chan-Vese, and snake methods. • Implemented using MATLAB • Used only the image gradient to find edges • Tongue is the articulator that is focused on

  16. MATLAB Code • User picks 5 points • 33 snake elements found using spline interpolation • Computes internal and external energy of initial snake elements • Computes internal and external energies of points surrounding each initial point • Finds the surrounding point with the lowest energy, this becomes new point • New contour is graphed

  17. MATLABCode Demo %Edge_trak_demo %Coded by Erin Plasse

  18. Final Results Alpha = .2, Beta = .8, Delta = 5 • Results- • Energy of original snake = -96.9553 • Energy of new snake = 1.2244 • Percent change Snake energy = 101.2629

  19. E_snake_orig = -21.8775 E_snake_new = -0.5480 Percent_change_Snake_energy = 97.495

  20. Initial Points Final Points

  21. Application to other articulators

  22. Cont.

  23. Future Work • Apply the contour model to a sequence of consecutive frames • Find more articulators • Use the intensity method for external energy as described in the Edge Trak program

  24. References • Perkell, Joseph S.. Physiology of Speech Production: Results and Implications of a Quantitative Cineradiographic Study. Cambridge, MA: The MIT Press, 1969. • Stevens, Kenneth and Öhman, Dr. Sven. (1963). “Cineradiographic Studies of speech:procedures and objectives.” J. Acoust. Soc. Am., 35, 1889. • M. Kass, A. Witkin, and D. Terzopoulos, “Snakes: Active Contour Models,” Int. J. Comput. Vis., vol. 1, pp. 321-331, 1988. • T.F. Chan, L.A. Vese. Active Contours Without Edges. IEEE Trans. On Img. Processing., vol. 10 , pp.266-277, 2001. • M. Li, C. Kambhametti, M. Stone. Automatic Contour Tracking in Ultrasound Images. 2004.

  25. Questions?

More Related