1 / 10

MLP speech enabled systems applications

AI: Neural Networks lecture 4 Tony Allen School of Computing & Informatics Nottingham Trent University. MLP speech enabled systems applications.

tyrellj
Download Presentation

MLP speech enabled systems applications

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. AI: Neural Networks lecture 4Tony AllenSchool of Computing & InformaticsNottingham Trent University

  2. MLP speech enabled systems applications • M. Nakamura, K. Maruyama, T. Kawabata & K. Shikanu; “Neural Network Approach to Word Category Prediction for English Texts”; Procs. of the Int. Conference on Computational Linguistics, 1990, Helsinki, pp. 213–218. • H. Schmid, “Part-of-Speech Tagging with Neural Networks”, Procs. of the Int. Conference on Computational Linguistics, 1994, pp. 172–176. • Norman Poh & Jerzy Korczak; “Hybrid Biometric Person Authentication Using Face and Voice Features”; Proceedings of 3rd Int. Conference, Audio- and Video-Based Biometric Person Authentication AVBPA 2001, Sweden, pp 348-353, June 2001.

  3. Part-of-Speech Tag Prediction In Bigram neural network predictor system, input vector is (one of 89 bit) POS tag for previous word. Output vector is (one of 89 bit) POS tag for next word.

  4. Part-of-Speech Tag Prediction: N-Gram

  5. Part-of-Speech Tag Prediction: Results • Prediction accuracy increases as more of the output neurons are included within the output classification

  6. Part-of-Speech Tag Prediction: Application • Neural Network predictor used to improve speech recognition results by 6%. • HMM recogniser produces 10 potential word/tag candidates for each recognition event. Netgram predictor output used to select best word/tag candidate.

  7. Part-of-Speech Tag Disambiguation • Each output node corresponds to one of the tags in the tagset. • Input vector comprises the POS probabilities of the current word and 2 future words plus the disambiguated tags of 3 preceeding words. • POS probabilities obtained from look-up table

  8. Part-of-Speech Tag Disambiguation: Results • Network trained on 2 million subpart of PennTreebank corpus over 4 million epochs. • Networks tested on 100000 word subpart which was not part of training set.

  9. Biometric user authentication 10 moments extracted from HSI colour eye images automatically located in face images using histogram analysis, round mask convolution and peak-searching algorithm. 64 Morlet wavelet coefficients extracted from 3 seconds worth of speech input.

  10. Biometric user authentication: Results 2 mlps (one for face and one for voice) are trained for each of 30 people. Outputs of each pair of mlps ANDED together to give verification output for each person.

More Related