1 / 52

2.5.4.1 Basics of Neural Networks

2.5.4.1 Basics of Neural Networks. 2.5.4.2 Neural Network Topologies. 2.5.4.2 Neural Network Topologies. 2.5.4.2 Neural Network Topologies. TDNN. 2.5.4.6 Neural Network Structures for Speech Recognition.

melissaj
Download Presentation

2.5.4.1 Basics of Neural Networks

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. 2.5.4.1 Basics of Neural Networks

  2. 2.5.4.2 Neural Network Topologies

  3. 2.5.4.2 Neural Network Topologies

  4. 2.5.4.2 Neural Network Topologies

  5. TDNN

  6. 2.5.4.6 Neural Network Structures for Speech Recognition

  7. 2.5.4.6 Neural Network Structures for Speech Recognition

  8. 3.1.1 Spectral Analysis Models

  9. 3.1.1 Spectral Analysis Models

  10. 3.2 THE BANK-OF-FILTERS FRONT- END PROCESSOR

  11. 3.2 THE BANK-OF-FILTERS FRONT- END PROCESSOR

  12. 3.2 THE BANK-OF-FILTERS FRONT- END PROCESSOR

  13. 3.2 THE BANK-OF-FILTERS FRONT- END PROCESSOR

  14. 3.2 THE BANK-OF-FILTERS FRONT- END PROCESSOR

  15. 3.2.1 Types of Filter Bank Used for Speech Recognition

  16. Nonuniform Filter Banks

  17. Nonuniform Filter Banks

  18. 3.2.1 Types of Filter Bank Used for Speech Recognition

  19. 3.2.1 Types of Filter Bank Used for Speech Recognition

  20. 3.2.2 Implementations of Filter Banks • Instead of direct convolution, which is computationally expensive, we assume each bandpass filter impulse response to be represented by: Where w(n) is a fixed lowpass filter

  21. 3.2.2 Implementations of Filter Banks

  22. 3.2.2.1 Frequency Domain Interpretation of the Short-Time Fourier Transform

  23. 3.2.2.1 Frequency Domain Interpretation of the Short-Time Fourier Transform

  24. 3.2.2.1 Frequency Domain Interpretation of the Short-Time Fourier Transform

  25. 3.2.2.1 Frequency Domain Interpretation of the Short-Time Fourier Transform

  26. Linear Filter Interpretation of the STFT

  27. 3.2.2.4 FFT Implementation of a Uniform Filter Bank

  28. Direct implementation of an arbitrary filter bank

  29. 3.2.2.5 Nonuniform FIR Filter Bank Implementations

  30. 3.2.2.7 Tree Structure Realizations of Nonuniform Filter Banks

  31. 3.2.4 Practical Examples of Speech-Recognition Filter Banks

  32. 3.2.4 Practical Examples of Speech-Recognition Filter Banks

  33. 3.2.4 Practical Examples of Speech-Recognition Filter Banks

  34. 3.2.4 Practical Examples of Speech-Recognition Filter Banks

  35. 3.2.5 Generalizations of Filter-Bank Analyzer

  36. 3.2.5 Generalizations of Filter-Bank Analyzer

  37. 3.2.5 Generalizations of Filter-Bank Analyzer

  38. 3.2.5 Generalizations of Filter-Bank Analyzer

  39. روش MFCC • روش MFCC مبتني بر نحوه ادراک گوش انسان از اصوات مي باشد. • روش MFCC نسبت به ساير ويژگِيها در محيطهاي نويزي بهتر عمل ميکند. • MFCC اساساً جهت کاربردهاي شناسايي گفتار ارايه شده است اما در شناسايي گوينده نيز راندمان مناسبي دارد. • واحد شنيدار گوش انسان Mel مي باشد که به کمک رابطه زير بدست مي آيد:

  40. مراحل روش MFCC مرحله 1: نگاشت سيگنال از حوزه زمان به حوزه فرکانس به کمک FFT زمان کوتاه. : سيگنال گفتارZ(n) : تابع پنجره مانند پنجره همينگW(n( WF= e-j2π/F m : 0,…,F – 1; : طول فريم گفتاري.F

  41. مراحل روش MFCC مرحله 2: يافتن انرژي هر کانال بانک فيلتر. که M تعداد بانکهاي فيلتر مبتني بر معيار مل ميباشد. تابع فيلترهاي بانک فيلتر است.

  42. توزيع فيلتر مبتنی بر معيار مل

  43. مراحل روش MFCC • مرحله 4: فشرده سازي طيف و اعمال تبديل DCT جهت حصول به ضرايب MFCC • در رابطه بالا L،...،0=n مرتبه ضرايب MFCC ميباشد.

  44. سیگنال زمانی Mel-scaling فریم بندی |FFT|2 Logarithm IDCT Cepstra Low-order coefficients Delta & Delta Delta Cepstra Differentiator روش مل-کپستروم

  45. Time-Frequency analysis • Short-term Fourier Transform • Standard way of frequency analysis: decompose the incoming signal into the constituent frequency components. • W(n): windowing function • N: frame length • p: step size

  46. Critical band integration • Related to masking phenomenon: the threshold of a sinusoid is elevated when its frequency is close to the center frequency of a narrow-band noise • Frequency components within a critical band are not resolved. Auditory system interprets the signals within a critical band as a whole

More Related