1 / 34

Speech Coding

Speech Coding. Submitted To: Dr. Mohab Mangoud Submitted By: Nidal Ismail. Outline. Introduction Overview of Speech Coding Properties of a Speech Coder Modeling the Speech Production System Linear Prediction Different Coding Techniques Waveform Coders Parametric Coders Hybrid Coders

belita
Download Presentation

Speech Coding

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Speech Coding Submitted To: Dr. Mohab Mangoud Submitted By: Nidal Ismail

  2. Outline • Introduction • Overview of Speech Coding • Properties of a Speech Coder • Modeling the Speech Production System • Linear Prediction • Different Coding Techniques • Waveform Coders • Parametric Coders • Hybrid Coders • Coding Standards • PCM & DPCM • Linear Predictive Coding • Conclusion • References

  3. 1. Introduction Overview of Speech Coding Block Diagram ofa speech coding system Sampling Frequency = 8kHz Bit Rate = 8 . 8kHz = 64 kbps Number of Bits per sample = 8

  4. 1. Introduction Properties of a Speech Coder • Low Bit-Rate • High Speech Quality • Robustness Across Different Speakers / Languages • Robustness in the Presence of Channel Errors • Good Performance on Non speech Signals • Low Memory Size and Low Computational Complexity • Low Coding Delay

  5. 1. Introduction Modeling the Speech Production System Speech = Voiced + Unvoiced sounds

  6. 1.Introduction Modeling the Speech Production System Autocorrelation values for the signal frames. Left: Unvoiced. Right: Voiced.

  7. 1.Introduction Modeling the Speech Production System • Signal from a source is filtered by a time-varying filter with resonant properties similar to that of the vocal tract. • The gain controls Av and AN determine the intensity of voiced and unvoiced excitation. • The frequency of higher formant are attenuated by -12 dB/octave (due to the nature of our speech organs).

  8. 1.Introduction Linear Prediction • Linear prediction is a practical method of spectrum • estimation, where the PSD can be captured using a few coefficients. • These coefficients or linear prediction coefficients can be used to construct the synthesis filter. Linear prediction as system identification.

  9. 1.Introduction Linear Prediction Predicted Signal Prediction error Linear prediction as system identification.

  10. Outline • Introduction • Overview of Speech Coding • Properties of a Speech Coder • Modeling the Speech Production System • Linear Prediction • Different Coding Techniques • Waveform Coders • Parametric Coders • Hybrid Coders • Coding Standards • PCM & DPCM • Linear Predictive Coding • Conclusion • References

  11. 2. Different Coding Techniques Waveform Coders • Original shape of the signal waveform is preserved • Coders can be applied to any signal source • Coders are better suited for high bit-rate coding, since performance drops sharply with decreasing bit-rate. • In practice, these coders work best at a bit-rate of 32 kbps and higher. • Some examples of this class include various kinds of pulse code modulation (PCM) and adaptive differential PCM (ADPCM)

  12. 2. Different Coding Techniques Parametric Coders • The speech signal is generated from a model, which is controlled by some parameters. • Parameters are estimated from the input speech signal • No attempt to preserve the original shape of the waveform • Accuracy and sophistication of the mode account for the quality. • The most successful model is based on linear prediction. In this approach, the human speech production mechanism is summarized using a time-varying filter ( with the coefficients of the filter found using the linear prediction analysis procedure.) • This class of coders works well for low bit-rate. • Bit-rate is in the range of 2 to 5 kbps. • Example coders of this class include linear prediction coding (LPC) and mixed excitation linear prediction (MELP).

  13. 2. Different Coding Techniques Hybrid Coders • Combines the strength of a waveform coder with that of a parametric coder • As in waveform coders, an attempt is made to match the original signal with the decoded signal in the time domain • This class dominates the medium bit-rate coders, with the code-excited linear prediction (CELP) algorithm and its variants the most outstanding representatives • A hybrid coder tends to behave like a waveform coder for high bit-rate, and like a parametric coder at low bit-rate, with fair to good quality for medium bit-rate.

  14. 2. Different Coding Techniques Coding Standards

  15. Outline • Introduction • Overview of Speech Coding • Properties of a Speech Coder • Modeling the Speech Production System • Linear Prediction • Different Coding Techniques • Waveform Coders • Parametric Coders • Hybrid Coders • Coding Standards • PCM & DPCM • Linear Predictive Coding • Conclusion • References

  16. 3. PCM & DPCM Pulse Code Modulation • Invented 1926, deployed 1962. • Basic idea: assign smaller quantization stepsize for small-amplitude regions and larger quantization stepsize for large-amplitude regions (Non-uniform Quantization) • Two types of nonlinear compressing functions • Mu-law adopted by North American telecommunications systems • A-law adopted by European telecommunications systems • Mu-law(A-law) compresses the signal to 8 bits/sample or 64Kbits/second (without compandor, we would need 12bits/sample)

  17. 3. PCM & DPCM -law Pulse Code Modulation where A is the peak-input magnitude and  is a constant that controls the degree of compression.

  18. 3. PCM & DPCM -law Examples Pulse Code Modulation

  19. 3. PCM & DPCM A-law Pulse Code Modulation with Ao a constant that controls the degree of compression.

  20. 3. PCM & DPCM A-law Examples Pulse Code Modulation

  21. 3. PCM & DPCM Differential Pulse Code Modulation • Since speech signals are slowly varying, it is possible to eliminate the temporal redundancy by prediction • Quantizing the • prediction-error Signal • i[n] are entered into the quantizer’s decoder to obtain the quantized prediction error, which is combined with the prediction xp[n] to form the quantized input. DPCM encoder (top) and decoder (bottom)

  22. 3. PCM & DPCM Differential Pulse Code Modulation • Comparison between PCM and DPCM • Half the bit rate was used in DPCM and a higher SNR was achieved PCM quantized Signal(left) and Quantization error (right) DPCM quantized Signal(left) and Quantization error (right)

  23. Outline • Introduction • Overview of Speech Coding • Properties of a Speech Coder • Modeling the Speech Production System • Linear Prediction • Different Coding Techniques • Waveform Coders • Parametric Coders • Hybrid Coders • Coding Standards • PCM & DPCM • Linear Predictive Coding • Conclusion • References

  24. 4. Linear Predictive Coding • Linear prediction coding relies on a highly simplified • model for speech production • Parameters of the model are estimated from the speech samples The LPC model of speech production

  25. 4. Linear Predictive Coding • Parameters of the model are estimated from the speech samples These include: • Voicing: whether the frame is voiced or unvoiced. • Gain: mainly related to the energy level of the frame. • Filter coefficients: specify the response of the synthesis filter. • Pitch period: in the case of voiced frames, time length between consecutive excitation impulses. The LPC model of speech production

  26. 4. Linear Predictive Coding • By carefully allocating bits for each parameter so as to minimize distortion, an impressive compression ratio can be achieved. • For instance, the bit-rate of 2.4kbps for the FS1015 coder is 53.3 times lower than the corresponding bit-rate for 16-bit PCM • Estimating the parameters is the responsibility of the encoder. • The decoder takes the estimated parameters and uses the speech production model to synthesize speech

  27. 4. Linear Predictive Coding Block diagram of the LPC encoder.

  28. 4. Linear Predictive Coding Block diagram of the LPC decoder.

  29. 4. Linear Predictive Coding • The Voicing Detector is a key element to successful coding. • The purpose of the voicing detector is to classify a given frame as voiced or unvoiced. • Measurements that a voicing detector relies on to • accomplish its task : • Energy or • Zero Crossing Rate • Prediction Gain

  30. 4. Linear Predictive Coding Top left: A speech waveform. Top right: Magnitude sum function. Bottom left: Zero crossing rate. Bottom right: Prediction gain.

  31. 4. Linear Predictive Coding Samples/frame : 180 samples Bandwidth: 2.4kbps Frame Size: 22.5ms = 44.44 frames/sec

  32. 4. Linear Predictive Coding

  33. 5. Conclusion • An overview of speech coding was introduced with a brief explanation of the speech production model. Properties of different coding techniques were also co0mpared. For wire line transmission coding, PCM and DPCM were covered. Linear Prediction Coding which is a basic for modern wireless systems was also introduced.

  34. 6. References • Speech Coding Algorithms “Wai C. Chu” • Digital Communications “Bernard Skalr”

More Related