audio coding n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Audio Coding PowerPoint Presentation
Download Presentation
Audio Coding

Loading in 2 Seconds...

play fullscreen
1 / 30

Audio Coding - PowerPoint PPT Presentation


  • 215 Views
  • Uploaded on

Audio Coding. Digitization Processing. Signal encoder. Digital data. Signal decoder. Analog signal. storage. sampling. quantization. Overview of Today. Sampling Techniques. PCM Linear m -LaW DPCM ADPCM MPEG-1 Vocoding. Generic Coding Techniques. Psychoacoutic Coding.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

Audio Coding


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
    Presentation Transcript
    1. Audio Coding

    2. Digitization Processing Signal encoder Digital data Signal decoder Analog signal storage sampling quantization

    3. Overview of Today Sampling Techniques • PCM • Linear • m-LaW • DPCM • ADPCM • MPEG-1 • Vocoding Generic Coding Techniques Psychoacoutic Coding Speech Specific Techniques

    4. Encode Design • Bandlimiting filter • Smooth analog signals • Analog to digital converter (ADC) • Sample and Quantize analog signals.

    5. Bandlimiting filter Pass only frequency components up to half of Nyquist rate.

    6. Analog to digital converter

    7. Sampling • Pulse Amplitude Modulation (PAM) • Each sample’s amplitude is represented by 1 ________ value • Sampling theory (_________) • If input signal has ________ frequency (bandwidth) f, sampling frequency must be at least ____ • With a _____-pass filter to interpolate between samples, the input signal can be fully reconstructed

    8. Quantization error (“noise”) 0100 0011 0010 0001 0000 1001 1010 1011 1100 SNR – 4.77 n = 6.02 PCM • Pulse Code Modulation (PCM) • Each sample’s amplitude represented by an ________ code-word • Each bit of resolution adds __ dB of dynamic range • Number of bits required depends on the amount of noise that is tolerated

    9. Linear PCM • Quantization levels are _________ spaced. • ___ bit samples provide plenty of dynamic range. • Compact Disks do this.

    10. Under Sampling • Sample rate under Nyquist rate LF also called antialiasing filter Added to original signal and cause distortion.

    11. Quantization intervals

    12. Associated waveform set

    13. -Law companding (ITU Rec. G.711) • Non-linear quantization of the signal’s amplitude • Quantization step-size decreases logarithmically with signal ______ • Low-amplitude samples represented with ______ accuracy than high-amplitude samples • Humans are less sensitive to changes in “____” sounds than “_____” sounds

    14. ln(1 + |x|) f(x) = 127 x sign(x) x ln(1 + ) -Law companding • Provides __-bit quality (dynamic range) with an _-bit encoding • Used in North American & Japanese ISDN voice service • Simple to compute encoding (x normalized to [-1, 1])

    15. Difference Encoding 0100 0011 0010 0001 0000 1001 1010 1011 1100 • Differential-PCM (DPCM) • Exploit _________ redundancy in samples • ___________ between 2 x-bit samples can be represented with significantly fewer than x-bits • Transmit the difference (rather than the ________)

    16. DPCM Working Principle Previous sampling value

    17. “Slope Overload” Slope Overload Problem 0100 0011 0010 0001 0000 1001 1010 1011 1100 • Differences in high frequency signals near the ___________ frequency cannot be represented with a smaller number of bits! • Error introduced leads to severe distortion in the ______ frequencies

    18. Adaptive DPCM (ADPCM) • Use a larger step-size to encode differences between ______-frequency samples & a smaller step-size for differences between ____-frequency samples • Use ________ sample values to estimate changes in the signal in the near future

    19. ADPCM • To ensure differences are always small... • Adaptively change the ____-size (quanta) • (Adaptively) attempt to _____ next sample value y-bit PCM sample x-bit ADPCM “difference” + Difference Quantizer + – Step-Size Adjuster Predicted PCM Sample n+1 + Predictor Dequantizer + +

    20. Psychoacoustic Fundamentals • Absolute threshold of hearing • Critical band frequency analysis • Frequency masking • Temporal masking

    21. 100 80 60 40 20 0 Audible Inaudible 0.02 0.05 0.1 0.2 0.5 1 2 5 10 20 Absolute Threshold of Hearing Maximum allowable Energy level for Coding distortion • Human perception of sound is a function of ________ and signal __________ • (MPEG exploits this relationship.) • Sampled segments of the source audio waveform are analyzed but only those features _____________ to the ear are transmitted. • Psychoacoustic model is used to identify _________ masking and ________ masking and eliminate them from the transmitted signal. Sound Level (dB) Frequency (kHz)

    22. Auditory Masking 100 80 60 40 20 0 Audible • The presence of tones at certain frequencies makes us unable to perceive tones at other “_________” frequencies • Humans cannot distinguish between tones within _____ Hz at low frequencies and _____kHz at high frequencies Sound Level (dB) Masking tone Masked tone Inaudible Frequency (kHz) 0.02 0.05 0.1 0.2 0.5 1 2 5 10 20

    23. MPEG Encoder Block Diagram PCM Audio Samples (32, 44.1, 48 kHz) Mapping Quantizer Coding Psycho- acoutstic Model Frame Packing Encoded Bitstream Ancillary Data

    24. Vo-coding • Concept: Develop a __________ model of the vocal cords & throat • Derive/compute _____ parameters for a short interval and transmit to the decoder • Use the parameters to _______ speech at the decoder • So what is a good model? • A “buzzer” in a “tube”! • The buzzer is characterized by its _________ & _______ • The tube is characterized by its ___________s

    25. Vocoding - Basic Concepts 75 60 45 30 15 0 • Formant — frequency maxima & minima in the spectrum of the speech signal • Vocoders code • _____ • Period • _________, and • signaling vocal tract _________ parameters • Voiced sounds, m,v,and l. • Unvoiced sounds, f and s. Amplitude Frequency (kHz)

    26. p  k=1 “Buzzer” and “Tube” Model • Vocoding principles: • voice = _________s + buzz ______ & intensity • voice – estimated ________s = “residue” “yadda yadda yadda” • Linear Predictive Coding (LPC) • A sample is represented as a linear combination of ___ previous ________s y(n) =aky(n – k) +Gxx(n)

    27. LPC • Decoder artificially generates speech via _________ synthesis • A mathematical simulation of the _______ as a series of bandpass filters • Encoder codes & transmit filter _______, pitch period, gain factor, & nature of excitation

    28. LPC Schematic

    29. LPC Related Standards • Standards: • Regular Pulse Excited Linear Predictive Coder (RPE-LPC) • Digital cellular standard GSM 6.1 (___ kbps) • Code Excited Linear Predictive Coder (CELP) • US Federal Standard 1016 (_____ kbps) • Waveform template based to improve sound quality. • Linear Predictive Coder (LPC) • US Federal Standard 1015 (______ kbps) • Very synthetic and used primarily in military applications with very limited bandwidth.

    30. Networking Concerns • Audio bandwidth is actually quite small. • But human sensitivity to loss and noise is quite ________. • Networking concerns: • _______ concealment • ________ control • Especially for telephony applications.