1 / 19

Audio

Audio. Hao Jiang Computer Science Department Boston College Oct. 11, 2007. Digital Audio. Audio comes from different sources: Speech. Sounds of instruments, Music. Sounds of all other kinds (the sound of wind, train and ocean). Audio needs new methods for coding and processing.

ishi
Download Presentation

Audio

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Audio Hao Jiang Computer Science Department Boston College Oct. 11, 2007 CS335 Principles of Multimedia Systems

  2. Digital Audio • Audio comes from different sources: • Speech. • Sounds of instruments, Music. • Sounds of all other kinds (the sound of wind, train and ocean). • Audio needs new methods for coding and processing. • Audio processing is a key task in multimedia systems • Audio coding (MPEG audio, mp3, AAC and others) • Authoring and representation (composition) • Analysis and searching (retrieval and database) • 3D sound, etc. • We will focus on basic audio processing, MPEG audio and related topics. CS335 Principles of Multimedia Systems

  3. Audio Processing • Audio authoring Audio file formats: Waveform files and MIDI. MIDI: Musical Instrument Digital Interface. Instead of storing the waveform samples, MIDI file has a sequence of commands to control an audio device to generate a specified note with given properties. CS335 Principles of Multimedia Systems

  4. Audio Processing Using Matlab • To load a wave in Windows: audat = wavread(‘filename.wav’); Or, directly open the file and load a stream of “words” (2 bytes) or bytes depending on the wav format. • To play a sound, use sound(audat, samplingrate). • To display the spectrogram, use specgram. • Audio analysis are done in frames of 20ms – 40ms long. CS335 Principles of Multimedia Systems

  5. Frequency Domain Analysis • Fourier transform can be used to decompose any signal into summation of sinusoidal waves. • In Matlab, we can use fft (Fast Fourier Transform) for frequency domain analysis. T Base frequency ¼ 1/T The time domain waveform The frequency Domain components. CS335 Principles of Multimedia Systems

  6. MP3 and Others • MPEG (Motion Picture Expert Group) and ISO (International Standard Organization) have published several standards about digital audio coding. • MPEG-1 Layer 1,2 and 3 (MP3) • MPEG2 AAC • MPEG4 AAC and TwinVQ • Other standards • Dolby AC3 • They have been widely used in consumer electronics, digital audio broadcasting, DVD and movies etc. CS335 Principles of Multimedia Systems

  7. Perceptual Coding in MPEG audio Encoder MUX Bit stream Dynamic bit allocation FFT Masking Threshold Encoder MUX audio Bit stream Dynamic bit allocation CS335 Principles of Multimedia Systems

  8. Simultaneous Masking • A strong audio component can mask its nearby frequency components. dB Masker Sound pressure level Masking threshold Threshold in quiet 1000 20000 Hz 20 CS335 Principles of Multimedia Systems

  9. Masking and Quantization Masker dB Signal To mask ratio Sound pressure level m+1-bit quantizer SNR Minimum masking threshold for band A. m-bit quantizer SNR 20000 Hz 20 Critical band A Neighbor critical band A critical band defines the “resolution” of the hearing at some frequency location. CS335 Principles of Multimedia Systems

  10. Temporal Masking Amplitude Pre-masking curve Post-masking curve time CS335 Principles of Multimedia Systems

  11. MPEG Perceptual Model • A matlab demo. CS335 Principles of Multimedia Systems

  12. MPEG Audio Layer 1 • MPEG (1 and 2) audio allows sampling rate at 44.1 48, 32, 22.05, 24 and 16KHz. • MPEG filters the input audio into 32 bands. 12 samples Filtering And downsampling Perceptual coder 12 samples Audio Normalize By scale factor 384 samples 12 samples CS335 Principles of Multimedia Systems

  13. MPEG Audio Layer 2 • Layer 2 is very similar to Layer 1, but groups 3 12-samples together in coding. • It also improves the scaling factor quantization and also groups 3 audio samples together in bit assignment. 36 samples Filtering And downsampling Perceptual coder 36 samples Audio Normalize By scale factor 3x384 samples 36 samples CS335 Principles of Multimedia Systems

  14. Overlapped Transform and MDCT Window 1 Window 3 2N Window 2 Window 4 In overlapped transform, 2N samples are transformed to N elements. 1 3 In reverse Transform: 2 4 + Reconstructed result. CS335 Principles of Multimedia Systems

  15. Some Matlab Codes • The program compares DCT and MDCT in audio processing. • Code is available on the course website as a tar ball mdct_and_dct.tar. CS335 Principles of Multimedia Systems

  16. MP3 • MP3 is another layer built on top of MPEG audio layer 2. • MP3 further does MDCT on each band and tries to encode the MDCT coefficients. • MP3 then uses Huffman coding to further compress the bit streams losslessly. CS335 Principles of Multimedia Systems

  17. File Format Mpeg audio puts header in each of the frame, so that they can be decoded separately. Header CRC Bit Allocation Scale factors Subband Data Header CRC Bit Allocation Scale factors Subband Data Frame 1 Frame 2 CS335 Principles of Multimedia Systems

  18. Other Audio Coding Standards • MPEG 2 and MPEG 4 ACC (advanced audio coding) • Not backward compatible • Use MDCT without bandpass filtering • Dolby AC3 • MDCT based codec • Similar to MPEG ACC but uses a different quantization and coding scheme • A de-facto standard for DVD and Digital audio in Movie. CS335 Principles of Multimedia Systems

  19. Realtime Audio Systems Audio I/O Process Write pointer Read pointer Audio input circular queue Audio Processing Unit Audio output circular queue CS335 Principles of Multimedia Systems

More Related