Multimedia Technology

Multimedia Technology Compression I

Compression techniques were developed early in the life of computers, to cope with the problems of limited memory and storage capacity • Hardware advances have limited the requirement for such techniques in desktop applications • Network and communication capacity restrictions have resulted in continuing work on compression • The advent of distributed multimedia has resulted in considerable developments in compression • Problem : real-time, or timely, transmission of audio and video over communications networks

Simple Interpolative Predictive Transform Statistical • Digital Video and Image Compression Truncation Subsample DPCM DCT Huffman ADPCM CLUT Motion Comp. Run-length Fixed Adaptive Bit Assignment Video Input Colour Components Video Compression Algorithm Compressed Bit-Stream

As can be seen from the diagram, the majority of video compression algorithms use a combination of compression techniques to produce the bit-stream. We will consider each of the individual techniques identified in the diagram. • We assume that all input to the system is in the form of a PCM (Pulse Code Modulation - we will discuss this later when considering Sound sampling) digitised signal in colour component (RGB, YUV) form. • Selection of colour component form can be important, where there are differences in colour processing between compression and decompression. • Techniques can be made adaptive to the image content.

Simple Compression (Encoding) Techniques • Truncation • throw away least significant bits for each pixel • too much truncation will affect contouring, image becomes cartoon-like • for real images, truncation from 24bpp to 16bpp gives good results (RGB = 5:5:5 + keying bit; YUV=6:5:5) • CLUT • Colour Lookup Table • pixel values in the bitmap represent an index into a table of colours • usually 8bpp, so image limited to 256 colours • unique CLUT can be created for each image, but this results in non-trivial preprocessing • bpp can be increased for better quality, but once you reach 16bpp truncation is better and simpler

Run-length Encoding • blocks of repeated pixels are replaced with a single value plus a count • works well on images with large repeated blocks of solid colours, can achieve compression rates below 1bpp • good for computer-generated images, cartoons, etc. • poor for real images, video, etc. • Interpolative Techniques • Interpolative encoding works at the pixel level by transmitting a subset of the pixels and using interpolation to reconstruct the intervening pixels • not really compression as we are reducing the number of pixels rather than the size of their representation • it is validly used in colour subsampling, working with luminance-chrominance component images (YUV), can reduce 24bpp to 9bpp • also used in motion video compression (i.e. MPEG)

Predictive Techniques • Based on the fact that we can store the previous item (frame, line, pixel, etc.) and use it to help build the next item, allowing us to transmit only that part of the item that has changed. • DPCM • Compare adjacent pixels and only transmit the difference between them, because adjacent pixels are likely to be similar the difference value have a high probability of being small and can safely be transmitted with fewer bits. Hence we can use 4 bit difference values for 8 bit pixels. • In decompression the difference value is used to modify the previous pixel to get the new one, which works well as long the amplitude change is small. If the change is a full-amplitude, say from black to white, it would overload the DPCM system, requiring a number of pixel times to make the change and causing smearing of the edges in high-contrast images (slope overload).

ADPCM • Adaptive DPCM • Cam adapt the step size for the difference values to cope with full amplitude changes. Some extra overhead in data and processing to achieve adaptation. • Replaces slope overload with quantisation noise for high-contrast edges. • Since predictive encoding is dependent on previous pixels for future ones, any errors are likely to be exacerbated. To avoid this typically predictive schemes make differential start overs, often at the beginning of each scanning line or each frame. • Transform Coding Techniques • A transform is a process that converts a bundle of data into an alternate form which is more convenient for some purpose. • Transforms are usually reversible, using an inverse transform.

2x2 Array of Pixels • In image and video compression, the bundle of data is usually a two-dimensional array of pixels, i.e. 8x8. A B C D Transform Inverse Transform X0 = A An = X0 X1 = B - A Bn = X1 + X0 X2 = C - A Cn = X2 + X0 X3 = D - A Dn = X3 + X0

In the simple example shown, if the pixels were 8 bits each then the block would use 32 bits : • Using the transform we could assign 4 bits each for the difference values and 8 bits for the base pixel, A. This would reduce the data to 8 + (3x4) or 20 bits for the 2x2 block - compressing from 8bpp to 5bpp. • This example is too small to be useful, typically transforms are enacted on 8x8 blocks and the trick is to develop good transforms with calculations that are easy to implement in hardware or software. • The Discrete Cosine Transform • especially important for video and image compression • typically used on 8x8 pixel blocks, processing 64 pixel values and 64 new values are output, representing the amplitudes of the two-dimensional spatial frequency components of the 64-pixel block - these are referred to as DCT coefficients. • the coefficient for zero spatial frequency is called the DC coefficient, the remaining 63 are the AC coefficients, and they all represent amplitudes of progressively higher spatial frequencies in the block

As adjacent pixel values tend to be similar or vary slowly from one to another, the DCT processing provides the opportunity for compression by forcing most of the signal energy into the lower spatial frequency components. In most cases, many of the higher-frequency coefficients will have zero or near-zero values and can be ignored. • Statistical Coding • Uses the statistical distribution of the pixel values in an image, or of the data created from one of the techniques already described. • Also known as entropy encoding • Can be used in bit assignment as well as part of the compression algorithm itself. • Due to the non-uniform distribution of pixel values, we can set up a coding technique where the more frequently occurring values are encoded using fewer bits.

A codebook is created which sets out the encodings for the pixel values, this is transmitted separately from the image data and can apply to part of an image, a single image or a sequence of images. • Because the most frequently occurring values are transmitted using fewer bits high compression ratios can be achieved. • One of the most widely used forms of statistical coding is called Huffman encoding. • Motion Compensation • If we are transmitting video frames on the basis of describing the difference between one frame and the next, how do we describe motion? • Compare frames for differences • Set threshold value for motion

Use DPCM approach to encode the data • Use block structure to determine motion in parts of image (similar to transform approach) • In sophisticated compression systems, motion vectors can be developed to ensure fidelity of reproduction • Classification of Compression Algorithms • Lossless compression • image is mathematically equivalent to original • only achieves modest level of compression (5:1) • Lossy compression • image shows degradation from original • high rates of compression (up to 200:1) • Objective - achieve highest possible rate of compression while maintaining quality of image to be “virtually lossless”

Multimedia Technology