Cgmb 324: multimedia system design

Cgmb 324: multimedia system design Chapter 9: Video Compression

Objectives Upon completing this chapter, you should be able to: • understand the terms and concept related to video compression • identify the commonly available video codec • apply video compression effectively in a multimedia system

An Overview Of Compression • Compression becomes necessary in multimedia because it requires large amounts of storage space and bandwidth • Types of Compression • Lossless compression = no data is altered or lost in the process • Lossy = some data is lost but the file can be reasonably reproduced using the remaining data.

Introduction • Uncompressed video data is huge. • In HDTV, the bit-rate could exceed 1 Gbps  big problems for storage and network communications. • We will discuss both Spatial and Temporal Redundancy Removal -- Intra-frame and Inter-frame coding.

Video Compression • Video compression deals with the compression of visual video data. • Video compression is necessary for efficient coding of video data in video file formats and streaming video formats. • video is basically a three-dimensional array of color pixels. • Two dimensions serve as spatial (horizontal and vertical) directions of the (moving) pictures and one dimension represents the time domain (temporal).

Video Compression • A frame is a set of all pixels that (approximately) correspond to a single point in time. • Basically, a frame is the same as a still picture. • However, in interlaced video, the set of horizontal lines with even numbers and the set with odd numbers are grouped together in fields. • The term "picture" can refer to a frame or a field.

Video Compression • video data contains spatial and temporal redundancy. • Similarities can thus be encoded by merely registering differences within a frame (spatial) and/or between frames (temporal). • Spatial encoding is performed by taking advantage of the fact that the human eye is unable to distinguish small differences in colour as easily as it can changes in brightness and so very similar areas of colour can be "averaged out" in a similar way to jpeg images • With Temporal compression only the changes from one frame to the next are encoded as often a large number of the pixels will be the same on a series of frames

Video Compression • Video compression typically reduces this redundancy using lossy compression. • Intraframe compression or spatial compression technique to reduce spatial redundancy from frames • Interframe compression or temporal compression motion compensation and other techniques to reduce temporal redundancy (adapted from www.wikipedia.org)

Video Compression • In broadcast engineering, digital television (DVB, ATSC and ISDB ) is made practical by video compression. • TV stations can broadcast not only HDTV, but multiple virtual channels on the same physical channel as well. • It also conserves precious bandwidth on the radio spectrum. • Nearly all digital video broadcast today uses the MPEG-2 standard video compression format, although H.264/MPEG-4 AVC and VC-1 are emerging contenders in that domain.

A Note On Container Formats • Container formats like AVI (Audio Video Interleave), OGM (Ogg Media), ASF (Advanced Streaming Format), MKV (Matroska), MOV and the MPEG System are NOT codecs. • They are instead media file formats which allow the combination of video, audio and even subtitles together into one file. • Typically, to decode a stream, a media player first demuxes it. This means that it reads the container format and separates audio, video, and subtitles, if any. Then, each of these are passed to decoders that do the mathematical processing to decompress the streams .

Video codec • A video codec is a device or software module that enables the use of compression for digital video (mostly lossy) • Important issues/factors : • balance between the video quality, the quantity of the data needed to represent it (also known as the bit rate), • the complexity of the encoding and decoding algorithms, • robustness to data losses and errors, • ease of editing and random access, • the state of the art of compression algorithm design, • end-to-end delay, • other factors.

Common Video codec • H.261 • MPEG-1 Part 2: Used for Video CDs, • MPEG-2 Part 2 : Used on DVD and in another form for SVCD • H.263: Used primarily for video conferencing, video telephony, and internet video. • MPEG-4 Part 2: An MPEG standard that can be used for internet, broadcast, and on storage media • MPEG-4 Part 10 - adopted into a number of company products, including for example the PlayStation Portable, the Nero Digital product suite, Mac OS X v10.4, as well as HD-DVD/Blu-Ray.

Common Video codec • DivX, XviD and 3ivx: Video codec packages basically using MPEG-4 Part 2 video codec, with the *.avi, *.mp4, *.ogm or *.mkv file container formats. • Sorenson 3: A codec that is popularly used by Apple's QuickTime, basically the ancestor of H.264 • WMV (Windows Media Video): Microsoft's family of video codec designs including WMV 7, WMV 8, and WMV 9. • RealVideo: Developed by RealNetworks • Cinepak: A very early codec used by Apple's QuickTime.

Missing Codecs • A common problem when an end user wants to watch a video stream encoded with a specific codec is that if the exact codec is not present and properly installed on the user's machine, the video won't play (or won't play optimally). • Some video files and codec analysis tools have been made available to provide a user-friendly way to solve this common problem: • VideoInspector : Analyzes most containers (AVI, Matroska, MPEG, etc.) and gives direct download links for missing codecs. • GSpot : A pioneer in troubleshooting video applications,

Video codec • A variety of codecs can be implemented with relative ease on PCs and in consumer electronics equipment • It is therefore possible for multiple codecs to be available in the same product, avoiding the need to choose a single dominant codec for compatibility reasons. • In the end it seems unlikely that one codec will replace them all

H. 261 • Developed by CCITT (Consultative Committee for International Telephone and Telegraph) in 1988-1990 • Designed for videoconferencing and video-telephone applications over ISDN telephone lines. • H.261 was the first practical digital video coding standard • All subsequent international video coding standards (MPEG-1, MPEG-2/H.262, H.263, and even H.264) have been based closely on its design • The bitrate is p x 64 Kb/sec, where p ranges from 1 to 30.

Overview Of H. 261 • Frame Sequence • Frame resolutions are CCIR 601 CIF (352 x 288) and QCIF (176 x 144)

Common Intermediate Format (CIF) • CIF is used to standardize the horizontal and vertical resolutions in pixels of YUV sequences in video signals. • QCIF means "Quarter CIF". These two formats are the most common in video coding. • To have fourth the area as "quarter" implies, height and width of the frame are halved.

Overview Of H.261 • Two frame types: • Two frame types: Intra-frames (I-frames) and Inter-frames (P-frames): • Intra-frames(I-frame) provides an accessing point, it basically uses JPEG. • Inter-frames (P-frames)  use "pseudo-differences" from previous frame ("predicted"), so the frames depend on each other.

H. 263 • A video codec designed as a low-bitrate encoding solution for videoconferencing • developed as an evolutionary improvement based on experience from H.261 • completed in 1995 and provided a suitable replacement for H.261 at all bitrates • It was further enhanced in projects known as H.263v2 • Next enhanced codec after H.263 is the H.264 standard, also known as AVC and MPEG-4 part 10. • As H.264 provides a significant improvement in capability beyond H.263, the H.263 standard is now considered primarily a legacy design

H. 263

MPEG 1. What is MPEG ? • "Motion Picture Coding Experts Group", established in 1988 to create standard for delivery of video and audio. • MPEG-1 Target: VHS (Video Home System) quality on a CD-ROM or Video CD; • 352 x 240 + near-CD audio (224 kb/s mp2) = about 1.5 Mbits/sec • Standard has three parts: Video, Audio, and System (to control interleaving of streams)

MPEG • Typical pattern is IBBPBBPBBIBBPBBPBBIBBPBBPBB • Actual pattern is up to encoder, and need not be regular.

I & P Frames • I frame (intra coded pictures) coded without using information about other frames. • An I frame is treated as a still image and falls back on the results of JPEG. • They form the anchors of random access. • P frames (predictive coded pictures) require information about previous I and/or P frames for encoding & decoding. • Decoding a P frame requires decompression of the last I frame and any intervening P frames. • In return, the compression ratio is higher than for I frames. • A P frame allows the following P frame to be accessed if there are no intervening I frames.

B & D Frames • B frames (bi-directionally predictive coded pictures) require information from previous and following I and/or P frames. • B frames have the highest compression ratio in MPEG. • It is defined as the difference from a prediction based on a previous and a following I or P frame. • It cannot, however, ever serve as a reference for prediction coding of other pictures. • D frames (DC coded pictures) are intraframe-coded and can be used for efficient fast forward. • During the DCT (Discrete Cosine Transform), only the DC-coefficients are coded.

MPEG 2. Differences from H. 261 • Larger gaps between I and P frames, so need to expand motion vector search range. • To get better encoding, allows motion vectors to be specified to fraction of a pixel (1/2 pixel). • Bitstream syntax must allow random access, forward/backward play, etc. • Added notion of slice for synchronization after loss/corrupted data.

MPEG • MPEG is made up of one or more "slices," and a slice is intended to be the unit of recovery from data loss or corruption. • An MPEG-compliant decoder will normally advance to the beginning of next slice whenever an error is encountered in the stream.

MPEG 3. Decoding MPEG Video in Software • Software Decoder goals: portable, multiple display types • Breakdown of time IDCT – Inverse DCT

VCD • VCD (Video Compact Disc) is a typical disc, like an audio CD, which contains video and sound • They have the capacity to hold 74, 80, 90 and even up to 99 minutes of motion video with sound • VCDs typically use MPEG-1 compression for the video stream (1.15 Mbps) and MPEG-1 Layer 2 compression for the audio stream (224 Kbps) • It can be played on almost all hardware VCD and DVD players and on computer CD-ROM/DVD-ROM drives • The quality of a VCD is comparable to VHS (Video Home System by JVC in 1976) but has the advantage of being digital

VCD 1.1 Specifications • The basic VCD specification dating back to 1993/1994 • Up to 98 multiplexed MPEG-1 audio/video streams • Up to 500 MPEG sequence entry points used as chapter divisions • The VCD specification requires the multiplexed MPEG-1 stream to have a CBR (constant bitrate) of less than 174300 bytes (1394400 bits) per second in order to accommodate single speed (1x) CD-ROM drives

VCD 1.1 Specifications • It allows for the following two resolutions 6 : • 352 x 240 @ 29.97 fps (NTSC) • 352 x 240 @ 23.976 fps (NTSC FILM) • The CBR (constant bitrate) MPEG-1, Layer 2 audio stream is fixed at 224 kbps with 1 stereo or 2 mono channels • It is recommended to keep the video bitrate under 1151929.1 bps

VCD 2.0 Specifications • An improved Video CD 2.0 standard was published in 1995. It included added functionality as indicated below, among other things. • Support for mixing NTSC and PAL content. • By adding PAL support to the Video CD 1.1 specification, the following resolutions became available: • 352 x 240 @ 29.97 fps (NTSC) • 352 x 240 @ 23.976 fps (NTSC Film) • 352 x 288 @ 25 fps (PAL)

Newer MPEG Standards 1. MPEG-2 • Unlike MPEG-1 which is basically a standard for storing and playing video on a single computer at low bit-rates, MPEG-2 is a standard for digital TV. • It meets the requirements for HDTV and DVD (Digital Video/Versatile Disc).

MPEG-2 Specifications

Newer MPEG Standards • Other Differences of MPEG-2 from MPEG-1: • Support both field prediction and frame prediction. • Besides 4:2:0, also allows 4:2:2 and 4:4:4 chroma subsampling • MPEG-2 usually supports a variable bitrate (like in DVDs), while MPEG-1 VCD, has a fixed bitrate (CBR).

Newer MPEG Standards • Frame sizes could be as large as 16383 x 16383 • MPEG-3 (do not confuse with mp3 = MPEG-1 layer 3 audio): • Originally planned for HDTV, got folded into MPEG-2

Newer MPEG Standards 2. MPEG-4 • Version 1 approved Oct. 1998 • Originally targeted at very low bit-rate communication (4.8 to 64 Kb/sec), it now aims at the following ranges of bit-rates: • video -- 5 Kb to 10 Mb per second • audio -- 2 Kb to 64 Kb per second

MPEG-4 • It emphasizes the concept of Visual Objects •  Video Object Plane (VOP) • objects can be of arbitrary shape, VOPs can be non-overlapped or overlapped • supports content-based scalability • supports object-based interactivity • individual audio channels can be associated with objects

MPEG-4 • Good for video composition, segmentation, and compression; networked VRML, audiovisual communication systems (e.g., text-to-speech interface, facial animation), etc. • Standards being developed for shape coding, motion coding, texture coding, etc. • There are many clones out there which use MPEG-4 compression technology, such as Divx and Xvid, mainly to make high quality copies of DVD movies at sizes that would fit on a single CD-ROM (700 MB). • MPEG-4 Part 10 (H.264) has superior quality at the same sizes but isn’t as popular yet due to device support. • One drawback to all these is that more processing power is required to decompress the video

MPEG-7 3. MPEG-7 (Multimedia Content Description Interface) • MPEG-7 is a content representation standard for multimedia information search, filtering, management and processing. • Descriptors for multimedia objects and Description Schemes for the descriptors and their relationships. • A Description Definition Language (DLL) for specifying Description Schemes. • For visual contents, the lower level descriptions will be colour, texture, shape, size, etc., the higher level could include a semantic description such as "this is a scene with cars on the highway".

MPEG As A Container • MPEG is also a container format, sometimes refered to as the MPEG System. • There are several types of MPEG, namely ES, PS, and TS. • When you play an MPEG video from a DVD, for instance, the MPEG stream is actually composed of several streams (called Elementary Streams, ES): • there is one stream for video, one for audio, another for subtitles, and so on.

MPEG As A Container • These different streams are mixed together into a single Program Stream (PS). • So, the .VOB files you find in a DVD are actually MPEG-PS files. • But this PS format is not adapted for streaming video through a network or by satellite. • So, another format called Transport Stream (TS) was designed for streaming MPEG videos through such channels.

How To Apply Video Compression Effectively • With the wide variety of video compression types (mpeg, avi, mov etc.), it becomes difficult to decide which codec should be used in a multimedia system. • Choosing the right one and perhaps even a combination can result in many benefits for both the developer and client. • These are some of the things, in no particular order, that should be considered when deciding what compression to use for your video content.

How To Apply Video Compression Effectively • Storage capacity • Client system capability • Purpose of the multimedia system • Nature of the video (people, places, animals, colour etc.) • Quality of the video • Disc access time (CD-ROM, hard drive) • Problems with the original video stream

Storage Capacity • High quality video == larger file size. • Better compression technology can yield higher quality video at the same file size as a poorer compression technology. • For example, MPEG-4 video will produce better quality video (in terms of resolution and color reproduction) at a file size of 5 MB. • Everything else being equal, though; a larger file usually can be expected to produce higher quality video because more data is in it and less data was lost during compression. • So, it becomes obvious that good compression technology must be used, wherever possible, to obtain both high quality and small file sizes.

Storage Capacity Left : Divx Pro 5.0.4 Codec Near DVD Quality compression Below : MPEG-1 Codec VHS Quality compression • Another thing is the source of the video stream. • If the source is VHS, the quality will be less than if it were DVD. • In this example, both screenshots are from a DVD source. • The picture above, however, is larger, clearer and, at a smaller file size than the other one. • The only drawback, is that it might not be playable on as many machines as the movie on the right.

Client System Capability • We must always bear in mind that high quality compression and low file sizes take their toll in the form of system requirements. • A powerful computer would have no problem playing a complete, near-DVD quality movie file, 700 MB in size, 90 minutes long, at a resolution of 640x480 pixels, compressed with the Divx 5.0.5 codec. • However, a slower machine, say, under 500 MHz clock speed, would hardly manage. • Also, certain operating systems do not allow certain video codecs to be installed or do not function properly with them. • All of this must carefully be considered before deciding on what compression to use. • How about the ability to scan or search through the video? • High compression videos are often difficult to scan through, even with powerful computers.

Purpose Of The Multimedia System • What is the purpose of the multimedia system? • Is it an entertainment system? An educational one? • Scientific and educational multimedia systems would benefit from high quality video footage, but it is not essential if what you have is sufficient to get the message across. • Still, the video must be at an acceptable level of quality. • Entertainment systems are different. • People expect high quality video and nothing less, from them. Entertainment systems also cannot just make do with clips. • Provide full content wherever it would be appreciated. • If it is neither educational nor entertainment, the general rule of applying high quality video using suitable, good compression still applies.

Nature Of The Video • What kind of video do we want to include in the system? • Is it people, places, animals or something beyond that? • If perhaps we want to show a colony of ants, pixellated compression technology like Cinepak might not be a good idea. • A music video would have to be big and clear enough so as to not disappoint fans. • The artiste must be seen at a certain level of clarity. • Places and sceneries usually do not require much detail, so it may benefit from higher compression (less emphasis on quality) or the use of less advanced codecs (bigger file size, but more compatible).

Nature Of The Video Video that suffers from the blocky effect would not be suitable to depict the ants above. Music videos require high quality codecs at good file sizes so that the fans (and artiste) are not disappointed. The one above isn’t good enough. Sceneries do not pack much detail, so sometimes, even a poor codec can do the job.

Cgmb 324: multimedia system design