Video coding standards
Download
1 / 40

Video Coding Standards - PowerPoint PPT Presentation


  • 119 Views
  • Uploaded on

Video Coding Standards. Heejune AHN Embedded Communications Laboratory Seoul National Univ. of Technology Fall 2011 Last updated 2011. 5. 13. Agenda . History and Concepts JPEG and JPEG-2000 MPEG-1 and MPEG-2 MPEG-4 H.261 and H.263 H.264 Beyond H.264.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' Video Coding Standards' - merrill-salas


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Video coding standards

Video Coding Standards

Heejune AHN

Embedded Communications Laboratory

Seoul National Univ. of Technology

Fall 2011

Last updated 2011. 5. 13


Agenda
Agenda

  • History and Concepts

  • JPEG and JPEG-2000

  • MPEG-1 and MPEG-2

  • MPEG-4

  • H.261 and H.263

  • H.264

  • Beyond H.264


1 standards and standards bodies
1. Standards and Standards Bodies

  • VCEG (video coding expert group) in ITU (formerly CCITT)

    • Focus on real-time, two-way video communication

  • MPEG/JPEG (moving picture expert group) in ISO

    • Focus on multimedia storage and distribution for entertainment

  • Some are overlapped

ITU VCEG

ISO MPEG/JPEG

H.261

MPEG-1

MPEG-2 => H.262

JPEG

H.263

MPEG-4

JPEG-2000

H.264

MPEG-4/AVC <=

MPEG-7

H.264 High Profile

H.264 SVC

H.264 MVC

HEVC(H.265)

MPEG-21


History of video coding standards
History of Video Coding Standards

HP

HEVC

SVC

MVC

2011


  • ISO-MPEG/JPEG

    • JPEG (1992) : compression of still image (DCT)

    • MPEG-1 (1993) : real time play back of VHS quality on Video CD (1.4Mbps)

    • MPEG-2 (1995) : broadcasting quality video service (3~5Mbps)

    • MPEG-4 (1998) : wide bandwidth (20bps to high) and object oriented coding

    • JPEG-2000 (2000) : better quality still image

  • ITU-VCEG

    • H.261 (1990) : video telephony over ISDN (px64kbps)

    • H.263 (1995) : video telephony over circuit and packet network, at 20 kbps to high bandwidth

    • H.264 (2003) : multipurpose better quality video coding

  • Others

    • MPEG-7 (Multimedia content description interface) for search and retrieval in multimedia DB

    • MPEG-21(Multimedia Framework) for multimedia delivery for interoperability


Standards process and usage
Standards process and usage

  • Standards process

  • Understanding standards

    • Only Syntax and Decoder system are defined in Standards.

    • Encoder, application, and Implementation are open to users

    • Standards provides “profile and level” and recommended usage for helping users to choose from many technical options.

Int’l

St’ds

Draft

St’ds

Test Model

(Docs & ref. SW)

Scope & Aim of St’ds

Performance

&

complexity

evaluation

Proposals

From

Companies,

Universities

Improvement

Proposals


2 jpeg
2. JPEG

  • ISO IS-10918

    • By ISO/IEC JTC1/SC29/WG10, (1984~1992)

    • Widely used in WWW and digital photography

    • Motion-JPEG is just a successive stream of JPEG images


Baseline jpeg codec
Baseline JPEG Codec

SSSS-value

DC Huffman

tables

  • RGB or YCbCr coded in either separately or in interleaved order

dc quantization indices

bits

Differential

Coding

VLC

input

image

Uniformscalarquantization

Level

offset

8x8

DCT

[0,255] => [-128,127]

Zig-zag scan

Run-level

coding

VLC

bits

ac quantization indices

Quantization

tables

AC Huffman

tables

RRRRSSSS-value

8x8 blocks


  • Lossless JPEG

    • DPCM used, prediction from 3 neighbors pixels

  • Optional mode

    • Progressive encoding

      • Store image data in order of DC only, low-frequency AC, high frequency AC

    • Hierarchical encoding

      • Store image data in low resolution to high resolution

  • Motion-JPEG

    • Just a sequence of JPEG still images

    • Low complexity, Error tolerance, Market awareness

    • Used for video conferencing and surveillance before widely available cheap MPEG-1/2/4 solution in a market


Jpeg 2000
JPEG-2000

  • Features

    • Good compression performance than JPEG

      • at high compression ratio, no blocking effects

    • Good compression for continuous tone, bi level (text)

    • Both lossless and lossy compression in one framework

    • ROI (region of interest) support

    • Error resilient support (data partitioning)

    • Rather slow in current embedded system due to complexity

  • Encoding process

bits

Arithmetic Encoder

Quantizer

(Tiling)

Wavelet

Transform

image


Lenna, 256x256 RGB

Baseline JPEG: 4572 bytes

Lenna, 256x256 RGB

JPEG-2000: 4572 bytes


Mpeg 1 2

Coder

Control

Control

Data

DCT

Coefficients

Intra-frame

DCT Coder

Quant

-

Intra-frame Decoder

Decoder

DeQ

Entropy coder

0

Motion-

Compensated

Predictor

Intra/Inter

Motion

Data

Motion

Estimator

MPEG-1/2

  • MC-DCT Hybrid Coding


Mpeg 1
MPEG-1

  • MPEG-1

    • Targeted VHS quality(352x288, 30fps, YCbCr420) on VCD (600MB)

    • 1.4 Mbps (1.2 Mbps video + 0.2 Mbps audio) VCD, 70 minutes

    • Three parts: Part 1 System, Part 2 Video, Part 3 Audio

  • Technology

    • MC-DCT Hybrid

      • Macro-block (16x16 pixels): Motion estimation unit

      • Block (8x8 pixels): DCT and Quant unit

    • GOP structure

      • I, P, B picture

      • Trade-off between random access and coding efficiency

    • Asymmetric complexity

      • Larger memory and high computation required at Encoder


Mpeg 1 structure
MPEG-1 Structure

  • Syntax Hierarchy

    • Sequence layer

    • GOP layer

    • Picture Layer

    • Slice Layer

    • MB Layer

    • Block Layer


  • Picture Coding

    • I Picture: no interframe prediction

    • P Picture: interframe prediction from one casual reference picture

    • B Picture: interframe prediction from one previous and one future picture

  • GOP and picture order

    • display order (input at encoder)

    • Transmission order (Encoding/decoding order)

I1

B1

B2

P1

B5

I2

B4

P2

B6

B7

B1

I1

B2

B5

P1

I2

P2

B4

B6

B7


Mpeg 2
MPEG-2

  • Major target application

    • Digital television quality (720x576/480, 25/30 fps) at 3 ~ 4Mbps

  • Interlaced video support

    • Frame picture vs field picture : motion compensation unit

    • Frame DCT vs field DCT in frame picture

field picture

field picture

frame picture

Frame DCT

Field DCT


  • Scalability Support

    • Spatial scalability

      • Low resolution at Base layer and high resolution at Enhancement layer

      • BL is used for prediction of EL

      • E.g. SD resolution at BL, HD resolution at EL

    • Temporal scalability

      • 30 fps at BL, 60 fps at EL

    • SNR scalability

      • Same resolution but different quality

    • Data partitioning

      • Coding Data is packed into different stream

BL bit stream

BL Dec

Lower

Quality

BL Enc

down

EL Enc

EL Enc

Input video

Higher

Quality

EL bit stream


  • Profile & Level

    • MPEG-2 has many options; all implementation do not needs all of them

    • Profiles

      • Simple : 4:2:0 input, I and P picture only, low complexity & low perf.

      • Main : 4:2:0 input, I,P,B Picture, interlaced

      • 4:2:2 : 4:2:2 input (same vertical resolution of color)

      • SNR : SNR scalable

      • Spatial : Spatial scalable

      • High : Spatial and 4:2:2

    • Level

      • Low (352x288), Main(720x576), High 1440 (1440x1152), High (1920x1152)

    • E.g.

      • MPEG-1 : Main profile & Low Level

      • SD DTV, DVD : Main profile & Main Level

      • HDTV : Main profile & High Level (Historically MPEG-3’s target application)


Mpeg 4
MPEG-4

  • Features

    • Support for low bit rate (from 20 Kbps)

    • Support for object based coding

      • Reuse of components, composition, and interactivity support.

    • In practice, object based is not well used

  • Object-based Coding

    • Video Object

    • Shape Coding : transparent/opaque region, binary or grey scale

    • Texture coding with arbitrary shape

      • DCT after zero filling in interblock and exrapolation in Intrablock

VO3

VO1

VO2



H 261
H.261

  • ITU Mostly focus on real-time communication

  • H.261

    • First video coding std(1990)

    • N-ISDN (1990’s)

      • px64Kbps (p=1,..30), typically 64 ~ 384kbps

      • Circuit network based: low delay, reliable

  • H.261 key features

    • YCbCr420 CIF, QCIF input

    • MC-DCT

    • Integer-pel motion

    • Optional loop filter (for deblocking)

      • Filtering at 8x8 block boundary

    • FEC used


H 261 syntax structure
H.261 syntax structure

  • H.261 Bit structure


H 263

H.263 Versions

Version 1 (1995)

Improvement to H.261

4 optional modes

Version 2 (2000, H.263+)

12 optional modes

Version 3 (2002, H.263++)

19 optional modes

Key Features

Targets to 20 kbps and for packet based network also

Half-pel prediction

Redesigned 3-D VLC code

H.263


  • H.263 Optional Modes

    • Annex D: Unrestricted motion vectors

    • Annex E: Syntax-based arithmetic coding

    • Annex F: Advanced Prediction

    • Annex G: PB Frames

    • Annex I : Advanced Intra Coding

    • Annex J: Deblocking Filter

    • Annex K: Slice Structured Mode

    • Annex L: Supplemental enhancement information

    • Annex M: Improved PB frames

    • Annex N: Reference Picture Selection

    • Annex O: Scalability

    • Annex P: reference picture resampling


  • (continued)

  • Annex Q: Reduced resolution update

  • Annex R: Indepenedent Segment Decoding

  • Annex S: Alternative inter VLC

  • Annex T: Modified Quantization

  • Annex U: Enhanced reference picture selection

  • Annex V: Data partition slice

  • Annex W: Additional supplemental enhancement information



H 264
H.264

  • Name

    • ITU H.264 = ISO MPEG-4 Part 10/AVC

    • H.26L : Long term enhancement, not compatible H.263

    • Now accepted in DMB-T/S, IPTV, replacing many MPEG-2 solutions

    • For 50% gain to H.263+


  • Key features

    • Smaller processing units (upto 4x4 pixel block)

    • Intra prediction

    • Inter prediction

      • Macroblock based Interframe prediction selection

      • ¼ pixel motion vector support

      • Motion vector options for subblocks

    • 4x4 Integer DCT

    • Deblocking filter

    • Universal VLC

    • CAVAC (content-based adaptive binary arithmetic coding)


Intra frame prediction

A

B

M

C

D

I

J

K

L

M

A

B

C

D

I

M

A

B

C

D

J

I

K

Mean

(A-D,

I-M)

J

M

A

B

C

D

E

F

G

H

L

K

I

L

J

K

L

H

H

H

H

H

H

V

V

V

V

V

V

H

H

……..

……..

Mean

(H, V)

Mean

(H, V)

V

V

……..

……..

Intra-frame Prediction

  • luma

    - 4x4: 9 modes

    - 16x16: 4 modes

  • chroma

    - 8x8: 4modes

    - The same prediction mode is always applied to both chroma blocks


Inter frame prediction

I

P

B

Inter-frame Prediction


Transform and quantization
Transform and Quantization

  • Integer DCT

    • No encoder decoder mismatch

  • Three types of transformfollowed by quantization

    - Type 1: for the 4x4 array of luma DC coefficients in intra MBs predicted in 16x16 mode # -1

    - Type 2: for the 2x2 array of chroma DC coefficients #16-17

    - Type 3: for all other 4x4 blocks # 0-15, 18-25

( 16x16 Intra

Mode only)

16

17

-1

4 pixels

4 pixels

4 pixels

4 pixels

4 pixels

4 pixels

0

1

4

5

18

19

22

23

2

3

6

7

20

21

24

25

12

13

8

9

10

11

14

15

*Data is transmitted in the numbered order


Transform and quantization1

4×4 DCT ( X – Input, Y – output)

4×4 integer transform

- forward

- backward

Transform and Quantization

W

Post-scaling factor (PF)



Deblocking filters

A boundary-strength (BS) parameter is assigned to every 4×4 block

BS = 0 No filtering

BS = 1-3 Slight filtering

BS = 4 Strong filtering

Filters only when

|P0-Q0|< α

|P1-P0|< β

|Q1-Q0|< β

Thresholds α and β depend on the average quantization parameter (QP)

The deblocking filtering accounts for 1/3 of the computational complexity of a decoder.

Deblocking Filters


Network adaptation
Network Adaptation

  • VCL & NAL

    • VCL (video coding layer)

    • NAL (network adaptation layer)

  • Error Resilient Tools

    • Flexible macroblock ordering (FMO)

      • Allows to assign MBs to slices In an order other than scan order

    • Arbitrary slice ordering (ASO)

      • Improved end-to-end delay in real-time applications

    • Redundant slices (RS)

      • Redundant representations are coded using different coding parameters

Slice Group #0

Slice Group #1


Profile level
Profile & Level

  • Main application

    • Baseline : Video telephony

    • Main : DTV and Storage

    • Extended :Streaming

  • Profile & tools




Conclusion
Conclusion

  • Many video coding standards

    • St’ds reflect Coding Technology and Implementation Technology

    • Coding performance has improved over 4 times since H.261 (1990)

  • What’s next

    • SVC (Scalable Video Coding) in H.264 (done)

    • H.264ext (further improvement of H.264)

    • 3-D and MVC (Multi-View Coding) is on going.

    • UDTV (ultra Definition TV: 3840x2160)

    • And what’s next?


ad