video coding standards
Download
Skip this Video
Download Presentation
Video Coding Standards

Loading in 2 Seconds...

play fullscreen
1 / 40

Video Coding Standards - PowerPoint PPT Presentation


  • 121 Views
  • Uploaded on

Video Coding Standards. Heejune AHN Embedded Communications Laboratory Seoul National Univ. of Technology Fall 2011 Last updated 2011. 5. 13. Agenda . History and Concepts JPEG and JPEG-2000 MPEG-1 and MPEG-2 MPEG-4 H.261 and H.263 H.264 Beyond H.264.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Video Coding Standards' - merrill-salas


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
video coding standards

Video Coding Standards

Heejune AHN

Embedded Communications Laboratory

Seoul National Univ. of Technology

Fall 2011

Last updated 2011. 5. 13

agenda
Agenda
  • History and Concepts
  • JPEG and JPEG-2000
  • MPEG-1 and MPEG-2
  • MPEG-4
  • H.261 and H.263
  • H.264
  • Beyond H.264
1 standards and standards bodies
1. Standards and Standards Bodies
  • VCEG (video coding expert group) in ITU (formerly CCITT)
    • Focus on real-time, two-way video communication
  • MPEG/JPEG (moving picture expert group) in ISO
    • Focus on multimedia storage and distribution for entertainment
  • Some are overlapped

ITU VCEG

ISO MPEG/JPEG

H.261

MPEG-1

MPEG-2 => H.262

JPEG

H.263

MPEG-4

JPEG-2000

H.264

MPEG-4/AVC <=

MPEG-7

H.264 High Profile

H.264 SVC

H.264 MVC

HEVC(H.265)

MPEG-21

slide5
ISO-MPEG/JPEG
    • JPEG (1992) : compression of still image (DCT)
    • MPEG-1 (1993) : real time play back of VHS quality on Video CD (1.4Mbps)
    • MPEG-2 (1995) : broadcasting quality video service (3~5Mbps)
    • MPEG-4 (1998) : wide bandwidth (20bps to high) and object oriented coding
    • JPEG-2000 (2000) : better quality still image
  • ITU-VCEG
    • H.261 (1990) : video telephony over ISDN (px64kbps)
    • H.263 (1995) : video telephony over circuit and packet network, at 20 kbps to high bandwidth
    • H.264 (2003) : multipurpose better quality video coding
  • Others
    • MPEG-7 (Multimedia content description interface) for search and retrieval in multimedia DB
    • MPEG-21(Multimedia Framework) for multimedia delivery for interoperability
standards process and usage
Standards process and usage
  • Standards process
  • Understanding standards
    • Only Syntax and Decoder system are defined in Standards.
    • Encoder, application, and Implementation are open to users
    • Standards provides “profile and level” and recommended usage for helping users to choose from many technical options.

Int’l

St’ds

Draft

St’ds

Test Model

(Docs & ref. SW)

Scope & Aim of St’ds

Performance

&

complexity

evaluation

Proposals

From

Companies,

Universities

Improvement

Proposals

2 jpeg
2. JPEG
  • ISO IS-10918
    • By ISO/IEC JTC1/SC29/WG10, (1984~1992)
    • Widely used in WWW and digital photography
    • Motion-JPEG is just a successive stream of JPEG images
baseline jpeg codec
Baseline JPEG Codec

SSSS-value

DC Huffman

tables

  • RGB or YCbCr coded in either separately or in interleaved order

dc quantization indices

bits

Differential

Coding

VLC

input

image

Uniformscalarquantization

Level

offset

8x8

DCT

[0,255] => [-128,127]

Zig-zag scan

Run-level

coding

VLC

bits

ac quantization indices

Quantization

tables

AC Huffman

tables

RRRRSSSS-value

8x8 blocks

slide9
Lossless JPEG
    • DPCM used, prediction from 3 neighbors pixels
  • Optional mode
    • Progressive encoding
      • Store image data in order of DC only, low-frequency AC, high frequency AC
    • Hierarchical encoding
      • Store image data in low resolution to high resolution
  • Motion-JPEG
    • Just a sequence of JPEG still images
    • Low complexity, Error tolerance, Market awareness
    • Used for video conferencing and surveillance before widely available cheap MPEG-1/2/4 solution in a market
jpeg 2000
JPEG-2000
  • Features
    • Good compression performance than JPEG
      • at high compression ratio, no blocking effects
    • Good compression for continuous tone, bi level (text)
    • Both lossless and lossy compression in one framework
    • ROI (region of interest) support
    • Error resilient support (data partitioning)
    • Rather slow in current embedded system due to complexity
  • Encoding process

bits

Arithmetic Encoder

Quantizer

(Tiling)

Wavelet

Transform

image

slide11
Comparison between JPEG vs. JPEG-2000

Lenna, 256x256 RGB

Baseline JPEG: 4572 bytes

Lenna, 256x256 RGB

JPEG-2000: 4572 bytes

mpeg 1 2
Coder

Control

Control

Data

DCT

Coefficients

Intra-frame

DCT Coder

Quant

-

Intra-frame Decoder

Decoder

DeQ

Entropy coder

0

Motion-

Compensated

Predictor

Intra/Inter

Motion

Data

Motion

Estimator

MPEG-1/2
  • MC-DCT Hybrid Coding
mpeg 1
MPEG-1
  • MPEG-1
    • Targeted VHS quality(352x288, 30fps, YCbCr420) on VCD (600MB)
    • 1.4 Mbps (1.2 Mbps video + 0.2 Mbps audio) VCD, 70 minutes
    • Three parts: Part 1 System, Part 2 Video, Part 3 Audio
  • Technology
    • MC-DCT Hybrid
      • Macro-block (16x16 pixels): Motion estimation unit
      • Block (8x8 pixels): DCT and Quant unit
    • GOP structure
      • I, P, B picture
      • Trade-off between random access and coding efficiency
    • Asymmetric complexity
      • Larger memory and high computation required at Encoder
mpeg 1 structure
MPEG-1 Structure
  • Syntax Hierarchy
    • Sequence layer
    • GOP layer
    • Picture Layer
    • Slice Layer
    • MB Layer
    • Block Layer
slide15
Picture Coding
    • I Picture: no interframe prediction
    • P Picture: interframe prediction from one casual reference picture
    • B Picture: interframe prediction from one previous and one future picture
  • GOP and picture order
    • display order (input at encoder)
    • Transmission order (Encoding/decoding order)

I1

B1

B2

P1

B5

I2

B4

P2

B6

B7

B1

I1

B2

B5

P1

I2

P2

B4

B6

B7

mpeg 2
MPEG-2
  • Major target application
    • Digital television quality (720x576/480, 25/30 fps) at 3 ~ 4Mbps
  • Interlaced video support
    • Frame picture vs field picture : motion compensation unit
    • Frame DCT vs field DCT in frame picture

field picture

field picture

frame picture

Frame DCT

Field DCT

slide17
Scalability Support
    • Spatial scalability
      • Low resolution at Base layer and high resolution at Enhancement layer
      • BL is used for prediction of EL
      • E.g. SD resolution at BL, HD resolution at EL
    • Temporal scalability
      • 30 fps at BL, 60 fps at EL
    • SNR scalability
      • Same resolution but different quality
    • Data partitioning
      • Coding Data is packed into different stream

BL bit stream

BL Dec

Lower

Quality

BL Enc

down

EL Enc

EL Enc

Input video

Higher

Quality

EL bit stream

slide18
Profile & Level
    • MPEG-2 has many options; all implementation do not needs all of them
    • Profiles
      • Simple : 4:2:0 input, I and P picture only, low complexity & low perf.
      • Main : 4:2:0 input, I,P,B Picture, interlaced
      • 4:2:2 : 4:2:2 input (same vertical resolution of color)
      • SNR : SNR scalable
      • Spatial : Spatial scalable
      • High : Spatial and 4:2:2
    • Level
      • Low (352x288), Main(720x576), High 1440 (1440x1152), High (1920x1152)
    • E.g.
      • MPEG-1 : Main profile & Low Level
      • SD DTV, DVD : Main profile & Main Level
      • HDTV : Main profile & High Level (Historically MPEG-3’s target application)
mpeg 4
MPEG-4
  • Features
    • Support for low bit rate (from 20 Kbps)
    • Support for object based coding
      • Reuse of components, composition, and interactivity support.
    • In practice, object based is not well used
  • Object-based Coding
    • Video Object
    • Shape Coding : transparent/opaque region, binary or grey scale
    • Texture coding with arbitrary shape
      • DCT after zero filling in interblock and exrapolation in Intrablock

VO3

VO1

VO2

h 261
H.261
  • ITU Mostly focus on real-time communication
  • H.261
    • First video coding std(1990)
    • N-ISDN (1990’s)
      • px64Kbps (p=1,..30), typically 64 ~ 384kbps
      • Circuit network based: low delay, reliable
  • H.261 key features
    • YCbCr420 CIF, QCIF input
    • MC-DCT
    • Integer-pel motion
    • Optional loop filter (for deblocking)
      • Filtering at 8x8 block boundary
    • FEC used
h 261 syntax structure
H.261 syntax structure
  • H.261 Bit structure
h 263
H.263 Versions

Version 1 (1995)

Improvement to H.261

4 optional modes

Version 2 (2000, H.263+)

12 optional modes

Version 3 (2002, H.263++)

19 optional modes

Key Features

Targets to 20 kbps and for packet based network also

Half-pel prediction

Redesigned 3-D VLC code

H.263
slide24
H.263 Optional Modes
    • Annex D: Unrestricted motion vectors
    • Annex E: Syntax-based arithmetic coding
    • Annex F: Advanced Prediction
    • Annex G: PB Frames
    • Annex I : Advanced Intra Coding
    • Annex J: Deblocking Filter
    • Annex K: Slice Structured Mode
    • Annex L: Supplemental enhancement information
    • Annex M: Improved PB frames
    • Annex N: Reference Picture Selection
    • Annex O: Scalability
    • Annex P: reference picture resampling
slide25
(continued)
  • Annex Q: Reduced resolution update
  • Annex R: Indepenedent Segment Decoding
  • Annex S: Alternative inter VLC
  • Annex T: Modified Quantization
  • Annex U: Enhanced reference picture selection
  • Annex V: Data partition slice
  • Annex W: Additional supplemental enhancement information
h 264
H.264
  • Name
    • ITU H.264 = ISO MPEG-4 Part 10/AVC
    • H.26L : Long term enhancement, not compatible H.263
    • Now accepted in DMB-T/S, IPTV, replacing many MPEG-2 solutions
    • For 50% gain to H.263+
slide28
Key features
    • Smaller processing units (upto 4x4 pixel block)
    • Intra prediction
    • Inter prediction
      • Macroblock based Interframe prediction selection
      • ¼ pixel motion vector support
      • Motion vector options for subblocks
    • 4x4 Integer DCT
    • Deblocking filter
    • Universal VLC
    • CAVAC (content-based adaptive binary arithmetic coding)
intra frame prediction
A

B

M

C

D

I

J

K

L

M

A

B

C

D

I

M

A

B

C

D

J

I

K

Mean

(A-D,

I-M)

J

M

A

B

C

D

E

F

G

H

L

K

I

L

J

K

L

H

H

H

H

H

H

V

V

V

V

V

V

H

H

……..

……..

Mean

(H, V)

Mean

(H, V)

V

V

……..

……..

Intra-frame Prediction
  • luma

- 4x4: 9 modes

- 16x16: 4 modes

  • chroma

- 8x8: 4modes

- The same prediction mode is always applied to both chroma blocks

transform and quantization
Transform and Quantization
  • Integer DCT
    • No encoder decoder mismatch
  • Three types of transformfollowed by quantization

- Type 1: for the 4x4 array of luma DC coefficients in intra MBs predicted in 16x16 mode # -1

- Type 2: for the 2x2 array of chroma DC coefficients #16-17

- Type 3: for all other 4x4 blocks # 0-15, 18-25

( 16x16 Intra

Mode only)

16

17

-1

4 pixels

4 pixels

4 pixels

4 pixels

4 pixels

4 pixels

0

1

4

5

18

19

22

23

2

3

6

7

20

21

24

25

12

13

8

9

10

11

14

15

*Data is transmitted in the numbered order

transform and quantization1
4×4 DCT ( X – Input, Y – output)

4×4 integer transform

- forward

- backward

Transform and Quantization

W

Post-scaling factor (PF)

deblocking filters
A boundary-strength (BS) parameter is assigned to every 4×4 block

BS = 0 No filtering

BS = 1-3 Slight filtering

BS = 4 Strong filtering

Filters only when

|P0-Q0|< α

|P1-P0|< β

|Q1-Q0|< β

Thresholds α and β depend on the average quantization parameter (QP)

The deblocking filtering accounts for 1/3 of the computational complexity of a decoder.

Deblocking Filters
network adaptation
Network Adaptation
  • VCL & NAL
    • VCL (video coding layer)
    • NAL (network adaptation layer)
  • Error Resilient Tools
    • Flexible macroblock ordering (FMO)
      • Allows to assign MBs to slices In an order other than scan order
    • Arbitrary slice ordering (ASO)
      • Improved end-to-end delay in real-time applications
    • Redundant slices (RS)
      • Redundant representations are coded using different coding parameters

Slice Group #0

Slice Group #1

profile level
Profile & Level
  • Main application
    • Baseline : Video telephony
    • Main : DTV and Storage
    • Extended :Streaming
  • Profile & tools
conclusion
Conclusion
  • Many video coding standards
    • St’ds reflect Coding Technology and Implementation Technology
    • Coding performance has improved over 4 times since H.261 (1990)
  • What’s next
    • SVC (Scalable Video Coding) in H.264 (done)
    • H.264ext (further improvement of H.264)
    • 3-D and MVC (Multi-View Coding) is on going.
    • UDTV (ultra Definition TV: 3840x2160)
    • And what’s next?
ad