**Operational Rate-Distortion information theory in** optimization ofadvanced digital video codec Dragorad A. Milovanović DragoAM@Gmail.com Zoran S. Bojković z.bojkovic@yahoo.com University of Belgrade

**CONTENTS** 1.Rate-Distortion theory 1.1 Source coding and R-D function 1.2 Operational R-D framework 1.3 Formulation of efficient video coding 2. Operational control of standard-based encoder 2.1 Operational MPEG framework 2.2 Performance/efficiency of digital video codec 2.3 Bitrate control and joint optimization

**1. Rate-Distortion theory** • Information Transmission System (message, symbols encoding, entropy) • Source coding: perceptual signals and distortion criterion D ≤ Dmax • Average distortion: • Rate-distortion theory calculates the minimum transmission bitrate R for a required video quality D. • Mutual information is the information that symbols and symbols convey about each other. • Average mutual information: • Channel coding: channel capacity C is a maximum of mutual information I between source and destination. R

**1.1 Source coding and R-D function** • For a given maximum average distortion Dmax, the rate distortion function is lower bound for the transmission bitrate Shannon lower boundRL(D) assumes statistical independence between distortion and reconstruction. R(D) function is non-increasing and convex function of D. For continuous source S, function R(D) approaches infinity as D approaches zero. For discrete source S, the minimum rate that is required for a lossless transmission is equal to the entropy rate R(0)=H(S) (losseless coding). Stochastic model of Gauss-Markov source (correlation 0<ρ<0.9): DL(R) = (1- ρ2)・σ2・2−2R Stochastic model of Laplacian pdf source (variance σ2=1): DL(R) = e/π・σ2・2−2R

**1.2 Operational (R,D) framework** • In a practical coding framework, structure of the coder is determined and finite set of encoding modes is defined. In addition, it is usually difficult or simply impossible to find closed-form expressions for the R(D) and D(R) functions for general sources. • Then, each of encoding parameters choices lead to pair of rate and distortion values of operational point in R-D plane. The lower bound of all these rate-distortion pairs is referred as ORD function. • Block diagram for a typical lossy source coding system: • block code QN={αN,βN,γN}(N consecutive input samples are independently coded) • bitrate R (average number of bits per source symbol) • additive distortion measure D (MSE of source/reconstructed symbols)

**Operational R-D function** • For given source Sandcode Q, operational point(R,D) is definedR=r(Q) andD=δ(Q). • Operational plane R-D is possible partitioned into region of achievable rate distortion points (R,D) if there is a codeQwith r(Q)≤Randδ(Q)≤D. The function R(D) that describes this fundamental bound for a given source S is the operational function ORD. • The ORD boundary regions of achievable rate distortion points specifies: • the minimum rate R that is required for representing the source Swith a distortion less than or equal to a given value D or, alternatively, • the minimum distortion D that can be achieved if the source Sis coded at a rate less than or equal to a given value R. D=Max Region of achievable rate-distortion points (R,D) R=Max Operational R(D) function Rmin Dmin

**Quantization** • Uniform scalar quantizer (Δ=const, D~Δ2/12, opt. γ) • Non-uniform optimal quantizer (Lloyd–Max centroids of pdf ) • Asymptotic performance DL(R) = σ2・εS2・2−2R (Shannon lower bound)

**Etropy coding (γ)** • Variable length code (VLC ): • Huffmancode minimize average code length • Ls = Σ p(si)・length(si) [bps] • Optimal code p*(si) minimize first-order entropy • Hs = - Σ p*(si)・log2 p*(si) [bps] • K= 2: p(s1) = P1, p(s2) = 1-P1 • Hs = - P1 log2 P1 – (1-P1) log2(1-P1) bits/symbol • P1= 0.5 max Hs =1, Redundancys = log2K - Hs = 0 • Arithmetic encoder (CABAC): • adaptive estimation of statistical distribution p(si)

**Predictive coding** • Differential coder • Predictive coder (DPCM) • Linear prediction Ŝn: • prediction coefficient pi • prediction error Un • reconstruction error U'n • Optimal linear prediction (Un orthogonal on Ŝn) • Prediction error variance σ' 2 = εα2σ2 ≥ γS2 εα2σS2 , γ=sfm • asymptotic performance: Coding gain CG =1/ γS2 • N =1: p1,opt=ρ1, CG=1/(1- ρ1 2) • N =2: p1,opt=ρ1 (1- ρ2)/(1- ρ1 2), p2,opt=ρ1 (ρ2 - ρ1 2) (1- ρ2)/(1- ρ1 2)

**Transform coding** • Linear transformation • A transformation matrices • B inverse matrices • Aorthogonal matrices A-1 = AT, ATA= A AT = I • Borthonormal matrices B = A-1 = AT (sum of N variances of coeff. = variance of s) • Optimal linear transformation KLT (eigenvalues of auto-covariance matrices RSS) • Asymptotic performance: Coding gain CG =1/ γS2 • Optimal bitrate allocation R between N quantizers: • N=2

**1.3 Formulation of efficient video coding** • Standard-based codec requires optimization procedure over a set of allowed operating parameters as well as additional criteria that arise from real-time operations (complexity, delay). • The goal of operational information theory is to find a set of operating parameters of the encoder which is optimal in R(D) sense. Also, an efficient optimization procedure based on a fast algorithms solution instead the full search of parameter’s space, is requires. • Practical trade-off between the allowed distortion D and available bitrate R in designing an encoder, is based on the discrete optimization procedureof finding a local optimum operational (R, D) points.

**Lagrange multiplier method** • Formulation of R-D problem: Cost function with constraint Necessary condition for the existence of a minimum: The solution: • Unconstrained Lagrangian cost function: Necessary condition for the existence of a minimum : The solution is simultaneousiteration ofRandλ:

**Geometrical interpretation** Operational R-D function is convex border which connects subset of local optimum operational points (connected operational points are sub-optimal solution of Lagrange method). Optimal operational point (D,R) as a solution of Lagrange method min(D+ λ R) for constant λ, is operational pointon convex border which touches slopeλ.

**Optimal bit allocation** • Formulation: Optimal bit allocation with constraint • Unconstrained Lagrangian cost function: • Necessary condition for the existence of a minimum: • The solution is simultaneous iteration of Ri and λ:

**Joint hierarchical optimization** • Optimal image decomposition and bitrate allocation: • discrete version of Lagrange multipliermethod, • deterministic dynamic programming (forward/backward). • The solution: • The image is decomposed to pre-specified number of levels. • For the adopted value of quality parameters λ = const, on each level of decomposition is calculated operational point min(D + λR) for each partition and the specified set of quantizers. • At each level of decomposition split/merge decision is made (principle of optimality) in the comparison of the Lagrange function of successive levels of decomposition: • Binary search (Newton method) determines the optimal λ * for a given bitrateRmax and the initial search interval

**2. Operational control of standard-based encoder** • Digital video encode exploits statistical redundancy of source as well as perceptual irrelevancy of an user. • Block-adaptive hybrid transform-entropy encoder with motion estimation&compensation: Scope of standardization

**2.1 Operational MPEG framework** • ITU/MPEG process of standardization: • Encoding techniques and operational parameters:

**Set of operational parameters** • The task of an encoder control is to determine the values of the standardized syntax elements, and thus the bitstreamb, for a given input sequence in a way that the distortion between the input sequence and its reconstruction is minimized subject to a set of constraints on average and maximum bit rate. • Let Bcbe the set of all conforming bitstreams that obey the given set of constraints. For distortion measure D, the optimal bitstream in the rate–distortion sense is given by • Due to the huge parameter space and encoding delay, it is impossible to directly apply the minimization. Instead, the overall minimization problem is split into a series of K smaller minimization problems (p is subset of operational parameters) • The constrained minimization problem can be reformulated as an unconstrained minimization, where Q denotes the quantization step size, which is controlled by the quantization parameter QP. R(QP) D(QP)

**2.2 Performance/efficiency of digital video codec** 1 1 1 1 H.265 H.263 HD720 QP=30BR=512 PSNR= 39.66dB QP=20BR=512 PSNR= 34.00dB 2 2 2 2 H.265 H.263 HD720 QP=30BR=512 PSNR= 39.36dB QP=31BR=512 PSNR= 30.94dB 3 3 3 3 H.265 H.263 HD720 QP=30BR=512 PSNR= 39.24dB QP=25BR=512 PSNR= 32.78dB

**Coding gain BRCG, PSNR=const** • The three test sequences (1/2/3) with typical video conferencing content was selected in experiments (Vidyo1280x720 60fps x10s). • Each test sequence was coded at 12 different bitrates. The ORD functionPSNRYUV(BR) are shown for bitrates BR = 0.256, 0.384, 0.512, 0.850, 1.500Mbps • The combined PSNRYUV is first calculated as the weighted sum of the PSNR per picture of the individual components (PSNR) to obtainPSNRYUV = (6·PSNRY+PSNRU+PSNRV)/8 where individual components are computed asPSNR = 10 log10 (2B-1)2/MSE, B=8 1 1 2 2 3 BitRate reduction of HEVC vs. AVC based on subjective MOS performance for typical video conferencing bitrates 3

**Coding gain PSNRCG, BR=const** • Variability PSNRY per frame (time) for BR=const(BR~0.512Mbps: QPHEVC=30, QPAVC=32, QPH.263=20/31/25) 1 2 3

**Complexity of encoder/decoder** • The encoding and decoding times for the representative HD720 sequences (60fps x 10s) are shown.Times are recorded in 10s of seconds such as to illustrate the ratio to real-time operation: • the HEVC encoding time exceed 1000 times real-time, • the decoding time exceed 4 times real-time on an Ultrabook x86-64 Core i5 2/4@1.7GHz 4GB RAM. 1 2 3

**2.3Bitrate control** • The objective of rate control is to regulate the MPEG coded bit stream to satisfy certain given conditions (variable/constant bits budget constraints, buffer over/underflow prevention). • Variable/Constant (VBR/CBR)bitrate is under control of constant/variable quantization parameter QP in open/closed loop. • A typical rate-control scheme consists of two basic operations: • bit allocation (R-D model), and • bit rate control (buffer occupancy measure). • To achieve the target bit rate R, rate control scheme appropriately chooses a quantization parameter Q . For accuracy, it is of importance R-Q rate-quantization model. Together with distortion-quantization D-Q function, R-Q functions characterize the rate-distortion (R-D) behavior of video encoding. • The first step of the derivation of a rate control formula is to approximate the rate-distortion function R-Q by an inverse proportional curve as shown in figure.

**Joint encoding (Det/StatMux)** • Deterministic multiplex ofL video sequences, CBR encoded with constant bitrate Ri (variableDiand picture quality) in fixed channel caacity Rc: • Statistical multiplex ofL+SMCGvideo sequences, VBR encoded withvariable bitrate Ri (constant Diand picture quality). Criteria are joint buffer occupancy measure 1 1 2 2 3 3 1 2 3 . . . . . .

**References** [1] K.R.Rao, Z.S.Bojkovic, D.A.Milovanovic, Introduction to multimedia communications: applications – middleware - networking, Wiley, 2005. [2] K.R.Rao, Z.S.Bojkovic, D.A.Milovanovic, Multimedia communication systems: techniques, standards, and networks, Prentice Hall, 2002. [3] Y. Shoham, A Gersho, “Efficient bit allocation for an arbitrary set of quantizers,” IEEE Trans. ASSP,vol.36,pp. 1445-1453,Sep 1988. [4] T. Berger, Rate-Distortion theory: A mathematical theory for data compression, Prentice-Hall, 1971. [5] D.P. Bertsekas, Constrained optimization and Lagrange multiplier methods,Athena Scientific, 1996. [6] R. Bellman, Dynamic Programming, Princeton University Press, 1957. [7] D.A.Milovanovic, Z.S.Bojkovic, From information theory to standard codec optimization for digital visual multimedia, Seminar on Computer science and Applied mathematics - June 2013, Mathematical institute of the Serbian Academy of science and arts, andIEEE Chapter Computer Science (CO-16), Belgrade, Serbia. [8] D.Milovanović, Z.Milićević, Z.Bojković, MPEG video deployment in digital television: HEVC vs. AVC codec performance study, 11th International Conference on Telecommunications in Modern Satellite, Cable and Broadcasting Services TELSIKS2013, Nis, Serbia, Oct. 2013.