1 / 65

Roadmap

Roadmap. Introduction Intra-frame coding Inter-frame coding Object-based and scalable video coding * Why object-based? motion segmentation, shape coding, R-D optimization scalability issues Spatial/temporal/quality scalabilities. Object-based Video Coding.

fleta
Download Presentation

Roadmap

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Roadmap • Introduction • Intra-frame coding • Inter-frame coding • Object-based and scalable video coding* • Why object-based? • motion segmentation, shape coding, R-D optimization • scalability issues • Spatial/temporal/quality scalabilities EE569 Digital Video Processing

  2. Object-based Video Coding • Waveform-based coding discussed so far uses a simple source model (e.g., H.261/263/264, MPEG-1/-2) • Does not consider the semantic content (e.g. objects and their shape) of the video • Object-based video coding identifies objects (or regions) in a video and encodes them. Potential benefits may include • Improved coding efficiency • Improved visual quality (e.g., no blocking artifacts) • Content description • Content-based interactivity • Also called “content-dependent video coding” • The buzz word for MPEG-4 but less successful than expected (so the important question is to understand why it does not work so well) EE569 Digital Video Processing

  3. Essential Tasks in Object-based Video Coding • Object/region segmentation • Separate pixels based on their color, texture, motion characteristics • Closely related to motion detection and segmentation • Intrinsically ill-defined and desperate for a breakthrough • 2D shape modeling and coding • Not all shapes are equally probable • Subtle implications into video coding (hidden pitfalls) • 2D texture modeling and coding • Extension of existing block-based MCP into region-based • Deformable textures (tradeoff between spatial and temporal prediction) EE569 Digital Video Processing

  4. Object/Region Segmentation • The major challenge in content/object-based coding • Common approaches for segmentation in a still image: gray-level thresholding, clustering, edge detection, region growing, splitting and merging • Object segmentation in video • Motion information can be utilized, but how? • Should we trust more on motion or spatial clues? EE569 Digital Video Processing

  5. Motion-based Segmentation • Motion-based segmentation: to segment an image using motion information • We can first estimate the motion field and then segment the motion field • However, estimation and segmentation are like two sides of the same coin + EE569 Digital Video Processing

  6. A Mind-bothering Example Frame 1 Frame 2 It is easy to convince yourself that tree branches are moving, But how do we know the sky is still? What if it were also moving at the same speed (shouldn’t we observe the same intensity patterns because sky is a smooth region)? EE569 Digital Video Processing

  7. Implications into Video Coding • True motion representation might be useful to computer vision and motion perception, but it is not indispensable in video coding • The fundamental reason lies in the relationship between motion representation and video coding: how to tolerate the uncertainty in motion? • The same issue remains in object-based image coding: how to tolerate the uncertainty in shape? (we will discuss this in more detail later) EE569 Digital Video Processing

  8. Simplified Segmentation: Change Detection • To detect the changing parts in a video, from time ti to time tj , we compute a difference image and threshold the difference by T f (x, y, tj) f (x, y, ti) • dij(x,y) can be further processed, e.g., to remove isolated 1’s, or to group 1’s that are close by to each other EE569 Digital Video Processing

  9. Change Detection: Pros and Cons • Simple to implement; fast • Detects all changes • Detects even unwanted changes • Positive and negative changes detected (occlusion) • Difficult to quantify motion • Requires a static reference frame EE569 Digital Video Processing

  10. Change Detection: An Example • Monitor the traffic EE569 Digital Video Processing

  11. If without a static reference frame • Background extraction methods • Ad-hoc median detector (your CA#6) • To eliminate the impact of (small) moving objects, use the “robust estimator” approach to iteratively remove the outliers • More sophisticated approaches involve the modeling of background by mixture of Gaussian distributions and graph-cut based optimization EE569 Digital Video Processing

  12. Simplified Segmentation: Global Motion Estimation • Planar homography (feature-based) • Homogeneous coordinates • Conditions for planar homography • Homography estimation from feature correspondence • Hierarchical model-based GME (feature-less) • Directly minimize an energy function (the MSE of MCP errors) • Solve the optimization problem in a coarse-to-fine fashion (more robust and efficient) EE569 Digital Video Processing

  13. Plane Homography EE569 Digital Video Processing

  14. Model-based GME Target function for minimization Solution: Gauss-Newton method where Bergen, J. R., Anandan, P., Hanna, K. J., and Hingorani, R. “Hierarchical Model-Based Motion Estimation.” In Proc. of the Second European Conference on Computer Vision, pp. 237-252, 1992 EE569 Digital Video Processing

  15. Multi-resolution GME EE569 Digital Video Processing

  16. Numerical Example EE569 Digital Video Processing

  17. Summary for Change Detection and Global Motion Estimation • Motion segmentation becomes relatively easier to solve when either camera is still or background objects belong to a plane • Latest advances include a joint motion segmentation and estimation using level-set methods (PDE-based formulation) Mansouri, A.-R.; Konrad, J., "Multiple motion segmentation with level sets," Image Processing, IEEE Transactions on , vol.12, no.2, pp. 201-220, Feb 2003 EE569 Digital Video Processing

  18. 2-D Shape Modeling and Coding • Bitmap coding: a binary map specifying whether or not a pixel belongs to an object • A special case of the general alpha-map • Contour coding: code only the contour of the object or the region • Chain codes • Polygon approximation • Spline approximation EE569 Digital Video Processing

  19. Image Matting (Soft segmentation) Not for coding but for interactive editing EE569 Digital Video Processing

  20. 2-D Texture Modeling and Coding* Shape-adaptive DCT Shape-adaptive wavelet transform EE569 Digital Video Processing

  21. Roadmap • Introduction • Intra-frame coding • Review of JPEG • Inter-frame coding • Conditional Replenishment (CR) • Motion Compensated Prediction (MCP) • Scalable video coding • 3D subband/wavelet coding and recent trend EE569 Digital Video Processing

  22. Scalable vs. Multicast • What is scalable coding? foreman.yuv foreman.yuv foreman128k.cod foreman.cod foreman256k.cod foreman512k.cod foreman1024k.cod 128 256 512 1024 Multicast Scalable coding EE569 Digital Video Processing

  23. Spatial scalability EE569 Digital Video Processing

  24. Temporal scalability Frame 0,1,2,3,4,5,… Frame 0,4,8,12,… Frame 0,2,4,6,8,… 7.5Hz 15Hz 30Hz EE569 Digital Video Processing

  25. SNR (Rate) scalability PSNRavg=40dB PSNRavg=30dB PSNRavg=35dB PSNRi: PSNR of frame i EE569 Digital Video Processing

  26. Scalability via Bit-Plane Coding sign bit A=(a0+a12+a222+ … … +a727) Least Significant Bit (LSB) Most Significant Bit (MSB) Example A=129  sign=+,a0a1a2 …a7=10000001 sign=-, a0a1a2 …a7=00110011 A=-(4+8+64+128)=-204 EE569 Digital Video Processing

  27. Why DPCM Bad for Scalability? Frame number 3 1 2 … Base layer Ibase P P P Enhancement Layer 1 Ienh1 P P P Enhancement Layer 2 Ienh2 P P P suffer from drifting problem suffer from coding efficiency loss EE569 Digital Video Processing

  28. Efficiency gap Enhancement layer variable bit-rate Base layer 20 kbps Fine Granular Scalability (FGS) H.264 with/without FGS option Foreman sequence (5fps) ~2dB gap EE569 Digital Video Processing

  29. 3D Wavelet/Subband Coding y t x 2D spatial WT+1D temporal WT EE569 Digital Video Processing

  30. H 7 H 6 5 H H H H H H H H 4 H H H H H H H H 3 2 1 0 Wavelet Video Coder Originalvideoframes LH LH LLH LLL Spatial WaveletTransform TemporalWavelet Transform Embedded Quantization & Entropy Coding • [Taubman & Zakhor,1994] [Ohm, 1994][Choi & Woods, 1999] [Hsiang & Woods, VCIP ’99] . . . and others EE569 Digital Video Processing

  31. Motion-Adaptive 3D Wavelet Transform Recall Haar transform lifting-based implementation Motion-adaptive Haar transform W,W-1: forward and backward motion vector EE569 Digital Video Processing

  32. Low Band Even Frames Analysis: P U Motion Compensation Odd Frames High Band Low Band Even Frames Synthesis: P U [Secker & Taubman, 2001] [Popescu & Bottreau, 2001] Odd Frames High Band Lifting EE569 Digital Video Processing

  33. MC Wavelet Coding vs. H.264/AVC 38 36 Non-scalable H.264/AVC 34 32 30 Luminance PSNR (dB) 28 26 Scalable MC 5/3 Wavelet • Sequence: Mobile CIF • H.264/AVC • high complexity RD control • CABAC • PBBPBBP . . . • 5 prev/3 future reference frames • data courtesy of M. Flierl 24 22 20 2.0 1.8 1.6 0.6 1.4 0.4 1.2 0.2 1.0 0.8 [Taubman & Secker, VCIP 2003]courtesy D. Taubman bit-rate (Mbps) EE569 Digital Video Processing

  34. Wavelet Synthesis with Lossy Motion Vector Videoin Videoout Inverse Wavelet Transform MC Wavelet Transform Embedded Encoding Decoder Minimize J=D+lR Embedded Encoding Decoder Motion Estimator Minimize J=D+lR [Taubman & Secker, ICIP03] EE569 Digital Video Processing

  35. 40 38 Non-embedded single-rate 36 34 Video PSNR (dB) 32 Embedded wavelet coefficients Lossless motion 30 28 Embedded wavelet coefficientsLossy motion 26 CIF Foreman 24 0 200 400 600 800 1000 1200 - Bit Rate (kbps) R-D Performance with Lossy Motion Vector [Taubman & Secker, VCIP 2003]courtesy D. Taubman EE569 Digital Video Processing

  36. Internet video streaming Surprising Success of ITU-T Rec. H.263 . . . and what is was used for. What H.263 was developed for . . . ?? Analog videophone EE569 Digital Video Processing

  37. Access SW What is Streaming Video? Receiver 1 • Download mode: no delay bound • Streaming mode: delay bound Access SW Domain B Domain A Data path Domain C Access SW Internet Source Receiver 2 RealPlayer cnn.com EE569 Digital Video Processing

  38. Outline • Challenges for quality video transport • An architecture for video streaming • Video compression • Application-layer QoS control • Continuous media distribution services • Streaming server • Media synchronization mechanisms • Protocols for streaming media • Summary EE569 Digital Video Processing

  39. Time-varying Available Bandwidth Access SW Receiver No bandwidth reservation Access SW Domain B R>=56 kb/s Domain A Data path R<56 kb/s 56 kb/s RealPlayer Source cnn.com EE569 Digital Video Processing

  40. Time-varying Delay Access SW Receiver Access SW RealPlayer Domain B Domain A Data path Delayed packets regarded as lost 56 kb/s Source cnn.com EE569 Digital Video Processing

  41. Effect of Packet Loss Access SW Receiver No packet loss Access SW Domain B Domain A Data path Loss of packets No retransmission Source EE569 Digital Video Processing

  42. Unicast vs. Multicast Multicast Unicast Pros and cons? EE569 Digital Video Processing

  43. Access SW Heterogeneity For Multicast • Network heterogeneity • Receiver heterogeneity Receiver 2 256 kb/s Access SW What Quality? Domain B Domain A Domain C Internet Gateway Ethernet Telephone networks 1 Mb/s Source Receiver 1 64 kb/s Receiver 3 What Quality? EE569 Digital Video Processing

  44. Outline • Challenges for quality video transport • An architecture for video streaming • Video compression • Application-layer QoS control • Continuous media distribution services • Streaming server • Media synchronization mechanisms • Protocols for streaming media • Summary EE569 Digital Video Processing

  45. Architecture for Video Streaming EE569 Digital Video Processing

  46. D D D + + Video Compression Layer 0 64 kb/s Layer 1 256 kb/s Layered Coder Layer 2 1 Mb/s Layered video encoding/decoding. D denotes the decoder. EE569 Digital Video Processing

  47. Access SW Application of Layered Video Receiver 2 256 kb/s IP multicast Access SW Domain B Domain A Domain C Internet Gateway Ethernet Telephone networks 1 Mb/s Source Receiver 1 64 kb/s Receiver 3 EE569 Digital Video Processing

  48. Application-layer QoS Control • Congestion control (using rate control): • Source-based, requires • rate-adaptive compression or • rate shaping • Receiver-based • Hybrid • Error control: • Forward error correction (FEC) • Retransmission • Error resilient compression • Error concealment EE569 Digital Video Processing

  49. Congestion Control • Window-based vs. rate control (pros and cons?) Window-based control Rate control EE569 Digital Video Processing

  50. Source-based Rate Control EE569 Digital Video Processing

More Related