1 / 30

A Picture is Worth a Thousand Words

A Picture is Worth a Thousand Words. Milton Chen. What’s a Picture Worth?. A thousand words - Descartes (1596-1650) A thousand bytes - modern translation 1000 * 5 * 5 / 3  8,000 bits 75,000 bytes - ATSC/MPEG-2 20 M / 30  600,000 bits. Frequency Response of the Eye. Lens - low pass

tameka
Download Presentation

A Picture is Worth a Thousand Words

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. A Picture is Worth a Thousand Words Milton Chen

  2. What’s a Picture Worth? • A thousand words - Descartes (1596-1650) • A thousand bytes - modern translation • 1000 * 5 * 5 / 3  8,000 bits • 75,000 bytes - ATSC/MPEG-2 • 20 M / 30  600,000 bits

  3. Frequency Response of the Eye • Lens - low pass • Photoreceptors - low pass • Lateral inhibition - high pass • edge is important

  4. Today’s Video Coding YUV (lossy) Motion DCT Quantize (lossy) Order Entropy Designed for natural scenes => Higher frequency DCT coefficients are quantized more => Sharp edges are not well preserved

  5. What’s Wrong with Today’s Video Coding • Poor performance for • text (channel logo, stock ticks) • graphics • anything with sharp edges

  6. Desirable Features • Postproduction support • Personalized delivery / presentation • Interactive • Error resilience • More compression • Facilitate search / indexing (MPEG-7)

  7. Outline • Why • MPEG-4 Overview • Systems Layer • Visual Coding • Arbitrarily shaped video • Meshed video • Face and body

  8. Goals of MPEG-4 • One content • convergence of DTV, computer graphics, and WWW • broadcast, internet, local • User interactivity • Higher compression rates • Robustness in mobile environment

  9. MPEG-4 Applications • Interactive TV (broadcast) • Home-shopping, Interactive game show • Virtual workspace (internet) • virtual meeting, collaborative design • Infotainment (local) • Virtual-City-Guide

  10. MPEG-4 Key Concepts • Independent coding of objects • allow user interactivity (client & server) • higher compression rates • Provide tools as well as solutions • allow content specific and user defined compression algorithms

  11. MPEG-4 History • Started in July 1993 • Originally for low-bit-rate applications • Version 1 to be standardized by January 1999 • Continue work on version 2, etc.

  12. MPEG-4 Standard 1) Systems (manage streams, composition) 2) Visual (natural and synthetic) 3) Audio (natural and synthetic) 4) Conformance Testing 5) Reference Software 6) Delivery Multimedia Integration Framework (medium abstraction layer)

  13. Previous Work in Object Coding • Synthetic High System (Schreiber ‘59) • Contour-Texture Approach (Kocher & Kunt ‘82) • Object-Based Video Coder (Musmann et. al. ‘89) • Talisman (Torborg & Kajiya ‘96) • Blue screen matting (Vlahos ‘64)

  14. Shape Coding • Bitmap-based • 1 means in, 0 means out • Chroma-keying, GIF89a • G4 fax standard • Contour-based • chain code • polygon/curve approximation • Fourier descriptor

  15. Chain Code • Follows the contour and encode the direction of next boundary pel • 4 or 8 directions for an avg. of 1.2 or 1.4 bits per boundary pel • Extensions • length • angular resolution

  16. Polygon Approximation • Add control points until maximum error is below threshold • Threshold <= 1.4 pel for CIF (352*288) video • Extension • curves of various order

  17. Fourier Descriptor • Translation, rotation, and scale invariant • Sample contour -> ( xi, yi ) • i, ( yi+1 - yi ) / ( xi + 1 - xi ) • Compute Fourier Series coefficients • Good for recognition, but not an efficient shape coder

  18. MPEG-4 Experiments • Chroma-keying • color bleeding • need to decode whole frame to get shape • Bitmap and contour-based coding are similar in: • error resilience • coding efficiency • Bitmap-based is simpler for hardware due to regular memory access

  19. MPEG-4 Shape Coding • Three types of macroblocks • transparent, opaque, and object boundary • Context-based arithmetic encoder • Macroblocks can be subsampled • Texture padded with 0 or mean value • Transparency • constant: one 8 bit value • arbitrary: treat it like color

  20. Meshed Video • 2D mesh tessellates the video into patches • Motion vector for each vertex • Texture warped in each patch

  21. Meshed Video - Motivation • Motion Modeling • Translational-block motion does not model rotation, scaling, reflection, and shear • Shape Modeling • Possible without depth

  22. Meshed Video - Applications • Compression • better motion compensation • transmit texture only at key frames • spatio-temporal interpolation (zooming, frame-rate up-conversion) • Manipulation • augmented reality • transfiguration (replace billboards) • Indexing / searching

  23. Face • Face object • Default face model with terminal • Facial Definition Parameter or user supplied model/texture • Facial Animation Parameter plus Amplification and Filters • Lip Shape Animation from phoneme

  24. Facial Definition Parameter

  25. Facial Animation Parameter

  26. Body • Like the face

  27. Ultimate Compression TechniqueComputer Graphics ??? • Block based DCT (MPEG-1/2) • Arbitrary shaped video (MPEG-4) • Meshed video (MPEG-4) • Image based rendering • Textured 3D graphics • Geometry only 3D graphics

More Related