1 / 41

Introduction

Introduction. Much of the information is in form of images Images are handled by machines as a matrix of digital picture elements, or pixels The appearance of an image depends on image type resolution. Types of images & Resolution. bilevel (black & white) e.g. faxes grayscale color

len
Download Presentation

Introduction

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Introduction • Much of the information is in form of images • Images are handled by machines as a matrix of digital picture elements, or pixels • The appearance of an image depends on • image type • resolution Gabriele Monfardini - Corso di Basi di Dati Multimediali a.a. 2004-2005

  2. Types of images & Resolution • bilevel (black & white) • e.g. faxes • grayscale • color • dot per inches (dpi) • 600 x 600 – actual medium quality laser printer • 1200 x 1200 – low cost phototypesetter • 4800 x 4800 – high resolution phototypesetter Gabriele Monfardini - Corso di Basi di Dati Multimediali a.a. 2004-2005

  3. Bilevel images: CCITT fax standard • fax: facsimile • CCITTComité Consultatif International Téléphonique et Télégraphique, it is part of the ITUInternational Telecommunication Union, one of the specialized agencies of the United Nations • In the late 70s CCITT starts thinking about a standard for fax transmission • 1980 CCITT Group 3 standard • group 1 & 2 are earlier attempt, which use simpler encoding and modulations techniques, resulting in very slow transmissions Gabriele Monfardini - Corso di Basi di Dati Multimediali a.a. 2004-2005

  4. CCITT Group 3 - I • It is the most common standard for fax transmission • It is accepted worldwide, almost every fax machine supports this standard • It uses compression algorithms for bilevel images Gabriele Monfardini - Corso di Basi di Dati Multimediali a.a. 2004-2005

  5. 1728 bits/line 1188 lines/page CCITT Group 3 - II • Paper size: international A4 (not US letter) • standard resolution 204x98 dpi (200x100) • high resolution 204x196 dpi (200x200) • bilevel image  1 bit/pixel • image size: 1728x1188 bits at standard resolution  about 2 Mbit • Transmission rate: 4.8 Kbit/s • today is usually higher, 14.4 – 33.6 Kbit/s • At 4.8Kbit/s in std resolution one page would take about 430 sec, but only 1 minute on average with Group 3 algorithms

  6. Run-length coding • Each scan line is composed by sequences of pixel of the same color • Count the number of element of each run • Example 3w 4b 9w 2b 2w 6b 5w 2b 5w... Gabriele Monfardini - Corso di Basi di Dati Multimediali a.a. 2004-2005

  7. G3 1D • Group 3 One-Dimensional coding (G3 1D) is called Modified Huffman (MH) as it encodes run lengths using a predefined Huffman code • In order to maintain black/white syncronization, each line begins with a white run, eventually of zero length Gabriele Monfardini - Corso di Basi di Dati Multimediali a.a. 2004-2005

  8. G3 1D 1000 011 10100 11 0111 0010 ... • predefined Huffman codewords have been found from the probabilities of the runs in typical handwritten documents Gabriele Monfardini - Corso di Basi di Dati Multimediali a.a. 2004-2005

  9. G3 1D • As one line has 1728 bits, we have to define a codeword for all 1728 black and white run lengths • As shorter runs occur more frequently that longer runs, we code each run length in an additive form • there is a terminating and makeup codeword • Lengths form 0 to 63 are coded with a single terminating codeword • Longer runs are coded with one or more makeup codewords and a terminating codeword • Each line is terminated with a EOL symbol composed of eleven 0 and one 1 Gabriele Monfardini - Corso di Basi di Dati Multimediali a.a. 2004-2005

  10. G3 2D • Group 3 Two-Dimensional coding (G3 2D) is called Modified READ (MR) as it is a variant of a previously defined code, called READ (Relative Element Address Designate) • Many images have a high degree of vertical coherence between consecutive lines • changing elements are coded w.r.t. a “nearby” change position of the same color in the previous (reference) line

  11. G3 2D • Nearby means within an interval of radius 3 pixels • If there are changing elements in the current line without correspondents in the reference line  switch to horizontal mode (1D) • On the opposite if the ref line has a run with no counterpart in the current line  special pass code Gabriele Monfardini - Corso di Basi di Dati Multimediali a.a. 2004-2005

  12. G3 2D reference line current line vertical mode horizontal mode pass code vertical mode ... -1 0 +2 -2 from a Huffman table, with codewords for -3, -2, -1, 0, +1, +2, +3 <mode | length of preceding white run | length of black run> 0001 generated code

  13. G3 2D • Two dimensional coding is more prone to transmission errors • In the G3 1D an error may cause problems in the entire line, but syncronization is forced back by EOL codeword • Here an error in the reference line is likely propagated in all the other lines • For this reason there are 1 reference line for each k lines (i.e. k-1 are coded w.r.t. each ref line) • standard resolution  k=2 • high resolution  k=4 Gabriele Monfardini - Corso di Basi di Dati Multimediali a.a. 2004-2005

  14. CCITT fax standard compression performances • Standard resolution (~200x100 dpi) • G3 1D  0.13 bits/pixel 57s. for A4 at 4.8 Kbps • G3 2D (k=2)  0.11 bits/pixel 47s. for A4 at 4.8 Kbps • High resolution (~200x200 dpi) • G3 2D (k=4)  0.09 bits/pixel 74s. for A4 at 4.8 Kbps • Compression is very good for office image where run lengths are long • It would be very bad for bilevel natural images

  15. Continuous-tone images: why lossless compression? • lossy compression is often preferred to have remarkably more compressed images, with good quality • However there are some situations in which using an approximation may not be adequate • medical images • historical documents • images with legal relevance Gabriele Monfardini - Corso di Basi di Dati Multimediali a.a. 2004-2005

  16. Continuous-tone images: lossless compression • GIF standard • PNG standard • JPEG-LS • It is a quite new standard. The original JPEG standard included a lossless mode, but its performances were not close to ‘state of the art’ • extimation of pixel value using quite simple context: effective and low cost solution • www.hpl.hp.com/loco

  17. GIF image format - I • Adopted by CompuServe to minimize the time required to download images over a modem link • The most widely used lossless image format until 1995 • 8-bit pixel description • 256 color images, but it is possible to use a color map Gabriele Monfardini - Corso di Basi di Dati Multimediali a.a. 2004-2005

  18. GIF image format - II • The color map can be specified for each image or can be omitted • if specified, it is included as an header into image file, in uncompressed form • color map is composed of 256 24-bit entries, that specify 256 RGB colors • Compression scheme used is LZW • Alphabet symbols are the 256 colors of the color map plus a “clear” code and an “end-of-information” code Gabriele Monfardini - Corso di Basi di Dati Multimediali a.a. 2004-2005

  19. GIF image format - III • Even if this feature is not widely used, GIF files may contain more than one image, and it is possible to share the color map • LZW-coded information is grouped into blocks preceded by a byte-count, in order to skip an image without decompressing it • In 1995 Unisys announced that there would be royalties on GIF implementations due to an old patent they held on LZW • This catalyzed the development of a new lossless image format, designed for public domain and with the last improvements

  20. PNG image format - I • Portable Network Graphics(pronounced “ping”) • it uses gzip compression scheme • through some improvements compression obtained is about 10-30% better than GIF • By default it encodes the pixels in raster scan order, but some other methods are available • it is possible to code horizontal difference, i.e. the difference between current pixel value and the previous one or vertical difference, i.e. the difference w.r.t. the above pixel • average difference, the difference with the average of above and next pixel • ... Gabriele Monfardini - Corso di Basi di Dati Multimediali a.a. 2004-2005

  21. PNG image format - II • It is possible to use more than 256 colors, up to 16 bit grayscale and 48 bit color • GIF uses one special pixel value to indicate transparency, PNG uses 256 different values per pixel, allowing for picture progressively fading into the background • It seems inevitable that PNG format will gradually assume the role of standard lossless image format for the WWW, replacing GIF Gabriele Monfardini - Corso di Basi di Dati Multimediali a.a. 2004-2005

  22. Continuous-tone images: why lossy compression? • Digital images are yet an approximation of the real analog phenomenon • lossy techniques allow to obtain very good compression with a modest lost of details • This is useful for storing and trasmitting images Gabriele Monfardini - Corso di Basi di Dati Multimediali a.a. 2004-2005

  23. Continuous-tone images: lossy compression • JPEG • JPEG2000 • a new image coding system that uses state-of-the-art compression techniques based on wavelet technology • file extension .jp2 • With very compressed files, if image size is the same, perceived quality of JPEG2000 images is better w.r.t. JPEG images Gabriele Monfardini - Corso di Basi di Dati Multimediali a.a. 2004-2005

  24. JPEG format - I • JPEG is a standard defined by the Joint Photographic Experts Group in 1992 • It was conceived to transmit images at 64 Kbps • It has a lossy mode and a lossless mode (not so much used, and today replaced by the JPEG-LS standard) • With lossy mode it allows to obtain very good quality at about 1 bit/pixel • Implementation complexity is reasonable Gabriele Monfardini - Corso di Basi di Dati Multimediali a.a. 2004-2005

  25. JPEG format - II • It could be used with graylevel and color images • Each channel of the color space (RGB, YUV...) is treated separately • it allows progressive transmission (that is much better suited for WWW than raster transmission) Gabriele Monfardini - Corso di Basi di Dati Multimediali a.a. 2004-2005

  26. Raster vs. progressive transmission

  27. Quantization Discrete Cosine Transform Binary Encoder JPEG Coder - I Gabriele Monfardini - Corso di Basi di Dati Multimediali a.a. 2004-2005

  28. JPEG Coder - II • Image is divided in 8x8-pixel squares • Preprocessing • Apply Discrete Cosine Transform on each square • Coefficient quantization • Bit stream encoding Gabriele Monfardini - Corso di Basi di Dati Multimediali a.a. 2004-2005

  29. Preprocessing: color space transformation & downsampling • from RGB into YUV • The Y component represents the brightness of a pixel, and the U and V components together represent the hue and saturation • Human eye can see more detail in the Y component than in the U and V, that can be compressed more aggressively • 4:4:4 no downsampling • 4:2:2 horizontal downsampling of a factor 2 • 4:2:0 both horizontal and vertical downsampling Gabriele Monfardini - Corso di Basi di Dati Multimediali a.a. 2004-2005

  30. Discrete Cosine Transform - I • The discrete cosine transform (DCT) is a Fourier-related transform similar to the discrete Fourier transform (DFT), but using only real numbers • It is used in JPEG because it is fast and quite easy to implement efficiently Gabriele Monfardini - Corso di Basi di Dati Multimediali a.a. 2004-2005

  31. Discrete Cosine Transform - II where • the block is pixels (in JPEG, 8x8) • A(i,j) is the value of pixel of position (i,j) • is the DCT coefficient of position • low values for corresponds to low vertical frequencies, low values for to low horizontal frequencies • Generally higher frequencies have very low values Gabriele Monfardini - Corso di Basi di Dati Multimediali a.a. 2004-2005

  32. Discrete Cosine Transform - III • DCT function basis • each 8x8 square is reduced to 64 coefficient Gabriele Monfardini - Corso di Basi di Dati Multimediali a.a. 2004-2005

  33. Discrete Cosine Transform - IV • Knowing with infinite precision the 64 DCT coefficient it is possible to reconstruct exactly the pixels of the square • But • finite precision • quantization of the coefficients (always) • Some coefficient related to high frequency are not transmitted. This allows higher compression without sacrifying too much quality as human eye is less responsible Gabriele Monfardini - Corso di Basi di Dati Multimediali a.a. 2004-2005

  34. Quantization - I • The DCT matrix obtained is scaled differently in each component, dividing each by a diferent factor • the factor for each component has been decided based on human sensitivity to changes at each frequency • In practice the matrix of factor is usually

  35. Quantization - II • Next, all values are rounded to nearest integer • This leads to a quite high number of 0s in the high frequency zone, as factors are bigger Gabriele Monfardini - Corso di Basi di Dati Multimediali a.a. 2004-2005

  36. Zig-zag scan • Low frequency coefficients are transmitted before higher frequency coefficients • This allows for progressive visualization of this 8x8 block Gabriele Monfardini - Corso di Basi di Dati Multimediali a.a. 2004-2005

  37. Raster vs. progressive transmission • Raster transmission • DCT coefficient of the upper left block, then those of all the others in the upper part of the image and so on • Progressive transmission • first all (0,0) coefficients, than all (0,1) and so on, following zig-zag scan in each block Gabriele Monfardini - Corso di Basi di Dati Multimediali a.a. 2004-2005

  38. Binary coding • DCT(0,0) has usually a very slow variation from one block to the next, as it is the mean value • For this reason it is convenient to encode the difference from the previous value • Tipically the bit stream is coded with Huffman • It is possible to use arithmetic scheme, gaining some compression at cost of decoding speed • Huffman codes are predefined, or it is possible to build optimal tables and insert them in the stream Gabriele Monfardini - Corso di Basi di Dati Multimediali a.a. 2004-2005

  39. Dequantization Inverse DCT Binary Decoder JPEG Decoder Some values are lost! Good quality, but reconstruction is not exact Gabriele Monfardini - Corso di Basi di Dati Multimediali a.a. 2004-2005

  40. JPEG performances - I Gabriele Monfardini - Corso di Basi di Dati Multimediali a.a. 2004-2005

  41. JPEG performances - II Original Quality factor 75 Quality factor 20 Quality factor 3

More Related