1 / 32

Tong Zhang ( zhant@rpi )

ECSE-6640 Digital Picture Processing Prof. George Nagy (nagy@ecse.rpi.edu) Image and Document Formats & Conversion. Tong Zhang ( zhant@rpi.edu ). Coverage. Image representations with different formats ( meta-information ) GIF, BMP, JPEG, TIFF, PBM, PGM, PPM, PS, EPS

devin
Download Presentation

Tong Zhang ( zhant@rpi )

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. ECSE-6640 Digital Picture ProcessingProf. George Nagy (nagy@ecse.rpi.edu)Image and Document Formats & Conversion Tong Zhang (zhant@rpi.edu)

  2. Coverage • Image representations with different formats (meta-information) GIF, BMP, JPEG, TIFF, PBM, PGM, PPM, PS, EPS • Document representations that combine images, texts, music, video etc. Latex, SGML, XML, HTML, PDF (Portable Document Format), ODA (Office Document Architecture)

  3. Outline • Why discuss this topic? • What are the categories? • What are the common types of image formats? • What are the tools to view and manipulate them? • Which one should I pick to use? • How to guess an image format? • Document formats

  4. 1. Why discuss this topic? • Understand the advantages and disadvantages of various image formats • Know the available tools • Aware of what’s going on when using these tools • Choose the appropriate image format for your own work

  5. 2. What are the categories? One categorization: • Raster Image Formats • Vector Image Formats Another categorization: • Binary Image Formats • ASCII Image Formats

  6. 2.1 Raster Image Formats • Breaks the image into a series of color dots called “pixels” • The number of bits at each pixel determines the maximum number of colors 1 bits = 2 (21) colors 2 bits = 4 (22) colors 4 bits = 16 (24) colors 8 bits = 256 (28) colors 16 bits = 65,536 (216) colors 24 bits = 16,777,216 (224) colors!

  7. Instead …

  8. 2.2 Vector Image Formats • Break the image into a set of mathematical descriptions of shapes: curve, arc, rectangle, sphere etc. • Resolution-independent: scalable without the problem of “pixelating”. • Not all images are easily described in a mathematical form. How to describe a photograph?

  9. Raster Resolution-dependent Suitable for photographs Smooth tones and subtle details Larger size Vector Resolution-independent Suitable for line drawings, CAD, logos Smooth curves Smaller size 2.3 Comparison

  10. 3. What are the common types of image formats? • Raster: GIF (Graphics Interchange Format), Bitmap,JPEG, TIFF, PBM (Portable Bit Map - binary), PGM (Portable Gray Map – grayscale), PPM(Portable Pixel Map - color), PNM (Portable Any Map – any three), PCD(Photo CD), PNG (Portable Network Graphics), etc. • Vector: PS(Postscript), EPS (Embedded Postscript), CDW (CorelDraw), WMF (Windows Metafile), SVG (Scalable Vector Graphics), etc.

  11. 3.1 CompuServ GIF – Graphics Interchange Format • First standardized in 1987 by CompuServ (called GIF87a) • Updated in 1989 to include transparacy, interlacing, and animation (called GIF89a) • Use the LZW (Lempel-Ziv Welch) algorithm for compression • A maximum of 256 colors, so doesn’t work well for photographs • Suitable for small images such as icons • Simple animations • Interlacing vs. non-interlacing

  12. 3.2 Bitmaps • Can create great image with 24 or even 32 bits per pixel • File size is large, for example, a bitmap image of size 1024x768 with 24 bits per pixel is at least 1024x768x3 = 2 MBs • How to reduce size? Run Length Encoding (RLE) – lossless • What about even smaller size? Lossy encoding such as JPEG.

  13. 3.3 JPEG (Joint Photographic Experts Group) • Lossy encoding • Like interlaced GIFs, there is progressive JPEGs

  14. 3.4 TIFF (Tag Image File Format) • Tag-based image format • Originated in 1986 at Aldus Corp. (PageMaker), the latest version 6.0 • Developed by Aldus and Microsoft • Platform-independent • Mostly used by scanners and desktop publishing • http://www.libtiff.org/ for a TIFF library • Support compressions of CCITT Fax 3 & 4, LZW, JPEG etc. • Support multiple color spaces: Grayscale, RGB, YCbCr, CMYK etc.

  15. File Header Byte Order (2 bytes): MM or II Version (2 bytes): 42 (deep philosophical reason!) Pointer to first IFD (4 bytes) IFD (Image File Directory) Pointer count (2 bytes) Tagged Pointer 0 (12 bytes) Tagged Pointer 1 (12 bytes) …. Pointer to next IFD (if none, 0000) (4 bytes) Some details

  16. Some details - continued • Tagged pointer (12 bytes) • Tag code (2 bytes): in the specs • Type of data (2 bytes): 1 (BYTE), 2 (ASCII), 3 (SHORT), 4 (LONG), 5 (rational) • Length (4 bytes) • Data pointer or data field

  17. 3.5 PBM, PGM, PPM (Portable Bit Map, Portable Gray Map, Portable Pixel Map) • ASCII / Binary format • Easy to edit a.pbm a.pgm a.ppm P1/P4 P2/P5 P3/P6

  18. 3.6 PS (PostScript) • A programming language from Adobe for printing graphics and text (stack based, interpreted language using RPN – Reverse Polish Notation) • A page description language that is device-independent (introduced in 1985 by Adobe) • Different levels: Level 1, 2, 3 • Change coordinate system, scaling, translation, rotation, filling, clipping, etc. • Main unit: point (1/72 of an inch)

  19. 3.7 EPS (Embedded PostScript) • A Postscript with additional rules • For putting postscript in a document • Essential information: what is the size of the image

  20. 3.8 SVG (Scalable Vector Graphics) • A language for describing 2D graphics and applications in XML • SVG specification and current implementations: http://www.w3.org/TR/SVG/ • Adobe SVG Viewer http://www.adobe.com/svg/main.html

  21. 4. What are the tools to view and manipulate them? • Use image editors • For raster image: Abobe Photoshop, Paint Shop Pro, xv • For vector image:Adobe Freehand, Adobe Illustrator, ghostview, xfig • For conversion between different image formats: ImageMagick (free with different platforms)

  22. 5. Which one should I use? • No unique answer • A small image like icons, a grayscale image – GIF • A large image, photographs, an image with many colors – JPEG • Scalability required – PS, EPS

  23. 6. How to guess an image format? • Image magic words GIF: “GIF” TIFF: “II” or “MM” BMP: “BMP” JPEG: FF,D8 (hexadecimal) – Start of Image Marker (SOI) PS: %!PS-Adobe-3.0 EPS: %!PS-Adobe-3.0 EPSF-3.0

  24. 7. Document Representations • Text: ASCII, UNICODE • Page composition languages- Word processing (WYSIWYG), RTF- Typesetting (Tex, FrameMaker) • Document Interchange Formats: DIF(Document Interchange Format), SGML, ODA (Office Document Architecture) • Presentation Formats: HTML, PDF

  25. 7.1 Tex & LaTeX • A high-quality typesetting system • Designed to produce technical and scientific documentation • Based on Donald E. Knuth's TeX typesetting language • First developed in 1985 by Leslie Lamport • Cross-platform • Useful if you are writing your thesis!

  26. 7.2 DIF (Document Interchange Format) • Text only, no graphics or complex structures • ASCII stream of text and instructions (prefixed by ESC)

  27. 7.3 Office Document Architecture (ODA) • A market code standard by ISO • For actual coding, it has a companion called Office Document Interchange Format (ODIF) • Describes the logical structure and layouts.

  28. 7.4 Standard Generalized Markup Language (SGML) • “Meta” language: used to define markup languages • Established by the International Standards Organization (ISO) in 1986 • SGML is not a markup standard, but a framework for devising such a standard • http://xml.coverpages.org/sgml.html

  29. 7.5 Hypertext Markup Language (HTML) • A subset of SGML (an application of SGML) • A HTML file is in ASCII • Has standard codes • Can be edited by a simple text editor, but dedicated authoring tools are usually much more convenient

  30. 7.6 Portable Document Format (PDF) • Adobe’s de facto standard for secure and reliable distribution and exchange of electronic documents • Can embed fonts, images, graphics, forms, controls, layouts, media, etc. • Searchable, hyperlinks, digital signature, • Application and platform independent • http://partners.adobe.com/asn/tech/pdf/index.jsp

  31. 7.7 RTF (Rich Text Format) • Microsoft • Text & graphics • Use ANSI, PC-8, Macintosh, or IBM PC character sets • Currently the documents can be transferred between Windows and Macs

More Related