1 / 70

Lecture 4

ITEC 1000 “Introduction to Information Technology”. Lecture 4. Data Formats. Lecture Template:. Data Forms Data conversion and representation Data Formats Alphanumeric Data Image Data Audio Data Data Input Data Compression Internal Computer Data Format. Data Forms.

bo-sears
Download Presentation

Lecture 4

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. ITEC 1000 “Introduction to Information Technology” Lecture 4 Data Formats

  2. Lecture Template: • Data Forms • Data conversion and representation • Data Formats • Alphanumeric Data • Image Data • Audio Data • Data Input • Data Compression • Internal Computer Data Format

  3. Data Forms • Human communication • Includes language, images and sounds • Computers • Process and store all forms of data in binary format • Conversion to computer-usable representation using data formats • Define the different ways human data may be represented, stored and processed by a computer

  4. Data conversion and representation

  5. Data formats • Proprietary formats • Unique to a product or company • E.g., Microsoft Word, Word Perfect • Standards (evolve in two ways): • Proprietary formats become de facto standards (e.g., Adobe PostScript) • Invented by an international standard organization (e.g., Motion Pictures Experts Group, MPEG)

  6. Common Data Representations

  7. Alphanumeric Data • Characters (r, T), number digits (0..9), punctuation (!, ;), special purpose characters ($, &) • Four codes/standards to represent letters and numbers: • BCD (Binary-Coded Decimal) • Unicode • ASCII (American Standard Code for Information Interchange) • EBCDIC (Extended Binary Coded Decimal Interchange Code)

  8. Standard Alphanumeric Formats • BCD • ASCII • EBCDIC • Unicode Next 2 slides

  9. Binary-Coded Decimal (BCD) • Four bits per digit Note: the following 6 bit patterns are not used: 1010 1011 1100 1101 1110 1111

  10. BCD: Example • 709310 = ? (in BCD) 7 0 9 3 0111 0000 1001 0011

  11. Standard Alphanumeric Formats • BCD • ASCII • EBCDIC • Unicode Next 13 slides

  12. ASCII Features • Developed by ANSI (American National Standards Institute) • Defined in ANSI document X3.4-1977 • 7-bit code • 8th bit is unused (or used for a parity bit or to indicate “extended” character set) • 27 = 128 different codes • Two general types of codes: • 95 are “Printing” codes (displayable on a console) • 33 are “Control” codes (control features of the console or communications channel) • Represents • Latin alphabet, Arabic numerals, standard punctuation characters • Plus small set of accents and other European special characters (Latin-I ASCII)

  13. ASCII Table

  14. ASCII Table Most significant bit Least significant bit

  15. ASCII Table e.g., ‘a’ = 1100001

  16. ASCII Table 95 Printing codes

  17. ASCII Table 33 Control codes

  18. ASCII Table Alphabetic codes

  19. ASCII Table Numeric codes

  20. ASCII Table Punctuation, etc.

  21. ASCII Table 7416 111 0100

  22. H e l l o , w o r l d = = = = = = = = = = = = Binary 1001000 1100101 1101100 1101100 1101111 0101100 0100000 1110111 1100111 1110010 1101100 1100100 = = = = = = = = = = = = Hexadecimal 48 65 6C 6C 6F 2C 20 77 67 72 6C 64 = = = = = = = = = = = = Decimal 72 101 108 108 111 44 32 119 103 114 108 100 Example: “Hello, world”

  23. Common Control Codes • CR 0D carriage return • LF 0A line feed • HT 09 horizontal tab • DEL 7F delete • NULL 00 null Hexadecimal code

  24. ASCII Table: Common Control Codes

  25. Standard Alphanumeric Formats • BCD • ASCII • EBCDIC • Unicode Next 3 slides

  26. EBCDIC • 8-bit code • Developed by IBM • IBM and compatible mainframes only • Rarely used today (common in archival data) • Character codes differ from ASCII • Conversion software to/from ASCII available

  27. EBCDIC Table (1 out of 2)

  28. EBCDIC Table (2 out of 2)

  29. Standard Alphanumeric Formats • BCD • ASCII • EBCDIC • Unicode Next 2 slides

  30. Unicode • Most common 16-bit form represents 65,536 characters • ASCII Latin-I subset of Unicode • Values 0 to 255 in Unicode table • Multilingual: defines codes for • Nearly every character-based alphabet • Large set of ideographs for Chinese, Japanese and Korean • Composite characters for vowels and syllabic clusters required by some languages • Allows software modifications for local-languages

  31. Two-byte Unicode Assignment Table

  32. Collating Sequence • Collating Sequence – the order of the codes in the representation table • Determines sorting and selection of the alphanumeric data • Collating Sequences are different in ASCII and EBCDIC: • Small letters precede capitals in EBCDIC; reverse in ASCII • Numbers collate first in ASCII; in EBCDIC, last

  33. Two Classes of Codes • Printing characters • Produced output on the screen or printer • Control characters • Control position of output on screen or printer • Cause action to occur • Communicate status between computer and I/O device

  34. Control Code Definitions (ASCII Table)

  35. Escape Sequences • Extend the capability of the ASCII code set • For controlling terminals and formatting output • Defined by ANSI in documents X3.41-1974 and X3.64-1977 • The escape code is ESC = 1B16 • An escape sequence begins with two codes: ESC [ 1B16 5B16

  36. Escape Sequences: Examples • Erase display: ESC [ 2 J • Erase line: ESC [ K

  37. Alphanumeric Input: Keyboard • Scan code • Two different binary scan codes generated • when key is struck and when key is released • Converted to Unicode, ASCII or EBCDIC by software in terminal or PC • Received by the host as a stream of text and other characters, i.e. in the sequence typed • Advantage • Easily adapted to different languages or keyboard layout • Separate scan codes for key press/release for multiple key combinations • Examples: shift and control keys

  38. Shift Key • inhibits bit 5 in the ASCII code a Shift a

  39. Control Key • inhibits bits 5 & 6 in the ASCII code c Ctrl c Controlcode

  40. Keyboard Input • Three letters are typed: “D”, “I”, “R”, followed by the carriage return • Four scan codes translated to ASCII binary codes: 1000100, 1001001, 1010010, 0001101

  41. OCR (optical character recognition) • Scans text and inputs it as character data • Special OCR software required • Used to read specially encoded characters • Example: magnetically printed check numbers • Attempts to recognize hand-written input (limited, only carefully printed)

  42. Bar Code Readers • Used in applications that require fast, accurate and repetitive input with minimal employee training • Examples: supermarket checkout counters and inventory control • Alphanumeric data in bar code (i.e., 780471 108801 90000) read optically using wand that converts them into electrical binary signals • A bar code translation module converts the binary input into a sequence of number codes , one code per digit, then translated to Unicode or ASCII.

  43. OtherAlphanumeric Input • Magnetic stripe reader: alphanumeric data from credit cards • Voice • Digitized audio recording common but conversion to alphanumeric data difficult • Requires knowledge of sound patterns in a language (phonemes) plus rules for pronunciation, grammar, and syntax

  44. Image Data • Photographs, figures, icons, drawings, charts and graphs • Two approaches: • Bitmap or raster images of photos and paintings with continuous variation (e.g., GIF, JPEG) • Object or vector images composed of graphical shapes like lines and curves defined geometrically • Differences include: • Quality of the image • Storage space required • Time to transmit • Ease of modification

  45. Image Input • Image scanning (moves over the image converting dot by dot into a stream of binary numbers, pixels, representing black or white, or levels of gray, or of a colour) – bitmap image • Digital/video cameras – bitmap image • Pointing devices (mouse, pen)- object image

  46. Bitmap Images • Each individual pixel (pi(x)cture element) in a graphic stored as a binary number • Pixel: A small area with associated coordinate location • Example: each point below represented by a 4-bit code corresponding to 1 of 16 shades of gray

  47. Bitmap Display • Monochrome: black or white • 1 bit per pixel • Gray scale: black, white or 254 shades of gray • 1 byte per pixel • Color graphics: 16 colors, 256 colors, or 24-bit true color (16.7 million colors) • 4, 8, and 24 bits respectively

  48. Storing Bitmap Images • Frequently large files • Example: 600 rows of 800 pixels with 1 byte for each of 3 colors ~1.5MB file • File size affected by • Resolution (the number of pixels per inch) • Amount of detail affecting clarity and sharpness of an image • Levels: number of bits for displaying shades of gray or multiple colors • Palette: color translation table that uses a code for each pixel rather than actual color value • Data compression

  49. GIF (Graphics Interchange Format) • First developed by CompuServe in 1987 • GIF89a enabled animated images • allows images to be displayed sequentially at fixed time sequences • Color limitation: 256 • Image compressed by LZW (Lempel-Zif-Welch) algorithm • Preferred for line drawings, clip art and pictures with large blocks of solid color • Lossless compression

  50. GIF (Graphics Interchange Format)

More Related