slide1 n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Digitizing Discrete Information PowerPoint Presentation
Download Presentation
Digitizing Discrete Information

Loading in 2 Seconds...

play fullscreen
1 / 61

Digitizing Discrete Information - PowerPoint PPT Presentation


  • 101 Views
  • Uploaded on

Digitizing Discrete Information. Digitize Represent info with digits (symbols) Digits: { 0, 1, 2, …, 9 } Or digits: { A, B, C, …, Z } Or any set of distinct symbols. Symbols, Briefly. Prefer short names for symbols One, two, …, Instead of “asterisk”, “closing parenthesis”, etc.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Digitizing Discrete Information' - maura


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
digitizing discrete information
Digitizing Discrete Information
  • Digitize
    • Represent info with digits (symbols)
    • Digits: { 0, 1, 2, …, 9 }
    • Or digits: { A, B, C, …, Z }
    • Or any set of distinct symbols
symbols briefly
Symbols, Briefly
  • Prefer short names for symbols
    • One, two, …,
    • Instead of “asterisk”, “closing parenthesis”, etc.
  • Aside: we shorten many names in IT
    • exclamation point => bang
    • asterisk => star
    • open parenthesis => open paren
    • open curly brace => open brace
ordering symbols
Ordering Symbols
  • Want order for the digits/symbols
    • 0 – 9 has obvious order
    • But what about { !, @, #, …, ) }?
      • Define acollating sequence
  • Digitize
    • Represent info with symbols
fundamental information representation
Fundamental Information Representation
  • Given digital info, how to store it?
    • Use physical phenomena
      • Light
      • Current
      • Magnetism
fundamental information representation1
Fundamental Information Representation
  • In digital world
    • Don’t care how much, just presence
  • In logical world (basis of computing)
    • True and false
fundamental information representation2
Fundamental Information Representation
  • Physical world can implement logical world
    • Presence => “true”
    • Absence => “false”
the panda representation
The PandA Representation
  • We will use “PandA” for presence and absence representation
    • Only two states
    • Could use false for absent, true for present
    • Or 0 for absent, and 1 for present
the panda representation1
The PandA Representation
  • Such a formulation is said to be discrete
  • Discrete means “distinct” or “separable”
    • Opposite of continuous
    • No “shadesof gray”
analog vs digital
Analog vs. Digital

Analog is continuous data/information

Sound waves

analog vs digital1
Analog vs. Digital

Digital is discrete info

Obtained by sampling

a binary system
A Binary System
  • PandA encoding is binary
bits form symbols
Bits Form Symbols
  • PandA unit is a binary digit (bit)
  • Bit sequences form binary numbers
encoding bits on a cd rom
Encoding Bits on a CD-ROM

PandAbit values are pits and lands

bits in computer memory
Bits in Computer Memory
  • Memory is a long sequence of bits
  • Sidewalk Analogy
sidewalk memory
Sidewalk Memory
  • Imagine clean sidewalk consisting of squares
    • Presence of a stone on a square => 1
    • Absence of a stone => 0
  • Sidewalk: sequence of bits
sidewalk memory2
Sidewalk Memory
  • Writing info
    • Put stone on square (1)
    • Remove stone from square (0)
  • Reading info
alternative panda encodings
Alternative PandA Encodings
  • Other ways to encode two states
    • Color of stone
    • Number of stones
    • Another?
combining bit patterns
Combining Bit Patterns
  • One bit with two states isn’t enough
  • So we combine them
hex explained
Hex Explained
  • Hex numbers are base-16
  • A bit sequence may be

1111111110011000111000101010

    • Error prone
    • Instead use hex
the 16 hex digits
The 16 Hex Digits
  • Hex digits
    • { 0, 1, 2, …, 9, A, B, C, D, E, F }
    • Can represent 4-bit sequences
      • 0000 = 0 hex
      • 0001 = 1 hex
      • 1001 = 9 hex
      • 1010 = A hex
      • 1111 = F hex
hex to bits and back again
Hex to Bits and Back Again
  • Each hex digit corresponds to 4 bits
    • 0010 1011 1010 1101 2 B A D
    • F A B 41111 1010 1011 0100
    • 1 9 C 6

?

digitizing numbers in binary
Digitizing Numbers in Binary
  • Need binary representations for
    • Numbers
    • Characters
  • But also
    • image
    • video
    • sound
counting in binary
Counting in Binary
  • Binary numbers (base 2) uses digits 0 and 1
  • Decimal numbers (base 10) use 0 through 9

Counting to ten

counting in binary1
Counting in Binary
  • Place value representation
place value in a decimal number
Place Value in a Decimal Number
  • Example, 1010 (base 10) is

(1 × 1000) + (0 × 100) + (1 × 10) + (0 × 1)

place value in a binary number
Place Value in a Binary Number
  • Binary is base 2so powers of 2 are used
place value in a binary number1
Place Value in a Binary Number
  • 1010 in binary
    • (1 × 8) + (0 × 4) + (1 × 2) + (0 × 1)
digitizing text
Digitizing Text
  • # of bits determines # of symbols that can be represented
    • n bits => 2n symbols
digitizing text1
Digitizing Text
  • To digitize English text
    • Roman letters
    • Arabic numbers
    • Punctuation
    • Arithmetic symbols
assigning symbols
Assigning Symbols
  • So we need to represent
    • 26 uppercase
    • 26 lowercase letters
    • 10 numerals
    • 20 punctuation characters
    • 10 arithmetic characters
    • 3 other characters (new line, tab, and backspace)
    • 95 symbols…enough for English
assigning symbols1
Assigning Symbols
  • To represent 95 distinct symbols we need how many bits?
  • Need to represent control characters too
assigning symbols2
Assigning Symbols
  • ASCII stands for American Standard Code for Information Interchange
    • Widely used 7-bit code
  • Advantages of a “standard”
    • Interoperability of h/w
    • Communications among programs
extended ascii an 8 bit code
Extended ASCII: An 8-Bit Code
  • For other languages 7 bits aren’t enough
  • IBM developed an 8-bit ASCII
    • Uses 1 byte
    • Uses 0 in leftmost bit followed by 7-bit ASCII codes
    • Allows 128 more codes that start with 1
    • Can handle most Western languages
ascii character set decimal
ASCII Character Set (Decimal)

Decimal - Character

0 NUL 1 SOH 2 STX 3 ETX 4 EOT 5 ENQ 6 ACK 7 BEL

8 BS 9 HT 10 NL 11 VT 12 NP 13 CR 14 SO 15 SI

16 DLE 17 DC1 18 DC2 19 DC3 20 DC4 21 NAK 22 SYN 23 ETB

24 CAN 25 EM 26 SUB 27 ESC 28 FS 29 GS 30 RS 31 US

32 SP 33 ! 34 " 35 # 36 $ 37 % 38 & 39 '

40 ( 41 ) 42 * 43 + 44 , 45 - 46 . 47 /

48 0 49 1 50 2 51 3 52 4 53 5 54 6 55 7

56 8 57 9 58 : 59 ; 60 < 61 = 62 > 63 ?

64 @ 65 A 66 B 67 C 68 D 69 E 70 F 71 G

72 H 73 I 74 J 75 K 76 L 77 M 78 N 79 O

80 P 81 Q 82 R 83 S 84 T 85 U 86 V 87 W

88 X 89 Y 90 Z 91 [ 92 \ 93 ] 94 ^ 95 _

96 ` 97 a 98 b 99 c 100 d 101 e 102 f 103 g

104 h 105 i 106 j 107 k 108 l 109 m 110 n 111 o

112 p 113 q 114 r 115 s 116 t 117 u 118 v 119 w

120 x 121 y 122 z 123 { 124 | 125 } 126 ~ 127 DEL

ascii character set hexadecimal
ASCII Character Set (Hexadecimal)

Hexadecimal - Character

00 NUL 01 SOH 02 STX 03 ETX 04 EOT 05 ENQ 06 ACK 07 BEL

08 BS 09 HT 0A NL 0B VT 0C NP 0D CR 0E SO 0F SI

10 DLE 11 DC1 12 DC2 13 DC3 14 DC4 15 NAK 16 SYN 17 ETB

18 CAN 19 EM 1A SUB 1B ESC 1C FS 1D GS 1E RS 1F US

20 SP 21 ! 22 " 23 # 24 $ 25 % 26 & 27 '

28 ( 29 ) 2A * 2B + 2C , 2D - 2E . 2F /

30 0 31 1 32 2 33 3 34 4 35 5 36 6 37 7

38 8 39 9 3A : 3B ; 3C < 3D = 3E > 3F ?

40 @ 41 A 42 B 43 C 44 D 45 E 46 F 47 G

48 H 49 I 4A J 4B K 4C L 4D M 4E N 4F O

50 P 51 Q 52 R 53 S 54 T 55 U 56 V 57 W

58 X 59 Y 5A Z 5B [ 5C \ 5D ] 5E ^ 5F _

60 ` 61 a 62 b 63 c 64 d 65 e 66 f 67 g

68 h 69 i 6A j 6B k 6C l 6D m 6E n 6F o

70 p 71 q 72 r 73 s 74 t 75 u 76 v 77 w

78 x 79 y 7A z 7B { 7C | 7D } 7E ~ 7F DEL

beyond ascii
Beyond ASCII

Unicode

Uses up to 4 bytes to handle how many characters?

Allows all modern scripts (Kanji, Arabic, Cyrillic, Hebrew, etc.)

Contains 8-bit ASCII as the low 256 characters for compatibility

Allows ancient scripts like Egyptian hieroglyphics

ascii coding of phone numbers
ASCII Coding of Phone Numbers

How to encode 888 555 1212 in ASCII?

Encode each digit with its ASCII byte

8 8 8 5 5 etc.

00111000 00111000 00111000 00110101 00110101 etc.

another ascii example
Another ASCII Example
  • From Lab 1

CSCI ftw!

Takes ? bytes to store.

Representation in ASCII?

43 53 43 49 20 66 74 77 21 0A

In Binary?

0100 0011 0101 0011 0100 0011 0100 1001 ... 0010 0001 0000 1010

advantages of long encodings
Advantages of Long Encodings
  • Short encodings save memory
  • Examples of longer encodings
    • NATO Broadcast Alphabet
    • Bar Codes
nato broadcast alphabet
NATO Broadcast Alphabet
  • NATO alphabet
    • Used for radio communication
    • Purposely inefficient
    • Distinctive amid noise (‘m’ versus ‘n’)
  • Letters represented with word “symbols”
    • a => alpha, b => bravo, c => charlie
  • Digits keep their usual names
    • Except 9 => niner
bar codes
Bar Codes
  • Universal Product Codes (UPC) use more bits than necessary
  • UPC-A encoding uses 7 bits to encode the digits 0 – 9
bar codes1
Bar Codes
  • Encodes manufacturer (left side) and product (right side)
    • Different bit combinations are used for each side
    • One side is complement of the other
    • Bit patterns were chosen to appear as different as possible
bar codes2
Bar Codes
  • Encodings for each side make it possible to recognize whether code is upside down
metadata and the oed
Metadata and the OED
  • To represent info
    • Need to convert to binary
    • Need to describe its properties
  • Characteristics of the content also need to be encoded
    • How is the content structured?
    • What other content is it related to?
    • Where was it collected?
    • When was it created or captured?
    • What units is it given in?
    • How should it be displayed?
    • And so on…
metadata and the oed1
Metadata and the OED
  • Metadata
    • info describing info
    • often specified with tags (like with HTML)
properties of data
Properties of Data
  • ASCII encodes characters
  • Metadata gives properties of data
    • font style
    • color
    • justification
    • margins
    • etc.
properties of data1
Properties of Data
  • Content and metadata example
using tags for metadata
Using Tags for Metadata
  • Oxford English Dictionary (OED)
    • Definitive reference for every English word’s meaning, etymology, and usage
    • Printed version is 20 volumes, weighs 150 pounds, and fills 4 feet of shelf space
structure tags
Structure Tags
  • Digital OED uses tags to indicate structure
    • <hw> for a headword (word defined)
    • <pr> for pronunciation
    • <ph> for phonetic notations
    • <ps> for part of speech
    • <hm> for homonym numbers
    • <e> for entire entry
    • <hg> for head group (all info at start of definition)
structure tags1
Structure Tags
  • Algorithms utilize tags
    • Search
    • Formatting
slide59
Quiz
  • What’s the first step in debugging?
    • check for obvious
    • isolate the problem
    • reproduce the problem
    • pinpoint
  • Fix the error in this CSS
    • body { color; red }
slide60
Quiz
  • Like all engineers, programmers begin with a _____________ – a precise description of the input, how the system should behave, and how the output should be produced.
summary
Summary
  • Digitizing info
  • Storing info using PandA
    • Bits, bytes, hex
  • ASCII
  • Metadata