1 / 19

CIS 234: Character Codes

CIS 234: Character Codes. Dr. Ralph D. Westfall April, 2011. Problem 1 (other PowerPoint ). computers only understand binary coded data (zeros and ones) 00000000, 11111111, 01010101 people like to count in decimals 00000000=0, 11111111=255, 01010101=85

thiery
Download Presentation

CIS 234: Character Codes

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CIS 234: Character Codes Dr. Ralph D. Westfall April, 2011

  2. Problem 1 (other PowerPoint) • computers only understand binary coded data (zeros and ones) • 00000000, 11111111, 01010101 • people like to count in decimals 00000000=0, 11111111=255, 01010101=85 • 1st problem: it is extremely hard for people to work with binary data

  3. Problems 2a and 2b • since computers only work with numbers, they need to use numbers to identify letters to print or show on screen e.g., 01000001=65=A • people who don't read English also use computers • next problem: what kind of numbering should be used for different languages?

  4. Problem 2 Solution • using binary data to display characters • make up a "coding scheme" that assigns characters to numbers • ASCII code: 7-8 bits (1 byte) • Unicode: 16 bits (2 bytes)

  5. ASCII Code • used for teletypes before computers • 128 characters in original ASCII • 0 to 31 (decimal) control the machine 7 (BEL) rings bell 8 (BS) backspace key 10 (LF) line feed (go down 1 line) 13 (CR) carriage return (to left of page) Java: '\n' = 10 and 13 together (2 bytes)

  6. ASCII Characters • A = 41 hex (65 decimal), Z = 5A h (90) • a = 61 hex (97 decimal), z = 7A h (122) • see calculator (String or ASCII choices) • space character = 20 hex (32 decimal) • see how space character code is used in browser Address textbox • ; (semicolon) = 3B hex (59 decimal)

  7. Printable ASCII Characters (space) ASCII mage is from Wikipedia

  8. ASCII Numbers • codes are for characters on screen and do NOT equal the values of the characters • Code numeric values can NOT be used in calculations without adjustments 0 = 30 hex (ASCII 0 is really 48 decimal) 9 = 39 hex (57 decimal)

  9. Unicode • ASCII is a 7-8 bit encoding scheme • 128-256 character limit • Unicode is a 16-bit scheme • Uni comes from the word universal (also from Unix) • can code 65,536 characters (actually more) • Java uses Unicode encoding so that it can be used for many different languages

  10. Unicode - 2 • Unicode characters for many languages • Western alphabets: Latin (English), Greek, Cyrillic (Russian), etc. • Unicode uses 0000000 + ASCII for English • 00000000 01000001 = A (65 decimal) • Asian characters: CJK (Chinese, Japanese, Korean) has over 20,000 characters • many character systems require installing special fonts onto user's computer

  11. Using Unicode in Java char letter = 'A' ; //easiest way char letter = '\u0041' ; // also = 'A' char letter = '\u3220' ; // or '\u3280' ; // 1 Chinese character for 1 • \ (backslash) = escape character • \u means Unicode (#s are in hexadecimal) char sound = '\u0007' ; // BEL • sounds speakers when "printed" to screen

  12. Review Questions • How many bits are there in ASCII code? • How many bits are there in Unicode? • True or False: All ASCII codes can be seen as characters on the screen • How many characters can be printed using ASCII? Using Unicode? (match 2) • around 90, around 12,000, over 50,000

  13. Review Questions - 2 • Why was Unicode created to handle over 50,000 characters? • Give an example of what some non-printable ASCII character does on a computer or screen • How does Java code need to handle calculations on numeric characters entered on the screen by the user

  14. Review Questions - 3 • Is a space a character? • What is the Chinese character for the number 1? 2? 3? • this will NOT be on a test! • see answers on next slide

  15. Chinese Characters: 3, 2 and 1

  16. Appendix • the following slides show how ASCII characters can be read from the keyboard and converted to values that can be used for mathematical calculations

  17. Reading Characters in DOS int iInit = System.in.read() ; • gets numeric value of character it reads • if character is A, iInit = 65 (decimal) char cInit = (char) System.in.read() ; • (char) "casts" (converts) numeric value to character type System.out.println(iInit) ; //number System.out.println(cInit) ; //character

  18. Reading Characters in Java - 2 • 2 characters sent when hit Enter key CR (13) and then LF (10 decimal) • when accepting keyboard input from DOS window in Java, need to "absorb" both characters from Enter keystroke System.in.read(); System.in.read(); • reads characters, doesn't store (=) them • program is now ready to read next input

  19. Using Characters for Math • numbers (characters) read from keyboard have numeric values • need to convert character's decimal value to its mathematical value • 0 = 30 h (48 decimal), 9 = 39 h (57) • math value = decimal value – 48 int quantity = System.in.read() – 48 ; code // notes

More Related