unicode 3 0 1
Download
Skip this Video
Download Presentation
Unicode 3.0.1

Loading in 2 Seconds...

play fullscreen
1 / 25

Unicode 3.0.1 - PowerPoint PPT Presentation


  • 166 Views
  • Uploaded on

Unicode 3.0.1. Mark Davis www.macchiato.com. New 3.0 Characters. Category V 2.1 V 3.0 Alphabetics, Symbols 6,511 10,236 CJK Ideographs 21,204 27,786 Hangul Syllables 11,172 11,172 Assigned characters 38,887 49,194 Unassigned code values 18,134 7,827

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Unicode 3.0.1' - amato


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
unicode 3 0 1

Unicode 3.0.1

Mark Davis

www.macchiato.com

new 3 0 characters
New 3.0 Characters

CategoryV 2.1V 3.0

Alphabetics, Symbols 6,511 10,236

CJK Ideographs 21,204 27,786

Hangul Syllables11,17211,172

Assigned characters 38,887 49,194

Unassigned code values 18,134 7,827

Sync’ed with ISO/IEC 10646, 2nd edition

Unicode 3.0

new 3 0 blocks
80 Syriac

192 Thaana

128 Sinhala

160 Myanmar

384 Ethiopic

96 Cherokee

640 U.C. Ab. Syl.

32 Ogham

96 Runic

128 Khmer

176 Mongolian

256 Braille

128 CJK Rad. Sup.

224 Kangxi Rad.

16 Ideo. Desc.

32 Bopomofo Ext.

6,582 CJK Ideo. A

1,168 Yi Syllables

64 Yi Radicals

New 3.0 Blocks

Unicode 3.0

property updates 1
Property Updates (1)
  • Bidirectional properties
  • Byte order mark
  • Capital letters with iota adscript
  • Case
  • Combining classes
  • Decompositions

Unicode 3.0

property updates 2
Property Updates (2)
  • Identifier Syntax
  • Layout controls
  • Linebreak properties
  • East-Asian width properties
  • Misc. Characters: Figure Space, Tilde,…
  • Ligature Control
  • Unassigned Code Points

Unicode 3.0

conformance
Conformance
  • Unicode Transformation Formats
    • UTF-16BE, UTF-16LE, UTF-16, UTF-8
  • Unicode Bidirectional Behavior
  • Other normative character property values

Clause numbering maintained!

  • Stability Policies
  • Clarification of noncharacters
  • Normalization Conformance Test

Unicode 3.0

unicode standard annexes uax
Unicode Standard Annexes (UAX)
  • Integral part of 3.0.1 Standard
  • UAX #09: BIDI
  • UAX #11: East Asian Width
  • UAX #13: Newline Guidelines
  • UAX #14: Line Breaking
  • UAX #15: Normalization
  • Included in any reference to version 3.0 or later

Unicode 3.0

unicode technical standards uts
Unicode Technical Standards (UTS)
  • UTS #06: Compression
    • IANA name: SCSU
  • UTS #10: Collation
    • Note: defined over all Unicode code points
    • Values will be updated soon for better ordering

Unicode 3.0

technical reports
Technical Reports
  • UTR #07: Language Tags
  • UTR #16: UTF-EBCDIC
  • UTR #17: Character Encoding Model
  • UTR #18: Regular Expressions
  • UTR #19: UTF-32
  • UTR #21: Case Mappings

Unicode 3.0

draft technical reports
Draft Technical Reports
  • UTR #20: Unicode in XML…
  • UTR #22: Character Mapping Tables
  • UTR #24: Script Names
  • Open for public comment

Unicode 3.0

unicode character database
Unicode Character Database
  • More Documentation, More Data
    • UnicodeData Blocks
    • ArabicShaping Jamo
    • CompositionExclusions SpecialCasing
    • EastAsianWidth LineBreak
    • Unihan BidiMirroring
    • CaseFolding NormalizationTest

Unicode 3.0

website changes
Website changes
  • New Look & Feel
  • New Navigation
  • Enhanced FAQ
  • Glossary
  • What is Unicode?
  • Where is my character?

Unicode 3.0

beyond 3 0
Beyond 3.0
  • Characters
    • CJK characters, symbols, music systems, ancient scripts, extra characters, etc.
    • First allocated surrogate pairs
  • Properties
    • essential for Unicode enablement

Unicode 3.0

unicode 3 0
Unicode 3.0
  • Major new version
  • Over 10,000 new characters
  • Enhanced character data for implementations
  • Reorganized text for better reference
  • The version for normalization
  • Unicode Character Database 3.0.0
  • Available now!

Unicode 3.0

slide15
Q & A

Unicode 3.0

backup slides
Backup Slides

Unicode 3.0

icu paid advertisement
ICU: Paid Advertisement
  • Open Source Unicode Enablement Library
    • ICU: C/C++ and Java Versions
    • IBM Public License
    • Friday, 10:00 Helena Shih
  • http://oss.software.ibm.com/icu

Unicode 3.0

enumerated versions
Enumerated Versions
  • Unicode 1.0.0, Unicode 1.0.1
  • Unicode 1.1.0, Unicode 1.1.5
  • Unicode 2.0.0
  • Unicode 2.1.2, Unicode 2.1.5, Unicode 2.1.8, Unicode 2.1.9
  • Unicode 3.0.0
    • www.unicode.org

Unicode 3.0

editorial committee
Joan Aliprand

Julie Allen (editor)

Joe Becker

Mark Davis

Asmus Freytag

John Jenkins

Mike Ksar

Rick McGowan

Lisa Moore

Ken Whistler

Editorial Committee

Unicode 3.0

new characters 2
New Characters (2)

CategoryV 2.1V 3.0

Private Use 6,400 6,400

Surrogates 2,048 2,048

Controls 65 65

Not Characters 2 2

Assigned code values 47,402 57,709

Unassigned code values 18,134 7,827

Unicode 3.0

reference to versions
Reference to Versions
  • Open repertoire, but backwards compatible
  • Characters only added, not removed
    • Two early exceptions: ISO sync. & Korean
  • Don’t overspecify the version:
    • “Version 2.1.0” vs.“Version 2.1” vs.“Version 2 or later”
  • Includes Technical Reports!!

Unicode 3.0

versions of the standard
Versions of the Standard
  • major - significant additions
    • published as a book
  • minor - character additions or more significant normative changes
    • published as a Technical Report
  • update - any other changes
    • on the website in /standard/versions/
  • Example: 2.1.9

Unicode 3.0

unicode 3 023
Versioning

Characters

Properties

Conformance

Technical Reports

Unicode Character Database

Future

Unicode 3.0

Unicode 3.0

reorganized text
Reorganized Text
  • 6: Punctuation
  • 7: European Alphabetics
  • 8: Middle Eastern
  • 9: South Asian
  • 10: East Asian
  • 11: Other (Mongolian, etc.)
  • 12: Symbols
  • 13: Formatting, Controls, Specials

Unicode 3.0

additionally
Additionally
  • Shift-JIS Index
  • Full Radical Stroke Index
    • CJK split in several blocks
  • Improved Charts
    • Especially for CJK Ideographs
  • Improved Implementation Guidelines
  • General Clarifications

Unicode 3.0

ad