1 / 91

Windows™ 2000 Indian Language Developers Conference

Windows™ 2000 Indian Language Developers Conference. F. Avery Bishop Senior Program Manager for Multilingual Developer Communications, and David C. Brown Development Lead for Complex Script Enabling in Windows™ Operating Systems Microsoft™ Corporation. Agenda for the Day .

nili
Download Presentation

Windows™ 2000 Indian Language Developers Conference

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Windows™ 2000 Indian Language Developers Conference F. Avery Bishop Senior Program Manager for Multilingual Developer Communications, and David C. Brown Development Lead for Complex Script Enabling in Windows™ Operating Systems Microsoft™ Corporation

  2. Agenda for the Day • Welcome and Keynote • International Features of Windows 2000 • Complex Script Processing in Windows 2000 • Uniscribe: The Unicode Script Processor • Lunch • Guidelines for supporting complex scripts in Win32 applications • Supporting Indian text in Enterprise applications • Introduction to Open Type Fonts • Microsoft developer programs in India

  3. Updates on Session Materials • Today’s presentations vary slightly from your session handouts • For updates to ppt files and demos, seewww.microsoft.com/globaldev

  4. International Features in Microsoft Windows 2000™F. Avery BishopSenior Program ManagerMicrosoft Corporation

  5. Agenda: International Features of Windows 2000* • Definitions of key concepts • Windows 2000 single-binary internationalization • Multilingual content • Windows 2000 Multilanguage version • New complex script support, including: • Support for Indian languages • Complex Scripts in web pages • Right-to-left layout of shell, applications *Old name: Windows NT 5.0

  6. Definitions • Script:A set of symbols used to write one or more languages • Locale: • A place or locality (Dictionary definition) • Set of user preferences related to language and local customs • Language Group:Term used to describe the supported script families in Windows NT 5

  7. Definitions • System Locale:Not really a locale. Determines which script non-Unicode applications will support (e.g., what Windows 9x system Windows NT emulates) • User Locale:User preferences for formatting of dates, currencies, numbers, etc. • Input Locale:Pairing of input language and method of of input; determines what language is currently being entered and how

  8. Definitions • Enabling for a script:Adding support for input, display, and output of the script • Localization:Translating user interface elements • Globalization:Developing software such that feature design and code design are not limited to a single locale or script

  9. Definitions • Complex Scripts:Scripts that require contextual processing for display, editing, and other processing

  10. All language versions of Windows 2000 use the same core binary files!So What? • Advantages to Users: • Can enter text in any supported language on any version of Windows 2000 • Any language version of well written Win32 app runs on any language version of Windows 2000 • Advantages to developers: • Develop all language versions on one system • Can develop and ship a single binary for all languages

  11. More on Unified language Support in Windows 2000 • Effect of system default locale on application: • ANSI applications require appropriate system locale setting • ANSI/Unicode applications may require system locale setting (more on this later) • Pure Unicode applications work with any system locale • Native Unicode support: • Important: New scripts will have no codepage, the support is through Unicode only (e.g., Indian scripts, Armenian, Gregorian)

  12. Unicode allows processing of Multilingual Content • System components: • Internet Explorer 5.0 can do amazing things! • Others: Winlogon, File system, Notepad, etc. • Unicode applications • Office 2000 • Your application!

  13. Windows 2000 Multilanguage Version • Language of menus and dialogs is a per-user-setting • Installable language modules • Sold through MOLP, Select, and Enterprise Agreement • Available to developers through MSDN

  14. Support for Complex Scripts in Windows 2000 • Bi-directional (BiDi) reordering (Arabic, Hebrew) • Contextual shaping (Arabic, Indic family) • Display of combining characters (Arabic, Thai, Indian) • Specialized word-break and justification rules (Thai) • Disallowing illegal character combinations (Indian,Thai) A complex script is one that requires special processing, such as:

  15. RTL Orientation, or “Mirroring”

  16. Right-to-Left Mirroring API • One function call will “mirror” all windows in an application • Can also mirror selective windows • APIs to suppress mirroring of bitmaps • May need to modify coding practices

  17. Support for Indian Languages in Windows 2000 • APIs handle Devanagari and Tamil text through Unicode • Locale support • Time, Date, number, currency formats • Sorting • Conversion • Explicit function calls convert to/from ISCII • No Windows 98 compatibility mode

  18. How We Developed Indian Script Support in Windows 2000 • Worked with Government organizations • Consulted with NCST, CDAC, academics • Brought engineers from NCST • Added Indian shaping engines to Uniscribe • Helped define feature tables for Open Type • Hired Hindi/Tamil speakers to test

  19. Complex Scripts in Web pages • IE 5.0 supports complex scripts, including Devanagari and Tamil in: • Standard HTML text • DHTML – All properties in DOM • XML • Recommended encoding is UTF-8 • Place charset=utf-8 in HTTP header • Allows mixed scripts

  20. Demo!

  21. Questions?

  22. Further Information and Resources • http://www.microsoft.com/globaldev(Watch for updates!) • MSJ articles, e.g., • Uniscribe: http://www.microsoft.com/msj/1198/multilang/multilangtop.htm • Multilingual UI: Coming April 1999 • Send suggestions to nlshelp@microsoft.com

  23. Break!

  24. Complex Script Processing in Microsoft Windows 2000™David BrownDevelopment LeadMicrosoft Corporation

  25. Agenda • Overview • Implementation • Details

  26. 1. Overview Distinct language groups Mix any and all scripts Most apps are easy to develop CS = Complex Script

  27. Complex Script Language groups Arabic, Hebrew, Indic, Thai, Vietnamese Part of ALL versions of Windows 2000 Enable in Control Panel - Regional Settings Turn it on today!

  28. All scripts, any mix Unicode makes representation easy Common framework and APIs Individual script and font handlers Multilingual for no extra effort

  29. Built into standard system APIs Plain text ExtTextOut, Drawtext, TabbedTextOut System edit control Dialog boxes Formatted text Richedit HTML control See the Win32 SDK Don’t write your own formatting

  30. Font fallback Standard system fonts For dialogs, plaintext edit controls and other plaintext display Dialog boxes work automatically

  31. Summary CS support is standard in Windows 2000 No restrictions on script combinations Easy (unless you are implementing your own formatting)

  32. 2. Implementation Callouts from GDI and USER Performance Text broken by script and direction Script handlers LPK.DLL

  33. Callouts from GDI and USER ExtTextOut, DrawText passed early to LPK.DLL Plaintext edit control has many callouts Caret placement Text measurement Line breaking Word advance Safe, stable changes to OS core

  34. Fast path for non CS Normal GDI 1:1 char to glyph Simple side by side placement No CS characters If right-to-left, no neutrals If Digit substitution, no digits Performance is good

  35. Split by script and direction Separate e.g. Devanagari, Tamil, Western Left-to-right or right-to-left Unicode bidirectional algorithm Atomic item of display

  36. Handler for each script Script shaping and reordering Devanagari - matra I reordered before consonant cluster Tamil - vowel sign O surrounds consonant cluster Urdu - Initial, media, final, alone forms Various font formats Backward compatability Shaping - ligatures, contextual forms Placement of marks Script handlers understand scripts

  37. Language Pack: LPK.DLL Apply NLS settings (preferred digits) Plaintext edit control Calls to Uniscribe string handling LPK.DLL is OS <> Uniscribe bridge

  38. Application USER GDI LPK.DLL Uni-scribe

  39. Summary Callouts from GDI and USER Performance issues Split by script and direction Script handlers LPK.DLL

  40. 3. Details • Clusters • Caret placement and Mouse hits • Word breaking • Font metrics • Measuring text • Metafiles

  41. Clusters • Indivisible - Indian, Thai, Vietnamese • Divisible - Arabic

  42. Caret, mouse hits • For indivisible clusters • Arrow keys skip over clusters • Del deletes entire cluster • Backspace decomposes cluster one character at a time • Arrows and Mouse select whole clusters • Left click snaps to nearest boundary • For divisible clusters • Caret shows proportional position • Use system controls or query Uniscribe

  43. Font metrics • Matching the body height

  44. Font metrics • Matching the ascender

  45. Font metrics • Matching the descender

  46. Matching fonts • When CS text is predominant • Full CS line spacing • Increase Western height • When Western text is predominant • Compromise line spacing • Accept some clipping • System edit control • Line spacing from single font • Richedit, HTML control • Line spacing adjusted for multiple fonts

  47. Measuring text • Adding characters can make text smaller

  48. Metafiles • Device independent • Store Unicode - Enhanced metafile • Use ExtTextOut(W) • Windows adjusts widths for different playback fonts • Device dependant • Avoid • Stores glyphs • Requires identical font for playback

  49. Summary • Caret placement and Mouse hits • Word breaking • Font metrics • Measuring text • Metafiles • Format with richedit, MSHTML

  50. Resources • Uniscribe - next talk • OpenType - later today • Win32 SDK • Richedit • RTF • messages • Text object model • HTML control • HTML • Document object model

More Related