1 / 67

Software Globalization With Windows 2000/XP

Software Globalization With Windows 2000/XP. Houman Pournasseh Lead Program Manager. Agenda. Definitions Why invest in World-Ready products? Globalization – step-by-step Universal encoding - Unicode Locale aware Handle different input methods Complex script aware Font independency

jeneil
Download Presentation

Software Globalization With Windows 2000/XP

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. SoftwareGlobalization With Windows 2000/XP Houman Pournasseh Lead Program Manager

  2. Agenda • Definitions • Why invest in World-Ready products? • Globalization – step-by-step • Universal encoding - Unicode • Locale aware • Handle different input methods • Complex script aware • Font independency • Multi-lingual UI aware • Mirroring aware • Conclusion & References

  3. Agenda • Definitions • Why invest in World-Ready products? • Globalization – step-by-step • Universal encoding - Unicode • Locale aware • Handle different input methods • Complex script aware • Font independency • Multi-lingual UI aware • Mirroring aware • Conclusion & References

  4. Definitions • World-Ready:Properly globalized and localizable. • Globalization:The process of designing and implementing source code so that it can accommodate any local market (locale) or script. • Localizability:Designing software code and resources such that resources can be localized for any local market (locale) without changing the source code. • Localization:The process of adapting a product (including both text and non-text elements) to meet the language, cultural, and political expectations and/or requirements of a specific local market (locale).

  5. To define their geographical location, users set the location To select a UI language, users set the UI language To run legacy applications (non-Unicode), users set the system locale To enter text in different languages, users set the input locale Users and Locales: To define formatting for date, time…,users set the user locale

  6. New to Windows XP • Nine (9) new locales added to previous list of 126. • Punjabi, Gujarati, Telugu, Kannada, Kyrgyz, Mongolian (Cyrillic), Galician, Divehi, Syriac • New Indic and Arabic scripts • Gujarati, Gurmukhi, Telugu, Kannada, Syriac, Divehi • More robust font display for East Asian languages. • Improved Regional Settings options. • Largely improved MUI support • New location (GEO) • Support for GB18030

  7. Agenda • Definitions • Why invest in World-Ready products? • Globalization – step-by-step • Universal encoding - Unicode • Locale aware • Handle different input methods • Complex script aware • Font independency • Multi-lingual UI aware • Mirroring aware • Conclusion & References

  8. Why invest in World Ready products? • Get into international market (World Wide Web era) • Create a single functionality binary to: • Reduce development effort and cost • Ease support and maintenance pain • Sim-ship and avoid being your own competitor

  9. Agenda • Definitions • Why invest in World-Ready products? • Globalization – step-by-step • Universal encoding - Unicode • Locale aware • Handle different input methods • Complex script aware • Font independency • Multi-lingual UI aware • Mirroring aware • Conclusion & References

  10. Transforms of Unicode • UTF-7: 7 bit transformation format (rare) • UTF-8 • 8 bit transformation format • For transmission over unknown lines: e.g. Web pages • Codepage number CP_UTF8 = 65001 • UTF-16 and UCS-2 • Microsoft uses UTF-16 little-endian as its standard for Unicode encoding • UTF-32 and UCS-4

  11. Windows 2000/XP:Unicode & Single Binary • Built in support for hundreds of languages • Any (well behaved) language Win32 application can run on any language version of Windows 2000/XP • Native Unicode support for new scripts • Support for supplementary characters

  12. Unicode Encoding Non-Unicode applications behavior depends on user’s settings and makes data exchange between OS language versions impossible.

  13. Legacy systems support • Few exceptions for not fully Unicode apps: • App has to run on Win9x and NT • Existing Internet protocols and standards require special encoding • Supporting apps that need to run on Win9x • Create two separate binaries: one ANSI & one Unicode • Register as ANSI and internally convert to/from Unicode as needed • Use MSLU!

  14. Data types • For 8 bit and double-byte characters: typedef char CHAR; // 8 bit character typedef char *LPSTR; // pointer to 8 bit string • For Unicode (“Wide”) characters: typedef unsigned short WCHAR; // 16 bit character typedef WCHAR *LPWSTR; //pointer to 16 bit string LPTSTR TCHAR wchar_t char wchar_t * char *

  15. Win32 API prototypes • Generic function prototypes:// winuser.h#ifdef UNICODE#define SetWindowText SetWindowTextW#else#define SetWindowText SetWindowTextA#endif // UNICODE • A routines behavior under Windows 2000/XP • W routines behavior under Win9x

  16. Generic CRT 8 bit codepage Unicode _tcscpy strcpy wcscpy _tcscmp strcmp wcscmp Generic Win32 8 bit codepage Unicode lstrcpy lstrcpyA lstrcpyW lstrcmp lstrcmpA lstrcmpW String manipulation functions and macros Compile with –D_UNICODE to get Unicode version Compile with –DUNICODE to get Unicode version Text macro: #ifdef UNICODE#define TEXT(string) L#string #else#define TEXT(string) string#endif // UNICODE

  17. Unicode  ANSI • Converting between ANSI and Unicode • MultiByteToWideChar for codepage  Unicode • WideCharToMultiByte for Unicode  codepage CP can be any legal codepage number or a predefined such as: CP_ACP, CP_SYMBOL, CP_UTF8, etc. • Tips for writing Unicode: • Use generic data types and function prototypes • Replace p++/p-- with CharNext/CharPrev • Compute buffer sizes in TCHAR

  18. Demo! Porting an ANSI application to Unicode

  19. Encodings in Web pages • ANSI codepages or ISO character encodings • Mono-lingual or restricted to one script • Raw Unicode: UTF-16 • OK for Windows NT networks • Number entities: क • OK for occasional use • UTF-8: Recommended encoding • Supported by IE 4.0+ and Netscape 4.0+

  20. Setting web encoding • HTML/DHTML: Tag in the head of the document <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=<value>"> • XML: <?xml version=“1.0” encoding=<value>?> • ASP: • Specify charset using ASP directives: • Per session: • <%Session.CodePage=<charset>%> • Per page: • <%@CODEPAGE=<charset>%>

  21. Setting encodings for .NET • Class: System.Text • Distinction between: File, Request, and Response encodings • in code: Response.ContentEncoding=<value> • in page directive: <%@Page ResponseEncoding=<value>%> • in configuration file: <globalization requestEncoding=<value> responseEncoding=<value> fileEncoding=<value> />

  22. Universally encoded page

  23. Agenda • Definitions • Why invest in World-Ready products? • Globalization – step-by-step • Universal encoding - Unicode • Locale aware • Handle different input methods • Complex script aware • Font independency • Multi-lingual UI aware • Mirroring aware • Conclusion & References

  24. Windows 2000/XP: NLS • NLS APIs allow you to automatically adjust to users formatting preferences: • Date: 07/04/01 is 平成 13年7月4日 in Japan • Time: 9:00PM is 21:00 in the France • Currency: $1,000.00 is 1.000,00 $ in Germany • Large Numbers: 123,456,789.00 is 12,34,56,789.00 in Hindi • Sort Order: German ä comes after a Swedish ä comes after z

  25. Language ID 12 bits 4 bits 6 bits 6 bits 10 bits Sub-language PrimaryLanguage Reserved Sort ID Locale awareness • Eliminate implicit locale assumptions from code: #define ToUpper(ch) \ ((ch)<='Z' ? (ch) : (ch)+'A' - 'a') • Query system to format locale-dependent data using NLS APIs and LCIDs.

  26. NLS APIs Getting and setting locales • Querying locales • LCID GetSystemDefaultLCID • EnumSystemLocales • LCID GetUserDefaultLCID() • LCID GetThreadLocale() • Setting locales • BOOL SetThreadLocale(LCID dwNewLocale) • BOOL SetLocaleInfo(LCID,…)// Works for standard locales only! • No APIs to set System locale, User locale, and UI language

  27. NLS APIsQuerying locale information • To retrieve information specific to a given locale: GetLocaleInfo • Gives information for any valid locale (takes an LCID). • LCTYPE input tells type of info to retrieve for a given locale (e.g. currency symbol, name of months…). • Returns info in string buffer (LPTSTR). • To retrieve information specific to a location: GetGeoInfo • Gives information for any valid location (takes an LCID). • SYSGEOTYPE input tells type of info to retrieve for a given location(e.g. LCID, Time zones…).

  28. NLS APIsFormatting data • To enumerate formats: • EnumCalendarInfo(Ex) • EnumDateFormats • EnumTimeFormats • To format data directly: • GetCurrencyFormat • GetDateFormat • GetTimeFormat

  29. String comparison • A locale depending comparison: • lstrcmp or lstrcmpi • Locale independent comparison Win2000 & below: Locale = MAKELCID(MAKELANGID (LANG_ENGLISH, SUBLANG_ENGLISH_US), SORT_DEFAULT); ComapreString(Locale, ..., ..., ..., ...); Windows XP: CompareString(LOCALE_INVARIANT, …, …, …, …, …);

  30. Demo! A locale aware application

  31. Locales in web pages • Defaults to the user locale • Supported by IE4.x and Netscape 4.x • A server variable that can be retrieved by:Request.ServerVariables("HTTP_ACCEPT_LANGUAGE") • A property of the Navigator objectnavigator.UserLanguage

  32. Locale awareness in web pages • To retrieve user locale: • A server variable: Request.ServerVariables("HTTP_ACCEPT_LANGUAGE") • A property of the navigator object: navigator.UserLanguage • To set a locale: • In DHTML: SetLocale("de") DateData = FormatDateTime(now(), vbShortDate) • In ASP: <% Session.LCID = 1041 %> <% Response.Write( FormatDateTime(dtNow) ) %>

  33. Locale awareness in .NET • Class: System.Globalization • Referenced as CultureInfo – set of preferences based on language and culture. Pattern: xx-XX, such as fr-CA, de-AT (RFC-1766) • Setting the CultureInfo: • Implicit: Picked up from User Locale • Explicit: In code: Thread.CurrentThread.CurrentCulture = new CultureInfo (“de-DE”) In page directive: <%@Page Culture=<value>%> In config:<globalization culture=<value> />

  34. Demo! Locale aware web site

  35. Agenda • Definitions • Why invest in World-Ready products? • Globalization – step-by-step • Universal encoding - Unicode • Locale aware • Handle different input methods • Complex script aware • Font independency • Multi-lingual UI aware • Mirroring aware • Conclusion & References

  36. Handling Input methods • Easiest: Using edit controls (recommended) • Responding directly to user input • Input locales (language + input method): HKL • GetKeyboardLayout • ActivateKeyboardLayout • LoadKeyboardLayout • Windows messages: • WM_INPUTLANGCHANGEREQUEST • WM_INPUTLANGCHANGE • WM_IME*.* (for IME support only) • WM_CHAR and WM_IME_CHAR

  37. Agenda • Definitions • Why invest in World-Ready products? • Globalization – step-by-step • Universal encoding - Unicode • Locale aware • Handle different input methods • Complex script aware • Font independency • Multi-lingual UI aware • Mirroring aware • Conclusion & References

  38. Windows 2000/XP: Complex Scripts • Complex Scripts have one or more of the following attributes: • Bi-directional (BiDi) reordering (Arabic, Hebrew) • Contextual shaping (Arabic, Indic family) • Display of combining characters (Arabic, Thai, Indic) • Specialized word-breaking (Thai) • Text Justification (Arabic)

  39. Complex ScriptsBiDi reordering Back

  40. Complex ScriptsContextual Shaping Back

  41. Complex ScriptsCombining Characters Back

  42. Complex ScriptsJustification Back

  43. Uniscribe • Clients: Windows 2000/XP, Trident, Microsoft Office 2000/XP • A collection of exported APIs (high and low level) • Hides implementation details • A shaping engine per language Application LPK.DLL USERGDI USP

  44. Options to display text • Plain text in application • Standard edit control or • Win32 API (ExtTextOut / DrawText). • Simple formatted text • In Win32 apps, use Richedit control. • For Web pages, use Document Object Model (DHTML). • Advanced formatting • Use Uniscribe (see SDK and MSJ article).

  45. Special considerations • When dealing with BiDi, set RTL reading order and alignment • SetTextAlign / GetTextAlign with TA_RIGHT • ExtTextOut with ETO_RTLREADING • DrawText with DT_RTLREADING • To measure line lengths: • Do not sum cached character widths • Do use a GetTextExtent function or Uniscribe • When displaying typed text: • Do not output characters one at a time! • Do save text in a buffer and display the whole string with Uniscribe or Win32 API

  46. Agenda • Definitions • Why invest in World-Ready products? • Globalization – step-by-step • Universal encoding - Unicode • Locale aware • Handle different input methods • Complex script aware • Font independency • Multi-lingual UI aware • Mirroring aware • Conclusion & References

  47. Windows 2000/XP:Font support • Introduction of OpenType fonts: • Extended TTF with glyphs for PE, ME, Thai, Greek, Turkish, Cyrillic… • Font fallback mechanism for CS and Eastern Asian scripts used by Uniscribe • Font linking mechanism used by GDI

  48. Font independencyWin32 programming • Not to do: • Hard code font face names • Assume a given font is installed • Assume selected font supports the desired script • To do: • Use MS Shell Dlg face name in Dialog resources • EnumFontFamiliesEx or ChooseFont to select fonts

  49. Font independencyIn Web pages • Avoid placing text formatting values into in-line style. <span style = "font-size: 10pt; font-family: Arial;"> Hello </span> • Declare text style in CSS files: <style> .myStyle {font-size: 10pt; font-family: Arial;} </style> <span class = myStyle> Hello </span> • Use WEFT to embed fonts to your web pages (IE only): http://www.microsoft.com/typography/web/default.htm

  50. Agenda • Definitions • Why invest in World-Ready products? • Globalization – step-by-step • Universal encoding - Unicode • Locale aware • Handle different input methods • Complex script aware • Font independency • Multi-lingual UI aware • Mirroring aware • Conclusion & References

More Related