Globalization features in whidbey s clr l.jpg
This presentation is the property of its rightful owner.
Sponsored Links
1 / 24

Globalization Features in Whidbey’s CLR PowerPoint PPT Presentation


  • 121 Views
  • Uploaded on
  • Presentation posted in: General

Globalization Features in Whidbey’s CLR. Michael Kaplan Technical Lead Globalization Infrastructure, Fonts and Tools Microsoft Windows International Division http://blogs.msdn.com/michkap. Customized Cultures and Regions. CultureAndRegionInfoBuilder class

Download Presentation

Globalization Features in Whidbey’s CLR

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Globalization features in whidbey s clr l.jpg

Globalization Features in Whidbey’s CLR

Michael Kaplan

Technical Lead

Globalization Infrastructure, Fonts and Tools

Microsoft Windows International Division

http://blogs.msdn.com/michkap

April 25, 2005


Customized cultures and regions l.jpg

Customized Cultures and Regions

  • CultureAndRegionInfoBuilder class

    • Create an override to an existing culture

    • Create based on an existing culture

    • Create from scratch

  • Must be an administrator to register

  • Can register the file on multiple machines

April 25, 2005


Cultureandregioninfobuilder sample l.jpg

CultureAndRegionInfoBuilder sample

CultureAndRegionInfoBuilder carib = new CultureAndRegionInfoBuilder(“de-DE-MineMine”, CultureAndRegionModifiers.None);

// load up all of the existing data for German and for Germany....

carib.LoadDataFromCultureInfo(new CultureInfo(“de-DE", false));

carib.LoadDataFromRegionInfo(new RegionInfo(“de”);

// Change a property

carib.ThreeLetterISORegionName = “ZZZ”;

// Register the culture on the machine

carib.Register();

// Use the new culture

CultureInfo ci = new CultureInfo(“de-DE-MineMine”);

April 25, 2005


Carib serialization with ldml l.jpg

CaRIB serialization with LDML

  • Locale Data Markup Language

  • Described in UTS#35 at http://unicode.org/reports/tr35/

  • CaRIB objects can be saved as LDML files

  • Data can be loaded from LDML files

  • CaRIB will do its best with files it did not create

April 25, 2005


Ldml sample l.jpg

LDML Sample

string file1 = Path.GetTempFileName();

File.Delete(file1);

CultureInfo ci = new CultureInfo("ar-EG");

RegionInfo ri = new RegionInfo("de-DE");

CultureAndRegionInfoBuilder carib = new CultureAndRegionInfoBuilder("x-en-US-Pepsi", CultureAndRegionModifiers.None);

carib.LoadDataFromCultureInfo(ci);

carib.LoadDataFromRegionInfo(ri);

carib.Save(file1);

carib = CultureAndRegionInfoBuilder.CreateFromLdml(file1);

carib.Register();

April 25, 2005


When windows knows more than net l.jpg

When Windows knows more than .NET

  • As of XPSP2, there are 25 new locales in Windows:

    • Bengali - India

    • Croatian - Bosnia and Herzegovina

    • Bosnian - Bosnia and Herzegovina

    • Serbian - Bosnia and Herzegovina (Latin)

    • Serbian - Bosnia and Herzegovina (Cyrillic)

    • Welsh - United Kingdom (more info in English, in Welsh)

    • Maori - New Zealand

    • Malayalam - India

    • Maltese - Malta

    • Quechua - Bolivia

    • Quechua - Ecuador

    • Quechua - Peru

    • Setswana / Tswana - South Africa

    • isiXhosa / Xhosa - South Africa

    • isiZulu / Zulu - South Africa

    • Sesotho sa Leboa / Northern Sotho - South Africa

    • Northern Sami - Norway

    • Northern Sami - Sweden

    • Northern Sami - Finland

    • Lule Sami - Norway

    • Lule Sami - Sweden

    • Southern Sami - Norway

    • Southern Sami - Sweden

    • Skolt Sami - Finland

    • Inari Sami - Finland

  • There will be more in future service packs

  • In Longhorn, there will be 75 or more new locales

April 25, 2005


Windows only cultures l.jpg

Windows-only Cultures

  • The solution: Windows-only cultures!

    • Synthesizes a CultureInfo object when Windows supports a locale that the .NET Framework does not know how to create itself

April 25, 2005


Windows only culture test l.jpg

Windows only culture test

foreach(CultureInfo culture in CultureInfo.GetCultures(CultureTypes.WindowsOnlyCultures))

{

Console.WriteLine(ci.Name);

}

// New cultures on XP SP2 include:

// mt-MT, bs-BA-Latn, smn-FI, smj-NO, smj-SE, sms-FI, sma-NO,

// sma-SE, quz-BO, quz-EC, quz-PE, ml-IN, bn-IN, cy-GB, and more

April 25, 2005


Special cultureinfo support for sql server 2005 yukon l.jpg

Special CultureInfo support for SQL Server 2005 (Yukon)

  • SQL Server locale semantics:

    • One setting for UI and formatting

    • Another setting for collation/encoding

  • .NET/Windows semantics

    • One setting for UI

    • Another setting for formatting/collation

  • Solution

    • Special GetCultureInfo override that takes two CultureInfo names for the two SQL Server settings

April 25, 2005


How yukon uses this support l.jpg

How Yukon uses this support

  • Microsoft.ReportingServices.Diagnostics.Localization

    • CatalogCulture

    • ClientPrimaryCulture

    • DefaultReportServerCulture

    • FallbackUICulture

    • InstalledCultureNames

    • ReportParameterCulture

    • SqlCulture

April 25, 2005


New locale properties methods l.jpg

New locale properties/methods

  • TextInfo

    • CultureName

    • LCID

  • CompareInfo

    • Name

  • DateTimeFormatInfo

    • ShortestDayNames

    • MonthGenitiveNames

    • AbbreviatedMonthGenitiveNames

  • NumberFormatInfo

    • NativeDigits

    • DigitSubstitution

  • CultureInfo

    • IsCustomCulture

    • IetfLanguageTag

    • CultureTypes

    • GetCultureInfo()

    • GetCultureInfoByIetfLanguageTag()

  • RegionInfo

    • GeoId

    • NativeName

    • CurrencyEnglishName

    • (Can now create via full culture names)

April 25, 2005


Updates to encodings l.jpg

Updates to encodings

  • Now built into the BCL

    • Improved performance

    • more flexibility

    • consistent results across supported platforms

  • Encoding enumeration API

  • UTF-32 support (little endian and big endian)

  • UTF-16 big endian support

  • Encoding/decoding fallbacks

    • Exception

    • Replacement

    • “Best fit”

    • Custom

April 25, 2005


Slide13 l.jpg

public class NumericEntitiesFallback : EncoderFallback {

public override EncoderFallbackBuffer CreateFallbackBuffer() {

return new NEFallbackBuffer();

}

public override int MaxCharCount {

get {

return 8;

}

}

}

public class NEFallbackBuffer : EncoderFallbackBuffer {

// Store our default string

private String strEntity;

int fallbackCount = -1;

int fallbackIndex = 0;

// Fallback Methods

public override bool Fallback(char charUnknown, int index) {

// If we had a buffer already we're being recursive, throw,

// it's probably at the suspect character in our array.

if (fallbackCount >= 0)

ThrowLastCharRecursive(unchecked((int)charUnknown));

// Go ahead and get our fallback

strEntity = String.Format("&#{0};", (int)charUnknown);

fallbackCount = strEntity.Length;

fallbackIndex = 0;

return fallbackCount != 0;

}

public override bool Fallback(char charUnknownHigh, char charUnknownLow, int index) {

// Double check input surrogate pair

if (!Char.IsHighSurrogate(charUnknownHigh))

throw new ArgumentOutOfRangeException("charUnknownHigh",

“supposed to be between 0xD800 and 0xDBFF");

if (!Char.IsLowSurrogate(charUnknownLow))

throw new ArgumentOutOfRangeException("CharUnknownLow",

“supposed to be between 0xD800 and 0xDBFF");

// If we had a buffer already we're being recursive, throw, it's

// probably at the suspect character in our array.

if (fallbackCount >= 0)

ThrowLastCharRecursive(Char.ConvertToUtf32(charUnknownHigh, charUnknownLow));

// Go ahead and get our fallback

strEntity = String.Format("&#{0};", Char.ConvertToUtf32(charUnknownHigh, charUnknownLow));

fallbackCount = strEntity.Length;

fallbackIndex = 0;

return fallbackCount != 0;

}

public override char GetNextChar() {

// We want it to get < 0 because == 0 means that the current/last

// character is a fallback and we need to detect recursion. We

// could have a flag but we already have this counter.

fallbackCount--;

// Do we have anything left? 0 is now last fallback char, negative

// is nothing left

if (fallbackCount < 0)

return (char)0;

// Need to get it out of the buffer.

return strEntity[fallbackIndex++];

}

public override bool MovePrevious() {

fallbackCount++; fallbackIndex--;

return true;

}

public override int Remaining {

get {

return (fallbackCount < 0) ? 0 : fallbackCount;

}

}

// private helper methods

private void ThrowLastCharRecursive(int charRecursive) {

// Throw it, using our complete character

throw new ArgumentException(

String.Format("Last character \\u{0:4X} was a recursive fallback",

charRecursive), "chars");

}

}

April 25, 2005


Collation improvements l.jpg

Collation Improvements

  • OrdinalIgnoreCase

    • Same results as ToUpper/Ordinal

    • Matches OS file system results

  • Correct Serbian collation

    • Fixed in Windows XPSP2

    • Customer reported (MSDN Feedback Center)

  • Better handling of ignored/ignorable characters

    • IndexOf/LastIndexOf/IsPrefix/IsSuffix

    • StartsWith/EndsWith, too

April 25, 2005


Ordinalignorecase sample l.jpg

OrdinalIgnoreCase sample

string strTest1 = "IamAString";

string strTest2 = "STRING";

if(strTest1.EndsWith(strTest2, StringComparison.OrdinalIgnoreCase)) {

Console.WriteLine(“Successful test!”);

};

April 25, 2005


Unicode normalization l.jpg

Unicode normalization

  • Described in UAX#15 at http://www.unicode.org/reports/tr15/

  • String.IsNormalized()String.IsNormalized(NormalizationForm normalizationForm)

  • String.Normalize()String.Normalize(NormalizationForm normalizationForm)

  • NormalizationForm enumeration

  • FormC, FormD, FormKC, FormKD

  • õĥµ¨(U+00f5 U+0068 U+0302 U+00b5 U+00a8)LATIN SMALL LETTER O WITH TILDE; LATIN SMALL LETTER H; COMBINING CIRCUMFLEX ACCENT; MICRO SIGN; DIAERESIS

  • FormC: õĥµ¨(U+00f5 U+0125 U+00b5 U+00a8)

  • FormD: õĥµ¨(U+006f U+0303 U+0068 U+0302 U+00b5 U+00a8)

  • FormKC: õĥμ ̈ (U+00f5 U+0125 U+03bc U+0020 U+0308)

  • FormKD: õĥμ ̈ (U+006f U+0303 U+0068 U+0302 U+03bc U+0020 U+0308)

  • In collation, õĥµ¨ ≅ õĥµ¨≅õĥμ ̈  ≅ õĥμ ̈ 

April 25, 2005


Slide17 l.jpg

namespace àáâãäå {

using System;

using System.Text;

using System.Globalization;

class àáâãäå

{

[STAThread]

static void Main(string[] args) {

àáâãäå(); àáâãäå(); àáâãäå(); àáâãäå(); àáâãäå(); àáâãäå(); àáâãäå();

}

static void àáâãäå(string àáâãäå) {

StringBuilder àáâãäå = new StringBuilder();

StringInfo àáâãäå = new StringInfo(àáâãäå);

àáâãäå.Append(àáâãäå.Normalize(NormalizationForm.FormC));

àáâãäå.Append(": ");

for(int àáâãäå=0; àáâãäå < àáâãäå.LengthInTextElements; àáâãäå++) {

string àáâãäå = àáâãäå.SubstringByTextElements(àáâãäå, 1);

if(àáâãäå.IsNormalized(NormalizationForm.FormC)) {

àáâãäå.Append("C");

} else if(àáâãäå.IsNormalized(NormalizationForm.FormD)) {

àáâãäå.Append("D");

} else {

àáâãäå.Append("_");

}

}

Console.WriteLine(àáâãäå.ToString());

return;

}

static void àáâãäå() {

àáâãäå.àáâãäå("àáâãäå");

}

static void àáâãäå() {

àáâãäå.àáâãäå("àáâãäå");

}

static void àáâãäå() {

àáâãäå.àáâãäå("àáâãäå");

}

static void àáâãäå() {

àáâãäå.àáâãäå("àáâãäå");

}

static void àáâãäå() {

àáâãäå.àáâãäå("àáâãäå");

}

static void àáâãäå() {

àáâãäå.àáâãäå("àáâãäå");

}

static void àáâãäå() {

àáâãäå.àáâãäå("àáâãäå");

}

}

}

April 25, 2005


Idn mapping apis l.jpg

IDN Mapping APIs

  • IdnMapping class

  • Based on three RFCs (standard based on Unicode 3.2)

    • 3490 - Internationalizing Domain Names in Applications (IDNA)

    • 3491 - Nameprep: A Stringprep Profile for Internationalized Domain Names (IDN)

    • 3492 - Punycode: A Bootstring encoding of Unicode for Internationalized Domain Names in Applications (IDNA)

  • \u5B89\u5BA4\u5948\u7F8E\u6075-with-SUPER-MONKEYS becomesxn---with-SUPER-MONKEYS-pc58ag80a8qai00g7n9n

  • Properties

    • AllowUnassigned (allows new Unicode characters)

    • UseStd3AsciiRules (more like DNS rules)

  • Methods

    • GetAscii - Gets ASCII (Punycode) version of the string

    • GetUnicode - Gets Unicode version of the string, normalized and limited to IDNA characters.

April 25, 2005


Unicode property information l.jpg

Unicode property information

  • New CharUnicodeInfo class

  • Extends methods on Char

  • Offical data from the Unicode Character Database at http://www.unicode.org/ucd/

    • IsWhiteSpace

    • GetNumericValue

    • GetDigitValue

    • GetDecimalDigitValue

    • GetUnicodeCategory

    • GetBidiCategory

April 25, 2005


New text element support in the stringinfo class l.jpg

New text element support in the StringInfo class

  • StringInfo ctor that takes a string

  • StringInfo.String

  • StringInfo.LengthInTextElements

  • StringInfo.SubstringByTextElements()

  • Both use ParseCombiningCharacters() to get their results

April 25, 2005


New stringinfo props methods sample l.jpg

New StringInfo props/methods sample

StringInfo si = New StringInfo("A\u0300\u0301\u0300e\u0300\u0301\u0300“);

Console.WriteLine(si.LengthInTextElements); // Length is two

for(int ich = 0; ich < si.LengthInTextElements; ich++) {

Console.WriteLine(si.SubstringByTextElements(ich, 1);

}

April 25, 2005


New supplementary character support in lots of methods l.jpg

New supplementary character support in lots of methods

  • New signature -- (String s, int index)

  • IsControl, IsDigit, IsLetter, IsLetterOrDigit, IsLower, IsNumber, IsPunctuation, IsSeparator, IsSurrogate, IsSymbol, IsUpper, IsWhiteSpace, GetUnicodeCategory, GetNumericValue, IsHighSurrogate, IsLowSurrogate, IsSurrogatePair

  • ConvertToUtf32, ConvertFromUtf32 methods

April 25, 2005


References l.jpg

References

  • MSDN Magazine Article

    • Make the .NET World a Friendlier Place with the Many Faces of the CultureInfo ClassMarch 2005 - http://msdn.microsoft.com/msdnmag/issues/05/03/CultureInfo/

  • SQL Server Books Online

    “International Considerations for SQL Server”http://whidbey.msdn.microsoft.com/library/en-us/icsql9/html/50dc4fa8-4772-46a8-a8ef-bc134502b4e0.asp

  • My Blog

    • http://blogs.msdn.com/michkap

  • Some other blogs for int’l support in Whidbey

    • http://blogs.msdn.com/AchimR

    • http://www.dasblonde.net/

    • http://blogs.msdn.com/BCLTeam

  • Other useful sites

    • http://www.microsoft.com/globaldev/

    • http://lab.msdn.microsoft.com/productfeedback/

    • http://www.unicode.org/

April 25, 2005


Globalization features in whidbey s clr questions l.jpg

Globalization Features in Whidbey’s CLRQuestions

April 25, 2005


  • Login