globalization features in whidbey s clr
Download
Skip this Video
Download Presentation
Globalization Features in Whidbey’s CLR

Loading in 2 Seconds...

play fullscreen
1 / 24

Globalization Features in Whidbey’s CLR - PowerPoint PPT Presentation


  • 167 Views
  • Uploaded on

Globalization Features in Whidbey’s CLR. Michael Kaplan Technical Lead Globalization Infrastructure, Fonts and Tools Microsoft Windows International Division http://blogs.msdn.com/michkap. Customized Cultures and Regions. CultureAndRegionInfoBuilder class

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Globalization Features in Whidbey’s CLR' - umika


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
globalization features in whidbey s clr

Globalization Features in Whidbey’s CLR

Michael Kaplan

Technical Lead

Globalization Infrastructure, Fonts and Tools

Microsoft Windows International Division

http://blogs.msdn.com/michkap

April 25, 2005

customized cultures and regions
Customized Cultures and Regions
  • CultureAndRegionInfoBuilder class
    • Create an override to an existing culture
    • Create based on an existing culture
    • Create from scratch
  • Must be an administrator to register
  • Can register the file on multiple machines

April 25, 2005

cultureandregioninfobuilder sample
CultureAndRegionInfoBuilder sample

CultureAndRegionInfoBuilder carib = new CultureAndRegionInfoBuilder(“de-DE-MineMine”, CultureAndRegionModifiers.None);

// load up all of the existing data for German and for Germany....

carib.LoadDataFromCultureInfo(new CultureInfo(“de-DE", false));

carib.LoadDataFromRegionInfo(new RegionInfo(“de”);

// Change a property

carib.ThreeLetterISORegionName = “ZZZ”;

// Register the culture on the machine

carib.Register();

// Use the new culture

CultureInfo ci = new CultureInfo(“de-DE-MineMine”);

April 25, 2005

carib serialization with ldml
CaRIB serialization with LDML
  • Locale Data Markup Language
  • Described in UTS#35 at http://unicode.org/reports/tr35/
  • CaRIB objects can be saved as LDML files
  • Data can be loaded from LDML files
  • CaRIB will do its best with files it did not create

April 25, 2005

ldml sample
LDML Sample

string file1 = Path.GetTempFileName();

File.Delete(file1);

CultureInfo ci = new CultureInfo("ar-EG");

RegionInfo ri = new RegionInfo("de-DE");

CultureAndRegionInfoBuilder carib = new CultureAndRegionInfoBuilder("x-en-US-Pepsi", CultureAndRegionModifiers.None);

carib.LoadDataFromCultureInfo(ci);

carib.LoadDataFromRegionInfo(ri);

carib.Save(file1);

carib = CultureAndRegionInfoBuilder.CreateFromLdml(file1);

carib.Register();

April 25, 2005

when windows knows more than net
When Windows knows more than .NET
  • As of XPSP2, there are 25 new locales in Windows:
    • Bengali - India
    • Croatian - Bosnia and Herzegovina
    • Bosnian - Bosnia and Herzegovina
    • Serbian - Bosnia and Herzegovina (Latin)
    • Serbian - Bosnia and Herzegovina (Cyrillic)
    • Welsh - United Kingdom (more info in English, in Welsh)
    • Maori - New Zealand
    • Malayalam - India
    • Maltese - Malta
    • Quechua - Bolivia
    • Quechua - Ecuador
    • Quechua - Peru
    • Setswana / Tswana - South Africa
    • isiXhosa / Xhosa - South Africa
    • isiZulu / Zulu - South Africa
    • Sesotho sa Leboa / Northern Sotho - South Africa
    • Northern Sami - Norway
    • Northern Sami - Sweden
    • Northern Sami - Finland
    • Lule Sami - Norway
    • Lule Sami - Sweden
    • Southern Sami - Norway
    • Southern Sami - Sweden
    • Skolt Sami - Finland
    • Inari Sami - Finland
  • There will be more in future service packs
  • In Longhorn, there will be 75 or more new locales

April 25, 2005

windows only cultures
Windows-only Cultures
  • The solution: Windows-only cultures!
    • Synthesizes a CultureInfo object when Windows supports a locale that the .NET Framework does not know how to create itself

April 25, 2005

windows only culture test
Windows only culture test

foreach(CultureInfo culture in CultureInfo.GetCultures(CultureTypes.WindowsOnlyCultures))

{

Console.WriteLine(ci.Name);

}

// New cultures on XP SP2 include:

// mt-MT, bs-BA-Latn, smn-FI, smj-NO, smj-SE, sms-FI, sma-NO,

// sma-SE, quz-BO, quz-EC, quz-PE, ml-IN, bn-IN, cy-GB, and more

April 25, 2005

special cultureinfo support for sql server 2005 yukon
Special CultureInfo support for SQL Server 2005 (Yukon)
  • SQL Server locale semantics:
    • One setting for UI and formatting
    • Another setting for collation/encoding
  • .NET/Windows semantics
    • One setting for UI
    • Another setting for formatting/collation
  • Solution
    • Special GetCultureInfo override that takes two CultureInfo names for the two SQL Server settings

April 25, 2005

how yukon uses this support
How Yukon uses this support
  • Microsoft.ReportingServices.Diagnostics.Localization
    • CatalogCulture
    • ClientPrimaryCulture
    • DefaultReportServerCulture
    • FallbackUICulture
    • InstalledCultureNames
    • ReportParameterCulture
    • SqlCulture

April 25, 2005

new locale properties methods
New locale properties/methods
  • TextInfo
    • CultureName
    • LCID
  • CompareInfo
    • Name
  • DateTimeFormatInfo
    • ShortestDayNames
    • MonthGenitiveNames
    • AbbreviatedMonthGenitiveNames
  • NumberFormatInfo
    • NativeDigits
    • DigitSubstitution
  • CultureInfo
    • IsCustomCulture
    • IetfLanguageTag
    • CultureTypes
    • GetCultureInfo()
    • GetCultureInfoByIetfLanguageTag()
  • RegionInfo
    • GeoId
    • NativeName
    • CurrencyEnglishName
    • (Can now create via full culture names)

April 25, 2005

updates to encodings
Updates to encodings
  • Now built into the BCL
    • Improved performance
    • more flexibility
    • consistent results across supported platforms
  • Encoding enumeration API
  • UTF-32 support (little endian and big endian)
  • UTF-16 big endian support
  • Encoding/decoding fallbacks
    • Exception
    • Replacement
    • “Best fit”
    • Custom

April 25, 2005

slide13
public class NumericEntitiesFallback : EncoderFallback {

public override EncoderFallbackBuffer CreateFallbackBuffer() {

return new NEFallbackBuffer();

}

public override int MaxCharCount {

get {

return 8;

}

}

}

public class NEFallbackBuffer : EncoderFallbackBuffer {

// Store our default string

private String strEntity;

int fallbackCount = -1;

int fallbackIndex = 0;

// Fallback Methods

public override bool Fallback(char charUnknown, int index) {

// If we had a buffer already we're being recursive, throw,

// it's probably at the suspect character in our array.

if (fallbackCount >= 0)

ThrowLastCharRecursive(unchecked((int)charUnknown));

// Go ahead and get our fallback

strEntity = String.Format("&#{0};", (int)charUnknown);

fallbackCount = strEntity.Length;

fallbackIndex = 0;

return fallbackCount != 0;

}

public override bool Fallback(char charUnknownHigh, char charUnknownLow, int index) {

// Double check input surrogate pair

if (!Char.IsHighSurrogate(charUnknownHigh))

throw new ArgumentOutOfRangeException("charUnknownHigh",

“supposed to be between 0xD800 and 0xDBFF");

if (!Char.IsLowSurrogate(charUnknownLow))

throw new ArgumentOutOfRangeException("CharUnknownLow",

“supposed to be between 0xD800 and 0xDBFF");

// If we had a buffer already we're being recursive, throw, it's

// probably at the suspect character in our array.

if (fallbackCount >= 0)

ThrowLastCharRecursive(Char.ConvertToUtf32(charUnknownHigh, charUnknownLow));

// Go ahead and get our fallback

strEntity = String.Format("&#{0};", Char.ConvertToUtf32(charUnknownHigh, charUnknownLow));

fallbackCount = strEntity.Length;

fallbackIndex = 0;

return fallbackCount != 0;

}

public override char GetNextChar() {

// We want it to get < 0 because == 0 means that the current/last

// character is a fallback and we need to detect recursion. We

// could have a flag but we already have this counter.

fallbackCount--;

// Do we have anything left? 0 is now last fallback char, negative

// is nothing left

if (fallbackCount < 0)

return (char)0;

// Need to get it out of the buffer.

return strEntity[fallbackIndex++];

}

public override bool MovePrevious() {

fallbackCount++; fallbackIndex--;

return true;

}

public override int Remaining {

get {

return (fallbackCount < 0) ? 0 : fallbackCount;

}

}

// private helper methods

private void ThrowLastCharRecursive(int charRecursive) {

// Throw it, using our complete character

throw new ArgumentException(

String.Format("Last character \\u{0:4X} was a recursive fallback",

charRecursive), "chars");

}

}

April 25, 2005

collation improvements
Collation Improvements
  • OrdinalIgnoreCase
    • Same results as ToUpper/Ordinal
    • Matches OS file system results
  • Correct Serbian collation
    • Fixed in Windows XPSP2
    • Customer reported (MSDN Feedback Center)
  • Better handling of ignored/ignorable characters
    • IndexOf/LastIndexOf/IsPrefix/IsSuffix
    • StartsWith/EndsWith, too

April 25, 2005

ordinalignorecase sample
OrdinalIgnoreCase sample

string strTest1 = "IamAString";

string strTest2 = "STRING";

if(strTest1.EndsWith(strTest2, StringComparison.OrdinalIgnoreCase)) {

Console.WriteLine(“Successful test!”);

};

April 25, 2005

unicode normalization
Unicode normalization
  • Described in UAX#15 at http://www.unicode.org/reports/tr15/
  • String.IsNormalized()String.IsNormalized(NormalizationForm normalizationForm)
  • String.Normalize()String.Normalize(NormalizationForm normalizationForm)
  • NormalizationForm enumeration
  • FormC, FormD, FormKC, FormKD
  • õĥµ¨(U+00f5 U+0068 U+0302 U+00b5 U+00a8)LATIN SMALL LETTER O WITH TILDE; LATIN SMALL LETTER H; COMBINING CIRCUMFLEX ACCENT; MICRO SIGN; DIAERESIS
  • FormC: õĥµ¨(U+00f5 U+0125 U+00b5 U+00a8)
  • FormD: õĥµ¨(U+006f U+0303 U+0068 U+0302 U+00b5 U+00a8)
  • FormKC: õĥμ ̈ (U+00f5 U+0125 U+03bc U+0020 U+0308)
  • FormKD: õĥμ ̈ (U+006f U+0303 U+0068 U+0302 U+03bc U+0020 U+0308)
  • In collation, õĥµ¨ ≅ õĥµ¨≅õĥμ ̈  ≅ õĥμ ̈ 

April 25, 2005

slide17
namespace àáâãäå {

using System;

using System.Text;

using System.Globalization;

class àáâãäå

{

[STAThread]

static void Main(string[] args) {

àáâãäå(); àáâãäå(); àáâãäå(); àáâãäå(); àáâãäå(); àáâãäå(); àáâãäå();

}

static void àáâãäå(string àáâãäå) {

StringBuilder àáâãäå = new StringBuilder();

StringInfo àáâãäå = new StringInfo(àáâãäå);

àáâãäå.Append(àáâãäå.Normalize(NormalizationForm.FormC));

àáâãäå.Append(": ");

for(int àáâãäå=0; àáâãäå < àáâãäå.LengthInTextElements; àáâãäå++) {

string àáâãäå = àáâãäå.SubstringByTextElements(àáâãäå, 1);

if(àáâãäå.IsNormalized(NormalizationForm.FormC)) {

àáâãäå.Append("C");

} else if(àáâãäå.IsNormalized(NormalizationForm.FormD)) {

àáâãäå.Append("D");

} else {

àáâãäå.Append("_");

}

}

Console.WriteLine(àáâãäå.ToString());

return;

}

static void àáâãäå() {

àáâãäå.àáâãäå("àáâãäå");

}

static void àáâãäå() {

àáâãäå.àáâãäå("àáâãäå");

}

static void àáâãäå() {

àáâãäå.àáâãäå("àáâãäå");

}

static void àáâãäå() {

àáâãäå.àáâãäå("àáâãäå");

}

static void àáâãäå() {

àáâãäå.àáâãäå("àáâãäå");

}

static void àáâãäå() {

àáâãäå.àáâãäå("àáâãäå");

}

static void àáâãäå() {

àáâãäå.àáâãäå("àáâãäå");

}

}

}

April 25, 2005

idn mapping apis
IDN Mapping APIs
  • IdnMapping class
  • Based on three RFCs (standard based on Unicode 3.2)
    • 3490 - Internationalizing Domain Names in Applications (IDNA)
    • 3491 - Nameprep: A Stringprep Profile for Internationalized Domain Names (IDN)
    • 3492 - Punycode: A Bootstring encoding of Unicode for Internationalized Domain Names in Applications (IDNA)
  • \u5B89\u5BA4\u5948\u7F8E\u6075-with-SUPER-MONKEYS becomesxn---with-SUPER-MONKEYS-pc58ag80a8qai00g7n9n
  • Properties
    • AllowUnassigned (allows new Unicode characters)
    • UseStd3AsciiRules (more like DNS rules)
  • Methods
    • GetAscii - Gets ASCII (Punycode) version of the string
    • GetUnicode - Gets Unicode version of the string, normalized and limited to IDNA characters.

April 25, 2005

unicode property information
Unicode property information
  • New CharUnicodeInfo class
  • Extends methods on Char
  • Offical data from the Unicode Character Database at http://www.unicode.org/ucd/
    • IsWhiteSpace
    • GetNumericValue
    • GetDigitValue
    • GetDecimalDigitValue
    • GetUnicodeCategory
    • GetBidiCategory

April 25, 2005

new text element support in the stringinfo class
New text element support in the StringInfo class
  • StringInfo ctor that takes a string
  • StringInfo.String
  • StringInfo.LengthInTextElements
  • StringInfo.SubstringByTextElements()
  • Both use ParseCombiningCharacters() to get their results

April 25, 2005

new stringinfo props methods sample
New StringInfo props/methods sample

StringInfo si = New StringInfo("A\u0300\u0301\u0300e\u0300\u0301\u0300“);

Console.WriteLine(si.LengthInTextElements); // Length is two

for(int ich = 0; ich < si.LengthInTextElements; ich++) {

Console.WriteLine(si.SubstringByTextElements(ich, 1);

}

April 25, 2005

new supplementary character support in lots of methods
New supplementary character support in lots of methods
  • New signature -- (String s, int index)
  • IsControl, IsDigit, IsLetter, IsLetterOrDigit, IsLower, IsNumber, IsPunctuation, IsSeparator, IsSurrogate, IsSymbol, IsUpper, IsWhiteSpace, GetUnicodeCategory, GetNumericValue, IsHighSurrogate, IsLowSurrogate, IsSurrogatePair
  • ConvertToUtf32, ConvertFromUtf32 methods

April 25, 2005

references
References
  • MSDN Magazine Article
    • Make the .NET World a Friendlier Place with the Many Faces of the CultureInfo ClassMarch 2005 - http://msdn.microsoft.com/msdnmag/issues/05/03/CultureInfo/
  • SQL Server Books Online

“International Considerations for SQL Server”http://whidbey.msdn.microsoft.com/library/en-us/icsql9/html/50dc4fa8-4772-46a8-a8ef-bc134502b4e0.asp

  • My Blog
    • http://blogs.msdn.com/michkap
  • Some other blogs for int’l support in Whidbey
    • http://blogs.msdn.com/AchimR
    • http://www.dasblonde.net/
    • http://blogs.msdn.com/BCLTeam
  • Other useful sites
    • http://www.microsoft.com/globaldev/
    • http://lab.msdn.microsoft.com/productfeedback/
    • http://www.unicode.org/

April 25, 2005

ad