knotty problems in date time parsing and formatting and time zones l.
Skip this Video
Loading SlideShow in 5 Seconds..
Knotty problems in date/time parsing and formatting and time zones PowerPoint Presentation
Download Presentation
Knotty problems in date/time parsing and formatting and time zones

Loading in 2 Seconds...

play fullscreen
1 / 28

Knotty problems in date/time parsing and formatting and time zones - PowerPoint PPT Presentation

  • Uploaded on

Knotty problems in date/time parsing and formatting and time zones. Yoshito Umaoka IBM Globalization Center of Competency. 32nd Internationalization and Unicode Conference. Agenda. Challenges for Implementing Date and Time UI Understanding Time Zone Formatting Parsing.

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
Download Presentation

PowerPoint Slideshow about 'Knotty problems in date/time parsing and formatting and time zones' - zorina

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
knotty problems in date time parsing and formatting and time zones

Knotty problems in date/time parsing and formatting and time zones

Yoshito Umaoka

IBM Globalization Center of Competency

32nd Internationalization and Unicode Conference

  • Challenges for Implementing Date and Time UI
  • Understanding Time Zone Formatting Parsing
challenges for implementing date and time ui
Challenges for Implementing Date and Time UI
  • Two examples
    • Google Calendar
    • IBM Lotus Notes
  • Walking through various requirements for displaying date and time
  • Solutions provided by CLDR
  • Design/Implementation Tips
date format types
Date Format Types

Basic: July 27, 2008

Relative: Today

Basic: July 28, 2008

Relative: Tomorrow

Basic: August 3, 2008

Relative: August 3, 2008

Interval: July 27 - 28, 2008

Duration: 1 day

Interval: July 27 – August 3, 2008

Duration: 7 days

mini calendar
Mini Calendar
  • Month
    • Different form without date in some locales
    • Eg. Polish - lipiec (nominative) vs. lipca (genitive)
      • lipiec 2008
      • 28 lipca 2008
  • Day of week
    • Very short abbreviation
    • Not always the first letter of day of week name
    • Eg. Chinese: 星期日 ⇒ 日
  • The first day of week
    • Sunday is the first day of week in many regions, but it’s not true in some regions.
month day of week names in cldr
Month/Day of Week Names in CLDR
  • 3 different widths - wide / abbreviated / narrow
  • 2 context types – format / stand-alone

Month name example - January

Day of week name example - Sunday

date and time interval
Date and Time Interval
  • When displaying a date interval, duplicated date fields could be stripped off.
    • 3 possible patterns depending on combination of start date and end date
      • July 20–26, 2008
      • July 20 – August 1, 2008
      • July 20, 2008 – July 19, 2009
    • Different combination patterns in different locales
      • 20–26 July 2008
      • 20 July – 1 August 2008
      • 20 July 2008 – 19 July 2009
date time interval in cldr
Date/Time Interval in CLDR
  • Each <intervalFormatItem> is associated with as “skeleton” pattern and contains one or more patterns
  • A <greatestDifference> element contains a pattern which will be used when the greatest difference of two given dates matches its “id” attribute

<intervalFormatItem id="yMMMd">

<greatestDifference id="y">MMM d, yyyy – MMM d, yyyy</greatestDifference>

<greatestDifference id="M">MMM d – MMM d, yyyy</greatestDifference>

<greatestDifference id="d">MMM d–d, yyyy</greatestDifference>


other challenges
Other Challenges
  • Various combinations of date fields and widths
    • “Sat 7/26”
      • The UI requires to display short format including month, day of month and day of week, but not year
      • The pattern could be changed depending on the locale
        • “Sat 26/7” for en_GB
        • “7/26(土)” for ja_JP
  • Week number
    • Week number is commonly used in European countries
    • The way of calculating week numbers in a year may vary depending on local conventions
flexible date format support in cldr 1
Flexible Date Format Support in CLDR (1)
  • <availableFormats> contains various <dateFormatItem>
  • Each <dateFormatItem> has id attribute representing “skeleton”
  • “skeleton” contains only field information in a canonical order
  • A CLDR consumer provides a “skeleton” – When the matching “skeleton” is available in the locale, the associated pattern is returned. If not, closest match which contains all requested fields is returned.


<dateFormatItem id="MMMEd" draft="provisional">E d MMM</dateFormatItem>

<dateFormatItem id="MMMMd" draft="provisional">d MMMM</dateFormatItem>

<dateFormatItem id="MMdd" draft="provisional">dd/MM</dateFormatItem>

<dateFormatItem id="Md" draft="provisional">d/M</dateFormatItem>

<dateFormatItem id="yyMMM" draft="provisional">MMM yy</dateFormatItem>

<dateFormatItem id="yyyyMM" draft="provisional">MM/yyyy</dateFormatItem>

<dateFormatItem id="yyyyMMMM" draft="provisional">MMMM yyyy</dateFormatItem>


flexible date format support in cldr 2
Flexible Date Format Support in CLDR (2)
  • When any <dateFormatItem> element does not satisfy the matching criteria, use the rules defined by <appendItems> to append missing fields to one of the existing format.


<appendItem request="Day">{0} ({2}: {1})</appendItem>

<appendItem request="Day-Of-Week">{0} {1}</appendItem>

<appendItem request="Era">{0} {1}</appendItem>

<appendItem request="Hour">{0} ({2}: {1})</appendItem>

<appendItem request="Minute">{0} ({2}: {1})</appendItem>

<appendItem request="Month">{0} ({2}: {1})</appendItem>

<appendItem request="Quarter">{0} ({2}: {1})</appendItem>

<appendItem request="Second">{0} ({2}: {1})</appendItem>

<appendItem request="Timezone">{0} {1}</appendItem>

<appendItem request="Week">{0} ({2}: {1})</appendItem>

<appendItem request="Year">{0} {1}</appendItem>


week data in cldr
Week Data in CLDR
  • <weekData>
    • minDays: minimum days in the first week
    • firstDay: first day in a week
    • weekendStart/weekendEnd: start/end day of weekend


<minDays count="1" territories="001" />

<minDays count="4" territories="AT BE CA CH DE DK FI FR IT LI LT LU MC MT NL NO SE SK" />

<minDays count="4" territories="CD" draft="true" />

<firstDay day="mon" territories="001" />

<firstDay day="fri" territories="MV" />

<firstDay day="sat" territories="AE AF BH DJ DZ EG ER ET IQ IR JO KE KW LB LY MA OM QA SA SD SO TN YE" />

<firstDay day="sun" territories="AS AU AZ BW CA CN FO GE GL GU HK IE IL IS JM JP KG KR LA MH MN MO MP


<firstDay day="sun" territories="ET MW NG TJ" draft="true" />

<firstDay day="sun" territories="GB" draft="true" alt="variant" references="Shorter Oxford Dictionary (5th edition, 2002)"/>

<firstDay day="thu" territories="SY" />

<weekendStart day="sat" territories="001"/>

<weekendStart day="fri" territories="EG IL SY"/>

<weekendStart day="sun" territories="IN"/>

<weekendStart day="thu" territories="AE BH DZ IQ JO KW LB LY MA OM QA SA SD TN YE AF IR"/>

<weekendEnd day="sun" territories="001"/>

<weekendEnd day="fri" territories="AE BH DZ IQ JO KW LB LY MA OM QA SA SD TN YE AF IR"/>

<weekendEnd day="sat" territories="EG IL SY"/>


design implementation tips
Design/Implementation Tips
  • Keep internal date/time representation locale-independent
    • Localized format may vary depending on implementation
    • Use standard format such as ISO8601 for data exchange
  • Do not hardcode format patterns in your source code
  • Do not put format patterns in resource bundles with other localizable messages!
    • Locale support is more than UI translation
    • Translation vendors are usually not able to handle regional variants
    • You should be able to find solutions in CLDR/ICU – if no available, file bugs to request new features
  • Avoid date/time data entry by text
    • Formatting date/time is complicated, so is parsing
    • Use UI widget to eliminate ambiguous data entry
  • Understand regional conventions of calendar system
    • Rules for calculating some calendar fields may vary
  • Be prepared to support non-Gregorian calendar systems
    • For example,
      • Buddhist calendar is the most preferred calendar system in Thai
      • Japanese calendar support may be required depending on target sectors
understanding time zone formatting and parsing
Understanding Time Zone Formatting and Parsing
  • CLDR’s approach for supporting time zone formatting
  • Choosing a right time zone format type for your needs
  • Tips for processing date/time with time zone

time zone implementations
Time Zone Implementations
  • The tz database (a.k.a Olson database)
    • 568 zones (436 unique zones / 132 aliases) (2008d)
    • Support historic time transitions since late 19th century
    • At least 1 zone per country/region
    • Time zone abbreviations for display (3 or 4 letter ASCII alphabet), such as “EST”, “JST”…
    • Used by *nix systems (Solaris, Linux, AIX, Mac OS X…) and Java
  • MS Windows time zone
    • 84 zones (Windows Vista), some are obsolete
    • Support historic rules (2005 and beyond) in Vista/2008 Server (Dynamic DST)
    • A zone is shared by multiple cities/countries
    • Time zone display names including the standard offset and common name or exemplar cities, such as “(GMT-05:00) Eastern Time (US & Canada)”, “(GMT+09:00) Osaka, Sapporo, Tokyo”…
time zone format types in cldr 1
Time Zone Format Types in CLDR (1)
  • Generic location format
    • Designed for populating choice lists for time zones
    • Uniquely mapped to “canonical” zone IDs
    • Examples
      • Europe/Rome ⇔ Italy Time [en]
      • America/New_York ⇔ United States (New York) Time [en]
      • America/New_York ⇔ Hora de Estados Unidos (New York) [es]
  • Generic non-location format
    • Designed for recurring events, meetings, or anywhere people do not want to be overly specific
    • Two widths – long/short
    • Examples
      • America/New_York ⇒ ET [en/short]
      • America/New_York ⇒ Eastern Time [en/long]
      • America/Montreal ⇒ Eastern Time [en/long]
time zone format types in cldr 2
Time Zone Format Types in CLDR (2)
  • Generic partial location format
    • A variant of generic non-location format – used as a fallback name when the generic non-location format is not specific enough
    • Two widths – long/short
    • Examples
      • America/Mexico_City ⇒ Hora central (Ciudad de México) [es_US/short/Mar 9 – April 6, 2008]
      • America/Chicago ⇒ Hora central (Chicago) [es_MX/short/Mar 9 – April 6, 2008]
  • Specific (non-location) format
    • Designed to distinguish between standard time and daylight time
    • Two widths – long/short
    • Examples
      • America/New_York ⇒ EST [en/short/standard time]
      • America/New_York ⇒ EDT [en/short/daylight time]
      • America/New_York ⇒ Eastern Standard Time [en/long/standard time]
      • America/Montreal ⇒ Eastern Standard Time [en/long/standard time]
time zone format types in cldr 3
Time Zone Format Types in CLDR (3)
  • Localized GMT format
    • Designed for representing the offset from GMT
    • Local decimal digits are used
    • Examples
      • America/New_York ⇒ GMT-05:00 [en/standard time]
      • America/New_York ⇒ GMT-04:00 [en/daylight time]
      • America/New_York ⇒ Гриинуич-0500 [bg/standard time]
  • RFC 822 format
    • Locale in-sensitive “fixed” format representing the offset from GMT defined by RFC 822
    • ASCII decimal digits are always used
    • Examples
      • America/New_York ⇒ -0500 [standard time]
      • America/New_York ⇒ -0400 [daylight time]
cldr metazone
CLDR Metazone
  • A metazone is an grouping of one or more internal zones that share common non-location display names
    • Following zones are currently associated with a metazone “America_Eastern” (CLDR 1.6.1)America/Nassau, America/Resolute, America/Coral_Harbour, America/Thunder_Bay, America/Nipigon, America/Toronto, America/Montreal, America/Iqaluit, America/Pangnirtung, America/Port-au-Prince, America/Jamaica, America/Cayman, America/Panama, America/Grand_Turk, America/Indiana/Vincennes, America/Indiana/Petersburg, America/Indiana/Marengo, America/Indiana/Winamac, America/Indianapolis, America/Louisville, America/Indiana/Vevay, America/Kentucky/Monticello, America/Detroit, America/New_York
  • Each metazone has a set of localizable names
    • Following names are used for metazone “America_Eastern” (CLDR 1.6.1)
time zone short abbreviation problem
Time Zone Short Abbreviation Problem
  • 2 to 4 letter ASCII alphabets abbreviations are used for short names, such as ET, EST, PDT…
  • The extent to which time zone abbreviations are understood varies heavily by region
    • For example, how many people recognize EAT (East Africa Time) in US?
  • CLDR’s solution - a boolean value associated with a zone/metazone “commonlyUsed” to enable/disable short abbreviations
    • Metazone “Africa_Eastern” has a short standard name “EAT” for English locales
    • For metazone “Africa_Eastern”
      • commonlyUsed = true in en_ZA [English (South Africa)]
      • commonlyUsed = false in en_US [English (United States)]
ambiguous time with generic format
Ambiguous Time with Generic format
  • Daylight ⇒ Standard transition
    • Sunday, November 2, 2008 01:30:00 Pacific Time?
    • Valid, happens twice
    • Generic format cannot distinguish between 1:30 PST and 1:30 PDT
  • Standard ⇒ Daylight transition
    • Sunday, March 9, 2008 02:30:00 Pacific Time?
    • Invalid!
    • 30 minutes 1 second after 01:59:59? or 30 minutes before 03:00:00?
tips for processing date time with time zone
Tips for Processing Date/Time with Time Zone
  • For serializing future date/time data in text format, use RFC 822 format with zone ID
    • Time zone rules could be changed
    • GMT offset information along with zone ID is sufficient to fix up data
  • The result of java.util.Date#toString() might be ambiguous
    • “CST” is used for both “America/Chicago” and “Asia/Shanghai” in Java
    • CLDR does not use a same name for multiple time/meta zone
  • Many zones in tz database use LMT (Local Mean Time) as initial offset
    • LMT is calculated from the longitude and the GMT offset has a fraction of minute
    • ISO8601 / RFC822 / Java GMT format does not have second field, so it may not roundtrip
  • Minimize the dependencies on Windows time zone in multi-platform applications
    • Some windows time zones are not well maintained
    • No historic time zone rule support before Vista/2008 server
    • Mapping between Windows time zones and the tz database is 1-to-n
  • Unicode CLDR project -
  • ICU Project -
  • tz database -