18
This presentation is the property of its rightful owner.
Sponsored Links
1 / 22

18 th International Unicode Conference PowerPoint PPT Presentation


  • 200 Views
  • Uploaded on
  • Presentation posted in: General

18 th International Unicode Conference. Documentum and UTF-8: Converting Content Management Software Product Line to Unicode. 27 April 2001 Donald Ziff. Agenda. What is Documentum? Documentum’s I18N Problem How Unicode UTF-8 Saved the Day Other Success Factors Demo.

Download Presentation

18 th International Unicode Conference

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Documentum proprietary

18th International

Unicode Conference

Documentum and UTF-8:

Converting Content Management

Software Product Line to Unicode

27 April 2001

Donald Ziff

Documentum Proprietary


Agenda

Agenda

  • What is Documentum?

  • Documentum’s I18N Problem

  • How Unicode UTF-8 Saved the Day

  • Other Success Factors

  • Demo

Documentum Proprietary

Documentum Proprietary and Confidential


About documentum

About Documentum

  • Documentum: NASDAQ “DCTM”

  • The Leader in Web and Enterprise Content Management Solutions

  • > $128M in revenue 1999. > 800 employees.

  • Over 900+ Global 2000 customers with strong vertical focus

  • Over 25 Offices in 10+ countries

Documentum Proprietary


Dctm s i18n problem

DCTM’s I18N Problem

  • Everyone agrees: we need I18N to fuel growth – especially in Asia

  • Asian-certified product much more important than multi-lingual

    • Although demand for multi-lingual is growing…

  • So why not I18N?

Documentum Proprietary


I18n perception problems

I18N Perception Problems

  • Too Difficult – won’t fit into a development cycle

  • Too much Overhead – multiplies QA and Support

  • Not Sexy – no new functionality

    Let’s look at these problems…

Documentum Proprietary


I18n is too difficult

“I18N is too difficult”

Product Layers:

  • Server (built on RDBMS + Verity)

  • DMCL: Client Library (C++)

  • DFC: Foundation Classes (Java)

  • DTC: Desktop Client – Win32 end-user client

  • WDK: Web Development Kit

  • RightSite: Legacy Web-Server Integration

  • Web Publisher: Web Content Management App

  • Legacy clients: Workspace (Win32), Intranet

Documentum Proprietary


History lesson

History Lesson

  • Server v3.1.6.INT, created by consultants for Japanese market, was expensive and time-consuming

    • 3.1.6.INT attempted to internationalize all the layers in the DCTM architecture at once

  • 4.0 was released without I18N changes

  • 4.1 followed, the deltas from 3.1.6 to 3.1.6.INT became hard to apply…

Documentum Proprietary


I18n requires too much overhead

“I18N requires too much overhead”

  • The DCTM server requires pharmaceutical-strength certification

  • Dimensions of certifications:

    • 3 RDBMS platforms: Oracle, Sybase, SQL-Server

    • 4 Server OS’s: NT, Solaris, HPUX, AIX

  • The 3.1.6.INT architecture introduced new dimensions, leading us to…

Documentum Proprietary


Certification hell

Certification Hell!

  • New certification dimensions:

    • 5 DCTM Server code-pages

    • 5 RDBMS code-pages

  • Market requires another dimension:

    • 5 Server OS Localizations

  • 125 new times 12 old  1500 certs!

  • Exaggeration, of course… But still…

Documentum Proprietary


I18n not sexy

“I18N not sexy”

  • DCTM is a growth company, needs sizzle as well as steak

  • I18N grows markets, but doesn’t add much to marketing message

  • To be fair: new functionality is not just “sexy” – it is essential to DCTM’s continued survival

  • Other priorities will move to the top…

Documentum Proprietary


Dctm s i18n requirements

DCTM’s I18N Requirements

  • Crucial need: support Asia from the main code-line. One binary for the world

  • Backward compatibility essential

  • Multi-lingual features would be a side-benefit. High on the wish list for a few key customers

  • I18N project must be scoped down to be achievable

Documentum Proprietary


How utf 8 saved the day

How UTF-8 Saved the Day

  • UTF-8 moves safely through the server because anything that looks like ASCII actually is

  • Standardizing on UTF-8 as the only supported internal code-page cuts down certification matrix

Documentum Proprietary


Lessons from double byte experiments

Lessons from Double-Byte Experiments

  • EUC-KR: 4.1 server works (basically)

  • SJIS: problems! double-byte characters whose second bytes are ASCII: \ ` |

  • Lessons:

    • Non-ASCII moves through the server safely

    • String handling need not be double-byte aware, if ASCII always means ASCII

  • Solution: UTF-8!

Documentum Proprietary


Utf 8 ascii is ascii

UTF-8: ASCII is ASCII

  • No need for special string handling

    • Server 3.1.6.INT replaced all standard c string handling with calls to 3rd-party library

    • With UTF-8, we stick with standard – yacc and other legacy tools work fine

  • Greatly improved perception (and reality) of how difficult I18N would be

    • Now, it’s relatively low-impact

Documentum Proprietary


It s utf 8 dummy

It’s UTF-8, dummy!

  • Use UTF-8 everywhere, cut down on certification dimensions

  • Provides safe character-handling for Asia

  • Even though multi-lingual is not a requirement

  • Easier to support

Documentum Proprietary


Other success factors

Other Success Factors

  • Rely on RDBMS services to translate between RDBMS code-page and UTF-8

  • Market research cut back on OS localization constraints

  • Transcoding infrastructure

Documentum Proprietary


Rdbms transcodes to from utf 8

RDBMS transcodes to/from UTF-8

  • Oracle and Sybase transcode automatically – SQL Server is a problem

  • No need for new transcoding calls between Server and RDBMS – lower impact

  • Upgrade customers have non-unicode RDBMS – no need for them to convert

  • One less certification dimension!

Documentum Proprietary


Cut back on localized os certs

Cut back on Localized OS certs

  • Limit RDBMS for Asia – for 4.2, just Oracle

  • Localized OS certification not necessary for Europe

Documentum Proprietary


Transcoding infrastructure

Transcoding Infrastructure

  • Server must be aware of interface code-pages

  • Transcoding done at the interfaces

  • 3rd party transcoding used: Uniscape’s GlobalC

Documentum Proprietary


New i18n architecture

New I18N Architecture

Desktop Client

Custom WebApp

Web Publisher

Intranet Client

Administrator

WDK (Unicode)

Rightsite(NCS)

WorkSpace

DFC (Unicode)

Web Cache

ARP(NCS)

( UTF8) DMCL (4.2)

DMCL ≤ 4.1 (NCS)

e-Content Server

(UTF8)

Legend:

National Character Set

Unicode

File System

Verity

RDBMS

(Unicode)

Documentum Proprietary


Documentum proprietary

Demo

  • Demo – multilingual WDK

  • If there’s time, a quick look at localized Desktop Client (Win32 Client)

Documentum Proprietary


Conclusion

Conclusion

UTF-8 was a crucial technology in DCTM’s I18N strategy:

  • Provided an easy path for legacy C++

  • Supported specific Asian languages consistently, minimizing certifications

  • Prepared infrastructure for multi-lingual requirements

Documentum Proprietary


  • Login