1 / 37

XML: The Promise and the Reality

XML: The Promise and the Reality. K. Scott Morrison IBM Pacific Development Centre Vancouver. And clear the standards log jam!. XML Hype. XML will replace HTML. XML is for documents. XML is for data. XML will replace all message formats. Profile Edit Screen NAME: K. Scott Morrison

dylan-brown
Download Presentation

XML: The Promise and the Reality

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. XML: The Promise and the Reality K. Scott Morrison IBM Pacific Development Centre Vancouver

  2. And clear the standards log jam! XML Hype XML will replace HTML XML is for documents XML is for data XML will replace all message formats

  3. Profile Edit Screen NAME: K. Scott Morrison ADDRESS: 8999 Nelson Way CITY: Burnaby STATE/PROV: B.C. COUNTRY: Canada CODE: V5A 4B5 TEL: (604) 293-5753 FAX: (604) 473-5807 CREDIT CARD1 TYPE: VISA NUM: 123456789 EXP: 12/00 CREDIT CARD2 TYPE: AMEX NUM: 987654321 EXP: 04/01

  4. Screen Scrape Remote System …….. .. … … … .. Screen Scrape Server Persistent Store

  5. Binary Representation

  6. Binary Representation • Issues • Can’t determine structure from data • Portability • Fixed field length • Brittle interfaces • Must modify all clients and servers simultaneously • Mapping code typically buried in applications • Significant maintenance problem • Distribution of message map • Not human readable

  7. ANSI_X3.4-1968 (US-ASCII) Text Representation

  8. US-ASCII Text Representation • Standards-based, reasonably portable • Human readable • Can make conjectures about semantics • Issues: • Limited character set • NLS problems • No real structure: hierarchy, lists, etc • Still very brittle • Distribution of message maps

  9. Proprietary Tagging

  10. Proprietary Tagging • Human understandable • Less brittle interface • Delimited text using tag and <CR> • Issues • Distribution of tag semantics • Non-standard • Character escaping issues • E.g. <CR>, :, etc • No sense of hierarchy • Programmer intensive: parsers, handlers, etc

  11. Formalized Tagging: Markup • Markup is meta-data • Adds information about text • What it means, how to interpret, how to render, etc • Markup delimits • START-----------END • Interface is less brittle • Markup works as a container • Markup adds structure • Hierarchy, etc

  12. Markup Can Be Stylistic Lorem ipsum dolor sit amet, consectetuer adipiscing elit, sed diam nonummy nibh euismod tincidunt ut laoreet dolore magna aliquam erat volutpat. Ut wisi enim ad minim veniam, quis nostrud exerci tation ullamcorper suscipit lobortis nisl ut aliquip ex ea commodo consequat. <FONT FACE=“Times New Roman”> Lorem ipsum dolor sit amet, consectetuer adipiscing elit, sed diam nonummy nibh euismod tincidunt ut laoreet dolore magna aliquam erat volutpat. Ut wisi enim ad <B> minim </B> veniam, quis nostrud exerci tation <I> ullamcorper </I> suscipit lobortis nisl ut aliquip ex ea <U> commodo </U> consequat. </FONT>

  13. Markup Can Be Structural Lorem ipsum dolor sit amet, consectetuer adipiscing elit, sed diam nonummy nibh euismod tincidunt ut laoreet dolore magna aliquam erat volutpat. Ut wisi enim ad minim veniam, quis nostrud exerci tation ullamcorper suscipit lobortis nisl ut aliquip ex ea commodo consequat. <P> Lorem ipsum dolor sit amet, consectetuer adipiscing elit, sed diam nonummy nibh euismod tincidunt ut laoreet dolore magna aliquam erat volutpat. </P> <P> Ut wisi enim ad minim veniam, quis nostrud exerci tation ullamcorper suscipit lobortis nisl ut aliquip ex ea commodo consequat. </P>

  14. Markup Can Be Semantic Lorem ipsum dolor sit amet. consectetuer adipiscing elit, sed diam nonummy nibh euismod tincidunt ut laoreet dolore magna aliquam erat volutpat. Ut wisi enim ad minim veniam, quis nostrud exerci tation ullamcorper suscipit lobortis nisl ut aliquip ex ea commodo consequat. <TITLE> Lorem ipsum dolor sit amet. </TITLE> <BODY> consectetuer adipiscing elit, sed diam nonummy nibh euismod tincidunt ut laoreet dolore magna aliquam erat volutpat. Ut wisi enim ad minim veniam, quis nostrud exerci tation ullamcorper suscipit lobortis nisl ut aliquip ex ea commodo consequat. </BODY>

  15. Simple Markup Example #1 <Message> Hello, World! </Message>

  16. Simple Markup Example #2 <MessageContainer> <Message> Hello, World! </Message> <Message> Goodbye, World! </Message> </MessageContainer>

  17. Profile Using Markup <Profile> <Name> K. Scott Morrison </Name> <Address> 8999 Nelson Way </Address> <City> Burnaby </City> <StateProvince> BC </StateProvince> <Country> Canada </Country> <ZipPostalCode> V5A 1B5 </ZipPostalCode> <Telephone> (604) 293-5753 </Telephone> <FAX> (604) 473-5807 </FAX>

  18. Profile Using Markup (cont.) <Card> < Type > VISA </ Type > <Number> 123456789 </Number> <Expiry> 1200 </Expiry> </Card> <Card> <Type> AMEX </ Type > <Number> 987654321 </Number> <Expiry> 0401 </Expiry> </Card> </Profile>

  19. eXtensible Markup Language • Extensible • Tag set is not fixed • Structural • Deep, hierarchical nesting of structures • Ordered lists (unordered with Schema) • Can infer meaning from structure • Valid document requirement • Can check structure against a schema • Well formed (DTDs and Schema)

  20. eXtensible Markup Language • Portable • Text-based Unicode • All parsers must support UTF-8 (US-ASCII) • Parsers may support: UTF-16, EBCDIC, UCS-4, ASCII, ISO 646, ISO 8859, Shift-JIS, EUC, etc • Human readable • Machine understandable • W3C standard • Rich set of emerging tools

  21. XML Tooling: IE5

  22. Isn’t This Just Like HTML? • HTML is a markup language based on SGML • Key differences • HTML has a fixed set of tags • HTML mixes stylistic, structural, and semantic tags • HTML does not support deep nesting and hierarchy • HTML is invalid

  23. XML Issues • XML documents must be valid Invalid • Means that structure can be inferred <Message> Hello, World! Valid <Message> Hello, World! </Message> Hello, World! </Message> <Message> Hello, World! </MESSAGE>

  24. XML Issues • Knowledge of document organization • Distribution of document organization • Character set issues • Document focus • Parser speed and complexity

  25. Message Organization: DTDs • Document Type Definition • Used to validate documents • Issues: • Doesn’t use XML syntax, not extensible • Writing good DTDs is hard • Very awkward and limited language constructs, uses eBNF grammar • No namespaces • No inheritance, defaults, ranges, enums

  26. Profile DTD Example <?xml version="1.0" encoding="UTF-8" ?> <!DOCTYPE Profile [ <!ELEMENT Profile (Name, Address, City, StateProvince, Country, ZipPostalCode, Telephone, FAX, Card*)> <!ELEMENT Name (#PCDATA)> <!ELEMENT Address (#PCDATA)> <!-- Lines removed for clarity --> <!ELEMENT Card (Type, Number, Expiry)> <!ELEMENT Type (#PCDATA)> <!ELEMENT Number (#PCDATA)> <!ELEMENT Expiry (#PCDATA)> ]> <Profile> <Name> K. Scott Morrison </Name> <Address> 8999 Nelson Way </Address> <!-- Lines removed for clarity --> <Card> <Type> VISA </Type> <Number> 123456789 </Number> <Expiry> 1200 </Expiry> </Card> <!-- Lines removed for clarity --> </Profile>

  27. Message Organization: Schema • Alternative to DTDs • XML syntax • Much more expressive • Has namespace support • Has inheritance, defaults, ranges (min/max), enumerations, sequences, unordered lists

  28. Transforms: XSL <Profile> <Name> K. Scott Morrison </Name> <Address> 8999 Nelson Way </Address> <City> Burnaby </City> <StateProvince> BC </StateProvince> <Country> Canada </Country> <ZipPostalCode> V5A 1B5 </ZipPostalCode> <Telephone> (604) 293-5753 </Telephone> <FAX> (604) 473-5807 </FAX><Card> <Name> VISA </Name> <Number> 123456789 </Number> <Expiry> 1200 </Expiry> </Card> <Card> <Name> AMEX </Name> <Number> 987654321 </Number> <Expiry> 0401 </Expiry> </Card> </Profile> <VisaCard> <CardNumber> 123456789 </CardNumber> <Expiry> 1200 </Expiry> <Client> <Name> K. Scott Morrison </Name> <Telephone> (604) 293-5753 </Telephone> </Client> </VisaCard> Source Document Destination Document XSL Engine

  29. eXtensible Stylesheet Language • XSL=XSL Transforms (XSLT) + Formatting Objects and Properties • Some basic XSLT functions: • Insertion of static text (like templates) • Copy, discard, or rearrange source text • Compute new text from source

  30. XSLT Example: XML to HTML <?xml version="1.0"?> <xsl:stylesheet xmlns:xsl="http://www.w3.org/TR/WD-xsl"> <xsl:template match="/"> <html> <body> <P> Address is: <xsl:value-of select="Profile/Address"/> </P> <P> Name is: <xsl:value-of select="Profile/Name"/> </P> </body> </html> </xsl:template> </xsl:stylesheet>

  31. XML as Data <Profile> <ID> 123456789 </ID> <Name> K. Scott Morrison </Name> <Address> 8999 Nelson Way </Address> <Card> <Name> VISA </Name> <Number> 123456789 </Number> <Expiry> 1200 </Expiry> </Card> </Profile> XML Document XML-Db Extender Profile Table Database Card Table

  32. XML as Data: Messaging Request XML Message <…> <…> … </…></…> <…> <…> … </…></…> RDBMS Response XML Message Server System Client System • Transport Examples: • HTTP over sockets • MQSeries • etc • Message Infrastructure: • ebXML • SOAP • etc • Message Formats: • OTA • TravelFrame • etc

  33. Schema Distribution Source System Destination System DTD <…> <…> … </…></…> Embedded XML Message <…> <…> … </…></…> Replicated DTD <…> <…> … </…></…> Centralized Schema Repository

  34. Thick PC Clients AS400s SNA TCP/IP Gateway S390 S390 UNIX Servers Applications: Interoperability

  35. AS400s S390 S390 Applications: Interoperability Thick PC Clients MQSeries Integrator routing and transforming XML messages UNIX Servers

  36. AS400s SNA TCP/IP S390 S390 Applications: eBusiness Internet Web/WML/XML Servers

  37. Summary • XML is here to stay • There will be heavy vendor support for it because: 1. XML is standardized 2. XML will be the B2B message format • It can be leveraged now in most domains

More Related