1 / 17

Efficient XML Interchange

Efficient XML Interchange. What is it? Why is it? How does it fit in?. What is Efficient XML Interchange?. Alternative Representation of XML Infoset support full XML (Infoset) data model not a subset no really, not a subset! Interchange Format optimized for data exchange

Download Presentation

Efficient XML Interchange

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Efficient XML Interchange What is it?Why is it?How does it fit in?

  2. What is Efficient XML Interchange? • Alternative Representation of XML Infoset • support full XML (Infoset) data model • not a subset • no really, not a subset! • Interchange Format • optimized for data exchange • transmission, storage, processing • can use Schema, conventional compression

  3. Why? • Expand the Web • limited uptake of XML & friends in certain domains • performance is problem • noteworthy domains • mobile, embedded, scientific, … • Lesson From Binary XML Formats • real need, and real solutions • widely applicable, win-win • multiple formats cause segregation, limit adoption

  4. Integration into XML Stack • Same Data Model • merely an alternative encoding • Open Issues • format, or encoding? • content negotiation? • schema knowledge vs content negotiation • modes, configurability (e.g. simple types)

  5. WebAPI / EXI? • Impact on… • APIs • initalisation: encoding modes, schema info? • XMLHttpRequest • again: modes, schema info? • diversity of formats? • Are data models in sync? • HTML as XML? • REX • fragment support?

  6. Efficient XML Interchange Format Basics

  7. Efficient XML Interchange • Goal(s) • maintain XML (Infoset) data model • seamless integration into XML software stack • improve compaction AND processing • Observation: • ‘smallness’ has multiple benefits • e.g. energy consumption during transmission • allows XML deployment in new scenarios • Underlying Philosophy: • exploit a-priori knowledge of (likely) content

  8. How does it work? • Exploit Knowledge, at Several Different Levels • XML knowledge • copious syntactic redundancy • Schema knowledge • schema describes content in detail • heuristics • e.g. (declared) elements >> processing instructions • e.g. repeated string elements • e.g. small numbers >> large numbers • Cooperation with Conventional Compression • heavily biased data stream as compressor input

  9. EXI Base Format • Coding Grammars • ‚generic‘ grammar: describe full XML Infoset • arbitrary elements, PIs, comments, entity references, etc. • schema-derived grammar • describes a specific format • content-derived grammar • add rules depending on encountered elements • splice these together, at very fine granularity • allow anything, but know what is (currently) likely • likely content: more efficient encoding

  10. SE(*), CH, ER, CM, PI SE(*)CHERCMPI Element StartTag AT(*)NS EE EE EXI Base FormatBuilt-in, Generic Element Grammar

  11. SE(quantity) SE(price) SE(quantity) AT(color) SE(desc) SE(desc) EE SE(quantity) EXI Base FormatA Schema-Based Grammar • Element Content Model: • (optional) attribute “color” • (optional) element “desc” • (mandatory) elements quantity, price

  12. quantity desc SE(*) CH ER EE CM PI EXI Base FormatMerged Generic & Schema Derived Grammar SE(quantity) SE(price) SE(quantity) SE(desc) SE(*), CH, ER, CM, PI SE(*), CH, ER, CM, PI SE(*), CH, ER, CM, PI EE EE

  13. Other, Major EXI Features • Simple Type Values • optimized codecs • type assigment through grammar • generic text coding always available • string / value tables • Bit-Packed vs byte-aligned codec • biased input into “deflate” compression

  14. Impact on the XML Stack • Questions • content negotiation, header • http integration? • what do you need? what would be a problem? • pre-shared schemas • which formats? samples? • (X)HTML? AJAX? • need ‘hooks’ in the specification? • options / variables • different schemas, different options?

More Related