1 / 16

Office Open XML Overview

Office Open XML Overview. Štěpán Bechynský. Open XML Specification. Part 1: Fundamentals, 173 pages Part 2: Open Packaging Convention, 129 pages Part 3: Primer, 472 pages Part 4: Markup Language Reference, 5219 pages Part 5: Markup Compatibility and Extensibility, 43 pages.

denton
Download Presentation

Office Open XML Overview

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Office Open XML Overview Štěpán Bechynský

  2. Open XML Specification • Part 1: Fundamentals, 173 pages • Part 2: Open Packaging Convention, 129 pages • Part 3: Primer, 472 pages • Part 4: Markup Language Reference, 5219 pages • Part 5: Markup Compatibility and Extensibility, 43 pages http://www.ecma-international.org/

  3. Programmer View of Open XML Files • ZIP Archive • Document Parts • XML Parts • Binary Parts • Typed (RFC 2616) • Relationships • Connections between parts • Content Type Stream • A specially-named stream • Defines mappings from part names to content types • Not itself a part, not URI addressable • Folder structure for convenience only DEMO

  4. How to think about OPC packages Files and folders – NO! Parts and relationships – YES

  5. Ecma Office Open XML Specifications Markup Languages WordprocessingML SpreadsheetML PresentationML Vocabularies DrawingML Custom XML Bibliography VML (legacy) Metadata Equations Open Packaging Convention Relationships Content Types Digital Signatures Core Technologies ZIP XML + Unicode

  6. WordprocessingML Document Architecture Document body properties • A WordprocessingMLfile is a collection of multiple subdocuments: • The main story • Header(s) / Footer(s) • Footnote(s) / Endnote(s) • Subdocuments • Comment(s) comments images footnotes/endnotes numberingDefinitions headers/footers styles fontTable customXML

  7. Paragraph Example Simple text formatting at the paragraph/run levels: Paragraph properties specify bold (default for the entire paragraph) <w:p> <w:pPr> <w:b/> </w:pPr> <w:r> <w:t>The quick</w:t> </w:r> <w:r> <w:rPr> <w:i/> </w:rPr> <w:t>brown</w:t> </w:r> <w:r> <w:t>fox.</w:t> </w:r> </w:p> Run properties specify italics (override for this run)

  8. Images • An image is a w:pict element inside a run <w:r> • The v:imagedata element is defined in VML: • xmlns:v="urn:schemas-microsoft-com:vml" • The actual image is referenced via a relationship: • The relationship points to an image part in the package: <w:pict> <v:shape id="_x0000_i1025" type="#_x0000_t75" style="width:250; height:200"><v:imagedata r:id="rId4"/> </v:shape></w:pict> <Relationship Id="rId4” Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/image” Target="image1.jpg"/>

  9. Hyperlinks • A hyperlink is nested inside a paragraph, outside a run: • The destination is stored in a relationship: <w:p>  <w:hyperlink r:id=“linkRel1">    <w:r>      <w:rPr>        <w:color w:val="0000FF" w:themeColor="hyperlink" />        <w:u w:val="single" />      </w:rPr>      <w:t>Click here for OpenXmlDeveloper.org.</w:t>    </w:r>  </w:hyperlink></w:p> <Relationship Id=“linkRel1“ Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/hyperlink” Target="http://www.openxmldeveloper.org"TargetMode="External" />

  10. SpreadsheetML Workbook properties styles sharedStrings calcChain sheet1..N sheet1..N sheet1..N sheet1..N table chart sheet1..N sheet1..N sheet1..N drawing

  11. Minimal Workbook/Worksheet workbook.xml: <workbook> <sheets> <sheet name="Sheet1" sheetId="1" r:id="rId1"/> </sheets> </workbook> sheet1.xml: <worksheet> <sheetData/> </worksheet> relationship

  12. Strings in SpreadsheetML Two ways a string can be stored: • Inline strings • Provided for ease of translation/conversion • Useful in XSLT scenarios • Excel and other consumers may convert to shared strings • An entry in the shared-strings table • May be either a simple string or formatted text • These approaches may be mixed/combined

  13. Inline Strings • Inline string support provides a very simple mechanism for programmatically populating a worksheet • Especially useful in XSLT scenarios • Excel 2007 converts to shared strings on save • If you’re consuming Open XML documents, you must handle both cases: inline strings and/or shared strings • To convert our shared-strings example to inline strings, just replace sheetdata: <sheetData> <row><c t="inlineStr"><is><t>Paris</t></is></c></row> <row><c t="inlineStr"><is><t>Seattle</t></is></c></row> <row><c t="inlineStr"><is><t>London</t></is></c></row> <row><c t="inlineStr"><is><t>Copenhagen</t></is></c></row> <row><c t="inlineStr"><is><t>Paris</t></is></c></row> <row><c t="inlineStr"><is><t>London</t></is></c></row> </sheetData>

  14. Shared Strings • By default, strings are stored in a shared-strings part: • Each unique string is stored once • Cells store the index (0-based) of the string • This design is based on analysis of typical spreadsheet contents: highly repetitive strings are very common • Benefits: • Users: reduced file size, improved performance • Developers: all strings are in one part, simplifying search, localization, and other common string-handling objectives

  15. Shared Strings: example Worksheet contents: sharedStrings.xml contents: 6 string references, 4 unique strings <sstxmlns="..." count="6" uniqueCount="4"> <si> <t>Paris</t> </si> <si> <t>Seattle</t> </si> <si> <t>London</t> </si> <si> <t>Copenhagen</t> </si> </sst> Paris = string 0 <row r="1" spans="1:1"> <c r="A1" t="s"> <v>0</v> </c> </row>

More Related