1 / 13

XML technologies for text encoding

XML technologies for text encoding. Tamás Váradi varadi@nytud.hu. Introduction. Processing XML files CSS – getting the picture right XPATH – Finding our way around XSLT extracting the right info Encoding content the right way Text Encoding Initiative TEI Lite Tools. Benefits of XML.

Download Presentation

XML technologies for text encoding

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. XML technologies for text encoding Tamás Váradi varadi@nytud.hu

  2. Introduction • Processing XML files • CSS – getting the picture right • XPATH – Finding our way around • XSLT extracting the right info • Encoding content the right way • Text Encoding Initiative • TEI Lite • Tools

  3. Benefits of XML • makes structure and content clear • encoding independent of display and device • portable, platform independent • ideal for exchange of data • with a DTD, validation of document is easy

  4. Limitations of XML • Verbose annotation increases the size of the files (sometimes hugely) • Not very efficient format for fast access and recall

  5. Displaying XML files? • Style sheets • consistent design • easy to change • one stylesheet can serve many XML documents • one documents can use different stylesheets

  6. Cascading Stylesheets Elements are associated with display styles h1: { font-size: 3em; } value selector property A Stylesheet is a collections of style rules

  7. Declaring the stylesheet <?xml-stylesheet type = "text/css" href = "url-of-stylesheet" ?> <? xml version="1.0' ?> <? xml-stylesheet type="text/css" href="cards.css" ?>

  8. An example • Load the file letter.xml into Internet Explorer • Now load the file letter2.xml • View source • Open the file letter.css in notepad • Check that what you see corresponds to what is in the css file

  9. Cascading stylesheets • Features are inherited down the XML tree • Three levels of applying styles: • External stylesheets • Internal style definitions • Inline style settings

  10. Limitations of CSS • Elements are formatted in their original sequence • No means to reorder elements • No means to select a set of elements

  11. More advanced techniques • XSL – Extensible stylesheet Language • XSLT – XSL with Transformations • XPath – a standard way to find elements in the XML hierarchy

  12. XSLT • See the excellent introduction to XSLT by Sebastian Rahtz available here

  13. Standard annotation of content • XML is an annotation standard • it is not designed for any particular domain • Need for standard way of encoding typical text genres like books, dictionaries, letters, radio news etc. etc. • => TEXT ENCODING INITIATIVES (TEI)

More Related