1 / 30

CEAL Preconference Workshop: XML

CEAL Preconference Workshop: XML. Wooseob Jeong Assistant Professor School of Information Studies University of Wisconsin – Milwaukee March 2, 2004 San Diego, CA Sponsored by School of Information Studies, University of Wisconsin – Milwaukee University of California – San Diego Library.

karik
Download Presentation

CEAL Preconference Workshop: XML

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CEAL Preconference Workshop: XML Wooseob Jeong Assistant Professor School of Information Studies University of Wisconsin – Milwaukee March 2, 2004 San Diego, CA Sponsored by School of Information Studies, University of Wisconsin – Milwaukee University of California – San Diego Library

  2. Why XML? • Simply because it’s already everywhere. • MS Office • XHTML - WYSIWYG • PDF • RDF - Dublin Core, RSS • MARC in XML • E-books

  3. What is XML? • Extensible Markup Language • XML is a concept, not an application. • Meta Language • Linguistics for individual languages • XHTML is an application of XML. • Brief history of XML • SGML – HTML • Not enough … why?

  4. Learning XML • No technical experience needed. • Even no HTML experience is welcome. • HTML vs. XHTML (different families) • Again, XML is a concept. • Good starts on XML • http://www.infomotions.com/musings/getting-started/

  5. XML is simple but very strict. • You can make your own mark up set as you like with minimal requirement. • Every tag should be paired. • Tags should be in a hierarchy. • However, once you establish the set, you have to follow it. It’s the law! No exception. Otherwise, your document won’t be displayed at all. • “Well-formedness”– minimum requirement • DTD (Document Type Definition)

  6. Philosophy of XML • Separation of presentation information from its content. No decorating information allowed in contents. • Presentation should be rendered by methods outside the document, currently either CSS or XSLT • CSS has been used in HTML as well as in XML. • Ex) http://www.uwm.edu/~dhedberg/MENU.xml • XSLT is more powerful. • Ex) http://web.utk.edu/~rgilmou1/xml4lita/ • More Examples

  7. Markup information • Presentational Markup: Describe Appearance <blockquote> 1234 N. Oakland Ave. Milwaukee, WI 53201 </blockquote> • Semantic Markup: Indicates Meaning <address> <street>1234 N. Oakland Ave.</street> <city>Milwaukee</city> <state>WI</state> <zip>53201</zip> </address>

  8. Your First XML Document • Using NotePad, please follow the instruction at • http://supervoca.com/xml/first.htm • The result should look like • http://supervoca.com/xml/first.xml

  9. Restaurant Menu Exercise • Well-formedness • CSS (Cascading Style Sheet) • Simple but not flexible • XSLT (Extensible Stylesheet Language Transformations) • It is an xml document itself. • Complex but really powerful • Online Exercises

  10. Menu CSS Exercise • Use NotePad and type yourself, please! • Watch out “save as” option. • Modify “menu.xml” with your favorite foods, adding CSS info. • Modify “menu.css” with your prefences. • Comprehensive CSS reference • http://www.w3schools.com/css/default.asp

  11. Menu XSLT Exercise • Modify “menu2.xml” by adding XSLT info. • Modify “menu2.xsl” with your preference. • It is like a limited programming language. • Selective displays with the same data. • Examples • You may use HTML tags freely, but every attribute’s value should be quoted. • Watch out typos!

  12. Unicode in XML • Unicode is the default character set in XML. • What’s Unicode? • http://unicode.org/ • Why is it so important? • Where is ASCII? • Multilingual vs. Multiscript • WordPad or MS Word should be used for Unicode documents. • Save as “Unicode Text”

  13. “united.xml” <?xml version="1.0"?> <?xml-stylesheet type="text/css" href="united.css"?> <united> <English>Eradicate extreme poverty and hunger</English> <Chinese>消灭极端贫穷和饥饿</Chinese> <French>Réduire l'extrême pauvreté et la faim</French> <Russian>Ликвидация крайней нищеты и голода</Russian> </united>

  14. “united.css” English {display: block; color=red} French {display: block; color=blue} Chinse {display: block; color=green} Russian {display: block; color=purple}

  15. More Unicode Exercise • Multilingual/multiscript sources • United Nations • International Bible Society • Since an XSLT file is an XML document, you can use any languages or any scripts in your XSLT. • Only Windows 2000 or XP supports Unicode fully. • CSS –“bible.xml” • XSLT –“biblecjk.xml”

  16. SMIL Exercise (1) • Synchronized Multimedia Integration Language • Still an XML application! • Multiple media are played together. • Example: Closed Captioning. • RealText Exercise • Based on Real Player setting

  17. SMIL Exercise (2) • Locate an audio source. • Ex) Voice of America at http://voanews.com • Locate its transcript. • Modify “example.smil” file according to your information. • Modify “example.rt” file with your transcript. • It can be a “Karaoke” application.

  18. SMIL Exercise (3) • Online Exercise • http://supervocab.com/xml • Locate any CJK real audio file on the web, and copy the URL to the form. • Ex) http://homepage.third-wave.com/didreat/kor/real.htm • Type the script in CJK. • Choose a character set and a font. • SMIL in Real Audio does still support local character sets only.

  19. Document Type Definition • What is DTD? • The master plan dictates all the rules for elements, attributes, and entities. • You may make your own DTD, but once you make it, you should follow the rule. No exception! • Why is DTD important? • Data Exchange

  20. Elements, Attributes, and Entities • Elements • Building blocks of markup (tags) • Attributes • Qualifying Elements (properties) • Entities • Referencing External Content and Saving Typing • Ex) special characters

  21. DTDs • XHTML • TEI (Text Encoding Initiative) • EAD (Encoded Archival Description) • RDF (Resource Description Framework) • Dublin Core; RSS

  22. Validation • To be a same type of document, it should be valid for its DTD. <!DOCTYPE TEI.2 PUBLIC "-//TEI//DTD TEI Lite XML ver. 1.1//EN" "http://www.tei-c.org/Lite/DTD/teixlite.dtd"> • Online validation tool • Well-formedness vs. Validation

  23. TEI Letter Transcript Exercise • The purpose of this exercise is to make a valid TEI document transcribing a letter. • Use a remote TEI DTD • TEIXLite DTD • <!DOCTYPE TEI.2 PUBLIC "-//TEI//DTD TEI Lite XML ver. 1.1//EN" "http://www.tei-c.org/Lite/DTD/teixlite.dtd"> • Modify “letter.xml” and “letter.css” with your text and preference. • Do a validation test, please.

  24. Dublin Core • Most Frequent Example in RDF • http://dublincore.org/ <?xml version="1.0"?> <!DOCTYPE rdf:RDF PUBLIC "-//DUBLIN CORE//DCMES DTD 2002/07/31//EN" "http://dublincore.org/documents/2002/07/31/dcmes-xml/dcmes-xml-dtd.dtd"> <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:dc="http://purl.org/dc/elements/1.1/"> <rdf:Description rdf:about="http://www.ilrt.bristol.ac.uk/people/cmdjb/"> <dc:title>Dave Beckett's Home Page</dc:title> <dc:creator>Dave Beckett</dc:creator> <dc:publisher>ILRT, University of Bristol</dc:publisher> <dc:date>2002-07-31</dc:date> </rdf:Description> </rdf:RDF>

  25. RSS (RDF Site Summary) • “Rich Site Summary” • More practical and active example in RDF • http://supervocab.com/rss • RSS feeds are so many on the web. • http://mtgear.net/index.rdf • http://homepage.mac.com/cyberdog_to_go/iblog/B1549800066/rss.xml • http://blog.isism.net/b2rdf.php

  26. Other important parts in XML (1) • XSL • XSL Transformations • XSL Formatting Objects • Ex) PDF • URI (Uniform Resource Identifiers) • URL (Uniform Resource Locator) • ISBN/ISSN

  27. Other important parts in XML (2) • XLINK • More than what HTML links do • Ex) inbound link information, behavior of links (when, how to activate) • XPointer • More than what HTML anchors do • XPointers refer to particular parts of or locations in XML documents. • Ex) linking to the third sentence of the seventeenth paragraph in a document

  28. Other important parts in XML (3) • Namespace • An XML namespace is a collection of names, identified by a URI reference • Problem: same element names • Ex) title in HTML and title of a book • Schema • Alternative to DTD • Data type

  29. Popular E-book Formats • Adobe: basically PDF • Microsoft • Palm • Free E-book Projects • http://etext.lib.virginia.edu/ebooks/ebooklist.html • http://www.sois.uwm.edu/xml/ • Authoring tools • Universal CJK support cannot be found yet.

  30. Conclusion • XML is a concept. • There are many XML applications. • XML should separate its presentation information from its contents. • XML’s default character set is Unicode. • XML should be “well-formed” at least. • DTD/Schema is very important for data/information interchange.

More Related