1 / 59

Life after HTML

Life after HTML. an introduction to the future of electronic publication. Lou Burnard Humanities Computing Unit Oxford University http://users.ox.ac.uk/~lou. What went wrong?. The web today!!!. who cares?. application developers and maintainers (the desperate perl hacker)

phuong
Download Presentation

Life after HTML

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Life after HTML an introduction to the future of electronic publication Lou Burnard Humanities Computing Unit Oxford University http://users.ox.ac.uk/~lou

  2. What went wrong? The web today!!!

  3. who cares? • application developers and maintainers (the desperate perl hacker) • tools builders (the mythical CS grad student) • document creators and conservators • document managers • you and me, anxious to communicate

  4. Information Interchange (1) A B E C D 20 translations required (n2-n)

  5. Information Interchange (2) A CommonInterchangeStandard B E C D 10 translations required (2n)

  6. What is XML?  • eXtensible Markup Language • An activity of the World Wide Web Consortium (W3C) • original goal: delivering SGML on the web • new goals: refocus web development • Rewriting the rules of the game? • Adding intelligence to data • Database exchange • Client-side processing • Access to richer data • Better data management http://www.w3.org/pub/WWW/Markup/Activity

  7. The XML WG Hall of Fame Jon Bosak, Sun (Chair) Paula Angerstein, Texcel Tim Bray, Textuality & Netscape James Clark Dan Connolly, W3C Steve DeRose, INSO Dave Hollander, HP Eliot Kimber, Isogen Tom Magliery, NCSA • Eve Maler, ArborText • Murray Maloney, Muzmo &Veo Systems • Makoto Murata, Fuji Xerox • Joel Nava, Adobe • Conleth O'Connell, Vignette • Jean Paoli, Microsoft • Peter Sharpe, SoftQuad • C. M. Sperberg-McQueen, UIC • John Tigue, DataChannel (plus a cast of hundreds on the SIG)

  8. making data into information

  9. What is a document? • content: the components (words, images etc). which make up a document • structure: the organization and inter-relationship of the components • presentation: how a document looks and what processes are applied to it

  10. Separating these things means... • the content can be re-used • the structure can be formally validated • the presentation can be customized for • different media • different audiences • … in short, the information can be uncoupled from its processing • This is not a new idea! But it’s a good one...

  11. The XML family  • XML (Extensible Markup Language): • A subset of SGML (ISO 8879) designed for easy implementation • XLink (Extensible Linking Language): • A set of standard hypertext mechanisms based on HyTime (ISO/IEC 10744) and the Text Encoding Initiative (TEI) • XSL (Extensible Stylesheet Language): • A standard stylesheet language for structured information derived from DSSSL (ISO/IEC 10179) and key CSS concepts

  12. like HTML, XML must... • be usable on the net (but not restricted to it!) • support a wide variety of applications • be compatible with SGML • be easy to process • have few optional features (ideally none) • be human-legible and reasonably clear • be specifed in a way that is both formal and concise

  13. unlike HTML... • XML is an extensible markup language • XML markup can be verified • XML markup reflects themeaning of your data, not its appearance

  14. Some intelligent questions... Perec, Georges Life - a users manual. Collins, 1988. Translated from the French [La vie mode d’emploi] by David Bellos. xviii+581 pp. 841.941 Literature - French - 20th century • what’s the author’s name? • what titles have the classification …? • what authors have the name… ? • what translators are there ? • which books have more than 400 pages?

  15. … which non-extensible markup doesn’t help us answer <p><b>Perec, Georges</b> <I>Life - a users manual. Collins, 1988. Translated from the French </I>[La vie mode d’emploi] <I> by David Bellos. xviii+581 pp. 841.941</I> Literature - French - 20th century Perec, Georges Life - a users manual. Collins, 1988. Translated from the French [La vie mode d’emploi] by David Bellos. xviii+581 pp. 841.941 Literature - French - 20th century

  16. Extensible (user-defined) markup <author>Perec, Georges</author> <title>Life - a users manual</title><publisher>Collins</publisher><publDate>1988</publDate><note>Translated from the French [<title>La vie mode d’emploi</title>] by <translator>David Bellos</translator></note> <pages>xviii+581</pages> <ddc>841.941</ddc><keywords><term>Literature</term> <term>French</term> <term>20th century</term></keywords>

  17. Verifiable markup • well-formed XML markup • tags (etc.) are syntactically correct • every tag has an end-tag • tags are properly nested • valid XML markup • only declared tags are used • all tag occurrences conform to specified positional constraints

  18. Well-formedness <?xml version=“1.0” standalone=“yes”?> • <greeting>hello world!</greeting> • <greeting>hello world!</Greeting> • <grunting> <greeting>hello</greeting> world!</grunting>> • <greeting><grunting>hello</greeting> world!</grunting> • <greeting type=“loud”>ho!</greeting> • <greeting type=loud>ho!</greeting> • <greeting file=“ho.wav”/> • <greeting file=“ho.wav”>

  19. A Valid XML Document • invokes a Document Type Declaration (dtd) • a dtd specifies • names for all your tags • names and default values for their attributes • rules about how tags can nest • names for re-usable pieces of data (entities) • and a few other things • XML dtds are much simpler than SGML dtds

  20. A simple dtd <!ELEMENT greeting (#PCDATA)> a greeting consists of character data... <!ELEMENT name (#PCDATA)> <!ATTLIST name reg CDATA #IMPLIED> as does a name, which can also have an attribute called reg <!ELEMENT grunting (#PCDATA|greeting|name)* > a grunting contains zero or more of the other things, possibly mixed up with some character data

  21. When do you need a dtd? • at document preparation time (definitely) • validation, checking, consistency • at document processing time (probably) • simplifies generic/specific processing • may clarify intended semantics • at document delivery time (possibly) • strictly unnecessary for wf docs • but reduces processing effort

  22. Where do I get a dtd? • flood of industry announcements • some recent examples • Resource Description Framework (for metadata) • Channel Definition Format (for push technologies) • Electronic Data Interchange (banking etc.) • Handheld Device Markup Language (sic) • Chemical Markup Language (chemical modelling) • Math Markup Language (maths!) • Text Encoding Initiative (scholarly texts)

  23. The meaning of markup • ontologically speaking… • markup may be performative or descriptive • markup asserts an intention or interpretation which cannot be formally defined • tags have no predefined meaning • presentation or behaviour of an XML document is specified elsewhere

  24. Where is the behaviour of an XML document defined? • in a stylesheet • using XSL or CSS • possibly embedded in a program applet, or script, or JAVA bean • defined for that particular dtd, tagset, or tag • by reference to pre-existing mutual agreement amongst user communities • aka “namespaces” • by reference to a Document Object Model

  25. Xlink: the future of hypertext We believe in the interconnectedness of all things F. Braudel

  26. Some linking terminology • a link asserts a relationship between linkends • links may be typed • link behaviour is what happens when a link is activated • transclusion: new content appears without displacing current content • linkends may be single or multiple resources • linkends may be target or source with respect to each other

  27. Linking in HTML • link behaviour is tied to particular tags • only two types • <A> replace in same (or new) window • <IMG> transclude inline (usually) • link targets are always whole documents • cannot reassemble fragments • cannot add links to read-only documents • linkends are inherently fragile

  28. Xlink aims to do better • formerly XLL, formerly XML-Link • two components • Xlink • XPointer • working drafts at http://www.w3.org/TR/WD-xlink http://www.w3.org/TR/WD-xptr • WARNING: This is all subject to change!

  29. XLink goals (1) Provide advanced linking constructs within XML documents(XLink) • To anything

  30. Xlink goals (2) • Provide advanced addressing into XML document structure(XPointer) • From anything

  31. XPointer is… • for pointing to subparts of XML resources (even if they don’t have IDs) • based on the Text Encoding Initiative (TEI)“extended pointer” notation • usable in association with URLs/URIs <a href="http://some.url.com/Thing/foo.xml#id(foo)"> <!ENTITY bar SYSTEM "http://some.url.com/Thing/foo.xml#id(foo)">

  32. An XPointer consists of • a series of location terms in the form termname(parms) • terms are separated by a dot id(foo).child(3,SEC).child(4,LIST) • each term is the location source for the next • you can also use terms which point at strings, attributes, etc.

  33. XPointer advantages • a compact syntax which scales well • as robust as possible • any changes “off the path” won’t (necessarily) break the link • IDs are as safe as it gets... • if there’s an ID nearby, point to it and walk down/up • if not, walk down from the root

  34. Xpointers: a flavour • An Xpointer addresses the tree that the markup represents, not the markup itself • Location terms address particular nodes in the tree e.g. • absolutely eg id(), html() • relatively eg child(), descendant(),ancestor(), psibling(), fsibling() • string and attribute matches • can also specify spans

  35. id() and html() id(concepts) html(baz)

  36. child() and descendant() child(1,chapter).child(2,section) descendant(1,abstract)

  37. Xpointer examples id(intro).child(3,div1) the third <div1> within the element with identifier INTRO html(foo).child(2,div1).(4,p).child(1,quote,lang,”LAT”) the first <quote> whose LANG attribute is set to “LAT” within the fourth <p> of the second <div1> of whatever element contains an HTML <A NAME=“#foo”> descendant(#all,para) every <para> within the currentlocation source span(child(1,pb,n,”14”),child(1,pb,n,”23”)) everything between the first <pb> whose N attribute is “14” and the first one whose N attribute is “23”

  38. Xlink proper • allows you to invent your own linking elements and define their behaviour • the xml:link attribute is used to specify the linking properties of your element • allows you to create link databases • “standoff” markup allows you to link to non-modifiable documents • inline vs out-of-line links

  39. Link behaviours • show attribute • new/replace/embed • actuate attribute • user/auto • behavior attribute • “for other instructions”

  40. The importance of XLink • Not just about fancy capabilities and new ways of associating information • Promotes the creation of advanced information structures and site management • Makes possible an industry devoted to knowledge management (that's us!) • For example: OED + LION

  41. XSL: bringing it all back home

  42. script Transformation Tool valid XML documents XML or non- XML documents transforming xml documents

  43. XSL: the final piece • Standard Style Sheet Language • Combines DSSSL “flow objects” and CSS objects • Uses XML syntax (rather than Scheme) • Also uses ECMAscript for extensions • Automatic conversion from CSS

  44. DSSSL components

  45. XSL is the next step for publishing • XSL is not just about translation • user-configurability • enhanced clients • Single source for print and online delivery • XSL is intended to complete the internationalization of publishing

  46. Tools you can use now • Editing/creating documents • emacs + psgml; XED; any SGML editor • Parsers • free standing: SP • java applets: (many) • embedded in applications http://www.stud.ifi.uio.no/ ~larsga/linker/XMLtools.html

  47. Tools you can use now • Browsers and viewers • Hybrick; IE5; Netscape 4; Amaya, Xmetal… • Toolkits • DOM support now in Perl, TCL… • Transformers • Jade

  48. The big picture: empowerment

  49. The wider picture • XML is not just about exchanging data between machines • It's also about communication between humans • XML is not just about the web • It's about information in general • XML is not just about technology • It's also about the relationship between content creators and software vendors

  50. How we will use XML (1) xml Heterogenous clients interfacing with a single database

More Related