xhtml n.
Skip this Video
Loading SlideShow in 5 Seconds..
XHTML PowerPoint Presentation
Download Presentation

Loading in 2 Seconds...

play fullscreen
1 / 95

XHTML - PowerPoint PPT Presentation

  • Uploaded on

XHTML. Steven Pemberton CWI, Amsterdam Chair, W3C HTML Working Group. Overview. History Philosophy XML and related technologies XHTML 1.0 Modularisation XHTML Basic XHTML 1.1 The Future. HTML 1. The original HTML was designed in the early 1990’s for scientific reports

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
Download Presentation

PowerPoint Slideshow about 'XHTML' - astin

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript


Steven Pemberton

CWI, Amsterdam

Chair, W3C HTML Working Group

  • History
  • Philosophy
  • XML and related technologies
  • XHTML 1.0
  • Modularisation
  • XHTML Basic
  • XHTML 1.1
  • The Future
html 1
  • The original HTML was designed in the early 1990’s for scientific reports
  • Each document was a single resource (not even <IMG>)
  • (This explains much about HTTP by the way)
html 11
(HTML 1)
  • It is amazing how much we have been able to do with a language with such beginnings
  • It was described using SGML
html as an sgml application
HTML as an SGML Application
  • SGML: an international standard in 1986
  • It is a Meta-language that describes data formats, using DTD’s (Document Type Definitions)
  • Describes structure, not presentation<H1>HTML as SGML Application</H1>
example of a dtd fragment
Example of a DTD fragment

<!ELEMENT table

(caption?, (col*|colgroup*), thead?,

tfoot?, (tbody+|tr+))>

<!ELEMENT caption %Inline;>

<!ELEMENT thead (tr)+>




%attrs; -- %coreattrs, %i18n, %events --

summary %Text; #IMPLIED

width %Length; #IMPLIED

border %Pixels; #IMPLIED



<!ENTITY % fontstyle

"TT | I | B | BIG | SMALL">

<!ENTITY % inline "#PCDATA | %fontstyle; | %phrase; | %special; | %formctrl;">

<!ENTITY % Length "CDATA" -- nn for pixels or nn% for percentage length -->

problems with sgml
Problems with SGML
  • Arcane syntax
  • Very difficult to implement fully
  • No support for types
changes to html
Changes to HTML
  • Netscape and Microsoft start adding to HTML: mostly presentation-oriented tags (like <BLINK>, <CENTER>), and frames
  • The World Wide Web Consortium (W3C) started effort to:
    • Keep HTML Pure
    • Do presentation via Style Sheets
separating content and presentation
Separating content and presentation
  • HTML was designed as a data-structuring language, but the later changes undermined this.
  • Separating content from presentation has distinct advantages
for the author
For the author
  • Easier to write your documents
  • Easier to change your documents
  • Easy to change the look of your documents
  • Access to professional designs
  • Your documents are smaller
  • Visible on more devices
  • Visible to more people
for the webmaster
For the webmaster
  • Separation of concerns
  • Simpler HTML, less training
  • Cheaper to produce, easier to manage
  • Easy to change house style
  • Reach more people
  • Search engines find your stuff easier
  • Visible on more devices
for the reader
For the reader
  • Faster download (one of the top 4 reasons for liking a site)
  • Easier to find information
  • You can actually read the information if you are sight-impaired
  • Information more accessible
  • You can use more devices
for the implementor
For the implementor
  • Improves the implementation (separation of concerns)
  • Can produce smaller browsers
changes to html 2
Changes to HTML (2)
  • Another change that Netscape made, with insufficient thought was Frames
  • Frames create significant problems with web pages
the problems with frames
The problems with frames
  • Can’t bookmark framesets
  • [Back] does odd things
  • [Page up] and [page down] work oddly
  • [Reload] often doesn’t work right
  • Security is compromised
  • Nested frames are hard to deal with (how do you get out?)
what frames can do
What frames can do
  • Search and show interfaces
  • Keeping script variables in a hidden frame
style languages
Style languages
  • The first action that W3C did was to start an activity on Style Sheets (Nov 1995)
  • This produced CSS1 initially (Dec 1996), then CSS2 (May 1998) (CSS3 is in preparation)
  • Later produced XSL, an XML-based language, as complementary to CSS
  • CSS is a separate language from HTML that allows you to specify how an HTML document, or set of documents, should look
  • Separates content from presentation
  • HTML can be a structure language again
examples of css
Examples of CSS

h1 { font-weight: bold; font-size: 2em }

h2 { font-weight: bold; font-size: 1.5em }

em {background-color: yellow}

body {margin-left: 20%}

using css
Using CSS
  • Use the following at the top of an XML document:

<?xml-stylesheet type='text/css' href=’mystyle.css'?>

  • Or this in the <head> of an HTML document:

<link rel="stylesheet" type="text/css" href=”mystyle.css" />

advantages of css
Advantages of CSS
  • Makes HTML easier to write (and read)
  • You can define a house style
  • Compatible: you can still see the content on non-CSS browsers
  • Pages are much smaller
  • Accessible to sight-impaired
  • ...
by the way
By the way...
  • Check your logs: more than 95% of people browsing now use a CSS-enabled browser
  • The current generation of browsers (IE 5, NS 6, Opera 4) have excellent support for CSS.
  • You never need to use the <FONT> and <FONTFACE> elements again!
  • As mentioned, HTML was designed for just one sort of document (scientific reports), but is now being used for all sorts of different documents
  • You could use SGML to define other sorts of document, but SGML is notoriously hard to fully implement
  • Enter XML
enter xml
Enter XML
  • XML is a W3C effort to simplify SGML
  • It is a meta-language: a language for defining languages
  • It is a subset of SGML
  • One of the aims is to allow everyone to invent their own tags
  • DTD is optional: a DTD can be inferred from a document
  • The requirement of being able to infer a DTD from a document has an effect on the languages you can define:
    • Closing tags are now required<LI>....</LI> <P>....</P>
    • Empty tags are marked specially <IMG SRC=”pic.gif”/> <BR/> <HR/> (or <HR></HR> etc)
consequences 2
Consequences 2
  • CDATA sections must be marked as such (only necessary if they contain “<”, “&” etc.):



... script content ...



by the way p is not like br
Not Like This


An underlying problem with HTML is that ...


You could use SGML to define ...

But Like This



An underlying problem with HTML is that … </P>


You could use SGML to define ...</P>

By the way: <P> is not like <BR>
consequence of xml
Consequence of XML
  • Anyone can now design their own (Web-delivered) languages
  • CSS makes them viewable


<name>Steven Pemberton</name>


<street>Kruislaan 413</street>

<postcode>1098 SJ</postcode>




so do we still need html
So do we still need HTML?
  • Workshop in May 1998
  • XML is still a meta-language
  • There is still a perceived need for a base-line mark-up
  • HTML has some useful semantics, both implied and explicit (search engines gladly use it, for instance)
html as xml application
HTML as XML application
  • Clean up (get rid of historical flotsam)
  • Modularise – split into separate parts
    • Allows other XML applications to use parts
    • Allows special purpose devices to use subset
  • Add any required new functionality (forms, better event handling, Ruby)
the html working group
The HTML Working group
  • International membership, around 20 members
  • Many major players (IBM, Microsoft, Netscape, etc)
  • Meets weekly by phone, quarterly face-to-face
group experience
Group experience
  • There was more to be worked out than we anticipated
  • XHTML is the first major application of XML, so the world’s eyes are on us
  • XML still needs the wrinkles ironed out
philosophy of xhtml
Philosophy of XHTML
  • Transition from ‘old world’ to XML
  • Clean up the language
  • Return to structure only
  • Use generic XML as much as possible
  • Modularise
  • Address wider needs (International, Accessibility)
  • Add new functionality
plan of action
Plan of action
  • HTML 4.01: corrected version
  • XHTML 1.0: transitional version of HTML 4.01 in 3 flavours
  • Modularisation: agreement on split and methodology
  • XHTML Basic: Small devices
  • XHTML 1.1: clean version of 1.0 strict
plan of action1
(plan of action)
  • Events: accessible and device-independent
  • Ruby: needed Asian markup
  • Forms: more control
  • XHTML 2.0: Putting it all together
differences html xhtml
Differences HTML:XHTML
  • Because of the difference between SGML and XML, there are some necessary differences, for instance:
    • Use lower case: <p> not <P>
    • Attributes are always quoted: <th colspan=”2”>
    • Anchors use id attribute not name (and not just on <a> by the way):<a id=”index”> <p id=”top”>
example xhtml 1 0
Example XHTML 1.0

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/strict.dtd">

<html xmlns="http://www.w3.org/1999/xhtml” xml:lang="en">

<head><title>Virtual Library</title></head>


<p>Moved to <a href="http://vlib.org/">vlib.org</a>. </p>



  • Namespaces have been added to XML to allow you to mix fragments from different languages (e.g. HTML + Maths)
  • In the same way that object-oriented languages allow you to identify which function you are using, namespaces allow you to identify which tags you are using.
example of nesting
Example of nesting

<html xmlns="http://www.w3.org/1999/xhtml">

<head><title>A Math Example</title></head>


<p>The following is MathML markup:</p>

<math xmlns="http://www.w3.org/TR/REC-MathML">

<apply><log/><logbase><cn> 3 </cn> </logbase>

<ci> x </ci>





example of colonising
Example of colonising

<math xmlns="http://www.w3.org/TR/REC-MathML"


<apply><log/><logbase><cn> 3 </cn> </logbase>

<ci> x </ci>


<html:p>This is a paragraph</html:p>


namespaced attributes
Namespaced attributes
  • Attributes normally come from the element itself:

<html:a href="next.xml">

  • But you may also use ‘global’ attributes from a namespace:

<pointer html:href="x.xml">

<music style="classical" html:style="color: red">Beethoven’s 5th</music>

xml namespace
XML ‘namespace’
  • XML also has its own pseudo-namespace for reserved attributes:

<para xml:lang="en">

using generic xml
Using ‘generic’ XML
  • Presentation  use CSS
  • Links  use Xlink or Schemas
  • Forms  use CSS?
  • Images etc.  use Xlink or Schemas
  • (Natural) language of elements  use xml:lang attribute
  • HTML has several ‘built-in’ hyperlinks: <a>, <img>, <object>, <link>, etc.
  • Since XML allows you to define your own elements, a browser doesn’t know which are links
  • Xlink was started to solve this problem.
  • Xlink started as a method of describing which attributes of an element were a link
  • It later changed into a language of links, so it could no longer be used to describe XHTML
  • The current plan is now to introduce types into Schemas to describe links
example of xlink
Example of Xlink






xlink:title="Student List"



Current List of Students


  • Schemas are a new technology to replace much of DTDs.
  • Schemas are expressed in XML
  • They have support for data types
  • Much easier to parse and implement than DTDs
schemas but
Schemas: but
  • They don’t support the definition of entities (&eacute;)
  • Not easy to read (or write)
schema fragment
Schema fragment

<elementType name='table'>


<archetypeRef name='common'/>

<archetypeRef name='simpleBlockDisplay'/>



schema fragment1
(schema fragment)


<elementTypeRef name='caption' minOccur='0' maxOccur='1'/>


<elementTypeRef name='col' minOccur='0' maxOccur='*'/>

<elementTypeRef name='colgroup' minOccur='0' maxOccur='*'/>


more >>>

schema fragment2
(schema fragment)



<elementTypeRef name='thead' minOccur='0' maxOccur='1'/>

<elementTypeRef name='tfoot' minOccur='0' maxOccur='1'/>

<elementTypeRef name='tbody' minOccur='1' maxOccur='*'/>


<elementTypeRef name='tr' minOccur='1' maxOccur='*'/>




equivalent dtd
(equivalent DTD)

<!ELEMENT table

(caption?, (col*|colgroup*), thead?,

tfoot?, (tbody+|tr+))>

xhtml 1 0
  • XHTML 1.0 is an XML-ised version of HTML 4.01
  • Just like HTML 4.01, there are 3 versions: ‘strict’, ‘loose’, and ‘frameset’
transitional version
Transitional version
  • XHTML 1.0 has been carefully designed to make use of ‘quirks’ in existing HTML browsers
  • Use of a small number of guidelines allows XHTML to be served to HTML user agents as well as XML user agents
examples of guidelines
Examples of Guidelines
  • Use space before / of empty elements:

<br /> <hr /> <img src=”foo.gif” />

  • Don’t use <hr></hr> form
  • Use name=andid= on <a>:

<a name= ”index” id= ”index”> … </a>

serving xhtml 1 0
Serving XHTML 1.0
  • An XHTML 1.0 document that follows the guidelines can be served up either as HTML, or as XML
  • But beware: CSS has slightly different rules for HTML and XML
  • Similarly, the DOM has differences for HTML and XML
  • XHTML has been divided into a number of modules.
  • A module is a collection of elements and/or attributes that can be used as building blocks to build a DTD.
  • A language can be built by using just XHTML modules, or adding your own
  • We had originally defined Modularisation just for our own use, but it has turned out useful for other groups as well
xhtml modules
XHTML modules
  • Structure: html, head, title, body
  • Text: abbr, acronym, address, blockquote, br, cite, code, dfn, div, em, h1, h2, h3, h4, h5, h6, kbd, p, pre, q, samp, span, strong, var
  • Hypertext: a
  • List: ol, ul, dl, li, dt, dd
  • Applet (deprecated): applet, param
  • Presentation: b, i, hr, big, small, sub, sup, tt
  • Edit: del, ins
  • Bi-directional Text: bdo
  • Basic Forms: simple forms
  • Forms: full forms
  • Basic Tables: simple tables
  • Tables: full tables
  • Image: img
  • Client-side Image Map: map, +
  • Server-side Image Map: change to img
  • Object: object, param
  • Frames
  • Target: attribute
  • Iframe
  • Intrinsic Events: adds events attributes
  • Metainformation: meta
  • Scripting: script
  • Stylesheet: style
  • Style Attribute
  • Link: link
  • Base: base
  • Name Identification: name attribute
  • Legacy: basefont, center, font, s, strike, u, plus loads of attributes (eg align)
  • Ruby: Asian markup
note on modules
Note on modules
  • Note that some modules consist of a single element, or just add some attributes to existing elements
  • Not all modules are independent: if you use some modules, they bring other modules with them, or change other modules
  • Future modules are planned (eg extended forms, events)
the xhtml family
The XHTML family
  • To still be called an XHTML language you must use Structure, Hypertext, Basic Text, and List modules (you may define your own Structure module)
example integration languages
Example integration languages
  • SMIL is planning a module to integrate SMIL and HTML
  • Likewise for MathML
creating a dtd
Creating a DTD
  • It is not expected that creating XHTML-based languages will be a daily activity
  • Not the place to describe the method here: it depends on understanding DTDs.
  • The Modularisation document has extensive examples
  • Future versions will also use Schemas (we hope…)
xhtml basic
  • XHTML Basic is the first XHTML family-member to be defined using Modularisation
  • It is designed for small devices, typically mobile telephones
xhtml basic modules
XHTML Basic Modules
  • Structure Module*
    • body, head, html, title
  • Text Module*
    • abbr, acronym, address, blockquote, br, cite, code, dfn, div, em, h1, h2, h3, h4, h5, h6, kbd, p, pre, q, samp, span, strong, var
xhtml basic modules1
(XHTML Basic Modules)
  • Hypertext Module*
    • a
  • List Module*
    • dl, dt, dd, ol, ul, li
  • Basic Forms Module
    • form, input, label, select, option, textarea
xhtml basic modules2
(XHTML Basic Modules)
  • Basic Tables Module
    • caption, table, td, th, tr
  • Image Module
    • img
  • Object Module
    • object, param
xhtml basic modules3
(XHTML Basic Modules)
  • Metainformation Module
    • meta
  • Link Module
    • link
  • Base Module
    • base
xhtml basic usage
XHTML Basic usage


"-//W3C//DTD XHTML Basic 1.0//EN" "http://www.w3.org/TR/xhtml-basic/xhtml-basic10.dtd">

xhtml 1 1
  • XHTML 1.1 is the second family member to be defined using Modularisation
  • Its main aim is to present a cleaned-up, non-transitional version of XHTML 1.0 strict (no frames)
  • It also adds Ruby markup
  • Otherwise: no new functionality
xhtml 1 1 modules
XHTML 1.1 Modules
  • Structure, Text, Hypertext, List, Object, Presentation, Edit, Bidirectional Text, Forms, Tables, Image, Client-side Image Map, Server-side Image Map, Intrinsic Events, Metainformation, Scripting, Stylesheet Module, Style Attribute (Deprecated ), Link, Base, Ruby.
example xhtml 1 1
Example XHTML 1.1

<?xml version="1.0" encoding="UTF-8"?>



<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" >

<head> <title>Virtual Library</title> </head>


<p>Moved to <a href="http://vlib.org/">vlib.org</a>.</p>



example ruby markup
Example Ruby markup



<rp>(</rp><rt>World Wide Web</rt><rp>)</rp>


  • (Use CSS to describe presentation)
xhtml 2 0
  • XHTML 2.0 is still in preparation
  • New forms
  • New events
  • More accessibility
  • Being produced by a separate group
  • Consists of three parts:
    • data model
    • instances
    • user interface
  • Will allow you to
    • save and restore forms
    • download multi-page forms
  • Will include much more client-side checking
  • Form data will be sent to the server as XML
  • Separates content from presentation (e.g. a radio button and a select box both allow you to select one from many, and you may want to use different choices on different devices)
  • Current events are almost all in terms of mouse: onclick, onmouseover, onfocus, etc.
  • Future event model will be device independent, and allow you to define your own new events
  • Uses the DOM event model
the dom
  • Domain Object Model: how you access a document via scripting
  • Currently only an XML DOM
  • An XHTML DOM is being investigated
accessibility and internationalisation
Accessibility and Internationalisation
  • W3C has an accessibility group that checks that new recommendations address people with accessibility needs
  • There is also an internationalisation group that does the same for cultural issues (which produced <ruby>)
accessibility problems
Accessibility problems
  • A sighted person can work out the structure from the visual presentation
  • A non-sighted person cannot: the structure must be present in the markup
  • That is why new features were added to forms and tables in HTML 4, like <caption>
  • Text would also benefit from such a treatment: not h1, h2 etc (which are subject to misuse) but nested sections with their own headings
example of structure
Example of structure







css can still handle it
CSS can still handle it

section h { how an h1 should look }

section section h { h2 }

section section section h { h3 }


  • XML with related technologies gives you the freedom to define and deliver your own document types
  • HTML is still needed as a base-line markup
  • The new HTML gives a transition path to the future
the state of things
The State of Things
  • New generation of XML+CSS browsers emerging
  • Many XML applications appearing
  • Major companies planning XML as output(Adobe PDF, MS Office 2000)
  • Now: HTML4, XHTML 1.0, Modularisation, Basic, 1.1
to find out more
To Find Out More
  • All XHTML developments are made public at www.w3.org/Markup
  • Members of W3C can also look at www.w3.org/Markup/Group