CS 502: Computing Methods for Digital Libraries

Lecture 10

New Developments in XML:

MathML, Namespaces, RDF

- Objectives:
- Encode mathematical material for teaching and scientific communication at all levels
- Encode both mathematical notation and mathematical meaning
- Facilitate conversion to and from other math formats, both presentational and semantic. e.g., TeX
- Be suitable for a wide range of output formats, including Braille
- Provide for extensibility
- Be human legible, and simple for software to generate and process
- Intended for use with both HTML and XML

Example:a + b

Presentation:

<mrow>

<mi>a</mi>

<mo>+</mo>

<mi>b</mi>

</mrow>

Content:

<apply>

<plus/>

<ci>a</ci>

<ci>b</ci>

</apply>

Example: (a + b)2

Presentation:

<msup>

<mfenced>

<mrow>

<mi>a</mi>

<mo>+</mo>

<mi>b</mi>

</mrow>

</mfenced>

<mn>2</mn>

</msup>

Content:

<apply>

<power/>

<apply>

<plus/>

<ci>a</ci>

<ci>b</ci>

</apply>

<cn>2</cn>

</apply>

t

dx

x

0

<apply>

<int/>

<bvar><ci>x</ci></bvar>

<lowlimit><cn>0</cn></lowlimit>

<uplimit><ci>t</ci></uplimit>

<apply>

<divide/>

<cn>1</cn>

<ci>x</ci>

</apply>

</apply>

Content

t

dx

x

0

<mrow>

<msubsup>

<mo>∫</mo>

<mn>0</mn>

<mi>t</mi>

</msubsup>

<mfrac>

<mrow>

<mo>ⅆ</mo>

<mi>x</mi>

</mrow>

<mi>x</mi>

</mfrac>

</mrow>

Presentation

<semantics>

Content encoding

<annotation-xml encoding="MathML-Presentation">

Presentation encoding

</annotation-xml>

</semantics>

- Namespaces:
- Allow those who publish XML to explicitly indicate where their information is coming from
- Avoids any confusion regarding the information's origin

- Examples:
- <bk:title>Cheaper by the Dozen</bk:title>
- <isbn:number>1568491379</isbn:number>
- The tag consists of two parts:
- namespace (in red)
- name within namespace (in blue)

Example 1

<xhtml xmlns = "http://www.w3.org/1999/xhtml">

....

</xhtml>

Example 2

<?xml version="1.0"?>

<!-- both namespace prefixes are available throughout -->

<bk:book xmlns:bk = "http://loc.gov:books"

xmlns:isbn = "urn:ISBN:0-395-36341-6">

<bk:title>Cheaper by the Dozen</bk:title>

<isbn:number>1568491379</isbn:number>

</bk:book>

"Shakespeare is the author of the play Hamlet."

creator

Hamlet

Shakespeare

type

play

RDF: Metadata Schemes

- "Shakespeare is the author of the play Hamlet."
- In the Dublin Core metadata scheme, this can be represented as:
- ResourceProperty-typeValue
- Hamlet ---> creator ---> Shakespeare
- ---> type ---> play

"Shakespeare is the author of the play Hamlet."

dc:creator

Hamlet

Shakespeare

dc:type

play

- Define a namespace for the metadata scheme
- Basic XML
- <creator>Shakespeare</creator>
- <type>play</type>
- With dc namespace
- <dc:creator>Shakespeare</dc:creator>
- <dc:type>play</dc:type>

- Suppose that Hamlet is referenced by the URL:
- "http://hamlet.org/"
- <rdf:description rdf:about = "http://hamlet.org/">
- ..........
- </rdf:description>

- Full RDF record, with XML mark-up:
- <rdf:rdf>
- <rdf:description rdf:about = "http://hamlet.org/">
- <dc:creator>Shakespeare</dc:creator>
- <dc:type>play</dc:type>
- </rdf:description>
- </rdf:rdf>

- Full RDF record, with Namespace Definitions:

- <rdf:description rdf:about = "http://hamlet.org/">
- <dc:creator>Shakespeare</dc:creator>
- <dc:type>play</dc:type>
- </rdf:description>
- </rdf>

- Markup
- SGML was slow to gain acceptance because it is complex
- HTML and the web gained acceptance because they were simple
- XML is gaining acceptance steadily for structural mark-up, but has a long way to go
- Style sheets
- DSSSL has not been accepted because of complexity
- CSS and XSL are slowly gaining acceptance, but have a long way to go

- Mathematics:
- MathML is complex but mathematics is complex. It may succeed.
- Metadata markup:
- XML is becoming the standard for metadata. It is simple and intuitive.
- RDF adds functionality, but a lot more complexity.
- Namespaces:
- Namespaces are a simple concept, but the notation adds a lot of complexity