1 / 33

XML Lecture 1

XML Lecture 1. XML Motivation & Syntax Monica Farrow email : M.Farrow@hw.ac.uk. XML Topics. This lecture Motivation Storing XML Programming and XML Syntax Describing the document DTD, XML Schema Accessing the elements using XPath Transforming XML using XSLT. XML in One Slide.

sumi
Download Presentation

XML Lecture 1

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. XML Lecture 1 XML Motivation & Syntax Monica Farrow email : M.Farrow@hw.ac.uk

  2. XML Topics • This lecture • Motivation • Storing XML • Programming and XML • Syntax • Describing the document • DTD, XML Schema • Accessing the elements using XPath • Transforming XML using XSLT XML - Motivation & Syntax

  3. XML in One Slide • Basically, XML is an annotated text file. The data (an element) is surrounded by descriptive start and end tags. Elements can have attributes listed in the start tag. • Example: <person> <name id = “42”> Lisa Simpson</name> <tel> 0131-828-1234 </tel> <tel> 078-4701-7775 </tel> <email> lisa@macs.hw.ac.uk</email> </person> XML - Motivation & Syntax

  4. Motivation • XML allows us to create machine-readable text files. • In the file with Lisa’s data, without XML tags, how can we easily specify a semi-structured format? E.g. • Compulsory name • Between 0 and 4 telephone numbers • Optional email • Using XML, the data is labelled with tags, so can be easily identified. • The next few slides show some uses of XML: XML - Motivation & Syntax

  5. Application data • Applications can use XML to store, transmit, and display data. • E.g. To keep track of the updates which have been downloaded • Version number, file names, installation time etc • E.g. To specify start-up settings or parameters • These can be very extensive, can be generated by ‘wizards’ and modified by humans • E.g. To send data between the server and the client during web applications (jquery and javascript) • More about this later XML - Motivation & Syntax

  6. Web services • “A Web service is a software function provided at a network address over the web or the cloud, it is a service that is "always on” ”(wikipedia) • It’s not used through a GUI by a person • A software developer could use a web service within an application. • They use XML to tag the data. • Protocols based on XML are used to: • Transfer the data (SOAP) • Describe the service (WSDL) • List available services (UDDI) XML - Motivation & Syntax

  7. Web services - SOAP • SOAP Simple Object Access Protocol • For exchanging data between any web applications <?xml version="1.0" encoding="UTF-8"?> <soap:Envelopexmlns:soap= "http://schemas.xmlsoap.org/soap/envelope/"> <soap:Header> SOAP Example </soap:Header> <soap:Body> <desks:NumberInStock> 200 </desks:NumberInStock> </soap:Body> </soap:Envelope> XML - Motivation & Syntax

  8. Write Once Use Everywhere • Separation of content from presentation • “Write once read anywhere” • The same document can be transformed using XSL (eXtensible stylesheet language) into different formats XML document XSL XSL XSL XHTML (browser for mobile) TEXT (Excel) XHTML (web browser on PC) XML - Motivation & Syntax

  9. Some existing XML-based languages • XHTML • XML compatible version of HTML • DocBook • For any documentation. Tags such as title, chapter, para etc • ODF (OpenDocument Format) • For office documents such as word processing or spreadsheets . Used by OpenOffice. • MathXML • To describe mathematical formulae XML - Motivation & Syntax

  10. XML data file Storage – 3 options • As a text file – simple – used in this course • In a ‘native’ XML database (NXD) • Designed especially for XML, holds a collection of XML documents • Many different ones on the market – non standard • Extract data with XPath, XSLT (introduced in 3rd XML lecture) or the XML query language FLOWR (not covered in course) • Using a relational DBMS (now SQL has XML functions too) • EITHER store the XML document as the value of some field within a row • OR store the XML in a shredded form across a number of fields and tables XML In and Out

  11. XML and Programming • To read an XML document in a programming language, the processing steps are: • Reading the raw data as a stream of characters • Parsing the raw data • Recognising tags, content, attribute pairs • Passing the result to a client class or function for application specific processing • Many programming languages have a library of functions using Document Object Model [DOM], a tree-based interface • The programmer can navigate up and down the tree. • Details not covered in the course XML In and Out

  12. XML Syntax

  13. XML Overview • XML is a ‘human-legible’ simplified subset of the Standardised General Markup Language, on which HTML is also based • Data is divided into elements and attributes. Each element is surrounded by a start tag and an end tag. The end tag resembles the start tag but includes a backslash before the tagname. • <tel>0131–444 7777</tel> • Tagnames are chosen to reflect the meaning of the element content • (In html, tagnames are chosen to indicate page structure) SGML XML HTML XML - Motivation & Syntax

  14. element, Contains text Elements • The segment of an XML document between an opening and a corresponding closing tag is called an element • Elements may contain text or other elements Element contains other elements <person> <name>Bart Simpson</name> <tel>0131–444 7777</tel> <tel>078–4011 6022</tel> <email>bart@ed.ac.uk</email> </person> Can be >1 element with the same tagname XML - Motivation & Syntax

  15. person name tel tel email XML Document is a Tree Bart Simpson 0131-444 7777 078–4011 6022 bart@ed.ac.uk • XML documents are abstractly modeled as trees, as reflected by their nesting • Sometimes, XML documents are graphs (by using IDs and IDREFs to link elements) XML - Motivation & Syntax

  16. Elements Can Be Nested <addresses> <person> <name>Donald Duck</name> <tel>0131-8281345</tel> <tel>0131-8281374</tel> <email> donald@macs.hw.ac.uk </email> </person> <person> <name> Mickey Mouse</name> <tel> 0141-4261142 </tel> </person> </addresses> XML - Motivation & Syntax

  17. Semi-structured data • XML is ideal for semi-structured data • If an extra telephone number, add it in • If no email at all, leave it out • No need for empty fields or multiple tables. • In a corresponding database for up to 4 telephone numbers, the database design would include spaces for 4 numbers, or a separate phone number table. XML - Motivation & Syntax

  18. Attributes • An opening tag may contain attributes • These are typically used to describe the contents of an element <entry> <wordlanguage = “en”>cheese</word> <wordlanguage = “fr”>fromage</word> <wordlanguage = “ro”>branza</word> <meaning>A food made …</meaning> </entry> XML - Motivation & Syntax

  19. When to Use Attributes • It’s not always clear when to useattributes, • How should ssno (social security number, american) be stored? <person ssno= “123 4589”> <person> <name>L. Simpson </name> <ssno> 123 4567</ssno> <email> <name> L. Simpson</name> lisa@macs.hw.ac.uk <email> </email> lisa@macs.hw.ac.uk ... </email> </person> ... </person> XML - Motivation & Syntax

  20. When to Use Attributes • Using an attribute rather than elements might make the structure more difficult to alter in the future. In attributes: • Multiple values are not permitted • Tree structures are not permitted • General rule – avoid using attributes unless there is a good reason for using them • Use an attribute to describe how the data should be interpreted (e.g. language, currency) • Use an attribute for “IDs”, i.e., identifying data (covered later) XML - Motivation & Syntax

  21. A Complete XML Document <?xml version ="1.0" encoding="UTF-8" ?> <addresses> <person ssno = “113”> <name>Lisa Simpson</name> <tel> 0131-828 1234 </tel> <tel> 078-4701 7775 </tel> <email> lisa@macs.hw.ac.uk </email> </person> </addresses> Required XML - Motivation & Syntax

  22. Empty element, and case • There is a special shortcut for tags that have only attributes, with no text or sub-elements in between them (empty element, bachelor tag) • <imgsrc=“myPic.jpg” /> instead of • <imgsrc=“myPic.jpg” > </img> • XML is case-sensitive, i.e., the following are different: <person>, <Person>, <PERSON> XML - Motivation & Syntax

  23. Well Formed Documents • A document is well-formed if it has • One top-level element (root element) • Tags come in properly nested case-sensitive pairs • Empty elements may use the accepted shortcut / • Attribute values must be enclosed in quotes • Attribute names must not be repeated within a tag XML - Motivation & Syntax

  24. Are these valid xml files? • <?xml version=“1.0”?> • <Question> Here is a question</Question> • <?xml version=“1.0”?> • <Question> Here is a question</Question> • <Answer> Here is an answer</Answer> XML - Motivation & Syntax

  25. Why is this not well-formed? <?xml version ="1.0" encoding="UTF-8" ?> <person phone= 0131-828 1234 phone=078-4701 7775 > <Name> <first>Homer <second>Simpson </first></second> </name> <person phone= 0131-828 1235 > <Name> <first>Lisa <second>Simpson </first></second> </name> XML - Motivation & Syntax

  26. XML Authoring • There are many authoring tools available to facilitate the creation of XML documents. • VisualStudio for Windows is in the lab • However, you may as well start off using a simple text editor (not Word) which allows access to line numbers, ideally XML aware • XML is after all just a text file. • E.g. Notepad++ for Windows • Most linux text editors are ok • You are then responsible for checking that the XML is correct! XML - Motivation & Syntax

  27. Viewing and checking XML • If well formed XML is loaded into your browser it will be displayed as a tree structure • This is perhaps simplest way to check that XML is well formed XML - Motivation & Syntax

  28. Viewing and checking XML • If incorrect XML is loaded into your browser then error messages will be displayed XML - Motivation & Syntax

  29. Exercise 1 • An XML file holds information about holiday homes for rent. Write an example of such an XML file which containing 2 or 3 records. Invent appropriate element and attribute names. • Each home has an id, a name,a location and optional url • Additionally, each home has one or more sets of contact details. Contact details consist of a name and a phone number, and optionally an email address. • People do not own more than one holiday home. • In your example, demonstrate optional or repeated elements. • How would you hold this information in a relational database? XML - Motivation & Syntax

  30. Referencing other elements • Unique elements (identified here by an attribute) can be referred to from other elements • In this way, relationships between elements can be shown without repetition • E.g. • Books and authors can be listed. But each book may have >1 author, each author might write >1 book. So the book can contain a reference to the author. See books.xml XML - Motivation & Syntax

  31. Extract from books.xml <bookbookID = "222KK"year="2000"> ** an id <title>Data on the Web</title> <Author>4</Author> **** element references an id <Author>2</Author> <publisher>Morgan Kaufmann Publishers</publisher> <price>39.95</price> </book> ..... <authorauthID = "4"> **** an id <firstName>Mary</firstName> <lastName>Thomson</lastName> <Book>222KK</Book> ** element references an id </author> Asterisks show links between the data (in the same file) XML - Motivation & Syntax

  32. Exercise – 2 (using ids) • An XML file holds information about holiday homes for rent. Write an example of such an XML file which containing 2 or 3 records. Invent appropriate element and attribute names. Use books.xml as an example. • Each home has an id, a name, a location and optional url • Each contact has a name, phone and optional email address • Each person can own many homes • Each home can be owned by more than one person • How would you hold this information in a relational database? XML - Motivation & Syntax

  33. Defining the structure of an XML file • We can check if an XML file is well-formed • by looking at it, maybe • By loading it into a browser • If well-formed, it will be displayed • However, how can we check that the well-formed file contains the correct elements in the correct quantities? E.g. • Musn’t contain tagnames that aren’t expected • Must contain tagnames that are expected • Must contain the correct number of tags with the same tagname • We need to write a specification for the XML file • See the next lecture XML - Motivation & Syntax

More Related