1 / 83

XML

XML. Internet Engineering Spring 2014 Bahador Bakhshi CE & IT Department, Amirkabir University of Technology. Questions. Q6) How to define the data that is transferred between web server and client? Q6.1) Which technology? Q6.2) Is data correctly encoded?

Download Presentation

XML

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. XML Internet Engineering Spring 2014 Bahador Bakhshi CE & IT Department, Amirkabir University of Technology

  2. Questions • Q6) How to define the data that is transferred between web server and client? • Q6.1) Which technology? • Q6.2) Is data correctly encoded? • Q6.3) How to access the data in web pages? • Q6.4) How to present the data?

  3. Homework • Homework 3 • Will be announced

  4. Outline • Introduction • Namespaces • Validation • Presentation • XML Processing (using JavaScript) • Conclusion

  5. Outline • Introduction • Namespaces • Validation • Presentation • XML Processing (using JavaScript) • Conclusion

  6. Introduction • HTML + CSS + JavaScript  Dynamic Web pages • Web server is not involved after page is loaded • JavaScript reacts to user events • However, most web applications needs data from server after the page is loaded • e.g., new emails data in Gmail • A mechanism to communication: AJAX • A common (standard) format to exchange data • In most applications, the data is structured

  7. Introduction (cont’d) • In general (not only in web) to store or transport data, we need a common format, to specify the stucture of data; e.g., • Documents: PDF, HTML, DOCx, PPTx, ... • Objects: Java Object Serialization/Deserialization • How to define the data structure? • Binary format (similar to binary files) • Difficult to develop & debug, machine depended, … • Text format (similar to text files) • Human readable, machine independent & easier

  8. Introduction (cont’d) • Example: Data structure of a class • Course name, teacher, # of students, each student information IE Bakhshi 48 Ali Hassani 1111 Babak Hosseini 2222 …. Student num: 48 Name: IE Teacher: Bakhshi Ali Hassani 1111 BabakHosseini 2222 …. class Course{ string name; string teacher; integer num; Array st of Students; } c = new Course(); c.name = IE; c.teacher = Bakhshi; c.num = 48 st[1] = new Student(); st[1].name=Ali; st[2].fam=Hassani ….

  9. Introduction (cont’d) • W3C approach • XML: eXtensibleMarkup Language • A meta-markup language to describe data structure • In each application, a markup language (set of tags & attributes) are defined using XML <course> <title> IE </title> <num> 48 </num> <teacher> Bakhshi </teacher> <students> <student><name>Ali</name> <fam>Hassani</fam> <id> 1111 </id></student> … </students> </course>

  10. Introduction (cont’d) • Standard Generalized Markup Language (SGML) • Expensive, complex to implement • XML: a subset of SGML • Goals: simplicity, generality, and usability • Simplifies SGML by: • leaving out many syntactical options and variants • SGML ~ 600pp, XML ~ 30pp • XML = SGML  {complexity, document perspective} + {simplicity, data exchange perspective}

  11. Why to Study XML: Benefits • Simplify data sharing & transport • XML is text based and platform independent • Extensive tools to process XML • To validate, to present, to search, … • In web application, data separation from HTML • E.g., table structure by HTML, table data by XML • Extensible for different applications • A powerful tool to model/describe complex data • E.g., MS Office!!!

  12. XML Document Elements • Markup • Elements • Tag + Content • Attributes • Comments • Processing instructions • Content • Parsed Character Data • Unparsed Character Data (CDATA)

  13. XML Elements • XML element structure • Tag + content <tagname attribute=“value”> Content </tagname> • No predefined tag • If content is not CDATA, is parsed by parser • A value for this element • Child elements of this element

  14. XML Elements’ Attributes • Tags (elements) are customize by attribute • No predefined attributes <os install="factory">Windows</os> <os install="user">Linux</os> • Attribute vs. Tags (elements) • Attributes can be replaced by elements • Attribute cannot be repeated for an element • Attribute cannot have children • Attributes mainly used for metadata, e.g., ID, class

  15. Processing Instructions • Processing instructions pass information (instruction) to the application that process the XML file • They are not a part of user data <?Target String ?> • Common usage <?xml-stylesheet href="URL" type="text/xsl"?> • XML Declaration is a special PI <?xml version="1.0" encoding="UTF-16"?> • XML Declaration is always first line in file

  16. Basic XML Document Structure <?xml version="1.0" encoding="UTF-16"?> <root-tag> <inner-tags> Data </inner-tags> <!-- Comment --> </root-tag>

  17. Example <?xml version="1.1" encoding="UTF-8" ?> <notebook> <name>ThinkPad</name> <model>T500</model> <spec> <hardware> <RAM>4GB</RAM> </hardware> <software> <OS>Linux, FC20 </OS> </software> </spec> </notebook>

  18. Example (CDATA) <?xml version="1.1" encoding="UTF-8" ?> <operator> <mathematic> + - * / % </mathematic> <comparison> <![CDATA[ < <= == >= > != ]]> </comparison> </operator>

  19. XML vs. HTML • Tags • HTML: Predefined fixed tags • XML: No predefined (meta-language) • User defined tags & attributes • Purpose • HTML: Information display • XML: Data structure & transfer • Rules’ strictness • HTML (not XHTLM): loose • XML: strong/strict rule checking

  20. XML in General Application • XML by itself does not do anything • XML just describes the structure of the data • Other applications parse XML and use it • A similar approach is used for formats (event user-defined format); so, what is the advantages of XML?!!! • XML is standard • Available XML tools & technologies XML document XML processor (aka. XML Parser) Apache Xerces SAX, DOM application

  21. XML Technology Components • Data structure (tree) representation • XML document (a text file) • Validation & Conformance • Document Type Definition (DTD) or XML Schema • Element access & addressing • XPath, DOM • Display and transformation • XSLT or CSS • Programming, Database, Query, …

  22. Outline • Introduction • Namespaces • Validation • Presentation • XML Processing (using JavaScript) • Conclusion

  23. Namespaces • In XML, element names are defined by developers • Results in a conflict when trying to mix XML documents from different XML applications • XML file 1 <table> <tr> <td>Apples</td> <td>Bananas</td> </tr> </table> • XML file 2 <table> <name>Dinner Table</name> <width>80</width> <length>120</length> </table>

  24. Namespaces • Name conflicts in XML can easily be avoided by using a qualified names according to a prefix • Prefix is the namespaces • Qualified name is the prefixed name • Step 1: Namespace declaration • Defines a label (prefix) for the namespace and associates it to the namespace identifier • URI/URL is used to be universally unique • Step 2: Qualified name • namespace prefix: local name

  25. Namespaces <?xml version="1.0"?> <ceit:course xmlns:ceit="http://ceit.aut.ac.ir"> <ceit:department> <ceit:name> Computer Engineering & Information Technology </ceit:name> </ceit:department> <ceit:name> Internet Engineering </ceit:name> </ceit:course> Actual name of this tag of parser: http://ceit.aut.ac.ir:department

  26. Default Namespaces <alltables> <table xmlns="http://www.w3.org/TR/html4/"> <tr> <td>Apples</td> <td>Bananas</td> </tr> </table> <table xmlns="http://www.dinnertable.com"> <name>Dinner Table</name> <width>80</width> <length>120</length> </table> </alltables>

  27. Outline • Introduction • Namespaces • Validation • Presentation • XML Processing (using JavaScript) • Conclusion

  28. Valid XML • XML is used to describe a structured data • The description must be correct • A valid XML file • Correctness • Syntax • Syntax error  parser fails to parse the file • Syntax rules: e.g., all XML tags must be closed • Symantec (structure) • Application specific rules, e.g. student must have ID • Error  Application failure

  29. XML Syntax Rules (Well-Formed) • Start-tag and End-tag, or self-closing tag • Tags can’t overlap • XML documents can have only one root element • XML naming conventions • Names can start with letters or the dash (-) character • After the first character, numbers, hyphens, and periods are allowed • Names can’t start with “xml”, in uppercase or lowercase • There can’t be a space after the opening < character • XML is case sensitive • Value of attributes must be quoted • White-spaces are preserved • &, <, > are represented by &amp; &lt; &gt;

  30. How to Validate XML? • 1) Application specific programs need to check structure of XML document • Different applications  different programs • Change in data structure  code modification • 2) General XML parser + reference document • Reference document • Tag names, attributes, tree structure, tag relations, … • Different reference documents • DTD, XML Schema, RELAX NG

  31. XML Validation (cont’d) • Document Type Definition (DTD) or XML Schema • A language to define document type • The rules of the structure of XML • Internal or External parser interface parser XML-based application XML data DTD / Schema

  32. DTD • DTD is a set of structural rules called declarations, specify • A set of elements and attributes that can be in XML • Where these elements and attributes may appear • <!keyword …> • ELEMENT: to define tags • For leaf nodes: Character pattern • For internal nodes: List of children • ATTLIST: to define tag attributes • Includes: name of the element, the attribute’s name, its type, and a default option

  33. ELEMENT Declaration • General form of internal nodes • <!ELEMENTelement_name(list of children)> • To control the number of times a child may appear • + : One or more • * : Zero or more • ? : Zero or one • General form of leaf nodes • <!ELEMENTelement_name(#type)> • Where, types • PCDATA: Most commonly used, the content will be parsed, • i.e. < > & is not allowed • ANY: Any character can be used • EMPTY: No content

  34. ATTLIST Declaration • <!ATTLISTelement_nameattribute_nameattribute_typedefault_option> • element_name: The name of the corresponding element • attribute_name: The name of attribute • attribute_type: Commonly CDATA is used • default_option: • A value: The default value of the attribute • #REQUIRED: The attribute is mandatory

  35. Example: Internal DTD <?xml version="1.0"?> <!DOCTYPE name [ <!ELEMENT name (first, middle, last)> <!ATTLIST namenickename (#CDATA)> <!ELEMENT first (#PCDATA)> <!ELEMENT middle (#PCDATA)> <!ELEMENT last (#PCDATA)> ]> <name nickname="Jo"> <first>John</first> <middle>Johansen</middle> <last>Smith</last> </name>

  36. External DTD • System <!DOCTYPE root_nameSYSTEM "URL" > • Public <!DOCTYPE root_name PUBLIC "-//name//DTD Name//EN" "URL"> • Common format is FPI(Defined in the document ISO 9070) • Example <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">

  37. Example: External DTD sample.dtd <!ELEMENT note (to+,from,heading*,main)> <!ELEMENT to (#PCDATA)> <!ELEMENT from (#PCDATA)> <!ELEMENT heading (#PCDATA)> <!ELEMENT main (#PCDATA)> -------------------------------------------------------------------------- external-dtd.xml <?xml version="1.0" ?> <!DOCTYPE note SYSTEM "sample.dtd" > <note> <to>Ali</to> <to>Hassan</to> <from>Babak</from> <main>This is message</main> </note>

  38. XML Schema • XML Schema describes the structure of an XML file • Also referred to as XML Schema Definition (XSD) • XML Schemas benefits (DTD disadvantages) • Created using basic XML syntax (DTD has its own syntax) • Validate text element content based on built-in and user-defined data types (DTD does not fully support data type) • Similar to OOP • Schema is a class & XML files are instances • Schema specifies • Elements and attributes, where and how often • Data type of every element and attribute

  39. Schema (cont’d) • XML schema is itself a XML-based language • Has its own predefined tags & namespace xmlns:xs="http://www.w3.org/2001/XMLSchema" • Two categories of data types • Simple: Cannot have attribute or nested elements • Primitive: string, Boolean, integer, float, ... • Derived: byte, long, unsignedInt, … • User defined: restriction of base types • Complex: Can have attribute or/and nested elements

  40. XML Schema (cont’d) • Simple element declaration <xs:element name="a name" type="a type" /> • Complex element declaration <xs:element name="a name"> <xs:complexType> <xs:sequence> or <xs:all> <xs:elementname minOccures="…" maxOccures="…"/> </xs:sequence> or </xs:all> </xs:complexType> </xs:element>

  41. XML Schema Example: note.xsd <?xml version="1.0"?> <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"> <xs:element name="note"> <xs:complexType> <xs:sequence> <xs:element name="to" type="xs:string"/> <xs:element name="from" type="xs:string"/> <xs:element name="date" type="xs:date"/> </xs:sequence> </xs:complexType> </xs:element> </xs:schema>

  42. XML Schema Example: note.xml <?xml version="1.0"?> <note xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="note.xsd"> <to>Ali</to> <from>Reza</from> <date>1391/1/1 </date> </note>

  43. XML Validation Tools • Online validators • validator.w3.org • www.xmlvalidation.com • XML tools & commands • xmllint commands in Linux • xmllintxmlfile--valid --dtdvalidDTD • xmllintxmlfile--schema schema • XML libraries • LibXML2 for C • Java & C# XML libraries

  44. Outline • Introduction • Namespaces • Validation • Presentation • XML Processing (using JavaScript) • Conclusion

  45. XML Presentation • By default, browsers parses & displays XML files • Tree structure of XML • Syntax checking  Well-formed XML • Other presentations of XML • 1) Browsers support CSS for XML files • CSS is used to format the representation of XML • 2) Transform to HTML + CSS using XSLT • A powerful tool to separate data from HTML • 3) Use JavaScript to generate HTML for XML • Parse the XML and create HTML elements

  46. XML & CSS • Attach styling instructions directly to XML <?xml-stylesheet href="URL" type="text/css" ?> • Can style but not rearrange elements • Block or inline style • Bold, italic, underline, font, color, etc. • … Tag_name {color:red; font-weight:bold; font-family:serif;}

  47. CSS Example <?xml version="1.0" encoding="UTF-8"?> <programming> Good programming books: <C> <book> <title>The C Programming Language </title> <author>Ritchie</author> </book> </C> <Java> <book> <title>Thinking in Java </title> <author>Eckel</author> </book> </Java> </programming> ========================================== *{display: block;} programming{font-family: Arial; font-size:20pt;} C{color: blue;} Java{color: green;} author{ font-style:italic;} <?xml-stylesheet type="text/css" href="book.css" ?>

  48. XML & CSS • CSS is mainly designed to format HTML presentation • It does not work well in XML • XML does not have any predefined tags/attributes • No ID, No Class • CSS for XML uses tag names • The same style for all tags with the same name • CSS can only present the data in XML • Cannot process & transform the data into other format • e.g., presenting data as a table

  49. XSL • XSL stands for eXtensible Stylesheet Language, and is a style sheet language for XML documents • Started with XSL & leads to XSLT, XPath, and XSL-FO • Xpath • A language for navigating XML documents • XSLT (XSL Transform) • Transforms XML into other formats, like HTML • XSL-FO (XSL Formatting Objects) • Not discussed here!

  50. XPath • XPath is a language for addressing different parts of XML document • XPath is a syntax for defining parts of an XML • XPath uses path expressions to navigate in XML documents

More Related