1 / 40

AN Introduction to XML

AN Introduction to XML. Zhu Maosheng 2001-03-15. Main Content. 1.XML Tutorial 2.XML As a Data Representation Standard and Data Model 3.XML As a Data Interchange Standard and Information Integration 4.Repository and XML Application Server. XML Tutorial. Background

kovit
Download Presentation

AN Introduction to XML

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. AN Introduction to XML Zhu Maosheng 2001-03-15

  2. Main Content 1.XML Tutorial 2.XML As a Data Representation Standard and Data Model 3.XML As a Data Interchange Standard and Information Integration 4.Repository and XML Application Server

  3. XML Tutorial Background extend the HTML(MathML, CML, VoiceXML) data interchange(product catalog, health record…) Main Characters Data semantic Data independence Semi-structured(schemaless, irregular) Derived&others:flexible, local computing, data integration, structured text, License-free…

  4. XML’S Goals • Enable internationalized media-independent electronic publishing. • Allow industries to define platform-independent protocols for the exchange of data, especially the data of electronic commerce. • Deliver information to user agents in a form that allows automatic processing after receipt. • Make it easy for people to process data using inexpensive software. • Allow people to display information the way they want it. • Provide metadata – data about information –that will help people find information and help information producers and consumers find each other.

  5. Introduction to XML’S Family • Status of Document five phase: note->work draft -> candidate recommendation -> proposed recommendation -> recommendation. • whether a software support? Version? • XML 1.0 Recommendation • DTD&XML Schema Candidate Recommendation • Namespace, XPath 1.0 Recommendation, Xpointer, Xlink. • XSLT 1.0 Recommendation

  6. XML 1.0 Recommendation 4.1 Basic Logic Structure document::= prologelementMisc* prolog::= XMLDecl? Misc* (doctypedeclMisc*)? XMLDecl::= '<?xml' VersionInfoEncodingDecl? SDDecl? S? '?>‘ VersionInfo::= S 'version' Eq (' VersionNum ' |" VersionNum ") Eq::= S? '=' S? VersionNum::= ([a-zA-Z0-9_.:] |'-')+ Misc::= Comment |PI |S 4.2 Basic Physical Structure entities;internal, external, general, parameter 4.3 Reading note(EBNF) 4.4 Writing a well-formed XML Document note

  7. XML 1.0 Recommendation(continued) EBNF(Extended BNF) #Xnnnn, [a – z A - Z],[#Xnnnn - #Xnnnn] [^a – z A – Z], “string”, ab, a|b, a – b, a?, a+, a* One Example: Comment::= '<!--' ((Char - '-') |('-' (Char - '-')))* '-->' Writing XML note(eight points:< &lt; & &amp;) <?xmlversion="1.0"encoding="UTF-8"?> <!DOCTYPEgreeting[ <!ELEMENTgreeting(#PCDATA)> ]> <greeting time=“morning”>Hello,world!</greeting>

  8. XML Example <?xml version=“1.0” standalone=“yes”> <BIB><BOOK nickname=“Dragon book”> <AUTHOR id=“aho”> Aho, A. V. </AUTHOR> <AUTHOR id=“sethi”> Sethi, R. </AUTHOR> <AUTHOR id=“ullman”> Ullman, J. D. </AUTHOR> <TITLE> Compilers: Principles, Techniques, and Tools </TITLE> <PUBLISHER> Addison-Wesley </PUBLISHER> <YEAR> 1985 </YEAR> </BOOK><BOOK> <AUTHOR idref=“ullman”/> <TITLE> Principles of Database and Knowledge-Base Systems, Vol. 1 </TITLE></BOOK> ... </BIB>

  9. DTD&XML Schema Candidate Recommendation <!DOCTYPE bib [ <!ELEMENT BIB (BOOK+)> <!ELEMENT BOOK (AUTHOR+, TITLE, PUBLISHER?, YEAR?)> <!ATTLIST BOOK isbn CDATA #IMPLIED nickname CDATA #IMPLIED> <!ELEMENT AUTHOR (#PCDATA)> <!ATTLIST AUTHOR id ID #IMPLIED idref IDREF #IMPLIED> <!ELEMENT TITLE (#PCDATA)> <!ELEMENT PUBLISHER (#PCDATA)> <!ELEMENT YEAR (#PCDATA)> ]> Its Drawback:non-xml, datatypes, namespace(<!ELEMENT mybib:BIB...>)

  10. schema <xsd:schema xmlns:xsd=“http://www.w3.org/1999/XMLSchema”> <xsd:element name=“BOOK” type=“BOOKTYPE”/> <xsd:complexType name=“BOOK_TYPE” > <xsd:element name=“AUTHOR” type=“xsd:string” minOccurs=“1” maxOccurs=“unbounded”/> <xsd:element name=“TITLE” type=“xsd:string”/> <xsd:element name=“PUBLISHER” type=“xsd:string” minOccurs=“0” maxOccurs=“1”/> <xsd:element name=“YEAR” type=“xsd:decimal” minOccurs=“0” maxOccurs=“1”/> <xsd:attribute name=“isbn” use=“optional” type=“xsd:string”/> <xsd:attribute name=“nickname” use=“optional” type=“xsd:string”/> </xsd:complexType> </xsd:schema>

  11. Other Schema Languages XDR(first XML-Data XML-Data Reduced <-DCD, MS) SOX(Schema for O-O XML <-DTD, Commerce One) DSD(AT&T) DCD(Document Content Description) DDML(Doc Definition Markup Language) Different Facet: Syntax in XML, namespace, include, import, Datatype, Attribute, Element, Inheritance, Being unique XML Schema is complete and complex(candidate)

  12. Namespace, XPath 1.0 Recommendation, Xpointer, Xlink. Namespace: why need? Avoid name clash. declare, <BIB xmlns:mybib=“http://www.myserver.net/”> scope, default, identifier; XPath: location path is composed of location steps Location step contain axis, node test, predicate child::AUTHOR[position()<3]/attribute::id Abbreviation @,//,/,.,.. .//para=self::node()/descendant-of-self::node()/child::para

  13. Namespace, XPath 1.0 Recommendation, Xpointer, Xlink Xpointer: extend XPath at scope location, string match, uri(urn,url) <a xml:link=“simple” href=“#xpointer(id(“foo”))/> Xpointer(id(“sec2.1”)/descendant::P[last()] to id(“sec2.2”)/descendant::P[last()]) Xlink: type:simple,locator,arc,extended,group; <AUTHOR xmlns:xlink=http://www.w3.org/1999/xlink xlink:type=“simple”xlink:href=“http://www-cs-faculty.stanford.edu/ knuth/” xlink:role=“don_~knuth_homepage” xlink:show=“embed” xlink:actuate=“onLoad”> Donald Knuth </AUTHOR>

  14. XSLT 1.0 Recommendation A XSL file is a well-formed xml file contain a few templates A template is composed of pattern(xpath) and directive; <xsl:stylesheet version=“1.0” xmlns:xsl=“http://www.w3.org/1999/XSL/Transform”> <xsl:template match=“/”> <HTML><xsl:apply-templates/></HTML> </xsl:template> <xsl:template match=“BIB”> <UL><xsl:apply-templates/></UL> </xsl:template> <xsl:template match=“BOOK”> <LI><xsl:apply-templates/></LI> </xsl:template> <xsl:template match=“AUTHOR”> <xsl:value-of select=“.”/> </xsl:template> <xsl:template match=“TITLE”> <EM><xsl:value-of select=“.”/></EM> </xsl:template> </xsl:stylesheet>

  15. XML As a Data Representation Standard and Data Model • Why we need a Model of XML? (design, programming&implementation) • Concept Model(Model Tools) • Three facet(data structure, operator, constraint) • Architecture(data model, operation algebra, syntax) • Data Model network, hierarchy, relational, object-oriented, OEM, XML(advantage&disadvantage) • Distinct between them (navigational, structure) • ER -> Relation -> relation algebra -> SQL • UML -> Object-Oriented ->Object algebra -> OQL • ERX -> XML -> XML algebra -> XQL

  16. Relational Data Model • Data structure Relation, Key • Operation theta select, project, theta join, divide, union, Intersection, set difference, extended to bag • Constraint entity integrity, referential integrity, user-defined • Relation Algebra& Relation Calculus • SQL

  17. SQL • SELECT <attribute list> • FROM <relation list> • WHERE <condition> • GROUP BY <attribute list> • HAVING <condition> • ORDER BY <attribute list>

  18. Data Structure Class, Object • Operation Class(method, property, inheritance..):definition, create, access, modification, destroy Object(property..):create, access, update, delete,query; • Constraint unique(OID, attribute name, method name) existence(method implementation…)

  19. One Example

  20. XML Data Model • It is commonly considered as a edge-labeled/node-labeled directed graph. • Node-labeled directed graph

  21. Edge-labeled directed graph

  22. XML Data Model(XML InfoSet) • Data structure Document, Elements, Attributes, Namespaces, Processing Instructions, Comments, Values. • Operation(functional notation) general:constructor&accessor. each kind of node has its own operation. document has uri :DocNode -> URIRefValue children:DocNode ->[Ref(ElemNode)|Ref(PINode) | Ref(CommentNode)] attribute has name :Attrnode -> Ref(QNameValue) value :AttrNode ->Ref(ValueNode) • Constraint ID, IDREF,IDREFS;

  23. Example: • <?xml version=1.0?> <p:part xmlns:p="http://www.mywebsite.com/PartSchema" xsi:schemaLocation="http://www.mywebsite.com/PartSchema http://www.mywebsite.com/PartSchema" name="nutbolt"> <mfg>Acme</mfg> <price>10.50</price> </p:part>

  24. children(D1) = [ E1 ] root(D1) = E1 • name(E1) = QNameValue("http://www.mywebsite.com/PartSchema", "part", Ref(Def_QName)) • children(E1) = [ E2, E3 ] attributes(E1) = { A1 } • namespaces(E1) = { N1 } type(E1) = Ref(Def_part_type) • parent(E1) = D1 name(A1) = QNameValue(null, "name", Ref(Def_QName)) • value(A1) = StringValue("nutbolt", Ref(Def_string)) • parent(A1) = E1 • prefix(N1) = StringValue("p", Ref(Def_string)) • uri(N1) = URIRefValue("http://www.mywebsite.com/PartSchema", Ref(Def_uriReference)) • parent(N1) = E1 name(E2) = QNameValue(null, "mfg", Ref(Def_QName)) • children(E2) = [ StringValue("Acme", Ref(Def_string)) ] • attributes(E2) = {} namespaces(E2) = {} • type(E2) = Ref(Def_string) • parent(E2) = E1

  25. MasterFundamentals • Master Fundamentals • Hierarchy parent/child ancestor/descendant • Sequence immediately precedes precedes • Position absolute relative ranges

  26. Transformation among these model • Between XML and relation • Between XML and o-o • Between relation and o-o • Between hierarchy and relation • between network and relation

  27. XML Query • Why need XML query(view, integration)? • Query Operation union, intersection, difference, join, project, selection, sort,aggregation(XML Query Algebra, language and use cases); • Nine features necessary for an XML Query Language 1.clean semantic(select expr from path expr where cond/for path-expr where cond return result-set). 2.path expression 3.return XML doc 4.query and return XML element&attribute

  28. XML Query(continued) 5.type coercion(semi-structured) 6.handle unexpected data(not exact match) 7.query XML without Schema/DTD(wildcard) 8.return tree 9.preserver order • Five popular XML Query Language Lorel, XML-QL(AT&T), XML-GL(Politecnico di Milano, XSL, XQL • Who win--XQuery(combine together W3C) • View(maintenance, like search engine) • Update language(management) • Triggers -> active view(e-commerce:actor, view, rule, notify)

  29. Comparison table for XML Query language

  30. XML As a Data Interchange Standard and Information Integration • Because it is self-describing and flexible • Integration level same data model(relational data integration) different data model(relational, object-oriented, text, html) • Integration method federated databases, warehousing(combiner/extractor), mediation(mediator/wrapper); • Semantic Model • Example(OEM Plus Browse) • XSD(merge/separate, proximity search equal to index)

  31. Warehousing warehouse combiner extractor extractor datasource1 datasource2

  32. Mediator-Wrapper datasource1 Mediator Wrapper Wrapper datasource2

  33. Repository and XML Application Server • Goal:Fast access • Storage Method File/DOM(drawback:1.parse/per browse query, 2.demand too much memory, 3.index 4.update) RDBMS/SQL OODBMS/OQL XML-Enable(SQL Server, Oracle8i/Servlet/XSQL) XML Native(SoftwareAG/Tamino-index is great) Why do Oracle/MS build a native XML Server by their mature relation DB Technology? • Flexible storage • Distributed XML Storage Systems

  34. XML Server Manager • Goal:manageability,availability • GUI Manager • Schema Manager • Data Browse&Maintenance • Slice too big size file into small • Import&export • Backup • Recovery • Integrity Maintenance • And so on

  35. Summary • Simple&easy programming interface • Efficient Storage(relation query optimization) • Manageability • availability

  36. What do we learn from RDBMS/OODBMS Tech? • Storage manager • Buffer manager • Indexing • Transaction manager? • Concurrency? • Query optimization

  37. Database management system components

  38. Current development in XML • Application-to-Application / Object Serialisation • Conversion Tools • Database Systems • Document / Content Management Systems • DTD/Schema Editors/Tools • Publishing Systems • Search engines • Utilities/Tools/APIs • XLink/XPointer Tools • XML Browsers • XML Editors • XML Parsers/Processors • XPath utilities • XSL formatters • XSLT editors • XSLT engines • XSLT utilities

  39. Reading in XML • General Reading The first stop is http://www.w3c.org/xml(up-to-date) The first book is xml bible(some old) • Research&Develop Reading, Tools DBLP Bibliography http://www.informatik.unitrier.de/~ley/db/index.html http://www.acm.org/sigmod/ http://www.xmlsoftware.com/ http://www.acm.org/sigmod/databaseSoftware/index.html http://www.w3.org/Status http://xml.apache.org • Industrial Reading Microsoft, Oracle, Sun, IBM

  40. 谢谢大家! 提问和讨论

More Related