1 / 97

XML Basics

XML Basics. Overview. What is XML?. E x tensible M arkup L anguage A syntax for documents A Meta -Markup Language A Structural and Semantic language, not a formatting language Not just for Web pages. XML is a Meta Markup Language. Not like HTML, troff, LaTeX

dnassar
Download Presentation

XML Basics

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. XML Basics Overview IT380

  2. What is XML? • Extensible Markup Language • A syntax for documents • A Meta-Markup Language • A Structural and Semantic language, not a formatting language • Not just for Web pages IT-380

  3. XML is a Meta Markup Language • Not like HTML, troff, LaTeX • Make up the tags you needs as you need them • The tags you create can be documented in a Document Type Definition (DTD) • A meta syntax for domain-specific markup languages like MusicML, MathML, and CML IT-380

  4. XML describes structure and semantics, not formatting • XML documents form a tree • Element and attribute names reflect the kind of the element • Formatting can be added with a style sheet IT-380

  5. A Song Description in HTML <dt>Hot Cop <dd> by Jacques Morali, Henri Belolo, and Victor Willis <ul> <li>Producer: Jacques Morali <li>Publisher: PolyGram Records <li>Length: 6:20 <li>Written: 1978 <li>Artist: Village People </ul> IT-380

  6. A Song Description in XML <SONG> <TITLE>Hot Cop</TITLE> <COMPOSER>Jacques Morali</COMPOSER> <COMPOSER>Henri Belolo</COMPOSER> <COMPOSER>Victor Willis</COMPOSER> <PRODUCER>Jacques Morali</PRODUCER> <PUBLISHER>PolyGram Records</PUBLISHER> <LENGTH>6:20</LENGTH> <YEAR>1978</YEAR> <ARTIST>Village People</ARTIST> </SONG> IT-380

  7. Style Sheets provide formatting SONG {display: block} TITLE {display: block; font-family: Helvetica, serif; font-size: 20pt; font-weight: bold} COMPOSER {display: block; font-family: Times, Times New Roman, serif; font-size: 14pt; font-style: italic} ARTIST {display: block; font-family: Times, Times New Roman, serif; font-size: 14pt; font-weight: bold; font-style: italic} PUBLISHER {display: block; font-size: 14pt; font-family: Times, Times New Roman, serif} LENGTH {display: block; font-family: Times, Times New Roman, serif; font-size: 14pt} YEAR {display: block; font-family: Times, Times New Roman, serif; font-size: 14pt} IT-380

  8. Attaching style sheets to documents • Processing Instruction • <?xml-stylesheet type="text/css" href="song.css"?> • Converter Program IT-380

  9. What is XML used for? • Domain-Specific Markup Languages • Self-Describing Data • Interchange of Data Among Applications • Structured and Integrated Data IT-380

  10. Domain-Specific Markup Languages • Non proprietary format • Don’t pay for what you don’t use IT-380

  11. Self-Describing Data • Much data is lost due to format problems • XML is very simple • XML is self-describing • XML is well documented IT-380

  12. <PERSON ID="p1100" SEX="M"> <NAME> <GIVEN>Judson</GIVEN> <SURNAME>McDaniel</SURNAME> </NAME> <BIRTH> <DATE>21 Feb 1834</DATE> </BIRTH> <DEATH> <DATE>9 Dec 1905</DATE> </DEATH> </PERSON> IT-380

  13. Interchange of Data Among Applications • E-commerce • Syndication IT-380

  14. Structured and Integrated Data • Can specify relationships between elements • Can assemble data from multiple sources IT-380

  15. XML Applications • A specific markup language uses the XML meta-syntax is called an XML application • Different XML applications have their own more constricted syntaxes and vocabularies within the broader XML syntax • Further syntax can be layered on top of this; e.g. data typing through DCDs or other schemas IT-380

  16. Example XML Applications • Web Pages • Mathematical Equations • Music Notation • Vector Graphics • Metadata • and more… IT-380

  17. Mathematical Markup Language IT-380

  18. Channel Definition Format • <?xml version="1.0"?> • <CHANNEL HREF="http://metalab.unc.edu/xml/index.html"> • <TITLE>Cafe con Leche</TITLE> • <ITEM HREF="http://metalab.unc.edu/xml/books.html"> • <TITLE>Books about XML</TITLE> • </ITEM> <ITEM HREF="http://metalab.unc.edu/xml/tradeshows.html"> • <TITLE>Trade shows and conferences about XML</TITLE> • </ITEM> <ITEM HREF="http://metalab.unc.edu/xml/lists.htm"> • <TITLE>Mailing Lists dedicated to XML</TITLE> • </ITEM></CHANNEL> IT-380

  19. Classic Literature • The Complete Plays of Shakespeare • The Bible • The Koran • The Book of Mormon IT-380

  20. Vector Graphics • Vector Markup Language (VML) • Internet Explorer 5.0 • Microsoft Office 2000 • Scalable Vector Graphics (SVG) IT-380

  21. The Resource Description Framework (RDF) • Meta-data • Dublin Core • Better Web searching IT-380

  22. An Example of RDF <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:dc="http://purl.org/DC/> <rdf:Description about="http://metalab.unc.edu/xml/> <dc:CREATOR>Elliotte Rusty Harold</dc:CREATOR> <dc:TITLE>Cafe con Leche</dc:TITLE> </rdf:Description> </rdf:RDF> IT-380

  23. XML for XML • XSL: The Extensible Stylesheet Language • DCD: The Document Content Description Schema Language • XLL: The Extensible Linking Language IT-380

  24. XSL: The Extensible Stylesheet Language • XSL Transformations • XSL Formatting Objects IT-380

  25. DCD: The Document Content Description Schema Language • Data Typing in XML is Weak • <MONTH>9</MONTH> • <DCD> • <ElementDef Type="MONTH" • Model="Data" Datatype="i1" • Min="1" Max="12" /> • </DCD> IT-380

  26. XLL: The Extensible Linking Language • Any element can be a link • Links can be bi-directional • Links can be separated from the documents they connect <footnote xlink:form="simple" href="footnote7.xml">7</footnote> IT-380

  27. File Formats, In-house applications, and other behind the scenes uses • Microsoft Office 2000 • Federal Express Web API • Netscape What’s Related IT-380

  28. Hello XML • Plain ASCII or UTF-8 text • .xml is standard file extension • Any standard text editor will work • <?xml version="1.0" standalone="yes"?> • <FOO> • Hello XML! • </FOO> IT-380

  29. The XML Declaration <?xml version="1.0" standalone="yes"?> • version attribute • required • always has the value 1.0 • standalone attribute • yes • no • encoding attribute • UTF-8 • 8859_1 • etc. IT-380

  30. The FOO element • Start tag <FOO> • Contents "Hello XML!" • End tag </FOO> • <FOO> • Hello XML! • </FOO> IT-380

  31. greeting.xml • <?xml version="1.0" standalone="yes"?> • <GREETING> • Hello XML! • </GREETING> IT-380

  32. Style sheets • Separate from the XML document • Different Languages • Cascading Style Sheets Level 1 (CSS1) • Internet Explorer 5.0 • Mozilla 5.0 • Cascading Style Sheets Level 2 (CSS2) • Internet Explorer 5 (partial) • Mozilla 5.0 (partial) • Extensible Style Language (XSL) • Internet Explorer 5.0 (older draft, buggy) • LotusXSL, XT, Other non-browser converters • Document Style and Semantics Language (DSSSL) • Jade IT-380

  33. xml-stylesheet • Style sheets are attached via an xml-stylesheet processing instruction in the prolog <?xml version="1.0" standalone="yes"?> <?xml-stylesheet type="text/css" href="greeting.css"?> <GREETING>Hello XML!</GREETING> • type attribute has the value text/css or text/xsl • href attribute is a URL to the stylesheet, possibly relative • Can also use non-browser converters like XT, LotusXSL, and Jade IT-380

  34. greeting.css GREETING {display: block; font-size: 24pt; font-weight: bold} IT-380

  35. A larger example: Baseball statistics • Examine the data • Design a vocabulary for the data • Write a style sheet IT-380

  36. Sample statistics http://cbs.sportsline.com/u/baseball/mlb/stats.htm IT-380

  37. Organizing the Data • XML documents are trees. • XML elements contain other elements as well as text • Within these limits there's more than one way to organize the data • Hierarchically • Relationally • Objects IT-380

  38. What is the Root Element • The League? • The Season? • A custom Document element? IT-380

  39. The Root Element • Choose SEASON for the root element • Everything else will be a descendant of SEASON • This is not the only possible choice • <?xml version="1.0"?> • <SEASON> • </SEASON> IT-380

  40. What are the Immediate Children of The root? • Leagues? • Teams? • Players? • Games? IT-380

  41. Child Elements • <?xml version="1.0"?><SEASON> <YEAR> 1998 </YEAR></SEASON> IT-380

  42. White space in XML is not especially significant • <?xml version="1.0"?> • <SEASON><YEAR>1998</YEAR></SEASON> IT-380

  43. Leagues • Major league baseball is divided into two leagues • Each league has • a name • three divisions IT-380

  44. Divisions • Each division has • name • 4-6 teams IT-380

  45. Teams • Each team has • Name • City • Players IT-380

  46. Player Data • Each player has • First name • Last name • Position • Statistics IT-380

  47. G Games Played GS Games Started AB At Bats R Runs H Hits 2B Doubles 3B Triples HR Home Runs RBI Runs Batted In SB Stolen Bases CS Caught Stealing SH Sacrifice Hits SF Sacrifice Flies Err Errors PB Pitcher Balked BB Base on Balls (Walks) SO Strike Outs HBP Hit By Pitch Player Batting Statistics IT-380

  48. What does a player look like • Long names vs. short names IT-380

  49. The Complete 1998 Major League • Long version IT-380

  50. A Style Sheet • 1998shortstats.xml • baseballstats.css • <?xml-stylesheet type="text/css" href="baseballstats.css"?> • styled1998shortstats.xml IT-380

More Related