1 / 10

XML and SAX (A quick overview)

XML and SAX (A quick overview). What is XML? What are SAX and DOM? Using SAX. What is XML?. Textfiles built from text content marked up with text tags The tags provide meaning Although it is similar to HTML, all tags in XML must be well formed Start tags must be balanced by end tags

crohde
Download Presentation

XML and SAX (A quick overview)

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. XML and SAX (A quick overview) • What is XML? • What are SAX and DOM? • Using SAX

  2. What is XML? • Textfiles built from text content marked up with text tags • The tags provide meaning • Although it is similar to HTML, all tags in XML must be well formed • Start tags must be balanced by end tags <movie> ... </movie> <LI> ... </LI> <command text=”hello”/> • Similarly, tags are case sensitive in XML

  3. XML Documents • XML Documents contain one root node <Person> <Name>Zippy The Pinehad</Name> <Profession>Politician</Profession> <Profession>Clown</Profession> </Person> • The tags within an XML document form a tree structure. • All tags must be contained within the root tags.

  4. Tags • Tags can have attributes <Person name=”Jimbo” age=”97” accountNumber=”12345”/> • The tags within an XML document form a tree structure. • All tags must be contained within the root tags.

  5. DTDs and Schemas • An XML document can be checked for “validity”. • What makes a “valid” XML document can be defined within: • DTD: Document Type Definitions • Defines which tags must be present and where they can be seen • Convoluted syntax • Schema • Similar to DTDs except they are defined in XML • DTDs aren't used very much anymore due to the overly complex nature of their syntax • DTDs and Schemas are NOT necessary to using XML. If you choose to use a validating parser (and validation is enabled), you must have a DTD or Schema.

  6. What are SAX and DOM? • DOM (Document Object Model) describes a language neutral object model which can represent an XML document. • The types defined in DOM have been defined using an interface definition language (IDL) as defined by the OMG (Object Management Group) • DOM parsers parse an XML file into the DOM types. The user/programmer traverses these structures to pull out relevant portions of the document • SAX (Simple API for XML) is an API for parsing XML documents • It is an event-based push model of parsing. • As the document is parsed, events are generated. • The programmer must create handlers to deal with these events

  7. Using SAX • To get started with SAX, the developer must obtain a reference to an XMLReader object. This can be obtained in the following ways: XMLReader aReader = XMLReaderFactory.createXMLReader(); XMLReader aReader = XMLReaderFactory.createXMLReader( “org.apache.xerces.parsers.SAXParser”); • In the first call, the programmer is asking the system for an instance of the default parser. In the second, the programmer is asking for an instance of a specific parser. • The second call requires that a file xerces.jar is in the CLASSPATH.

  8. Defining Content Handlers • Once a Reader has been instantiated, it must be provided with a ContentHandler • The ContentHandler interface defines the events which are generated as a result of parsing the XML document. • The interface is as follows: setDocumentLocator(Locator locator); startDocument() throws SAXException; endDocument() throws SAXException; startPrefixMapping(String prefix, String uri) throws SAXException; endPrefixMapping(String prefix) throws SAXException; startElement(String namespaceURI, String localName, String qualifiedName, Attributes attrs) throws SAXException; endElement(String namespaceURI, String localName, String qualifiedName) throws SAXException; characters(char[] ch, int start, int length) throws SAXException; ignorableWhitespace(char[] ch, int start, int length) throws SAXException; processingInstruction(String target, String data) throws SAXException; skippedEntity(String name) throws SAXException;

  9. The DefaultHandler Class • To implement a ContentHandler, you must provide an implementation for all of the methods defined in the ContentHandler interface. • However, not all methods are necessary for all contexts • The DefaultHandler class contains a null implementation of all the methods defined in the ContentHandler interface. • You can subclass DefaultHandler and override only those methods you require. • Generally, most people provide an implementation for: • startElement – (Called when an start tag is parsed) • endElement – (Called when an end tag is parsed) • characters – (called when text between tags is parsed)

  10. Design Elements to Think about • When you are doing your 1st assignment, I recommend that you use SAX • Think about the design of SAX while you are using it, we will be revisting SAX at a later time • We will be evaluating its design to see if we can make improvements.

More Related