1 / 31

Dickson K.W. Chiu PhD, SMIEEE Thanks to Prof. SC Cheung (HKUST), Prof. Francis Lau (HKU) Reference: XML How To Program,

CSIT600b: XML Programming DOM Programming. Dickson K.W. Chiu PhD, SMIEEE Thanks to Prof. SC Cheung (HKUST), Prof. Francis Lau (HKU) Reference: XML How To Program, Deitel, Prentice Hall 2001. Overview of Java API for XML.

dareh
Download Presentation

Dickson K.W. Chiu PhD, SMIEEE Thanks to Prof. SC Cheung (HKUST), Prof. Francis Lau (HKU) Reference: XML How To Program,

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CSIT600b: XML Programming DOM Programming Dickson K.W. Chiu PhD, SMIEEE Thanks to Prof. SC Cheung (HKUST), Prof. Francis Lau (HKU) Reference: XML How To Program, Deitel, Prentice Hall 2001

  2. Overview of Java API for XML • Java Web Services Developer Pack (Java WSDP) http://java.sun.com/webservices/tutorial.html • Now in J2EE 1.4 core • Document-oriented • Java API for XML Processing (JAXP) -- processes XML documents using various parsers • Java Architecture for XML Binding (JAXB) -- processes XML documents using schema-derived JavaBeans component classes • Procedure-oriented • Java API for XML-based RPC (JAX-RPC) -- sends SOAP method calls to remote parties over the Internet and receives the results • Java API for XML Messaging (JAXM) -- sends SOAP messages over the Internet in a standard way • Java API for XML Registries (JAXR) -- provides a standard way to access business registries and share information Dickson Chiu 2004

  3. The DOM Core (JAXP) • Fundamental Interfaces • Required in all implementations • DOMException, ExceptionCode, DOMImplementation, DocumentFragment, Document, Node, NodeList, NamedNodeMap, CharacterData, Attr, Element, Text, Comment • Extended Interfaces • For DOM implementations working with XML • CDATASection, DocumentType, Notation, Entity, EntityReference, ProcessingInstruction • http://java.sun.com/xml/jaxp/ • Supports Schema and DTD validation + XSLT • More powerful than JDOM and dom4j Dickson Chiu 2004

  4. DOM classes and interfaces (Ref) Deitel, XML How to Program, Fig 8.4 Dickson Chiu 2004

  5. Some Document methods. Deitel, XML How to Program, Fig 8.5 Dickson Chiu 2004

  6. XmlDocument methods Deitel, XML How to Program, Fig 8.6 Dickson Chiu 2004

  7. Node methods Deitel, XML How to Program, Fig 8.7 Dickson Chiu 2004

  8. Some node types Deitel, XML How to Program, Fig 8.8 Dickson Chiu 2004

  9. Element methods Deitel, XML How to Program, Fig 8.9 Dickson Chiu 2004

  10. DOM API of JAXP Dickson Chiu 2004

  11. Merging 2 DOM trees (for assignment) import java.net.*; import java.io.*; import org.w3c.tidy.*; import org.w3c.dom.*; import javax.xml.transform.*; import javax.xml.transform.stream.*; import javax.xml.transform.dom.*; import org.w3c.dom.Node; import javax.xml.parsers.*; import javax.imageio.metadata.*; public class tester { public tester() { } static public void main(String[] arg){ try { String urlStr1 = "http://finance.yahoo.com/q/cp?s=^HSI"; String urlStr2 = "http://finance.yahoo.com/q/cp?s=^DJI"; // open the connection with that url URL url1 = null; URL url2 = null; url1 = new URL(urlStr1); url2 = new URL(urlStr2); URLConnection cn1 = null; URLConnection cn2 = null; cn1 = url1.openConnection(); cn2 = url2.openConnection(); // parse the html file into dom Tidy tidy = new Tidy(); tidy.setCharEncoding(Configuration.UTF8); tidy.setIndentContent(true); tidy.setXHTML(true); tidy.setWraplen(Integer.MAX_VALUE); Document doc1 = tidy.parseDOM(cn1.getInputStream(), null); Document doc2 = tidy.parseDOM(cn2.getInputStream(), null); Dickson Chiu 2004

  12. Merging 2 DOM trees - cont // xsl File xslFile1 = new File("xslt1.xsl"); File xslFile2 = new File("xslt1.xsl"); // transform obj TransformerFactory t = TransformerFactory.newInstance(); Transformer transformer1 = t.newTransformer(new StreamSource(xslFile1)); Transformer transformer2 = t.newTransformer(new StreamSource(xslFile2)); //transform DOMSource source1 = new DOMSource(doc1); DOMResult result1 = new DOMResult(); DOMSource source2 = new DOMSource(doc2); DOMResult result2 = new DOMResult(); transformer1.transform(source1, result1); transformer2.transform(source2, result2); Document resultDoc1 = (Document)result1.getNode(); Document resultDoc2 = (Document)result2.getNode(); Dickson Chiu 2004

  13. Merging 2 DOM trees - cont // merge two dom // 1. create new root Node newRoot = resultDoc1.createElement("indexes"); // 2. get the root element from both dom trees Node hsiIndex = resultDoc1.getFirstChild(); Node djiIndex = resultDoc2.getFirstChild(); // 3. *must* make all the nodes belongs to the same document djiIndex = resultDoc1.importNode(djiIndex, true); // 4. change the pointers newRoot.appendChild(hsiIndex); newRoot.appendChild(djiIndex); resultDoc1.appendChild(newRoot); t.newTransformer().transform(new DOMSource(resultDoc1), new StreamResult(System.out)); } catch (MalformedURLException ex) { } catch (IOException ex1) { } catch (TransformerConfigurationException ex2) { } catch (TransformerException ex3) { } } } True => deep copy Dickson Chiu 2004

  14. Operation on the Command Line • Compiling • H:\Sun\AppServer\jdk\bin\javac -classpath H:\Sun\Appserver\lib\tidy.jar tester.java • Running • H:\Sun\AppServer\jdk\bin\java -classpath H:\Sun\Appserver\lib\tidy.jar;. tester > out.wml Library Dickson Chiu 2004

  15. Building an XML Document with DOM • Desired Output <root> <!--This is a simple contact list--> <contact gender="F"> <FirstName>Sue</FirstName> <LastName>Green</LastName> </contact> <?myInstruction action silent?> <![CDATA[I can add <, >, and ?]]> </root> Dickson Chiu 2004

  16. DocumentBuilder loads and parses XML documents Building an XML Document with DOM // Dietel, XML How to Program, Fig. 8.14 : BuildXml.java. import java.io.*; import org.w3c.dom.*; import org.xml.sax.*; import javax.xml.parsers.*; import com.sun.xml.tree.XmlDocument; public class BuildXml { private Document document; public BuildXml() { DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance(); try { // get DocumentBuilder DocumentBuilder builder = factory.newDocumentBuilder(); // create root node document = builder.newDocument(); } catch ( ParserConfigurationException pce ) { pce.printStackTrace(); } import specifies location of classes needed by application Use JAXP default parser Obtain XML Document reference Dickson Chiu 2004

  17. Create root Element and append to Document Write XML Document to myDocument.xml OUT!!! Use transformer Create CDATA node and append to root node Create ProcessingInstruction node with target myInstruction and value actionsilent Building an XML Document with DOM Element root = document.createElement( "root" ); document.appendChild( root ); Comment simpleComment = document.createComment( "This is a simple contact list" ); // add a comment root.appendChild( simpleComment ); Node contactNode = createContactNode( document ); root.appendChild( contactNode ); // add a child element ProcessingInstruction pi = document.createProcessingInstruction( "myInstruction", "action silent" ); root.appendChild( pi ); // add processing instruction CDATASection cdata = document.createCDATASection( "I can add <, >, and ?" ); root.appendChild( cdata ); // add a CDATA section try { // write the XML document to a file ( (XmlDocument) document).write( new FileOutputStream( "myDocument.xml" ) ); } catch ( IOException ioe ) { ioe.printStackTrace(); } } Call method createContactNode (next slide) to create child node Dickson Chiu 2004

  18. Creates and returns Element node Building an XML Document with DOM public Node createContactNode( Document document ) { // create FirstName and LastName elements Element firstName = document.createElement( "FirstName" ); firstName.appendChild( document.createTextNode( "Sue" ) ); Element lastName = document.createElement( "LastName" ); lastName.appendChild( document.createTextNode( "Green" ) ); // create contact element Element contact = document.createElement( "contact" ); // create an attribute Attr genderAttribute = document.createAttribute( "gender" ); genderAttribute.setValue( "F" ); // append attribute to contact element contact.setAttributeNode( genderAttribute ); contact.appendChild( firstName ); contact.appendChild( lastName ); return contact; } public static void main( String args[] ) { BuildXml buildXml = new BuildXml(); } } Create ElementFirstName with text Sue Create ElementLastName with text Green Create Elementcontact with attribute gender Append Elements FirstName and LastName to Elementcontact Dickson Chiu 2004

  19. Write an XML file with Transformer // write XML document to disk import javax.xml.transform.*; import javax.xml.transform.stream.*; import javax.xml.transform.dom.*; try { // create DOMSource for source XML document Source xmlSource = new DOMSource( document ); // create StreamResult for transformation result // Write to console: Result result = new StreamResult( System.out ); Result result = new StreamResult( new FileOutputStream( new File( “test.xml") ) ); // create TransformerFactory TransformerFactory transformerFactory = TransformerFactory.newInstance(); // create Transformer for transformation Transformer transformer = transformerFactory.newTransformer(); transformer.setOutputProperty( "indent", "yes" ); // transform and deliver content to client transformer.transform( xmlSource, result ); } Dickson Chiu 2004

  20. Modifying XML Document with DOM // Fig 8.10 : ReplaceText.java import java.io.*; import org.w3c.dom.*; import javax.xml.parsers.*; import com.sun.xml.tree.XmlDocument; import org.xml.sax.*; public class ReplaceText { private Document document; public ReplaceText() { try { // obtain the default parser DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance(); // set the parser to validating factory.setValidating( true ); DocumentBuilder builder = factory.newDocumentBuilder(); // set error handler for validation errors builder.setErrorHandler( new MyErrorHandler() ); // obtain document object from XML document document = builder.parse( new File( "intro.xml" ) ); Dickson Chiu 2004

  21. Write new XML document to intro1.xml OUT!!! Use Transformer Modifying XML Document with DOM Cast root node as element (subclass), then get list of all message elements // fetch the root node Node root = document.getDocumentElement(); if ( root.getNodeType() == Node.ELEMENT_NODE ) { Element myMessageNode = ( Element ) root; NodeList messageNodes = myMessageNode.getElementsByTagName( "message" ); if ( messageNodes.getLength() != 0 ) { Node message = messageNodes.item( 0 ); // create a text node Text newText = document.createTextNode("New Message!!" ); // get the old text node Text oldText = ( Text ) message.getChildNodes().item( 0 ); // replace the text message.replaceChild( newText, oldText ); } } ( (XmlDocument) document).write( new FileOutputStream( "intro1.xml" ) ); } If message element exists, replace old text node with new one Item() returns type Objectand need casting <myMessage> <message>New Message!!</message> </myMessage> <myMessage> <message>Welcome to XML!</message> </myMessage> Dickson Chiu 2004

  22. Modifying XML Document with DOM catch ( SAXParseException spe ) { System.err.println( "Parse error: " + spe.getMessage() ); System.exit( 1 ); } catch ( SAXException se ) { se.printStackTrace(); } catch ( FileNotFoundException fne ) { System.err.println( "File \'intro.xml\' not found. " ); System.exit( 1 ); } catch ( Exception e ) { e.printStackTrace(); } } public static void main( String args[] ) { ReplaceText d = new ReplaceText(); } } Dickson Chiu 2004

  23. Handling Complexities + ELEMENT: sentence + TEXT: The + ENTITY REF: projectName + COMMENT: The latest name we're using + TEXT: Eagle + CDATA: <i>project</i> + TEXT: is + PI: editor: red + ELEMENT: bold + TEXT: important + PI: editor: normal • To be more robust, a DOM application must handle more cases. • When data comes from outside world • When searching for an element: • Ignore comments, attributes, and processing instructions. • Allow for the possibility that subelements do not occur in the expected order. • Skip over TEXT nodes that contain ignorable whitespace, if not validating. • When extracting text for a node: • Extract text from CDATA nodes as well as text nodes. • Ignore comments, attributes, and processing instructions when gathering the text. • If an entity reference node or another element node is encountered, recurse (that is, apply the text-extraction procedure to all subnodes). Dickson Chiu 2004

  24. Error Handler for Validation Errors // Fig 8.11 : MyErrorHandler.java // Error Handler for validation errors. import org.xml.sax.ErrorHandler; import org.xml.sax.SAXException; import org.xml.sax.SAXParseException; public class MyErrorHandler implements ErrorHandler { // throw SAXException for fatal errors public void fatalError( SAXParseException exception ) throws SAXException { throw exception; } public void error( SAXParseException e ) throws SAXParseException { throw e; } // print any warnings public void warning( SAXParseException err ) throws SAXParseException { System.err.println( "Warning: " + err.getMessage() ); } } Dickson Chiu 2004

  25. Load and parse XML document Obtain JAXP default parser and DocumentBuilder to load and parse XML documents Traversing the DOM // Dietel, XML How to Program, Fig. 8.15 : TraverseDOM.java import java.io.*; import org.w3c.dom.*; import org.xml.sax.*; import javax.xml.parsers.*; import com.sun.xml.tree.XmlDocument; public class TraverseDOM { private Document document; public TraverseDOM( String file ) { try { // obtain the default parser DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance(); factory.setValidating( true ); DocumentBuilder builder = factory.newDocumentBuilder(); // set error handler for validation errors builder.setErrorHandler( new MyErrorHandler() ); // obtain document object from XML document document = builder.parse( new File( file ) ); processNode( document ); } Require parser to validate documents Pass Document to method processNode Dickson Chiu 2004

  26. Traversing the DOM catch ( SAXParseException spe ) { System.err.println("Parse error: " + spe.getMessage() ); System.exit( 1 ); } catch ( SAXException se ) { se.printStackTrace(); } catch ( FileNotFoundException fne ) { System.err.println( "File \'" + file + "\' not found. " ); System.exit( 1 ); } catch ( Exception e ) { e.printStackTrace(); } } Dickson Chiu 2004

  27. Traversing the DOM switch statement determines Node type public void processNode( Node currentNode ){ switch ( currentNode.getNodeType() ) { case Node.DOCUMENT_NODE: // process a Document node Document doc = ( Document ) currentNode; System.out.println( "Document node: " + doc.getNodeName() + "\nRoot element: " + doc.getDocumentElement().getNodeName() ); processChildNodes( doc.getChildNodes() ); break; case Node.ELEMENT_NODE: // process an Element node System.out.println( "\nElement node: " + currentNode.getNodeName() ); NamedNodeMap attributeNodes = currentNode.getAttributes(); for ( int i = 0; i < attributeNodes.getLength(); i++){ Attr attribute = ( Attr ) attributeNodes.item( i ); System.out.println( "\tAttribute: " + attribute.getNodeName() + " ; Value = " + attribute.getNodeValue() ); } processChildNodes( currentNode.getChildNodes() ); break; If document node, output document node and process child nodes If element node, output element’s attributes and process child nodes Dickson Chiu 2004

  28. Method processChildNodes calls method processNode for each Node in NodeList Traversing the DOM case Node.CDATA_SECTION_NODE: // process text node / CDATA section case Node.TEXT_NODE: Text text = ( Text ) currentNode; if ( !text.getNodeValue().trim().equals( "" ) ) System.out.println( "\tText: " + text.getNodeValue() ); break; } } public void processChildNodes( NodeList children ) { if ( children.getLength() != 0 ) for ( int i = 0; i < children.getLength(); i++) processNode( children.item( i ) ); } public static void main( String args[] ) { if ( args.length < 1 ) { System.err.println("Usage: java TraverseDOM <filename>" ); System.exit( 1 ); } TraverseDOM traverseDOM = new TraverseDOM( args[ 0 ] ); } } If CDATA or text node, output node’s text content Dickson Chiu 2004

  29. DOM with JavaScript (Reference) (MS XML Parser) (Deitel Sec 8.3) <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd"> <html> <!-- Fig. 8.3 : DOMExample.html --> <head> <title>A DOM Example</title> </head> <body> <script type = "text/javascript" language = "JavaScript"> var xmlDocument = newActiveXObject( "Microsoft.XMLDOM" ); xmlDocument.load( "article.xml" ); // get the root element var element = xmlDocument.documentElement; document.writeln("<p>Here is the root node of the document:" ); document.writeln("<strong>" + element.nodeName+"</strong>" ); document.writeln("<br>The following are its child elements:" ); document.writeln( "</p><ul>" ); Dickson Chiu 2004

  30. DOM with JavaScript (2) // traverse all child nodes of root element for ( i = 0; i < element.childNodes.length; i++ ) { var curNode = element.childNodes.item( i ); // print node name of each child element document.writeln( "<li><strong>" + curNode.nodeName + "</strong></li>" ); } document.writeln( "</ul>" ); // get the first child node of root element var currentNode = element.firstChild; // firstChild = childNodes.item(0) document.writeln( "<p>The first child of root node is:" ); document.writeln( "<strong>" + currentNode.nodeName+ "</strong>"); document.writeln( "<br>whose next sibling is:" ); // get the next sibling of first child var nextSib = currentNode.nextSibling; document.writeln( "<strong>" + nextSib.nodeName+ "</strong>." ); document.writeln( "<br>Value of <strong>" + nextSib.nodeName + "</strong> element is:" ); Dickson Chiu 2004

  31. DOM with JavaScript (3) var value = nextSib.firstChild; // print the text value of the sibling document.writeln( "<em>" + value.nodeValue + "</em>" ); document.writeln( "<br>Parent node of " ); document.writeln( "<string>" + nextSib.nodeName+ "</strong> is:" ); document.writeln( "<strong>" + nextSib.parentNode.nodeName + "</strong>.</p>" ); </script></body></html> <?xml version = "1.0"?> <!-- Fig. 8.2: article.xml --> <article> <title>Simple XML</title> <date>December 6, 2000</date> <author> <fname>Tem</fname> <lname>Nieto</lname> </author> <summary>XML is pretty easy.</summary> <content>Once you have mastered HTML, XML is easily learned. You must remember that XML is not for displaying information but for managing information. </content> </article> Dickson Chiu 2004

More Related