1 / 16

Extracting Typed Values from XML Data

Extracting Typed Values from XML Data. Fabio Simeoni David Lievens Paolo Manghi Steve Neely Richard Connor. XML and Language Values. Language Bindings. Programming Models over XML. Low-Level Bindings. Architecture. The SNAQue Prototype. A High-Level Binding. Related Work.

bevis
Download Presentation

Extracting Typed Values from XML Data

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Extracting Typed Values from XML Data Fabio Simeoni David Lievens Paolo Manghi Steve Neely Richard Connor

  2. XML and Language Values Language Bindings Programming Models over XML Low-Level Bindings Architecture The SNAQue Prototype A High-Level Binding Related Work Conclusions Outline OOPSLA 2001 Workshop

  3. irregular structure Semistructured Documents unstable structure regular structure Portable Documents self-describing XML and Language Values Language values are increasingly generated and manipulated outside the language jurisdiction, and occur instead as fragments of XML documents OOPSLA 2001 Workshop

  4. d’ d <staff> <member code = “123517”> <name>Richard Connor</name> <home>www.cis.strath...</home> </member> <member code = “123345”> <name>Steve Neely</name> <ext>4565</ext> <project> <name>SNAQue</name> </project> </member> <member code = “175417”> <ext>4566</ext> <name>Fabio Simeoni</name> </member></staff> <staff><member code = “123517”> <name>Richard Connor</name> </member> <member code = “123345”> <name>Steve Neely</name> </member> <member code = “175417”> <name>Fabio Simeoni</name> </member></staff> Portable Document XML and Language Values Semistructured Document OOPSLA 2001 Workshop

  5. d’ Class Member { private String name; private int code; String getName() {...} void setName (String n) {...} int getCode() {...} void setCode (int c) {...}...} <staff><member code = “123517”> <name>Richard Connor</name> </member> <member code = “123345”> <name>Steve Neely</name> </member> <member code = “175417”> <name>Fabio Simeoni</name> </member></staff> Class Staff { Private Member[] members; Member[] getMembers() {...} void setMembers(Member[] m) {...} ...} staff XML and Language Values OOPSLA 2001 Workshop

  6. simple safe efficient A Computational Requirement Programming over d’ should be as: as programming over staff with existing programming languages. OOPSLA 2001 Workshop

  7. computationally incomplete Dedicated QLs (XQL, XML-QL, Lorel, etc.) Research Models(XDuce,Tequila, etc.) Language Bindings (DOM, SAX) ... Programming Models over XML not well suited to complex tasks essentially untyped graph types vs. language types steep learning curve/lack of tool support OOPSLA 2001 Workshop

  8. SAX & DOM SAX: a string served as a temporal sequence of tokens Data Structure: DOM:a tree SAX: events and call-backs Programming Model: DOM: tree navigation OOPSLA 2001 Workshop

  9. Think as a Botanist! SAX & DOM Limitations Data structures do not supportdata semantics (staff members are neither strings nor tree nodes) SAX supports the syntactic structure of the data. Think as a Parser! DOM supports the logical structure of the data. Good for syntactic (SAX) and structural (DOM) manipulations, not domain-specific tasks. OOPSLA 2001 Workshop

  10. SAX Programming Class SaxTask extends DefaultHandler { private String name, code; private boolean inProject; private CharArrayWriter buffer = new CharArrayWriter();  public void characters (char[ ] ch, int start, int length) {buffer.write(ch,start,length);}  public void startElement (String uri, String name, String qName, Attributes atts) {if (name.equals("member")) code=atts.getValue("code"); if (name.equals("project")) inProject=true; buffer.reset();} public void endElement (String uri, String name, String qName) { if (name.equals("project")) inProject=false; if (name.equals("name") && !inProject) name=buffer.toString().trim() … do something with name and code…}} OOPSLA 2001 Workshop

  11. DOM Programming String name=null; int code; Element staff = d.getDocumentElement(); NodeList members =staff.getElementsByTagName("member"); int membCount = members.getLength(); for (int i=0;i<membCount;i++) { Element member = (Element) members.item(i); code = Integer.parseInt(member.getAttribute("code")); NodeList children = member.getChildNodes(); int length = children.getLength(); for (int j=0;j<length;j++) { Node child = children.item(j); if (child.getNodeType()==Node.ELEMENT_NODE) { String tagName = ((Element) child).getTagName(); if (tagName.equals("name")) name= ((character data) child.getFirstChild()).getData();}} … do something with name and code…} OOPSLA 2001 Workshop

  12. SAX & DOM Programming How easy is to write, maintain, and evolve the code? • Programming is tedious, redundant and error-prone • Code is convoluted even for simple tasks How safe is the code? Nodelist members = staff.getElementsByTagName(“mebmer”); • Safety is responsibility of the programmer • Programmatic checks worsen code readability • Programmatic checks are simply not enough... OOPSLA 2001 Workshop

  13. SAX & DOM Programming Compare with: Member[] members = staff.getMembers(); for (int i=0;i<members.length;i++) { code = members[I].code; name=members[i].name; {...do something with name and code...} • Simple because the programming algebra is domain-specific • Safe and efficient because static knowledge is domain-specific Think Straight! OOPSLA 2001 Workshop

  14. staff High-Level Bindings Conclusion: • DOM & SAX are low-level bindings • We need high-level bindings that automate the conversion of XML fragments into their counterpart within the language d’ OOPSLA 2001 Workshop

  15. d d’.. staff A High-Level Binding ExtractionMechanism Types Staff and Member Programming Language Binding as type projection OOPSLA 2001 Workshop

  16. OOPSLA 2001 Workshop

More Related