1 / 19

An Extension to XML Schema for Structured Data Processing

An Extension to XML Schema for Structured Data Processing. Presented by: Jacky Ma Date: 10 April 2002. Presentation Outline. The Problems Research Objectives The Schema Extension: MMX MMX Query System Discussion Conclusion. The Problems. Mapping XML data into relational tables

deanna
Download Presentation

An Extension to XML Schema for Structured Data Processing

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. An Extension to XML Schema for Structured Data Processing Presented by: Jacky Ma Date: 10 April 2002

  2. Presentation Outline • The Problems • Research Objectives • The Schema Extension: MMX • MMX Query System • Discussion • Conclusion

  3. The Problems • Mapping XML data into relational tables • Not natural to XML structure • Efficient, but may not be a effective method • Legacy application-specific structured data • Similar modeling but proprietary implementation • Not interoperable, and difficult to maintain • Lack of modular design and thus difficult to combine to form more complex data structure • Meta-data can facilitate wide range of needs, while XML Schema is solely used for physical data validation nowadays

  4. Research Objectives • To facilitate more effective searching and storing of XML contents by making use of meta-data (XML Schema) • Propose a data-oriented model to allow different storage mechanism, processing model, and query model on XML contents

  5. Our Approach – MMX • Use meta-data to map XML data into structured data objects • Define the structured data models “conceptually” and link the models to XML document structure “syntactically” • Meta-data is defined as an extension of XML Schema • The extension is called MMX (Multi Model XML)

  6. Raw Data Structured Data (XML) Data with Modeling Information Data with Program Codes Program Driven vs. Data Driven Information for processing is hard-coded in program Program Driven MMX! Data Driven Processing instruction is hard-coded in data?!

  7. A Glance of XML Data

  8. A Glance of The Linked Schema

  9. Schema Extension • The extended schema is associated with a namespace • The extended schema goes within a schema element, like <tree:element> in the example • <tree:element> specify a single structure object instance • Name association for elements and attributes • Class hierarchies: • <tree:element> -> <tree:internal> -> <tree:leafNode> • finally to the structure specified in <tree:leafNodeValue> • Additional properties in <rootNodeAttr>, <internalNodeAttr> and <leafNodeAttr> • Schema writer has to know the structure model specification, while the XML writer only needs to know the given schema

  10. Modeling • For an instance of “MMX data object” • As an encapsulated information object only accessible from the root, thus as a “single tree node” • As a mapping from root node, query method and query parameters to the value at leaf nodes • Leaf nodes may contain any valid XML content, as long as defined in the Schema • I.e. may contain another “MMX data object” • A query is modeled as a 3-dimension tuple: • [accessing-node, query-method, query-parameters] • Accessing-node is specified by XPath • Query-method is specified in String Value • Query-parameters is multi-dimension depends on the current model

  11. Modeling (2) A Tree(1) is accessible frompoint A, occasionally, a query (e.g. [A, “spatial-search”,(3, 5)], assuming Tree(1) will accept spatial-search with two coordinates) may return point B as answer, either by XPath of B or the XML subtree of B. From this point B, user may drill down the tree by issueanother query on Tree(2). Tree (1) B Tree(2) XML Elements..

  12. Query with and without MMX • From the original XML data, we could not assume the semantics of the data: • We can ONLY do XML-based query such as XPath • We can do the spatial query ONLY IF we can map the data into a R-Tree • After mapping the data into R-Tree • Spatial Queries • Give me the point at (2,7) • Give me the point nearest to (4,4) • Nearest Neighbor Search • Give me the point nearest to “Franklin”

  13. Processing • Users might not know the “type” of the node (and not necessary to know). They are interested in what they can do • Users retrieved the list of possible operation by issuing a LIST-OPERATION method to the root element of a MMX object • Possible operations may include queries, updates, and other model-specific operations

  14. MMX Query System • To show that the schema, modeling, and processing of MMX extension is workable • To illustrate how it assists in querying XML data • To facilitate as the platform for testing the implementation of arbitrary structured models • Implement with JDK1.4

  15. System Design Clients XML DOM MMXDocument Node Data Schema MMX Element ParseSchema FetchClasses AbstractMMX Element The Abstract Class defines common interface that have to be implement in each MMX Element such as LIST-OPERATION, QUERY, BUILD, etc. Extends class (Partly)Defines … VP-Tree X-Tree R-TreeSchema Maps R-Tree

  16. Discussions - Pros • Compatible with the relational approach, and supersedes that. • Modular design promotes reusability and maintainability • XML “flatten” the legacy structured data to make them text-editable, easy to transport and process by different systems

  17. Discussion - Cons • There is no generic syntax to precisely describe all kinds of structures models • The size of XML file is often larger than legacy data file • Each structure model needs additional implementation effort • Schema specification become longer and longer quickly as number of supported model increases

  18. Conclusion • Propose a representation to encapsulate data structures • Describe XML data with the Schema conceptually as well as syntactically • Map legacy structure models into Schema, and map XML data to the structure models by the Schema • Structured data repository with increased interoperability, reusability, and transportability

  19. Q&A

More Related