1 / 13

CS4221 Presentation

CS4221 Presentation. Presentation Group P O8. P08 – XML Semi Structure Extractor. Project XML Semi Structure Extractor Project Members: Tran Duy Thien , A0096031M Nguyen Thi Mai Huong , A0075106M Truong Hoang Phuoc , A0074527B Daniyar Kosmukhanbetov , A0075100Y. XML.

komala
Download Presentation

CS4221 Presentation

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CS4221 Presentation Presentation Group P O8

  2. P08 – XML Semi Structure Extractor • Project XML Semi Structure Extractor • Project Members: • Tran DuyThien, A0096031M • Nguyen Thi Mai Huong, A0075106M • Truong Hoang Phuoc, A0074527B • DaniyarKosmukhanbetov, A0075100Y

  3. XML • While HTML is for presentation, XML is for data. • Semi-structured • User-defined tags • HTML tag-liked tree structure • Created & consumed by application • XML vs. RDBMS • More flexible structure • Can change schema easier

  4. DTD & XML Schema • Methods to capture the semi-structure of XML • DTD is part of original XML specification • XML Schema (XSD) provides a more detailed and powerful ways to capture the structure of XML. • But often criticized by its complexity.

  5. Web based application and its 3 tier architecture • Web Based Application: client-server application • 3-tier architecture • Presentation Layer • Business Layer • Services Layer

  6. Presentation Layer • UI content built from Facelets (XHTML), CSS and JS. • Configuration Items • Web.xml: configures the application settings and contexts • Faces-config.xml: specific for Facelets Platform like JSF • Persistence.xml: database configuration • Data is inputted from user and captured in various JSF components.

  7. Business Layer • Map data transfer from UI level to programming level items. • Handle business logic, processing data passed from UI level. • Handle File Upload and File Download. • Managed beans, and their components • Invoke services from Service Layer to perform operations. • Handle exception, and their message and pass back to UI

  8. Service Layer • Also known as Data Layer. • Provide services to the Business Layer. • Handle data processing, writing to and reading from backend data storage. • Contain logic to process low-level data form. • Exceptions are thrown to Business Layer.

  9. Storing XML • XML file is upload to server. • XML data and structure is broken down and store in DOM-based data structure. • Front end JavaScript ensure document must be XML type • Back end logic enforce well-formed and valid XML documents.

  10. Analyze and write DTD • Having main data structures: TreeMap and Stack • Parsing using XMLReader. • Writing process followed the Tree structure to navigate XML tag elements. • Writing element followed by its list of attributes. • Schema is stored on server and download link is displayed to user.

  11. Analyze and write XSD • Document tree is built using XOM (XML Object Model) • Elements are processed recursively, using the document tree. • Attempted to catch ID/IDREF relations under the “foreignrelation” attribute

  12. Explicit vs. Implicit Relationship • Although the application can capture Foreign Key Relationship via ID and IDREF(s), and even marked their relationships. • Implicit relationships are much more difficult to discovered. • Require understanding of the semantics and role of each entity. • Matrix of relationship between entities. • Due to time constraint, this can be considered as future improvement.

  13. Demonstration

More Related