1 / 36

The Role of XML in Cloud Data Integration

The Role of XML in Cloud Data Integration. Presenter: David RR Webber, Oracle Corporation October 15th, 2010. Introduction.

aderes
Download Presentation

The Role of XML in Cloud Data Integration

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. The Role of XML in Cloud Data Integration Presenter: David RR Webber, Oracle Corporation October 15th, 2010

  2. Introduction Cloud services introduce new challenges for information sharing. While certain aspects are familiar and yet still unresolved, new techniques can be utilized. A canonical approach underpinning Cloud data exchanges is vital to ensure consistent understanding and enhanced interoperability. Developers face many challenges and complexities in using today’s industry standards for information exchanges. How can we simplify this and rapidly develop consistent and conforming information exchanges in the Cloud? Open source tools will be discussed and government and industry example exchanges presented.

  3. Agenda Challenges, New and Old Why a canonical approach? Adaptive, agile and context aware infrastructure Avoiding the n(n-1)2 dilemma Ensuring Simplicity at the Foundation Never underestimate the ability of engineers to add complexity Open source and open standard solutions Examples from emergency management domain with NIEM* Summary and Q&A * National Information Exchange Model (NIEM) approach

  4. Challenges, New and Old Why a Canonical Approach? Adaptive, agile and context aware infrastructure Avoiding the n(n-1)2 dilemma

  5. Why a Canonical Approach? Traditional XML information exchange has been schema driven; many issues for Cloud integration: W3C XML Schema is inflexible, static, brittle, localized, expensive Canonical dictionaries exploit Cloud approach with distributed availability, flexible collaboration and dynamic updating and referencing Amazon Web Services AWS catalog an example of dynamic approach Canonical dictionaries provide the components that underpin the exchanges while leaving the precise exchange formatting open for implementers; the “what” not the “how”; content can change hourly on AWS! Neutral XML-based syntax is future proof

  6. Canonical XML dictionary A collection of distinct components that represent discreet business information for an application domain Includes singleton components and combinations of related components together as sub-assemblies Information is represented in a simple neutral conceptual data format that captures the critical concepts about the data e.g. name, description, content type, contextual usage pattern, hierarchy Wikipedia definition: http://en.wikipedia.org/wiki/Canonical#Computer_science

  7. Baking in Interoperability Using consistent component definitions dramatically improves interoperability and reuse Having formal design methods makes development faster, easier, predictable and repeatable Aligning local practice to industry domain dictionary can reduce complexity and reinforce best practices Dictionary definitions can be automatically evaluated for common mistakes and this reduces the opportunity for errors during design phase Generating software artifacts from neutral dictionary definitions ensures reliable information exchange results across user communities and their particular systems, platforms and tools

  8. Neutral Content Model Representation Neutral representations allow business stakeholders to participate in dictionary development without technology barriers Concise neutral formats can be viewed as simple spreadsheets as they have no special syntax dependencies Based on open public standard specifications, semantic concepts and leading knowledge domain techniques Neutral representation prevents lock-in by vendor, syntax, tooling or platforms Maximizes flexibility and future proofing of dictionary definitions

  9. Linguistic and Semantic Alignment Formal community domain naming and design rules provide consistency of definitions Consistency of definitions minimizes duplication and overlapping of dictionary components Dictionaries allow collaboration on component development to improve the overall results Formal component content detail drives alignment Design best practices ensure logical self-contained components that can be selected contextually Avoids explosion of complexity and excessive over definition (e.g. “kitchen-sink” schema)

  10. What is a Canonical Approach? There are several flavors of canonical approaches; some more complex than others – e.g. UBL vis OAGi vis CCTS Avoid dependence on W3C XML Schema mechanisms Core Components Technical Specification (CCTS) simple components with basic hierarchy Parent components with child entities, and/or components Associated attributes that denote context and related factors In CCTS parlance these are ABIE, BBIE and ASBIE Parent = Aggregate Business Information Entity Child = Basic Business Information Entity Attribute = Association Business Information Entity

  11. Conceptual Information Model Follows Naming and Design Rule (NDR) principles and guidelines Canonical Components Dictionary XML Each compound component Parent (ABIE) Item Parent (ABIE) Item Parent (ABIE) Item Parent (ABIE) Item . . . . . Each atomic component Attribute (ASBIE) Child (BBIE) Item Child (BBIE) Item Child (BBIE) Item Child (BBIE) Item Attribute (ASBIE) Attribute (ASBIE) ebXML CCTS terms (ABIE, BBIE, ASBIE) Parent = Aggregate Business Information Entity Child = Basic Business Information Entity Attribute = Association Business Information Entity Attribute (ASBIE) Optional attributes of component * CCTS – Core Components Technical Specification

  12. Example – Person Name Person Name (ABIE) Language Code (ASBIE) Verified Details? (ASBIE) Has Alias? (ASBIE) First Name (BBIE) Middle Name (BBIE) Last Name (BBIE) Previous Name? (ASBIE) Language Code may exist independently of Person Name Verified Details and Previous Name are flags that denote additional information about the entity they are associated with There are three component items aspects: structure relationships; content rules; definitions Naming and Design Rules (NDR) also important in ensuring shorter non-specific context names e.g. compare PersonName to IncidentPersonName

  13. Methods for creating Canonical Dictionary Harvest from collection of domain exchange schema Export from SQL database to schema; harvest; rename Export from modelling tool to schema; harvest; rename Create manually in XML or spreadsheet

  14. Sample Dictionary Building Processes LEGEND Automated Manual Analyst Review Apply Naming and Design Rule (NDR) checks and edits Option 1 – From Enterprise Data Model Import XSD and refactor for use with OASIS CAM 1 3 Import EDM Model Components XSD schema OASIS CAM template NDR Evaluation, Refactor, Renaming Tool 4 Export Components in XSD syntax Collection of objects from model Analyst Review Ele Def XML 5 Generate Standard Components Dictionary XML DDL ebXML CCTS compatible (ABIE, BBIE, ASBIE) Dictionary of exchange components Option 2 – Derive from existing exchange XSD schema Import each XSD and merge into CAM dictionary 2 3 4 5 Merge & Generate Dictionary XML NDR Evaluation, Refactor, Renaming Tool CAM template Exchange XSD schema XML Import CAM template Exchange XSD schema OASIS CAM template Import Exchange XSD schema Dictionary of exchange components Import ebXML CCTS compatible (ABIE, BBIE, ASBIE)

  15. Ensuring Simplicity at the Foundation Never underestimate the ability of engineers to add complexity

  16. Adaptive, agile and context aware infrastructure XML validation framework that is configurable dynamically through the use of XML templates and rules. “In today's complex information exchanges with XML and associated large XSD schema, coupled with an array of trading partners, it becomes a significant challenge to support and maintain accurate handling of all incoming transactions”. “With a more adaptive and fault tolerant process, the application is able to handle a wider variation in content and, hence, more easily support a broad set of interaction partners with reduced support and maintenance costs”. http://www.ibm.com/developerworks/library/x-camval/index.html

  17. Avoiding the n(n-1)2 dilemma New XML validation framework Automotive parts repair with STARBOD example Utilizing validation framework with singleton validation templates that are context rule driven Source: http://www.ibm.com/developerworks/library/x-camval/index.html#figure2

  18. Agile Solution Components LEGEND Automated Manual Canonical Dictionaries Pick Components Wantlist Industry dictionary formatted as XML Review Structure Assembly 1 Agile Validation Engine Blueprint toolkit Exchange Designer Tool User Interface Domain dictionary formatted as XML Definitions Repository (XML) Build Ele Def Business Context Rules WSDL actions (optional) 2 Domain applications CAMV engine Content Hints Templates XML Schema 3 4 Test Exchange Structure Schema XML exchange realistic test examples Unit Test Harness Interchanges 19

  19. Leveraging Cloud Deployment strengths Collaboration tools for sharing canonical component dictionaries Repositories of templates and code lists Fault tolerant deployment architectures with redundancy Machine accessible APIs to allow real time updates and propagation of changes Standards based implementations that provide open access Open source resources for shared implementation support

  20. Open Source and Open Standard solutions Examples from Emergency Management domain with NIEM, OASIS EDXL, LEXS

  21. Example Emergency Management Scenario Emergency Response Services Workflow using OASIS EDXL exchanges Haiti demonstrated need for agile exchanges to rapidly cope with unfamiliar scenario and environment changes Cloud-based sharing of open adaptive common infrastructure components

  22. Top Down Solution Approach LEGEND Automated Manual Target applications 3 Pick Components Structure Outline Blueprint Industry dictionaries formatted as XML Exchange generator tools (CAM) 4 Enterprise Data Model Import and refactor for use with CAM 6 Expand Structure Exchange Structure 2 Exchange Components 1 Build EDM Exchange Blueprint Designer User Interface Local domain dictionary formatted as XML Ele 5 Exchange Package Components Definition (XML) 7 Def Ele Def DDL Dictionary Repository

  23. Assembling Components from dictionaries Determine your business information exchange components at conceptual level Search and locate candidate components from appropriate domain dictionary collections Catalogue the parts to be used Dictionary components can be referenced individually or as collections by an assembly blueprint that puts them all together to create a complete information exchange Components can be selected from multiple dictionaries Note any new extension pieces as needed Select components from multiple physical dictionary files Blueprints themselves also have high re-use value Can be sub-assemblies and patterns not just exchange models

  24. Example Assembly Blueprint Outlines • LEXS messaging blueprint Reusable messaging envelope constructs • OASIS EDXL HAVE message Business functional components Message handling, delivery and control Individual component Payload goes here Top level sets of business information components these examples available from CAM editor install package ~ CAMeditor\eclipse\workspace\CAMEditor\dictionary\blueprints\ LEXS – Law Enforcement eXchange System – http://www.lexs.gov

  25. Exchange Development Process Tools Component Definitions Component Definitions Excel Domain dictionary Web tool 1 Blueprint Designer Industry dictionary 2 Search Tools Expander Tool 3 Insert Dictionary Parent Components 4 5 Completed Exchange Template

  26. Summary and Q & A Review Resource links

  27. Summary Canonical XML component dictionaries Neutral representation of components Deployment to target environments and architectures Collaborative development and open source Uses open public standards and government guidelines (NIEM) Available resources and tools Illustrative use cases Leverage strengths of cloud-based collaboration resources

  28. Resources Resource links Supporting supplemental slides

  29. Links and Resources DOWNLOADS - CAM Toolkit download https://sourceforge.net/projects/camprocessor SUPPORTING MATERIALS - NIEM Naming and Design Rules (NDR) 1.3 http://www.niem.gov/pdf/NIEM-NDR-1-3.pdf RESOURCES – UN/CEFACT Core Components Technical Specification http://www.unece.org/cefact/ebxml/CCTS_V2-01_Final.pdf Tutorials - wiki.oasis-open.org/cam/CAM_Tutorials Specifications www.oasis-open.org/committees/cam docs.oasis-open.org/cam www.oasis-open.org/committees/emergency NIEM site - www.niem.gov LEXS site – www.lexs.gov

  30. Available XML Dictionaries LEXS 3.1.4 dictionary OASIS EDXL dictionary OASIS EML dictionary NIEM 2.1 dictionaries CBRN dictionary Emergency dictionary Family dictionary Immigration dictionary Infrastructure dictionary Intelligence dictionary Justice dictionary Maritime dictionary Screening dictionary Trade dictionary NIEM core dictionary Immigration blueprint • Packaged with CAM editor see dictionary folder of install + spreadsheet + blueprint samples XML XML XML Note: Those marked in bold are model style dictionaries with recursive components. Available from download site direct link: http://sourceforge.net/projects/camprocessor/files + includes spreadsheets and sample blueprint XML XML XML XML XML XML

  31. Conceptual Information View Structure Rules CAM Template Definitions DICTIONARY COMPONENTS DOMAIN DATA COMPONENTS Items Item (ABIE, BBIE, ASBIE) CAM toolkit processing Properties Name Unique ID Component Type Cardinality Content Type Content Mask Children Apply tools in desktop CAM toolkit editor Group Structure Context Where from Definition Rules Language, Label, Notes * Required items in Blue

  32. XML View of Dictionary Content Name Unique ID Component Type Cardinality Parent / Child linkage where referenced Items * See slide notes for explanation Content Type Content Mask

  33. Excel Spreadsheet View Type (ABIE, BBIE) properties as columns children An item per row

  34. Mapping to Dictionaries You can compare a template of components to a dictionary check within a domain for alignment to dictionary check between domains for interoperability merge new/existing components with dictionary Matches on physical names Reports matching items and details Reports statistics and percentages of matching Generates crosswalk xml file Compatible with Microsoft Excel Report can be used to do spell checking

  35. Example cross-reference spreadsheet Formatted view in Microsoft Excel of import of cross-reference report details (from generated XML file) Matched details; item and alignment, definition

More Related