Validation of HL7 v3 instances

Validation of HL7 v3 instances A post-mortem from the caCIS Implementation Dan Kokotov, Todd Parnell, 5AM Solutions

Who We Are / Acknowledgements • Enterprise Service Development for caCIS project • 5AM one of three companies involved in ESD • Other disciplines on caCIS: A&A, QA, Deployment, Documentation, … • The content of this presentation is authored by 5AM – any errors or omissions are our sole responsibility • Acknowledgements • Architecture & Analysis – John Koisch, Paul Boyes, Jean-Henri Duteau, Lorraine Constable and others • Enterprise Service Development – SemanticsBits and Agilex teams

Context – Architecture and Methodology • HL7 v3 using R2 datatypes • CDA (and possibly R1) in scope for project but out of scope for our solution • Roughly 70 RMIMs • Project-specific datatype specification with roughly 50 custom datatype flavors • Terminology Worksheet with mix of explicit and referential definitions for roughly 35 vocabulary types • XML ITS used at implementation layer • SOA infrastructure – SOAP services with WS-* • Roughly 40 WSDL interfaces in 13 functional areas • Project-specific contract/fault specification, governing reporting of business and system exceptional behavior, including contract tracing • SAIF specification methodology • A&A delivered CIM/PSM specs, in the form of RMIM models, Interface description, and accompanying documentation • Also included XSDs and WSDLs (non-normative but implied by V3 tooling and choice of ITS)

Context - Technology • Tech Stack • JSE 1.6, JEE 1.5 (JTA only) • JAX-WS, as implemented by CXF • JAXB • JPA, as implemented by Hibernate • Spring 3.0 • Tolven

Challenge • Validate incoming messages for compliance to model • Structural • Base datatype rules • Flavors* • Later in the project • Vocabulary* • Mostly beyond scope of this presentation • QA generated test cases based on PIM specs/models to validate compliance

Initial approach • Architecture: AP, AO,CO, CS • “RIM-inspired” application data model • AP<->AO: ORM (JPA2), AO<->CO: Bean mapping (Dozer), CO<->CS: XML Serialization (JAXB) • Validation • Schema validation – implicit from ITS, enforced via CXF interceptor • JPA Bean validation – “message-independent” invariants, enforced by JPA • e.g. “a patient must have a name” • “External” Bean validation - “message-specific” constraints, enforced by custom AOP interceptor • E.g. “order must have an identifier in the REPC_MT000001US RMIM”

Initial Approach – with pictures

Initial approach - problems • Uncertainty on how to decide when something could be promoted as “message-independent invariant” • Occasional duplication between JPA Bean validation and “External” bean validation • Basic R2/ISO 21090 datatype validation • Would require extensive bean validation implementation • Vocabulary compliance required definition of explicit enumerated lists from worksheet • Was not sufficient for referential definitions

R2/ISO 21090 datatypes - solution • Leverage schematron definitions of constraints embedded in official iso_21090_types.xsd schema • To do so had to overcome several roadblocks and challenges: • Embedded schematron did not have any context, as XML ITS / ISO 21090 schema only defines XSD ComplexTypes for each datatype, not a standard element • XSLT2 supports a schema type axis, but no open source Java XSLT processor implements this • Therefore, have to define context as explicit OR of possible paths to the datatype from any message of a given SOAP service • Potential recursion in datatypes makes this very tricky • Embedded schematron did not use prefixes for element names, thus they were not bound to the HL7 namespace, and schematron does not permit binding the empty prefix to a namespace • Had to use a regular expression to inject a prefix to element names in schematron XPath expressions • Embedded schematron had a variety of typos/bugs • Fixed directly in the schema • Miscellaneous (ANY type, inheritance, bugs in Xerces’ XS Schema reader)

Meet ExtractSchematron.java • Part of build-time toolchain to generate schematron for the ISO 21090 datatypes • Pseudocode: • Walk the iso-21090 schema, extract schematron annotations • “Fix” the schematron by injecting hl7: prefix to element names • Write the “abstract” schematron rule file with all the extracted schematron rules • Single sch:pattern called “abstract rules” • One sch:rule per datatype rule • Walk the service schemas, determine possible paths to a datatype • Must include paths to a datatype’ssupertype, and account for abstract types which can have xsi:type declarations at runtime • Write the “concrete” schematron rule file which references the “abstract” rules • One sch:pattern per datatype rule, whose context is the OR of all possible paths to an element which is of that datatype • For the win – regexp for injecting hl7: prefix • (^|or |and |::|/|\\(|\\|)([^@naocmspxt()&\\.\\[\\\\=*+>!\\-0-9]|n(?!ot[ \\(])|a(?!nd[ \\(])|o(?!r[ \\(])|c(?!ount\\()|m(?!atches\\()|s(?!tring-length\\(|elf|tarts-with\\()|t(?!ext\\()|p(?!lain')|x(?!si:))

ExtractSchematron – the output • “Abstract” <sch:rule abstract="true" id="IVL_PQ-0"> <sch:assert test="(@nullFlavor and not(hl7:any|hl7:low|hl7:high|hl7:width)) or (not(@nullFlavor) and (hl7:any|hl7:low|hl7:high|hl7:width))"> null rules </sch:assert> </sch:rule> • “Concrete” <sch:pattern name="concrete rules"> <sch:rule context="ns0:buildTemplateResponse/responseEnvelope/hl7:subject2/hl7:sequenceNumber/hl7:uncertainty[@xsi:type and fn:resolve-QName(@xsi:type, self::node())=fn:QName('urn:hl7-org:v3', 'PQ')] | ns0:buildTemplateResponse/responseEnvelope/hl7:subject2/hl7:priorityNumber/hl7:uncertainty[@xsi:type and fn:resolve-QName(@xsi:type, self::node())=fn:QName('urn:hl7-org:v3', 'PQ')] | ns0:buildTemplate/templateParameter/hl7:parameterItem/hl7:value[@xsi:type and fn:resolve-QName(@xsi:type, self::node())=fn:QName('urn:hl7-org:v3', 'QTY')][@xsi:type and fn:resolve-QName(@xsi:type, self::node())=fn:QName('urn:hl7-org:v3', 'PQ')] | ns0:buildTemplate/templateParameter/hl7:parameterItem/hl7:value[@xsi:type and fn:resolve-QName(@xsi:type, self::node())=fn:QName('urn:hl7-org:v3', 'PQ')]"> <sch:extends rule="PQ-0"/> </sch:rule> </sch:pattern>

Integrating into the SOAP Stack • Applying the generated schematron at runtime • CXF Interceptor to apply the schematron • CXF Interceptor to detect if errors occurred and raise fault • Need two interceptors because we work at different spots in the CXF processing chain

Results • Were now able to successfully validate for the built-in ISO 21090 datatype constraints • With shiny new schematron facility, decided to start using it for custom validation as well • But not 100% rosy • Slow (ish) • Memory intensive • Can cause problems with Xalan/Saxon on the Classpath

The monkey-wrench • Architecture change – switch to Tolven backend • Now RP, RO, CO, CS (kind of), still using conversion for RO <-> CO • No more JPA Bean validation • Still use some “External” bean validation • Datatype Specification added, project RMIMs start using flavored datatype • One sprint later, we had 200 QA bugs for flavor validation

Revised Architecture

How to validate datatype flavors? • Fully MIF-driven • We did not have time to build this • Some off the shelf stuff was available, but not on our platform • Write all the rules by hand • Seemed painful • What if we could leverage ExtractSchematron • XML ITS does not have explicit types for flavors • But if we add them, we could annotate them with schematron rules and use ExtractSchematron to harvest them • So this is the approach the took

The approach • Each flavor derives from the base datatype by restriction • And have to do it for container types as well • Flavor definitions go into flavors.xsd, which imports iso-21090.xsd • Each flavor type is then annotated with schematron, just like iso-21090.xsd • Still have to write the actual schematron rules by hand based on Datatype specification • MIF representation not available, and OCL-schematron translation would be beyond our scope • RMIM Schema modified to reference the flavor XSD types • Because the derivation is by restriction, this is fully backwards compatible – valid instances look the same • For abstract types and flavors, can use either xsi:type or flavorId (for backward compatibility) to specify flavor

Putting it into practice • Modify V3 Generator • StaticMifToXsd.xsl modified to use flavor names in RMIM Schemas • RimInfrastructureRootToXsd.xsl modified to add reference to flavors.xsd • Both changes conditional on build-time parameter, so backward-compatible • Modify implementation • JAXB now generates Java beans for flavor types • Have to update Dozer rules and other code accordingly • Remaining challenges • Permanent home for V3 Generator changes • Possible divergence from official ITS spec • JAXB flavor beans cause a lot of overhead and over-tight coupling

Lessons Learned • Schematron is a powerful tool but complex and has limits • Cannot do vocabulary • Suffers from lack of full implementations of XPath2 and XSLT2 • XML ITS for datatypes would be better off with a separate namespace and explicit element names • In the end the HDF and HL7 modeling approach strongly require an MDA-oriented implementation, with full MIF awareness. • Everything else is a band-aid • Therefore must invest in high quality MIF-based toolchains • Validation must distinguish between object model, document, and message perspectives of HL7 v3 • Same constructs are used to address all three, but the intent and semantics are different • Validation strategies should adapt accordingly

Resources • Source code: http://caehrorg.jira.com/svn/ESD/trunk • Contact info • Dan Kokotov – dkokotov@5amsolutions.com • Todd Parnell – tparnell@5amsolutions.com

Validation of HL7 v3 instances

Validation of HL7 v3 instances

Presentation Transcript

NPfIT Interoperability The Role of HL7 v3

HL7 V3 Model Driven Software Development

Financial Management (FM) v3 Orientation HL7 Baltimore September 30, 2002

Instances

HL7 v3 Laboratory Profile

HL7 v3 Clinical Genomics – Overview

Validation of Edition 1 and V3 GERB products by comparison with CERES

HL7 v3 教育訓練系列教材

HL7 v3 教育訓練系列教材

SOA and HL7 V3 Workplan

SOA and HL7 V3

HL7 V2 and V3 – where next?

HL7 V3 API

HL7 v3 Clinical Genomics – Overview

V3 Lite – Simplifying the Exchange of HL7 V3 Messages

SOA and HL7 V3 Proposal Overview

HL7 v3 教育訓練系列教材

Creating Instances

v3

HL7 v3 Clinical Genomics – Overview