260 likes | 508 Views
CBS OPAG-ISS Expert Team on the Assessment of Data Representation Systems (ET-ADRS) Washington DC, USA, 23 - 25 April, 2008. Data Representation System XML. Jan W. Noteboom Royal Netherlands Meteorological Institute (KNMI) noteboom@knmi.nl. Presentation. XML Overview XML SWOT Analysis
E N D
CBS OPAG-ISS Expert Team on the Assessment of Data Representation Systems (ET-ADRS) Washington DC, USA, 23 - 25 April, 2008 Data Representation SystemXML Jan W. Noteboom Royal Netherlands Meteorological Institute (KNMI) noteboom@knmi.nl
Presentation • XML Overview • XML SWOT Analysis • XML Practical Experiences • Discussion ET-ADRS, 23-25 April 2008, Washington DC, USA
XML OverviewIntroduction XML is: • eXtensible Markup Language • W3C Recommendation since feb 1998 • subset of Standard Generalized Markup Language (SGML) • meta-language, used to create markup languages • designed to represent and exchange data as structured documents across information systems particulary via the Internet • human readable (text based, unicode support) • open standard, licence free ET-ADRS, 23-25 April 2008, Washington DC, USA
XML OverviewStructure and Semantics Basic components • elements, attributes, comments, PCDATA, processing information (e.g. declaration, namespaces) Well-formed • one root element (hierarchical structure) • no open tags, proper nesting:<parentTag><childTag1> </childTag1></parentTag> • attributes must be quoted • element names are case sensitive Valid • document conforms to some semantic rules(e.g. Document Type Definition DTD or XML Schema XSD) ET-ADRS, 23-25 April 2008, Washington DC, USA
XML Overview Structure and Semantics Semantic Rules - XML Schema XSD (W3C, 2001) • uses XML syntax (well-formed) • structural definitions, type definitions, defaults • very powerful and flexible • facilitates creation of own libraries with exchange data types Schemas are also useful for: • prior agreements between parties for data exchange • application development that process data Alternatives: RELAX NG (OASIS)*, Schematron* *Supported by DSDL: Document Schema Definition Languages (ISO 19757) Namespaces (W3C, 2006): • Identify your vocabularies (usage: xmlns:prefix=“URI”) • qualify element and attribute names to avoid name collisions • Allows modularization of schemas • Mix and match elements from multiple schemas in document instances • Import or include from one XML Schema into another (re-use) ET-ADRS, 23-25 April 2008, Washington DC, USA
XML Example <?xml version="1.0"?> <swe:CompositePhenomenon xmlns:swe="http://www.opengis.net/swe/1.0.1" xmlns:gml="http://www.opengis.net/gml" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xlink="http://www.w3.org/1999/xlink" xsi:schemaLocation="http://www.opengis.net/swe/1.0.1 http://schemas.opengis.net/sweCommon/1.0.1/swe.xsd" gml:id="weather1" dimension="6"> <gml:name codeSpace="urn:ietf:rfc:2141" >urn:ogc:def:phenomenon:SEEGrid:weather1</gml:name> <swe:base xlink:href="urn:ogc:def:phenomenon:OGC:Weather"/> <swe:component xlink:href="urn:ogc:def:phenomenon:OGC:AirTemperature"/> <swe:component xlink:href="urn:ogc:def:phenomenon:OGC:WindSpeed"/> <swe:component xlink:href="urn:ogc:def:phenomenon:OGC:WindDirection"/> <swe:component xlink:href="http://sweet.jpl.nasa.gov/ontology/property.owl#Visibility"/> </swe:CompositePhenomenon> ET-ADRS, 23-25 April 2008, Washington DC, USA
XML OverviewProcessing XML API implementor XML Infoset Application Programming APIs • DOM - Document Object Model • SAX - Simple API for XML • StAX - Streaming API for XML Data binding • JAXB - Java Architecture for XML Binding • Hibernate - relational/object/XML mapping tool XML parser XML validator XML doc XML schema ET-ADRS, 23-25 April 2008, Washington DC, USA
XML OverviewXML extensions • XPath, XPointer: for addressing XML subdocuments • XLink : to create hyperlinks between resources • XSLT : for rearranging & restructuring XML docs • XQuery : for querying • SOAP : XML-Protocol for message and object serialization and remote procedure calls • RDF : to describe resource metadata • XForms : for Web forms • XMI : XML Metadata Interchange …..etc…. ET-ADRS, 23-25 April 2008, Washington DC, USA
XML OverviewDialects • Geography Markup Language (GML, ISO19136) • Keyhole Markup Language (KML) • Digital Weather Markup Language (DWML) • Climate Data Markup Language (CDML) • Weather Markup Language (WxML) • Emergency Data Exchange Language (EDXL) • Water Markup Language (WaterML) • Chemical Markup Language (CML), • Electronic Business XML Initiative (ebXML), • Scalable Vector Graphics (SVG), . . . ET-ADRS, 23-25 April 2008, Washington DC, USA
XML SWOT Analysis ET-ADRS, 23-25 April 2008, Washington DC, USA
XML SWOT AnalysisCriteria • Ability/suitability to present WMO dataincludes also: pictorial data, textual information (e.g. warnings), metadata • Ability/suitability to exchange dataoperational data between NMHSs and information to users outside NMHSs • Ability/suitability for store datausage in storage systems • Compliance with and status of existing standards. • Available support skills and technology (tooling) • Other abilitiesability to translate back and forward to other DRSsability/suitability to envelope objects or act as a pseudo-carrier ET-ADRS, 23-25 April 2008, Washington DC, USA
XML SWOT AnalysisSummary poor excellent ET-ADRS, 23-25 April 2008, Washington DC, USA
XML SWOT AnalysisPresenting WMO data(weather, climate, water, atmospheric constituents, oceanography, aviation e.a) Strengths: • structured text format (hierarchical) • self-documenting • can represent common data structures: records, lists, trees • human-readable • very flexible – you can define and mix other languages (GML) • supports modularity (namespaces) Weaknesses: • fairly verbose and partially redundant • binary data • expressing non-hierarchical relationships is difficult Opportunities & Threats: • many languages and schemes available that are useful to describe weather, climate related aspects • potentially complex parsing (many namespaces) ET-ADRS, 23-25 April 2008, Washington DC, USA
XML SWOT AnalysisPresenting WMO data Pictorial data: • feature data, vector data (GML, KML) • presenting raster data Remark • GML (ISO19136) enables OGC services (WFS, WMS WCS) Text information (e.g. warnings): • XML is human readable and multi-lingual (unicode) • EDXL (Emergency Data Exchange Language) by OASIS • rearranging & restructuring abilities XSL (to text, XHTML, pdf etc) Remark • CAP – Common Alert Protocol (Oasis), open, based on EDXL Metadata: • ISO 191xx support (ISO 19139 Metadata Schema) Remark • Available WMO Core Metadata Profile (extension to ISO 19139) ET-ADRS, 23-25 April 2008, Washington DC, USA
XML SWOT AnalysisExchanging operational data (NMHS and centers) Strenghts: • strict syntax and parser requirements • XML schemes, useful for • validating • defining exchange formats • writing applications that process the data • web enabling Weaknesses: • verbose, bandwidth consumption • processing overhead (complex parsing, validation) Opportunities & Threats: • data compression (gzip) • no agreed WMO schemas, except for Core Metadata Usage: barely, e.g. Cyclone XML ET-ADRS, 23-25 April 2008, Washington DC, USA
XML SWOT AnalysisTransmitting information (Users outside NMHSs and centers) Strenghts: • human readable (text) • language support (unicode) • web enabling (Webservices, SOA support, e-Business) • interoperability • translation to text or HTML or other XML (XSL) • open standard, licence free • technology support (tools) Weaknesses: • verbose, bandwidth consumption • transmitting binary data Opportunities & Threats: • usage for metadata exchange is common practice (catalogue data) • too many proprietary schemas to transmit data Usage: growing practice , e.g. Road Weather Information Network, Canada ET-ADRS, 23-25 April 2008, Washington DC, USA
XML SWOT AnalysisUsage instorage systems Strenghts: • Native XML DBMSs available • Many XML enabled relational databases • technology such as XPath and XQuery, SQL/XML Weaknesses: • Normalizing/mapping XML data into RDBMS tables can be difficult • A native XML DBMS requires more space (less efficient) Opportunities & Threats: • No much experience with XML for storage in NMHSs Usage: writing XML queries iso SQL (growing practice) ET-ADRS, 23-25 April 2008, Washington DC, USA
XML SWOT AnalysisStandards Strenghts: • XML is open standard (W3C) • ISO 191xx standards are supported by XML schemas • GML (ISO 19136) – XML for geospatial aspects • OGC OM model supported by GML schemas • Ability to implement UML conceptual models (ISO 19103) in XML schemes (using XMI) Weaknesses: • No agreed WMO schemas available (except for Core Metadata) Opportunities (& Threats): • All harmonization initiatives are based on ISO, W3C and OGC standards (RA VI: EU-INSPIRE) • OGC Observation and Measurement model used to develop WXXM (weather exchange model) for avaition (Eurocontrol) • HollowWorld applied to develop WMO Core metadata profile. ET-ADRS, 23-25 April 2008, Washington DC, USA
XML SWOT AnalysisOther Abilities Translating XML from and to other DRSs • NetCDF <> XML tooling: • (to) NetCDF Markup Language (NcML), and NcML-GML • (to&from) LeoNetCDF • BUFR <> XML tooling: unknown • HDF5 <> XML tooling: (to) d5dump Combining XML with other data formats (e.g. envelope, pseudo carrier) • XML metadata /header for HDF5, NetCDF or BUFR datasets • SOAP (XML protocol) for exchanging data ET-ADRS, 23-25 April 2008, Washington DC, USA
XML SWOT AnalysisAvailable support(skills & technology/tools) • Numerous XML extensions(XSL, XQuery XForms, XPath, KML, DWML, SOAP etc.) • XMI to translate data models (UML) into XML schemes • GML for geospatial aspects (GIS systems) • XML technology is widespread, easily available and cheap • Increasing number of individuals with XML skills ET-ADRS, 23-25 April 2008, Washington DC, USA
XML SWOT AnalysisSummary poor excellent ET-ADRS, 23-25 April 2008, Washington DC, USA
XML practical experiences • Cyclone XML, TIGGE project • RWIN, Canada • WXXM, EUROCONTROL ET-ADRS, 23-25 April 2008, Washington DC, USA
XML practical experiences Cyclone XML • XML format for cyclone analyses and forecasts (CXML) • to improve sharing of cyclone information with other users than NHMSs • alternative for the BUFR/CREX format • development recently started – TIGGE project • details: http://www.bom.gov.au/bmrc/projects/THORPEX/CXML/index.html RWIN (Road Weather Information Network) • XML format for road weather observations (CMML) • Interchange between Canadian transportation ministries and contractors (network maintenance) • 200 observing sites every 20 minutes • Operational • details: http://www.clarusinitiative.org/ ET-ADRS, 23-25 April 2008, Washington DC, USA
XML practical experiences WXXM Weather Exchange model • for data and objects related to weather for aviation • Conceptual Model (WXCM) based on OGC Observations and Measurements model • Following ISO 19100 principles and OGC recommendations • using GML for compatibility with third-party GML applications • Under development • Details:http://www.eurocontrol.int/aim/public/standard_page/met.html Aeronautical Information Exchange Model (AIXM) ET-ADRS, 23-25 April 2008, Washington DC, USA
Discussion ET-ADRS, 23-25 April 2008, Washington DC, USA
Discussion • Is there a need to develop a “WMO Markup language”? • How to benefit from the extensive and cheap XML support? • What synergy XML <> HDF5/BUFR/NetCDF is achievable? • What governance should WMO offer to support XML? ET-ADRS, 23-25 April 2008, Washington DC, USA