1 / 18

OPeNDAP Services for ESG September 25, 2014

Performance Measures x.x, x.x, and x.x. OPeNDAP Services for ESG September 25, 2014. Peter Fox, Patrick West, Stephan Zednik RPI. In ESG II (in regard to data). Server side aggregation DAP object transfer via HTTP and GridFTP GSIFTP integration, myProxy support

chuong
Download Presentation

OPeNDAP Services for ESG September 25, 2014

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Performance Measures x.x, x.x, and x.x OPeNDAP Services for ESGSeptember 25, 2014 Peter Fox, Patrick West, Stephan Zednik RPI Earth System Grid Center for Enabling Technologies

  2. In ESG II (in regard to data) • Server side aggregation • DAP object transfer via HTTP and GridFTP • GSIFTP integration, myProxy support • OPeNDAP-g -> led to architecture of OPeNDAP/ Hyrax (primarily BES is from OPeNDAP-g) Earth System Grid Center for Enabling Technologies: (ESG-CET)

  3. Requirements leading to OPeNDAP-g • Separation of the core Data Access Protocol (DAP) from the transport protocol (HTTP). • High Performance Computing. The previous CGI based servers did not have the capacity required by ESG. Error and memory handling added. • Security. Once the OPeNDAP was independent of the transport protocol, adding security was possible by relying on the Globus gsiFTP system. • Aggregation. OPeNDAP 3.0 did not operate on aggregated datasets. OPeNDAP-g does. • Transport protocol independence and HPC were incorporated back into OPeNDAP leading to the current version. Security and aggregation remain an ESG only feature. Earth System Grid Center for Enabling Technologies: (ESG-CET)

  4. OPeNDAP-g Architecture Dispatcher BES Data • OPeNDAP-g Dispatcher (e.g. ESG Front-end Server) • Receives requests and asks the BES to fill them • Uses Apache Modules • Does not directly ‘touch’ data, handles URLs • Back End Server (BES) • Reads data files, Databases, et c., returns info • May return DAP objects or other data (netCDF) Earth System Grid Center for Enabling Technologies: (ESG-CET)

  5. OPeNDAP Hyrax Architecture Client OLFS BES Data • OPeNDAP Lightweight Front end Server (OLFS) • Receives requests and asks the BES to fill them • Uses Java Servlets • Does not directly ‘touch’ data • Multi-protocol • Back End Server (BES) • Reads data files, Databases, et c., returns info • May return DAP2 objects or other data • Does not require web server

  6. Info output HTML form ASCII output OPeNDAP Lightweight Front end Server GridFTP DAP2 Request Formulation** Request from client HTTP DAP2 RDF, OWL, JSON (HTTP) DAP2 (GridFTP, HTTP) BES SOAP-DAP (HTTP) THREDDS** Response to client PML output

  7. Hyrax/ Back-end Server BES Framework Network Protocol and Process start/stop activities PPT/PPTS* Initialization/ Termination Data Catalogs DAP2 Access BES Commands/ XML Documents Commands** e.g. server-side, Ferrett Provenance NetCDF3 HDF4 RDF/ SPARQL Data Store Interfaces e.g. IOSP … *PPT is built in (other protocols) **Some commands are built in Data Data Data

  8. OPeNDAP-g services for ESG • Data access via Dataportals. In this Use Case users interact with the portal, browse the catalogs and decide what data to download. The portal passes the request to OPeNDAP-g which executes it and returns the data to the portal. The portal returns a URL to the user to download the data. • Data access via netCDF library. In this Use Case users link their applications with OPeNDAP-g client library for netCDF. The user can “open” a URL that refers to ESG data. The library fetches the data for the user and makes it a local netCDF file • The performance requirements for ESG II were met. However, ESG-CET scales up these requirements. Earth System Grid Center for Enabling Technologies: (ESG-CET)

  9. Status of the Community OPeNDAP Server Software • Together the OLFS and the BES are known as Hyrax • Hyrax 1.6 provides support for NcML-based aggregation • Data response streamed back as netCDF file • RDF response type • Updated DDX response type (Data Definition XML) • Beginning development of DataDDX – multi-mime response with data and DDX • Full security audit and static code analysis certification to comply with NOAA and NASA requirements Earth System Grid Center for Enabling Technologies: (ESG-CET)

  10. ESG-CET and data • Large data sets, numbers and sizes • High performance • Flexible architecture, both client and several types and numbers of servers • Aggregation • Server side operations • Multiple transport protocol options • Full ESG security support as well as loose federation • Read-only client access via API (netCDF/CDM) • To satisfy the new goals, the OPeNDAP services for ESG have been re-architected. • We now use parts of the standard OPeNDAP framework Hyrax, focusing on high performance for the client side and extended flexibility. Earth System Grid Center for Enabling Technologies: (ESG-CET)

  11. ESG-CET and Products (server side functions) • Goal: drop in replacement for the TDS part of FTDS in LAS • Requires netCDF-Java Input-Output Service Provider (IOSP) adapter for Hyrax/BES • Use case examination will be required Earth System Grid Center for Enabling Technologies: (ESG-CET)

  12. Security Infrastructure status • OPeNDAP BES Security • SSL authentication between gridFTP middle tier and the BES. No persistent SSL connections are maintained • RNI Integration with ESG Security Infrastructure • RNI client supports gsiFTP connections to ESG GridFTP servers • ESG GridFTP server handles authentication of user • Neill's ESG GridFTp authz callout plugin handles authorization of data request • ESG/RNI GridFTP DSI module handles data request, forwards request to ESG BES server running RNI module Earth System Grid Center for Enabling Technologies: (ESG-CET)

  13. Client status • RNI Version 0.1 implemented using netCDF version 3 and OPeNDAP’s libnc-dap • In communication with Unidata regarding integration of RNI client with new NetCDF version 4 • Developing ncml aggregation in both client and server RNI • Full ESG security support Earth System Grid Center for Enabling Technologies: (ESG-CET)

  14. The Remote NetCDF Invocation (RNI) • The client is the netCDF library. It has exactly the same API as the standard C library netCDF, but it can deal with local files or files reachable via HTTP, PPT or gridFTP. • The third tier, the BES server can be reached only via PPT. NetCDF services for all NetCDF calls are implemented a a BES module. • The middle tier, acts like a proxy between the RNI client and server and deals with security. Earth System Grid Center for Enabling Technologies: (ESG-CET)

  15. RNI Architecture DATA RNI Library NetCDF Library CLIENT connection acts like RNI Module OPeNDAP BES GridFTP Earth System Grid Center for Enabling Technologies: (ESG-CET)

  16. Characteristics of the RNI as part of a data access system • Full Support of standard OPeNDAP URLs. RNI is being developed with the integrated Unidata/OPeNDAP netCDF library (and CDM) • Transparent access to either standard netCDF files and aggregated datasets via the NetCDF Markup Language (NCML). • For remote containers, all write operations are disable for security. That is, for HTTP/HTTPS, PPT and gridFTP/gsiFTP the RNI system is a read only API. • RNI utilizes Just in Time access. Caching is only for metadata. No pre-fetching of data. • RNI transparently accesses secure (gsiFTP, HTTPS) or insecure (gridFTP, HTTP) remote data. Earth System Grid Center for Enabling Technologies: (ESG-CET)

  17. Status of the RPI work (OPeNDAP) • The primary accomplishments for this subproject in the past year has been • The complete request-response for all netCDF API calls has been completed, that is, the two ends have been developed. • We have highly optimized core components of the standard OPeNDAP framework to support the performance goals. • We have established the knowledge of how the middle tier (the proxy) will be incorporated into the complete system. • Our work was presented in AGU fall 2007 and EGU 2008. • So far all of the goals established for the first stage of the project has been completed. • Next stage • integration with product server for gateway and data node data access • Functional (with enhancements) replacement for TDS Earth System Grid Center for Enabling Technologies: (ESG-CET)

  18. Future, will not elaborate • Storage Resource Manager / DMLite as a client • Return as RDF • Return as PML (Provenance – Proof Markup Language) Earth System Grid Center for Enabling Technologies: (ESG-CET)

More Related