1 / 32

GridAssist, making the Grid invisible

CEOS March 2005 Argentina. GridAssist, making the Grid invisible. Ruud Grim Mark ter Linden. Ivan Petiteville. Contents. History Technical Details Operational Experiences Future Plans. A user friendly service to support instrument calibration/validation & data (re-) processing.

wilson
Download Presentation

GridAssist, making the Grid invisible

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CEOS March 2005 Argentina GridAssist, making the Grid invisible Ruud Grim Mark ter Linden Ivan Petiteville

  2. Contents • History • Technical Details • Operational Experiences • Future Plans A user friendly service to support instrument calibration/validation & data (re-) processing. GridAssist, March 2005 CEOS Argentina

  3. History • 1997-2000 EC FP4 OASE project • Collaboration environments for the simulation and data processing of Earth Observation data • Chains of applications in distributed environment • Used CORBA technology provided only limited functionality and was not properly secure (opening of ports in firewall needed) Atmosphere Model OMI Simulator Ground Data Processor Total Ozone Column UV Prediction Dutch Space Dutch Space DLR-DFD KNMI FMI GridAssist, March 2005 CEOS Argentina

  4. GREASE Project2002-2003 (ESA) • Same concept, with new chassis (Grid) and powered by new engine (Globus Toolkit 2.x) • The environment should be easy to use and should hide the underlying Grid technology for the scientific user • Workflow and service oriented approach – more than simple chains of applications. Service A Service D Service F Service B Service C Service E GridAssist, March 2005 CEOS Argentina

  5. Grid resources Workstations with client tools Controller Concept • User friendly client tools run locally on the users workstations for constructing workflows and monitoring jobs • Centralized controller executes the workflows on the Grid • Controller implemented as Web Service for easy and standardized access (even through firewalls) SOAP Grid LAN GridAssist, March 2005 CEOS Argentina

  6. Use cases within ESA • Instrument validation • Mission simulation • Archive reprocessing • Instrument test data generation (via simulation) • Production-on-Demand • Concurrent design Satisfying different functional needs: • Collaboration • Computing power • Controlled provision & access of services GridAssist, March 2005 CEOS Argentina

  7. Grid implementations @ESA • Instrument validation (#3) • Mission simulation (#2) • Archive reprocessing • Instrument test data generation (#1) • Production-on-Demand • Concurrent design Examples (#) • OMI test data generation • ENVISAT validation • GAIA mission analysis & Grid-on-Demand Concurrent Design Facility GridAssist, March 2005 CEOS Argentina

  8. UC#1: OMI (NASA AURA) (launched summer 2004) Electronic Assembly Optical Assembly • Main products: Ozone columns, profiles • 6-7 GB / day (Level 0 data) GridAssist, March 2005 CEOS Argentina

  9. UC#1: Scanning the Earth daily • Continue global total ozone trends • Nominal 13 x 24 km spatial resolution or 13 x 13 km for detecting and tracking urban-scale pollution sources GridAssist, March 2005 CEOS Argentina

  10. UC#1: Test data generation • Fall 2003: Generation of one month of simulated OMI data for Ground Segment Verification (starting beginning 2004) • 230,000 simulation runs of 2 minutes each (total 7666 hours) • Between 50 and 80 CPU’s were used in a 6 week period • 32 Gb telemetry data produced and transferred to NASA spectrum NASA GS CCD output telemetry Existing GOME Data Level 0 OMI Instr. Simulator Level 1 Raw Data Generator Level 0 Processor Level 1b Processor Grid Level 2 Algorithm GridAssist, March 2005 CEOS Argentina

  11. UC#2: Instrument ValidationWhat is required? • Additional validated data • In-situ measurements • Aircraft • Balloon • Ground (lidar) • Other space instrumentation • Quality Assurance • Common data sets • Algorithms • Tools, converters, visualization tools • Good communication & collaboration GridAssist, March 2005 CEOS Argentina

  12. UC#2: ECV Prototype(ESA THE VOICE project) Demonstrate possibilities of e-Collaboration for cal / val • Authorization & Authentication • Communication (agenda, documentation) • Access to • Meta data catalogue • Data store • Applications & tools • Under configuration control • In development • Workflow Management (GridAssist) • Publish & Subscribe GridAssist, March 2005 CEOS Argentina

  13. UC#2: Validation Workflow • Access to data stores • GOME Level 2 • LIDAR (at IPSL or NILU) • On-demand processing • Publish/Subscribe tonotify users GridAssist, March 2005 CEOS Argentina

  14. UC#2 THE VOICE Workflow Environment Workflow submission Connecting Click-and-Drop Data stores Applications Access to Data stores Drag-and-Drop GridAssist, March 2005 CEOS Argentina

  15. UC#2: VOICE collaboration crossing boundaries NILU ESTEC & Dutch Space KNMI RIVM Univ Bremen BIRA/IASB NASA IPSL Genève Tor Vergata ESRIN GridAssist, March 2005 CEOS Argentina ESAC VillaFranca

  16. UC#3: Gaia mission analysisScience objectives • Map 10^9 stars in our Galaxy • Astrometry • Photometry • Spectra • Studies • Structure & kinematics of Galaxy • Stellar populations • Origin, formation & evolution of Galaxy • Stellar astrophysics • Cosmology • Extra-solar planetary science • Fundamental physics • Core Processing (Global Iterative Solution) using subset of 10^8 stars with • Raw data • Calibrated data • Attitude data • Science data • 500 TB over 5 yr • 10^20 flop CPU GridAssist, March 2005 CEOS Argentina

  17. UC#3: Gaia ProcessingForeseen architecture (May 2004) GridAssist, March 2005 CEOS Argentina

  18. Binary star simulation with the GASS (Gaia Simulator) 5 year period, submitted as 5 jobs covering 1 year each Executed on 23 CPU’s in 8 institutes of 5 countries Total of 3.8 million CPU seconds used 16.5 Gb telemetry data produced and transferred to CESCA >1,100 jobs submitted in 6 months Data extraction from GDAAS database (Oracle) Very flexible using Java as query language UC#3: GAIA collaboration Lund Astrometry ESTEC Dutch Space Copenhague Cambridge Photometry Leiden Photometry RVS Heidelberg Quick Looks Bruxelles ABS Meudon RVS Turino Minor Planets Geneve Variable Stars Trieste RVS CNES? Nice Fundamental Algos ESRIN ESAC Barcelona Core Tasks Database GridAssist, March 2005 CEOS Argentina

  19. Benefits of GridAssist • Easy and secure access to applications, data and resources • Satisfying both collaboration & HPC needs • Unattended execution of large and/or complex jobs using workflows • Low failure rate (>95% of jobs are successfully completed) • Supports loggingat three levels • Application, GridAssist, Globus • No or little modifications needed to existing applications; new applications can be added fast • The Grid environment can easily be extended with more resources • Easiness of installation GridAssist, March 2005 CEOS Argentina

  20. Lessons Learned • The GridAssist Workflow Tool proved to be a very user-friendly and intuitive tool; users can use it almost directly • It complies to both High Performance Computing and collaboration needs within ESA; users are very enthusiastic • Interface problems between applications can be detected early in the development process • Approach to use GridAssist to run applications on the Grid is usable for many fieldsthat have similar scientific data processing needs (Earth Observation, Astronomy, …?) GridAssist, March 2005 CEOS Argentina

  21. Future plans • Continue development • Improve robustness • Improved workflow features, user management • Improved access to data stores • Interoperability (e.g. gLite) • Project operations support • Mission analysis • Instrument calibration / validation • Application development • Level 3 & 4 product processing • Archive re-processing GridAssist, March 2005 CEOS Argentina

  22. More info? • Web site: http://www.gridassist.com/ • Contact persons: • Ivan Petiteville (ESA ESRIN) e-mail: Ivan.Petiteville@esa.int telephone: +39-06.941.80.567 • Ruud Grim (GridAssist Project Manager)e-mail: r.grim@dutchspace.nltelephone: +31-71-5.245.416 • Mark ter Linden (GridAssist Developer)e-mail: m.ter.linden@dutchspace.nltelephone: +31-71-5.245.557 • Photos: courtesy ESA, NASA, KNMI and Internet GridAssist, March 2005 CEOS Argentina

  23. + + Develop locally, compute and collaborate globally on the Grid. Questions ? GridAssist, March 2005 CEOS Argentina

  24. The Grid • Around 1998 the Grid concept was introduced:Sharing resources in Virtual Organizations • Demand driven access to computing power • Increased utilization of idle capacity • Greater sharing of computational results GridAssist, March 2005 CEOS Argentina

  25. Grid Environment • Grid environment based on Globus Toolkit 2.x using: • Globus Resource Allocation and Management (GRAM) • Remote job submission and control • Interface to local job management systems (PBS, LSF, Condor) • GridFTP • High performance, secure, reliable data transfer • Grid Security Infrastructure (GSI) • Single sign-on and secure communication • Based on Public Key encryption and X.509 certificates GridAssist, March 2005 CEOS Argentina

  26. Features • Workflow Tool • User interface implemented in Java (Windows, Linux, Unix, Mac) • To add / modify / remove applications, resources and properties • To create, start and monitor workflows • Embed additional (new) services, e.g. browsing in database, logging at 3 levels, converters, notification services, visualization • Embed batch programs, not (yet) interactive • No requirements on language (Java, Fortran, C, IDL, …). • User can configure runtime parameters • Central registry • Storage of information about applications and resources • Configuration control GridAssist, March 2005 CEOS Argentina

  27. MySQL Database Architecture • Implementation in Java – cross platform (tested on Windows, Linux and Mac) Globus specific protocols SOAP Grid Grid LAN Data Processing Application Apache Jakarta Tomcat Web Server JDBC Connector Apache AXIS Apache AXIS GridAssist Workflow Engine Java CoG-kit Globus Toolkit GridAssist Workflow Tool Controller Grid Resource User Workstation GridAssist, March 2005 CEOS Argentina

  28. Workflow ToolMaintaining the registry Resources Resource or service details Services GridAssist, March 2005 CEOS Argentina

  29. Workflow ToolCreating the workflow Workflow submission Data stores Connecting Click-and-Drop Applications Drag-and-Drop GridAssist, March 2005 CEOS Argentina

  30. Workflow ToolStatus Monitoring Availability & Usage Submitted workflows & status overview GridAssist, March 2005 CEOS Argentina

  31. Hiding Grid technologyIntuitive GUI preferred DAG structured Dynamic execution Fault tolerance build-in GridAssist, March 2005 CEOS Argentina

  32. Data Processing Applications • Batch programs, not interactive. • No requirements on language (Java, Fortran, C, IDL, …). • Applications do not have to be modified. • Applications can be configured by the user using runtime parameters. • A simple wrapper shell script can be written to handle the input, output and the runtime parameters. • The application itself can be stored on the Grid resource but also on a storage node (in this case only the wrapper script need to be present on the Grid resource). GridAssist, March 2005 CEOS Argentina

More Related