1 / 17

GAE (Grid Analysis Environment) Overview of Caltech effort

Slides for the Caltech GAE Workshop June 2003. GAE (Grid Analysis Environment) Overview of Caltech effort. Overview. GAE crucial for LHC experiments Utility of Grids proven for production Their use for Analysis will be the Acid Test of Grids Large, Diverse, Distributed community of users

amadis
Download Presentation

GAE (Grid Analysis Environment) Overview of Caltech effort

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Slides for the Caltech GAE Workshop June 2003 GAE(Grid Analysis Environment)Overview of Caltech effort

  2. Overview • GAE crucial for LHC experiments • Utility of Grids proven for production • Their use for Analysis will be the Acid Test of Grids • Large, Diverse, Distributed community of users • Support for hundreds/thousands of analysis tasks • Widely varying requirements • Need for Priority Schemes, robust authentication and security • Operation in a severely resource-limited and constrained global system • GAE is where the physics gets done • Where physicists learn to collaborate on analysis at a distance

  3. Scope • Diagram shows “snapshot” in time of analysis activities • Groups of individuals, geographically separated, work on specific analysis topics (e.g. Supersymmetry) • Resources in the Grid system are shared between the groups • Boundaries enclosing the groups move and change shape as the composition or requirements of the groups change

  4. Architecture • Several candidate computing system architectures have been proposed to support GAE • At Caltech we have defined the “CAIGEE” Architecture, in collaboration with UCSD, UCR, FNAL and UCD • Our work is focussed on developing critical missing components of the CAIGEE architecture, creating demonstration-grade applications to determine its validity, and working with other groups on integration of existing software into the CAIGEE scheme

  5. CAIGEE Architecture

  6. CAIGEE (continued) • Based on the use of Web Services or Portals to provide heterogeneous clients access to analysis tools and data • An important feature is support for even semi-infinitely thin clients, such as PDAs with very limited CPU/Memory • Grid Authentication and transport built in – mediates client/service (portal) traffic

  7. Web Services • Data/Processing services offered via the Web • Widely adopted in the commercial world • Good tools, de facto standard protocols, support etc. • We have been confirming their usefulness for scientific data and services • Access to RDBMS-resident Tags and nTuples (Oracle, SQLServer, PostgreSQL) • Access to ROOT files • Access to Objectivity object collections • To do this, we have updated existing tools to “talk” with Web Services: • ROOT • COJAC (3D event viewer) • Others

  8. Web Services - Principles • Publish makes the service description publicly available. • WSDL( Web Services Description Language) is the language used to create the service description. • Find discovers the web service • UDDI (Universal Description Discovery and Integration) is the directory technology used by service registries. The registries contain descriptions of web services, and support lookup. • Bind allows the service to be used by the client. • SOAP (Simple Object Access Protocol) through which the service provider, service registry and service requestor communicate. SERVICE PROVIDER 1 Publish 3 Bind SERVICE REQUESTOR SERVICE REGISTRY 2 Find

  9. ORACLE9i SERVER DATA (META DATA) Provided at authentication (Service Registry) and security layer of Grid. Data Replication through SSL UUDI Registry Node ORACLE9i SERVER DATA (META DATA) Available On Fabric layer of Grid JAVA XML API to connect with Database Server Proxy Server SOAP HTTP Server Server with Master Database SOAP Processor DISTRIBUTED DATABASE WSDL file (Service Provider) Available at Connectivity and Resource layer of Grid SOAP Bind with the provided service UDDI SOAP Request and Response MS-SQL DATA (META DATA) Server with Materialized View Database Client Web Application to connect with database (Service Requestor) Web Services: Experimental Setup

  10. Example Web Services

  11. GAE Tools (1) Clarens • Our emphasis is on accomodating existing analysis tools in our CAIGEE architecture • To facilitate this, we use the “Clarens Dataserver” • Clarens is server software that makes datasets and services available to clients in a suitable lingua franca • Clients initially Grid-authenticate with a Clarens server, and then are able to make use of a wide set of data and analysis services on offer

  12. GAE Tools (2) Clarens • Clarens uses an interpreted Python framework running inside Apache • PKI security for CA certificates • Commodity protocols (http/https) used to talk with clients • Authorization of Web Service requests using hierarchical ACLs for Virtual Organisations • Distributed administration of VO/ACLs • Creating new Clarens services is straightforward and easy: this was one of the design goals.

  13. GAE Tools (4) Clarens • Services include: • Access to SOCATS (next slide) • Storage Resource Broker interface • Application execution (submit jobs to cluster schedulers) • Proxy escrow • File access to files in server filesystem or SRB files

  14. GAE Tools (5) SOCATS • “STL Optimized Caching and Transport System” • SOCATS is a general-purpose tool we have developed that is able to deliver large object collections (result sets) in response to an SQL query on an RDBMS • Targetted at C++ clients who wish to send a SQL Query to a remote RDBMS (using the Clarens dataserver) and receive back the database rows/result set as a collection of C++ objects • Data delivered in binary format (avoid heavy overhead of explicit XML encoding) • Large result sets are streamed efficiently to the client, so allowing client processing to begin as soon as the first data are available

  15. GAE Tools (6) GroupMan • Developed in response to need for user-friendly administration of LDAP based “Virtual Organisations” • Import to the LDAP server of certificates from CA • User-friendly GUI allows ad hoc creation of user groups and VOs • VO data stored to allow easy extraction by standard Grid-based tools • E.g. creation of Globus gridmap files • Part of the DPE distribution

  16. GAE Tools (5) PDA Client • A handheld GAE client: fruits of collaboration between NUST and Caltech • Software is Java Analysis Studio (JAS) ported to the Pocket PC 2002 OS • Hardware is any Pocket PC 2002 device • This tool is still under development and currently lacks authentication/security components

  17. GAE Tools (6) Collaboration Desktop • Four-screen desktop analysis setup • Driven by a single server and single graphics card • Four flat panel monitors • Allows simultaneous work on: • Traditional analysis tools (e.g. ROOT) • Software development (e.g. VS.NET) • Even displays (e.g. IGUANA) • MonALISA monitoring displays • Persistent collaboration (e.g. VRVS) • Online event or detector monitoring • Web browsing, email • Chat windows, instant messaging • Shared whiteboards etc.

More Related