Frank van Lingen (fvlingen@caltech)

Frank van Lingen (fvlingen@caltech.edu) Grid Enabled Analysis: Prototype, Status and Results(on behalf of the GAE collaboration) Caltech, University of Florida, NUST, UBP

ARCHITECTURE

System view • Suport 100-1000s of analysis-production tasks • Batch, interactive (interactive should be really interactive!) • “Chaotic” behavior. (not only production like workflow) • Resource limited and policy constrained • Who is allowed to access what resources? • Real time monitoring and trend analysis • Workflow tracking, and data provenance • Collaborate on analysis (country, world wide) • Provide secure access to data and resources • Self organizing (prevent the 100 system administrators nightmare) • Detection of bottlenecks within the grid: network, storage, cpu and take action without human intervention. • Secure, robust, fast data transfer • High level services: autonomous replication, steering of jobs, workflow management (service flows, data analysis flows) • Creating a robust end 2 end system for physics analysis • No single point of failure • Composite services • Provide a simple access point for the user, while performing complex tasks behind the scene GAE is not all “new” development, but also focuses on integration of existing components which can include experiment specific applications

User view • Provide a transparent environment for a physicist to perform his/her analysis (batch/interactive) in a distributed dynamic environment: Identify your data (Catalogs), submit your (complex) job (Scheduling, Workflow,JDL), get “fair” access to resources (Priority, Accounting), monitor job progress (Monitor, Steering), get the results (Storage, Retrieval), repeat the process and refine results • I want to share my results/code with a selected audience! • I want access to data as quickly as possible! Catalogs Identify locate execute submit Scheduler Farm store Simplistic view! Monitor/steering Storage Notify/move

Example implementations Associated with GAE components ROOT (analysis tool) Python Cojac (detector viz.)/ IGUANA (cms viz tool) Architecture Analysis Client • Analysis Clients talk standard protocols to the “Grid Services Web Server”, a.k.a. the Clarens data/services portal. • Simple Web service API allows Analysis Clients (simple or complex) to operate in this architecture. • Typical clients: ROOT, Web Browser, IGUANA, COJAC • The Clarens portal hides the complexity of the Grid Services from the client, but can expose it in as much detail as req’d for e.g. monitoring. • Key features: Global Scheduler, Catalogs, Monitoring, and Grid-wide Execution service. Analysis Client • Discovery, • Acl management, • Certificate based access HTTP, SOAP, XML-RPC Clarens Grid Services Web Server Scheduler Catalogs Fully- Abstract Planner Metadata Sphinx RefDB MCRunjob Partially- Abstract Planner Virtual Data MonALISA ORCA Applications Data Management Chimera Monitoring MOPDB Replica FAMOS Fully- Concrete Planner BOSS ROOT POOL Grid Execution Priority Manager VDT-Server Grid Wide Execution Service

File_x File_x Peer 2 Peer System Discover services • Allow a “Peer-to-Peer” configuration to be built, with associated robustness and scalability features. • Discovery of Services • No Single point of failure • Robust file download Discover services Catalog Discover services Query for data Download file Client Find service (e.g. Catalog)

Data_1 Data_2 Data_1 Self Organizing Trend analysis Job scheduling Real time feedback Steering jobs, job feedback Autonomous Replica management Remove Trend analysis Trend analysis Real time feedback Real time feedback Replicate

Development(see for more details the talks on Clarens and Sphinx)

GAE Backbone: Clarens Service Framework • X509 Cert based access • Good Performance • Access Control Management • Remote File Access • Dyanamic Discovery of Services on a Global Scale • Available in Python and Java • Easy to install, as root or normal user, and part of DPE distribution. As root do: • wget -q -O - http://hepgrid1.caltech.edu/clarens/setup_clump.sh |sh • export opkg_root=/opt/openpkg • Interoperability with other web service environments such as Globus, through SOAP • Interoperability with MonALISA (Publication of service methods via MonALISA) Monitoring Clarens parameters Service publication

Clarens Cont.: 3rd party applications Service Client Client Client Client POOL catalog RefDB/PubDB (CERN) Boss Phedex (CERN) MCrunJob/MopDB (FNAL) Sphinx (Grid scheduler) (UFL) …… Python: dbsvr=Clarens.clarens_client('http://tier2c.cacr.caltech.edu:8080/clarens/') dbsvr.echo.echo('alive?') dbsvr.file.size('index.html') dbsvr.file.ls('/web/system','*.html') dbsvr.file.find(['//web'],'*','all') dbsvr.catalog.getMetaDataSpec('cat4') dbsvr.catalog.queryCatalog('cat4','val1 LIKE "%val%"','meta') dbsvr.refdb.listApplications('cms',0,20,'Simulation') Web server Grid Portal: Secure cert-based access to services through browser http/https • Java client • ROOT (analysis tool) • IGUANA (CMS viz. tool) • ROOT-CAVES client (analysis sharing tool) • CLASH (Giulio) • … any app that can make XML-RPC/SOAP calls other servers

PDA Clarens Grid Portals Job execution Catalog service Collaborative Analysis destop

MonALISA Integration Query repositories for monitor information Gather and publish access patterns on collections of data Publish web services information for discovery in other distribution systems

SPHINX: Grid scheduler • Data warehouse • Policies, account information, grid weather, resource properties and status, request tracking, workflows • Control process • Finite state machine • Different modules modify jobs, graphs, workflow • Flexible • Exstensible • Simple sanity checks • 120 canonical virtual data workflows submitted to US-CMS Grid • Round-robin strategy • Equally distribute work to all sites • Upper-limit strategy • Makes use of global information (site capacity) • Throttle jobs using just-in-time planning • 40% better throughput (given grid topology) VDT Client Sphinx Server Sphinx Client Chimera Virtual Data System Clarens Request Processing WS Backbone Condor-G/DAGMan Data Warehouse Data Management VDT Server Site Globus Resource Information Gathering Replica Location Service MonALISA Monitoring Service

CODESH • Virtual log-book for “shell” sessions • Parts can be local (private) or shared • Tracks environment variables, aliases etc during a session • Reproduce complete working sessions • First prototypes use popular tools: Python, ROOT and CVS; e.g. all ROOT commands and CAVES commands available • Three tier architecture: isolate client from back-end details; different implementations possible • Lightweight clients (use ROOT; C++; Python; e.g. CVS API) • Back-ends: e.g. CVS pservers (remote stores) with read/write access control; ARCH, Clarens etc • Optional MySQL servers for metadata (fast search for large data volumes) • More info: http://bourilko.home.cern.ch/bourilko/dpf04caves.ppt

Physics Analysis • Both Florida and Caltech moved DC04 data successfully from FNAL. • DC04 catalog and data distributed over several hosts and catalog available as web service • Within CODESH A Complex CMS ORCA example is available • ……

Services within GAE Chimera CERN PubDB POOL catalog Caltech/CERN BOSS CERN/Caltech/INFN CERN TMDB Refdb CERN/caltech Monte carlo processing service FNAL/Caltech FNAL MOPDB FNAL MCrunjob UFL Codesh Sphinx UFL Caltech MonaLisa SRM Service discovery SRB GROSS ACL management VO management =on wish list to become a service or to interoperate with this service =has javascript front end =accessible through a GAE web service File access =service being developed =Clarens core service

Deployment

The new Clarens distributions register automatically with MonaLisa. (Notice there are several entries for the same server representing different protocols) GAE testbed CACR CERN Pakistan Approximately 20+ installations world wide UCSD UK Conference User!

Querying for datasets Host 1 =service (2) Query for dataset Grid scheduler (2) Query for dataset Host 2 Refdb (3) Submit orca/root job(s) with dataset(s) for reconstruction/analysis runjob (1) Discover pool catalog, refdb, grid schedulers Host 3 Host 6 Client (2) Query for dataset Pool catalog (host 4 replica) Pool catalog (host 1 replica) Pool catalog (host 2 replica) Pool catalog (host 3 replica) Host 4 (1) Discover pool catalog, refdb, grid schedulers discovery discovery Pool Catalog Pool Catalog Pool Catalog Pool Catalog (2) Query for dataset Host 7 Host 5 TODO: integration of pubdb Refdb (replica) Client code has no knowledge about location of services, except for several urls for discovery services Multiple clients will query and submit jobs = not yet deployed

Farm 3 monitors Farm 1 Mona lisa Mona lisa Mona lisa Mona lisa BOSS BOSS Farm 2 PBS CondorG BOSS CondorG Scheduling Push Model =service (1) Submit orca/root job(s) with dataset(s) for reconstruction/analysis Push model has limitations once the system becomes resource limited (3) Submit job(s) SPHINX (2) Query resource status (2) Query resource status (2) Query resource status (2) Query resource status Pottentially BOSS can also be used for global job submission Farm 4 Uniform job submission layer BOSS PBS

Farm 3 monitors Farm 1 Mona lisa Mona lisa Mona lisa Mona lisa BOSS BOSS Farm 2 PBS CondorG BOSS CondorG Scheduling Pull Model =service (1) Submit orca/root job(s) with dataset(s) for reconstruction/analysis (3) pull job(s) Queue (2) Resources are available, give me a job (2) Resources are available, give me a job Combining push and pull to get better scalability Farm 4 Uniform job submission layer BOSS PBS

Client code and global manager have no knowledge about location of services, except for several urls for discovery services job submission job submission job submission local manager local manager local manager dcache dcache dcache MonaLisa global manager global manager global manager farm 1 farm 3 farm 2 catalog =service Similarity with other approaches (1) Discover a global manager (7) Submit job(s) discover client (2) Request session (dataset) (3) Discover catalog service Multiple clients query and submit jobs (4) Get list of farms that have this dataset (6) Allocate time (5) Reserve process time (5) Reserve process time (5) Reserve process time (10) Alive signal during processing (9) Data ready? (7) Report access statistics to MonaLisa (9) Data moved (7) Move data to nodes Job (8) Create job

SUMMARY

Lessons learned • Quality of (the) service(s) • Lot of exception handling needed for robust services (gracefully failure of services) • Time outs are important • Need very good performance for composite services • Discovery service enables location independent service composition. • Semantics of services are important (different name, name space, and/or WSDL) • Web service design: Not every application is developed with a web service interface in mind • Interfaces of 3rd party applications change: Rapid Application Development • Social engineering • Finding out what people want/need • Overlapping functionality of applications (but not the same interfaces!) • Not one single solution for CMS • Not every problem has a technical solution, conventions are also important

Future Work” • Integration of runjob into current deployment of services • Full chain of end to end analysis • Develop/deploy accounting service (ppdg activity) • Steering service • Autonomous replication • Trend analysis using monitor data • Improve exception handling • Integrate/interoperability mass storage (e.g. SRM) applications into/with Clarens environment

GAE Pointers • GAE web page: http://ultralight.caltech.edu/gaeweb/ • Clarens web page: http://clarens.sourceforge.net • Service descriptions: http://hepgrid1.caltech.edu/GAE/services/ • MonaLisa : http://monalisa.cacr.caltech.edu/ • SPHINX: http://www.griphyn.org/sphinx/Research/research.php

Frank van Lingen (fvlingen@caltech)