The ARDA Project Between Grid middleware and LHC experiments Juha Herrala ARDA Project GridPP 10 th Collaboration Meeting, CERN, 04 June 2004.

EGEE is a project funded by the European Union under contract IST-2003-508833

The ARDA ProjectBetween Grid middlewareand LHC experimentsJuha HerralaARDA

GridPP 10th Collaboration Meeting, CERN, 04 June 2004

  • Introduction to the LCG ARDA Project

    • History and mandate

    • Relation to EGEE project

    • ARDA prototypes

    • Relation to Regional Centres

  • Towards ARDA prototypes


  • Coordination and forum activities

    • Workshops and meetings

  • Conclusions and Outlook

GridPP@CERN, 04 June 2004

How arda evolved

How ARDA evolved

  • LHC Computing Grid (LCG) project’s Requirements and Technical Assessment Group (RTAG) for distributed analysis presented their ARDA report in November 2003.

    • ARDA = Architectural Roadmap for Distributed Analysis

    • Defined a set of collaborating Grid services and their interfaces

  • As a result the ARDA project was launched by LCG

    • ARDA = A Realisation of Distributed Analysis

    • Purpose is to coordinate different activities in the development of distributed analysis systems of the LHC experiments, which will be based on the new service-oriented Grid middleware and infrastructure.

  • But the generic Grid middleware is developed by the EGEE project

    • Sometimes ARDA became also a synonym for this “second generation” Grid middleware, which was later (May 2004) renamed to Glite.

    • Generic = no significant functionality that is of interest for HEP or any other science/community alone.

GridPP@CERN, 04 June 2004

Our starting point mandate recommendations of the arda working group

Our starting point / mandateRecommendations of the ARDA working group

  • New service decomposition

    • Strong influence of Alien system

      • the Grid system developed by the ALICE experiments and used by a wide scientific community (not only HEP)

  • Role of experience, existing technology…

    • Web service framework

  • Interfacing to existing middleware to enable their use in the experiment frameworks

  • Early deployment of (a series of) end-to-end prototypes to ensure functionality and coherence

    • Middleware as a building block

    • Validation of the design

LCG ARDA project

GridPP@CERN, 04 June 2004

  • LCG strongly linked with middleware developed/deployed in EGEE (continuation of EDG)

  • The core infrastructure of the EGEE Grid operation service will grow out of the LCG service

    • LCG includes many US and Asian partners

    • EGEE includes other sciences

    • Substantial part of infrastructure common to both

  • Parallel production lines

    • LCG-2 production Grid

      • 2004 data challenges

    • Pre-production prototype

      • EGEE/Glite MW

      • ARDA playground for the LHC experiments


GridPP@CERN, 04 June 2004

Arda prototypes

ARDA prototypes

  • Support LHC experiments to implement their end-to-end analysis prototypes based on the EGEE/Glite middleware

    • ARDA will equally support each of the LHC experiments

    • Close collaboration with data analysis teams, ensuring end-to-end coherence of the prototypes

    • One prototype per experiment

  • Role of ARDA

    • Interface with the EGEE middleware

    • Adapt/verify components of analysis environments of the experiments (robustness/many users, performance/concurrent “read” actions)

    • A Common Application Layer may emerge in future

    • Feedback from the experiments to the middleware team

  • Final target beyond the prototype activity: sustainable distributed analysis services for the four experiments deployed at LCG Regional Centres.

GridPP@CERN, 04 June 2004

Arda @ regional centres

ARDA @ Regional Centres

  • Regional Centres have valuable practical experience and know how

    • Understand “deployability” issues, which is a key factor for (EGEE/Glite) middleware success

    • Data base technologies

    • Web Services

  • Some Regional Centres will have the responsibility to provide early installation for the middleware

    • EGEE Middleware test bed

    • Pilot sites might enlarge the resources available and give fundamental feedback in terms of “deployability” to complement the EGEE SA1

  • Running ARDA pilot installations

    • ARDA test bed for analysis prototypes

    • Experiment data available where the experiment prototype is deployed

  • Stress and performance tests could be ideally located outside CERN

    • Experiment-specific components (e.g. a Meta Data catalogue) which might be used by the ARDA prototypes

    • Exploit local know how of the Regional Centres

  • Final ARDA goal: sustainable analysis service for LHC experiments

GridPP@CERN, 04 June 2004

Arda project team

Massimo Lamanna

Birger Koblitz

Derek Feichtinger

Andreas Peters

Dietrich Liko

Frederik Orellana

Julia Andreeva

Juha Herrala

Andrew Maier

Kuba Moscicki

Andrey Demichev

Viktor Pose

Wei-Long Ueng

Tao-Sheng Chen

ARDA project team





  • Experiment interfaces

  • Piergiorgio Cerello (ALICE)

  • David Adams (ATLAS)

  • Lucia Silvestris (CMS)

  • Ulrik Egede (LHCb)



GridPP@CERN, 04 June 2004

Towards arda prototypes

Towards ARDA prototypes

  • Existing systems as starting point

    • Every experiment has different implementations of the standard services

    • Used mainly in production environments

    • Now more emphasis on analysis

GridPP@CERN, 04 June 2004

Prototype activity

Prototype activity

  • Provide a fast feedback to the EGEE MW development team

    • Avoid uncoordinated evolution of the middleware

    • Coherence between users’ expectations and final product

  • Experiments may benefit from the new MW as soon as possible

    • Frequent snapshots of the middleware available

    • Expose the experiments (and the community in charge of the deployment) to the current evolution of the whole system

    • Experiments’ systems are very complex and still evolving

  • Move forward towards new-generation real systems (analysis!)

    • A lot of work (experience and useful software) is invested in current data challenges of the experiments, which makes them a concrete starting point

    • Whenever possible adapt/complete/refactorise the existing components: we do not need yet another system!

  • Attract and involve users

    • Prototypes with realistic workload and conditions, thus real users from LHC experiments required!

GridPP@CERN, 04 June 2004

Prototype activity

  • The initial prototype will have a reduced scope of functionality

    • Currently components are selection for the first prototype

  • Not all use cases/operation modes will be supported

    • Every experiment has a production system (with multiple backends, like PBS, LCG, G2003, NorduGrid, …).

    • We focus on end-user analysis on a EGEE MW based infrastructure

  • Informal Use Cases are still being defined, e.g. a generic analysis case:

    • A physicist selects a data sample (from current Data Challenges)

    • With an example/template as starting point (s)he prepares a job to scan the data

    • The job is split in sub-jobs, dispatched to the Grid, some error-recovery is automatically performed if necessary, and finally merged back in a single output

    • The output (histograms, ntuples) is returned together with simple information on the job-end status

GridPP@CERN, 04 June 2004

Towards arda prototypes1

Towards ARDA prototypes

  • LHCb - ARDA

GridPP@CERN, 04 June 2004

  • GANGA as a principal component

    • Friendly user interface for Grid services

  • The LHCb/GANGA plans match naturally with the ARDA mandate

    • Goal is to enable physicists (via GANGA) to analyse the data being produced during 2004 for their studies

    • Have the prototype where the LHCb data will be the key (CERN, RAL, …)

  • At the beginning, the emphasis will be focused on

    • Usability of GANGA

    • Validation of the splitting and merging functionality of users jobs

  • The DIRAC system is also an important component

    • LHCb grid system, used mainly in production so far

    • Useful target to understand the detailed behaviour of LHCb-specific grid components, like the file catalog.

  • Convergence between DIRAC and GANGA anticipated.

GridPP@CERN, 04 June 2004

Ganga gaudi athena and grid alliance

GANGAGaudi/Athena aNd Grid Alliance

  • Gaudi/Athena: LHCb/ATLAS frameworks

    • The Athena uses Gaudi as a foundation

  • Single “desktop” for a variety of tasks

  • Help configuring and submitting analysis jobs

  • Keep track of what they have done, hiding completely all technicalities

    • Resource Broker, LSF, PBS, DIRAC, Condor

    • Job registry stored locally or in the roaming profile

    • Automate config/submit/monitor procedures

  • Provide a palette of possible choices and specialized plug-ins (pre-defined application configurations, batch/grid systems, etc.)

  • Friendly user interface (CLI/GUI) is essential

    • GUI Wizard Interface

      • Help users to explore new capabilities

      • Browse job registry

    • Scripting/Command Line Interface

      • Automate frequent tasks

      • python shell embedded into the Ganga GUI













GAUDI Program



Internal Model





Grid Services














GridPP@CERN, 04 June 2004

Arda contribution to ganga

ARDA contribution to GANGA

  • Release management procedure established

    • Software process and integration

      • Testing, bug fix releases, tagging policies, etc.

    • Infrastructure

      • Installation, packaging etc.

    • ARDA team member in charge

  • Integration with job managers/resource brokers

    • Waiting for the EGEE middleware, we developed an interface to Condor

    • Use of Condor DAGMAN for splitting/merging and error recovery capability

  • Design and development in next future

    • Integration with EGEE middleware

    • Command Line Interface

    • Evolution of Ganga features

GridPP@CERN, 04 June 2004

Cern taiwan tests on lhcb metadata catalogue

CERN/Taiwan tests on LHCb metadata catalogue


  • Clone BookkeepingDB in Taiwan

  • Install the WS layer

  • Performance Tests

    • Database I/O Sensor

    • Bookkeeping Server performance tests

      • Taiwan/CERN Bookkeeping Server DB

      • XML-RPC Service performance tests

      • CPU Load, Network send/receive sensor, Process time

    • Client Host performance tests

      • CPU Load, Network send/receive sensor, Process time

  • Feedback to LHCb metadata catalogue developers

Network monitor

Bookkeeping Server

Virtual Users

  • CPU Load

  • Network

  • Processtime

  • Web & XML-RPC Service performance tests

  • CPU Load

  • Network

  • Process time


Bookkeeping Server

Oracle DB


  • DB I/O Sensor

Oracle DB

GridPP@CERN, 04 June 2004

Towards arda prototypes2

Towards ARDA prototypes


GridPP@CERN, 04 June 2004



  • ATLAS has a relatively complex strategy for distributed analysis, addressing different areas with specific projects

    • Fast response (DIAL)

    • User-driven analysis (GANGA)

    • Massive production with multiple Grids, etc…

    • For additional information see the ATLAS Distributed Analysis (ADA) site:

  • The ATLAS system within ARDA has been agreed

    • Starting point is the DIAL service model for distributed interactive analysis; users will be exposed to different user interface (GANGA)

  • The AMI metadata catalog is a key component in ATLAS prototype

    • mySQL as a back end

    • Genuine Web Server implementation

    • Robustness and performance tests from ARDA

  • In the start up phase, ARDA provided some assistance in developing production tools

GridPP@CERN, 04 June 2004

Ami studies in arda






AMI studies in ARDA

Studied behaviour using many concurrent clients:

  • Atlas Metadata- Catalogue, contains File Metadata:

    • Simulation/Reconstruction-Version

    • Does not contain physical filenames

  • Many problems still open:

    • Large network traffic overhead due to schema independent tables

    • SOAP Web Services proxy supposed to provide DB access

      • Note that Web Services are “stateless” (not automatic handles to have the concept of session, transaction, etc…): 1 query = 1 (full) response

    • Large queries might crash server

    • Shall SOAP front-end proxy re-implement all the database functionality?

  • Good collaboration in place with ATLAS-Grenoble

GridPP@CERN, 04 June 2004

Towards arda prototypes3

Towards ARDA prototypes


GridPP@CERN, 04 June 2004







  • Strategy:

    • The ALICE-ARDA will evolvethe analysis system presentedat SuperComputing 2003 ‘Grid-enabled PROOF’

  • Where to improve:

    • Heavily connected with the middleware services

    • “Inflexible” configuration

    • No chance to use PROOF on federated grids like LCG

    • User libraries distribution

  • Activity on PROOF

    • Robustness

    • Error recovery

Site A

Site B





Site C



GridPP@CERN, 04 June 2004

Improved proof system



Improved PROOF system

  • Original problem: no support for hierarchical Grid infrastructure, only local cluster mode.

  • The remote proof slaves looklike a local proof slave onthe master machine

  • Booking service is usable also on local clusters

Proxy rootd

Grid Services

Proxy proofd



GridPP@CERN, 04 June 2004

Towards arda prototypes4

Towards ARDA prototypes

  • CMS - ARDA

GridPP@CERN, 04 June 2004

Summaries of successful jobs




T0 worker







what has


GDB castor












  • The CMS system within ARDA is still under discussion

  • Provide easy access (and possibly sharing) of data for the CMS users is a key issue in discussions

CMS DC04production

GridPP@CERN, 04 June 2004

Cms refdb


  • Potential starting point for the prototype

  • Bookkeeping engine to plan and steer the production across different phases (simulation, reconstruction, to some degree into the analysis phase)

  • Contains all necessary information except file physical location (RLS) and info related to the transfer management system (TMDB)

  • The actual mechanism to provide these data to analysis users is under discussion

  • Measuring performances underway (similar philosophy as for the LHCb Metadata catalog measurements)

GridPP@CERN, 04 June 2004

  • Coordination and forum activities

    • Workshops and meetings

GridPP@CERN, 04 June 2004

Coordination and forum activities

Coordination and forum activities

  • Forum activities are seen as ‘fundamental’ in the ARDA project definition

    • ARDA will channel information to the appropriate recipients, especially to analysis-related activities and projects outside the ARDA prototypes

    • Ensures that new technologies can be exposed to the relevant community

  • ARDA should organise a set of regular meetings

    • Aim is to discuss results, problems, new/alternative solutions and possibly agree on some coherent program of work. Workshop every three months.

    • The ARDA project leader organises this activity which will be truly distributed and lead by the active partners

  • ARDA is embedded in EGEE

    • NA4, Application Identification and Support

  • Special relation with LCG

    • LCG GAG is a forum for Grid requirements and use cases

    • Experiments representatives coincide with the EGEE NA4 experiments representatives

GridPP@CERN, 04 June 2004

Workshops and meetings

Workshops and meetings

  • 1st ARDA workshop

    • January 2004 at CERN; open

    • Over 150 participants

  • 2nd ARDA workshop “The first 30 days of EGEE middleware”

    • June 21-23 at CERN; by invitation

    • Expected 30 participants

  • EGEE NA4 Meeting mid July

    • NA4/JRA1 (middleware) and NA4/SA1 (Grid operations) sessions

    • Organised by M. Lamanna and F. Harris

  • 3rd ARDA workshop

    • Currently scheduled for September 2004 close to CHEP; open

GridPP@CERN, 04 June 2004

Next arda workshop the first 30 days of the egee middleware

Next ARDA workshop“The first 30 days of the EGEE middleware”

  • CERN: 21-23 of June 2004

    • Exceptionally by invitation only

  • Monday, June 21

    • ARDA team / JRA1 team

    • ATLAS (Metadata database services for HEP experiments)

  • Tuesday, June 22

    • LHCb (Experience in building web services for grid)

    • CMS (Data management)

  • Wednesday, June 23

    • ALICE (Interactivity on the Grid)

    • Close out

  • Info on the web:


GridPP@CERN, 04 June 2004

  • Conclusions

GridPP@CERN, 04 June 2004

Conclusions and outlook

Conclusions and Outlook

  • LCG ARDA has started

    • Main objective: experiment prototypes for analysis

    • EGEE/Glite middleware becoming available

    • Good feedback from the LHC experiments

    • Good collaboration within EGEE project

    • Good collaboration with Regional Centres. More help needed.

  • Main focus

    • Prototyping distributed analysis systems of LHC experiments.

    • Collaborate with the LHC experiments, the EGEE middleware team and the Regional Centres to set up the end-to-end prototypes.

  • Aggressive schedule

    • Milestone for the first end-to-end prototypes is already December 2004.

GridPP@CERN, 04 June 2004

