90 likes | 116 Views
This proposal addresses the challenges of managing massive data in the BABAR project through a collaborative Grid system involving Tier A and B sites. Users can seamlessly access and process data, submit jobs, and retrieve results across multiple nodes. The proposal aims to create a unified system for data distribution, management, and job submission, leading to a streamlined computing process. Targets include a proof-of-principle demonstrator in 12 months and a production-quality system integrated into BABAR within 1-3 years.
E N D
A BABAR InterGrid Testbed Proposal for discussion Robin Middleton/Roger Barlow Rome: October 2001
Background • BABARexists • Already many millions of events, many Terabytes of data • “Not data challenges but challenging data’’ • Data increasing faster than SLAC computer center can buy computers • Must use distributed computing model • Tier A sites: SLAC and IN2P3, RAL… • Tier B sites at universities • Users will want to use such sites in a unified environment ‘as if’ they were working at SLAC. • Solution: the Grid
Many aspects • Data distribution • Smart network copying, tape archiving, etc • Data management • Multiple copies, reprocessed data, selection of data, full and DST formats, with and without Objectivity • Job submission • “Run this job on this data” without specifying details This proposal tackles Number 3
The sites MAN ? RAL SLAC IN2P3 ? ? ?
Use Case • User prepares binary as usual • Specify data to run on with Metadata description • Press ‘Go’ button • Wait for output
Behind the scenes • Data split into many jobs running on many nodes (possibly at several sites) • Take jobs to the data: transfer binaries (if necessary), control files, environment • Run jobs, monitor, restart(?) failures… • Collate output files
Issues • Mutual recognition of • certificates • accounts • Common environment at sites • DLLs • Databases • How to match jobs to nodes (‘Want ads’) • What to specify • Application to • Reprocessing • MC production • User analysis
Who’s doing what • Work within grids is proceeding • E.g. Manchester <-> RAL within GridPP • BABAR computing effort specifically assigned to the problem • UK-French discussion ongoing • Aim of this InterGrid project is to bring all this together in one homogenous system
Targets • Proof-of-principle demonstrator in 12 months leading to • Production quality system becoming part of BABAR way of life in 1-3 years