‘Real World’ issues from DC04

‘Real World’ issues from DC04 • DC04: • Trying to operate the CMS computing system at 25Hz for one month • We are three days in! • We are using components that are ready NOW • Even if it’s not politically correct • Often using several different approaches for comparison • This talk: concentrates on data management issues • ‘Real World’ issues that have come up during DC04 preparation • Stuff that is not (yet) well covered by the available tools • I know that… • Some issues may be application problems, not middleware ones • Some issues may be covered by components under development • Some issues may be self-inflicted injuries Dave Newbold, University of Bristol GridPP Middleware Meeting

Directed data transfer • Data management ‘type I’: replica management • The (automatic?) movement of data products to where they are needed; managing relevant system and application metadata • Best-effort optimisation of data location in response to dynamic workload needs • Well-covered by current and future middleware • Data transfer ‘type II’: bulk data management • The predictable straight(ish) ‘production line’ of data flow • Detector -> DAQ -> Buffer -> Reco farm -> T1 -> MSS -> calib -> … • Requirements are different to replica management • Robustness and reliability paramount (raw data is the ‘crown jewels’) • Throughput is very important: ‘best effort’ is not good enough • Not explicitly addressed by current middleware products • Data distribution is explicitly ‘directed’ by policy • ‘Seeds’ the replica mangement system from the Tier-1’s. Dave Newbold, University of Bristol GridPP Middleware Meeting

Directed data transfer • Our current solution • Cooperating system of simple ‘agents’ at Tier-0 and Tier-1 • They communicate only through a shared (Oracle) DB • They have little or no state - it’s all held in the central DB • Could this be useful as generic middleware? • Other related issues: • Lack of a single consistent interface to MSS (in Europe and US) makes life difficult (being addressed?) • There are very many failure modes in the data management system that we must think of… • Would be good to factorise out the problems of failing storage components by having the MSS ‘remap’ our data when required • Predict at least one disk failure per day somewhere in DC04 Dave Newbold, University of Bristol GridPP Middleware Meeting

Data transfer tools • Need low-level transfer tools that: • Log what is going on! (We have ad-hoc solutions here for DC04) • Adjust policy automatically for optimum throughput according to network conditions • Fail gracefully when something is wrong at an end-point • Play nice with firewalls, etc • NB: performance is not currently the problem, but the tools are… • Checksumming • We would like a system that performs fast file-level checksum of data ON THE DISK • No, TCP checksum does not catch all errors • Silent disk problems, filesystem errors, NFS problems, etc etc • Checksumming data from MSS after-the-fact is very difficult • Would also like: • Some SIMPLE means of distributed, authenticated, atomic, reliable message-passing between agents over the Grid • With a command-line level API for scripting Dave Newbold, University of Bristol GridPP Middleware Meeting

Other issues… • Small files! • They seem to be inevitable, but play havoc with efficiency: • Huge lists of files in catalogues • Not dealt with efficiently by MSS, transfer tools, etc • Basic unit of information management: data produced by one MC, reco, filter job during its run (with unique GUID) • Do not want to make jobs too long… (too much state in the system) • Can aggregation help? Perhaps, but we need the tools • Metadata • Currently a ‘hot topic’? • How to handle efficient distribution of system- and user-level metadata? • Which metadata are immutable after creation? Which need to be distributed widely? How to handle schema extension on per-user basis? Dave Newbold, University of Bristol GridPP Middleware Meeting

‘Real World’ issues from DC04