1 / 8

R-GMA and DØ

R-GMA and DØ. Iain Bertram RAL 13 May 2004 Thanks to Jeff Templon at Nikhef. Background. DØ uses SAM as its Datagrid ( http://projects.fnal.gov/samgrid/ ) All official MC production carried out off-site I.e. not at FNAL Store in SAM

denali
Download Presentation

R-GMA and DØ

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. R-GMA and DØ Iain Bertram RAL 13 May 2004 Thanks to Jeff Templon at Nikhef - Iain Bertram

  2. Background • DØ uses SAM as its Datagrid • (http://projects.fnal.gov/samgrid/) • All official MC production carried out off-site • I.e. not at FNAL • Store in SAM • Carried out significant fraction of data reprocessing off-site • Access and store data in SAM - Iain Bertram

  3. DØ and EDG/LCG • Nikhef group have implemented submission of DØ jobs on LCG • MC production • Data reconstruction • Notes from Jeff Templon. • caveat: Jeff is the expert. I am not! Therefore I may have trouble answering questions (my technical experts are at the 4 corners of the globe…). - Iain Bertram

  4. Monitoring using RGMA • From within python script: • worker_node = socket.getfqdn()site = worker_node[string.find(worker_node,'.')+1:]jstabl.set_val('site',site)jstabl.set_val('start_time',start_time)cmdline = string.join(sys.argv)jstabl.set_val('command',cmdline)jstabl.insert() • Under the hood: R-GMA (EDG product) • Can easily replace as long as don’t require more than “set_val” and “insert” … R-GMA has SQL like structure - Iain Bertram

  5. J. Templon Comments • It was useful not to worry about details of where servers, youCommands such as • "DEFINE TABLE" and "INSERT" or "LATEST SELECT". • R-GMA looked like a giant distributed database. • The SQL model worked well for what we wanted to do. • The down side is that the archiver process is not ready for prime time.  • It never stays up for more than a few days at a time, and it often dies in a way that fools the babysitting script into thinking that it is still alive. • This of course is deadly. • (the thing that sucks in the published records from jobs, and puts them in a database) - Iain Bertram

  6. LCG/EDG Problems • Single Storage Machine => bottleneck • “WP5” SEs • Traffic Jams • R-GMA not really stable until end December • Couldn’t submit jobs • Missed monitoring records • Software distribution reliable but extremely inefficient • Poor submission command throughput - Iain Bertram

  7. Plans • All MC and data production will be running on SAM computational grid by summer • MC by June 1 • Data reprocessing scheduled for later this year. • FNAL DØ farm will move to SAM-grid. • Plan to support interfaces to LCG for this processing • Runjob will interface directly to LCG - Iain Bertram

  8. Needs • Database Proxy Servers • Need to access trigger/calibration issues • Oracle database • The DB proxy design is in principle generic being based on CORBA (Common Request Broker Architecture) which wraps the sql queries. A two-stage cache is used: RAM and disk space of which the size is configurable, e.g. the cache sizes we currently have configured are in the order of a couple GBs. • Interface between SE and SAM? • Can store our files directly to SAM from LCG site - Iain Bertram

More Related