1 / 15

SAMGrid

SAMGrid. GridPP11 Liverpool Sept 2004 Gavin Davies Imperial College London. Tevatron Less data than LHC, but still PBs/experiment and growing Running experiments SAM (Sequential Access to Metadata) Well developed metadata and distributed data replication system

yoko
Download Presentation

SAMGrid

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. SAMGrid GridPP11 Liverpool Sept 2004 Gavin Davies Imperial College London GridPP11 Liverpool Sept04

  2. Tevatron Less data than LHC, but still PBs/experiment and growing Running experiments SAM (Sequential Access to Metadata) Well developed metadata and distributed data replication system Developed by DØ & FNAL-CD JIM (Job Information and Monitoring) handles job submission and monitoring (all but data handling) SAM + JIM →SAMGrid – computational grid Runjob handles job workflow management Introduction See http://cdinternal.fnal.gov/RUNIIRev2004/runIIMP.asp GridPP11 Liverpool Sept04

  3. SAMGrid Architecture GridPP11 Liverpool Sept04

  4. SAM plots (DØ usage) Up to 200TB/month Over 2 PB in last yr (DØ usage) • CDF usage now similar • have just topped the PB • Active SAM sites • 40 DØ, 26 CDF GridPP11 Liverpool Sept04

  5. SAMGrid-plots JIM: Active execution sites:11DØ, 1 CDF in testing http://samgrid.fnal.gov:8080/ (09/09/04) GridPP11 Liverpool Sept04

  6. SAMGrid plots GridPP11 Liverpool Sept04

  7. DØ – Production - MC • All DØ MC always produced off-site • SAMGrid now default (went into production in mar 04) • Based on request system and jobmanager-mc_runjob • MC software package retrieved via SAM • Currently running at (multiple) sites in Cz, Fr, UK, USA (10 in total + FNAL) • more on way, inc central farm • Average production efficiency ~90% • Average inefficiency due to grid infrastructure ~1-5% • For more details, see • GridPP10 DØ talk by Peter Love • http://www-d0.fnal.gov/computing/grid/deployment-issues.html GridPP11 Liverpool Sept04

  8. DØ – Production - Reprocessing • P14 Autumn 2003 • 25M events in UK • Based around mc_runjob • Distributed computing rather than Grid • UK effort key to project success • P17 Autumn 2004 • x 10 larger, use of db proxy servers • SAMGrid as default • Use LCG resources GridPP11 Liverpool Sept04

  9. DØ – Production - LCG • Increasing effort to ensure SAMGrid / LCG interoperability • MC generated on EDG/LCG and other shared resources (inc Imperial, RAL) “by hand” • Demo of sam_client functionality on LCG at London workshop in Apr • Will use LCG resources p17 data reprocessing All Nikhef MC produced this way GridPP11 Liverpool Sept04

  10. (DØ –) Runjob Runjob CDFRunjob CMSRunjob DØRunjob • mc_runjob currently used by SAMGrid for MC and reprocessing • DØrunjob - the rewrite • Joint CDF, CMS, DØ, FNAL-CD project • Base classes from common Runjob package • DØrunjob available this autumn • Will incorporate Sandbox as a separate module • For details see: http://projects.fnal.gov/runjob/ GridPP11 Liverpool Sept04

  11. CDF – production - I • See Mòrag Burgon-Lyon’s GridPP 10 talk for details • Goal 1: 25% of computing offsite by June 2004 • Done, using DCAF and SAM • DCAF = de-centralised CDF analysis farm, core of 7 sites, more on way • Goal 2: 50% by June 2005, using Grid • Resources being identified / pledged • JIM deployment • Originally planned for Oct 15th • Problematic, look at grid3 as possible alternative GridPP11 Liverpool Sept04

  12. CDF – production - II • Migration of DCAF sites to Condor • Migration to SAM V6 • Switch to new internal dbserve code under test • Roll out to global sites expected soon • FroNTier - new way to serve database contents to remote institutes • Should lower load on central CDF Oracle servers • Studying methods to lower load and avoid fragmentation on remote file servers due to simultaneous network writes GridPP11 Liverpool Sept04

  13. (CDF -) SAMTV • SAM TV used by CDF & DØ to monitor SAM and SAM stations • Currently created from log files • Version in dev created from MIS database, filled by new MIS server GridPP11 Liverpool Sept04

  14. Summary / plans DØ CDF • SAM & SAMGrid critical • GridPP key part of effort • SAMGrid, default for • MC production • Data reprocessing from autumn • Analysis to follow • dØ tools, dØrte, sandboxing • Interoperability • Good progress • 25% of computing off-site • Most with DCAF/SAM • GridPP effort key part of effort • Increase to 50% for June 2005 • More DCAF installations • Encourage user migration UKLight -10Gbit/s - “data –reprocessing” GridPP11 Liverpool Sept04

  15. Backup - I From Peter Love’s GridPP10 talk GridPP11 Liverpool Sept04

More Related