1 / 22

DØSAR a Regional Grid within DØ

DØSAR a Regional Grid within DØ. Jae Yu Univ. of Texas, Arlington. THEGrid Workshop July 8 – 9, 2004 Univ. of Texas at Arlington. The Problem. High Energy Physics Total expected data size is over 5 PB (5,000 inche stack of 100GB hard drives) for CDF and DØ

jenski
Download Presentation

DØSAR a Regional Grid within DØ

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. DØSAR a Regional Grid within DØ Jae Yu Univ. of Texas, Arlington THEGrid Workshop July 8 – 9, 2004 Univ. of Texas at Arlington

  2. The Problem • High Energy Physics • Total expected data size is over 5 PB (5,000 inche stack of 100GB hard drives) for CDF and DØ • Detectors are complicated  Need many people to construct and make them work • Collaboration is large and scattered all over the world • Allow software development at remote institutions • Optimized resource management, job scheduling, and monitoring tools • Efficient and transparent data delivery and sharing • Use the opportunity of having large data set in furthering grid computing technology • Improve computational capability for education • Improve quality of life

  3. CDF p DØ Tevatron p DØ and CDF at Fermilab Tevatron Chicago  • World’s Highest Energy proton-anti-proton collider • Ecm=1.96 TeV (=6.3x10-7J/p 13M Joules on 10-6m2) • Equivalent to the kinetic energy of a 20t truck at a speed 80 mi/hr

  4. DØ Collaboration 650 Collaborators 78 Institutions 18 Countries

  5. Centeralized Deployment Models Started with Lab-centric SAM infrastructure in place, … …transition to hierarchically distributed Model 

  6. Central Analysis Center (CAC) Normal Interaction Communication Path Occasional Interaction Communication Path …. RAC RAC ... … IAC IAC IAC IAC …. …. DAS DAS DAS DAS DØ Remote Analysis Model (DØRAM) Fermilab Regional Analysis Centers Institutional Analysis Centers Desktop Analysis Stations

  7. DØ Southern Analysis Region (DØSAR) • One of the regional grids within the DØGrid • Consortium coordinating activities to maximize computing and analysis resources in addition to the whole European efforts • UTA, OU, LTU, LU,SPRACE, Tata,KSU, KU, Rice, UMiss, CSF, UAZ • MC farm clusters – mixture of dedicated and multi-purpose, rack mounted and desktop, 10’s-100’s of CPU’s • http://www-hep.uta.edu/d0-sar/d0-sar.html

  8. KSU OU/LU KU Aachen Bonn Wuppertal UAZ Mainz Ole Miss UTA GridKa (Karlsruhe) LTU Rice Munich Mexico/Brazil DØRAM Implementation UTA is the first US DØRAC DØSAR formed around UTA

  9. UTA – RAC (DPCC) • 84 P4 Xeon 2.4GHz CPU = 202 GHz • 7.5TB of Disk space • 100 P4 Xeon 2.6GHz CPU = 260 GHz • 64TB of Disk space • Total CPU: 462 GHz • Total disk: 73TB • Total Memory: 168Gbyte • Network bandwidth: 68Gb/sec

  10. The tools • Sequential Access via Metadata (SAM) • Data replication and cataloging system • Batch Systems • FBSNG: Fermilab’s own batch system • Condor • Three of the DØSAR farms consists of desktop machines under condor • PBS • Most the dedicated DØSAR farms use this manager • Grid framework: JIM = Job Inventory Management • Provide framework for grid operation  Job submission, match making and scheduling • Built upon Condor-G and globus

  11. Project Managers Temp Disk Cache Disk MSS or Other Station MSS or Other Station File Storage Server Station & Cache Manager File Storage Clients File Stager(s) Data flow Control eworkers Operation of a SAM Station Producers/ /Consumers

  12. Tevatron Grid Framework (JIM) TTU UTA

  13. The tools cnt’d • Local Task managements • DØSAR • Monte Carlo Farm (McFarm) management  Cloned to other institutions • Various Monitoring Software • Ganglia resource • McFarmGraph: MC Job status monitoring • McPerM: Farm performance monitor • DØSAR Grid: Submit requests onto a local machine and the requests gets transferred to a submission site and executed at an execution site • DØGrid • Uses mcrun_job request script • More adaptable to a generic cluster

  14. Ganglia Grid Resource Monitoring Operating since Apr. 2003

  15. Job Status Monitoring: McFarmGraph Operating since Sept. 2003

  16. Farm Performance Monitor: McPerM Operating since Sept. 2003 Designed, implemented and improved by UTA Students

  17. DØSAR MC Delivery Stat. (as of May 10, 2004) D0 Grid/Remote Computing April 2004 Joel Snow Langston University

  18. DØSAR Computing & Human Resources

  19. Ded. Clst. Ded. Clst. Desktop. Clst. Desktop. Clst. Client Site Sub. Sites SAM How does current Tevatron MC Grid work? Global Grid Exe. Sites Regional Grids

  20. Actual DØ Data Re-processing at UTA

  21. Network Bandwidth Needs

  22. Summary and Plans • Significant progress has been made in implementing grid computing technologies for DØ experiment • DØSAR Grid has been operating since April, 2004 • Large amount of documents and expertise accumulated • Moving toward data re-processing and analysis • First set of 180million event partial reprocessing completed • Different level of complexity • Improved infrastructure necessary, especially network bandwidths • LEARN will boost the stature of Texas in HEP grid computing world • Started working with AMPATH, Oklahoma, Louisiana, Brazilian Consortia (Tentatively named the BOLT Network)  Need the Texan consortium • UTA’s experience on DØSARGrid will be an important asset to expeditious implementation of THEGrid

More Related