1 / 31

SAMGrid:JIM and CDF Development

SAMGrid:JIM and CDF Development. Rick St. Denis, University of Glasgow. CDF Accepts the Need for the Grid Requirements How to Meet the Need Status of SAMGrid for CDF. Spokespersons’ Requirements for CDF. Maximize physics output @ low Lumi L3 output rate: 80 -> 360Hz by 06.

eclemons
Download Presentation

SAMGrid:JIM and CDF Development

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. SAMGrid:JIM and CDF Development Rick St. Denis, University of Glasgow • CDF Accepts the Need for the Grid • Requirements • How to Meet the Need • Status of SAMGrid for CDF GridPP 9th Collaboration Meeting

  2. Spokespersons’ Requirements for CDF • Maximize physics output @ low Lumi • L3 output rate: 80 -> 360Hz by 06 CDF needs the Grid Director’s review, International Finance Committee: 50% computing outside FNAL CDFGrid supported by FNAL PAC GridPP 9th Collaboration Meeting

  3. Scale of CDF Requirements 6-7 sites, 100Duals each, by 2006 + 700 @FNAL GridPP 9th Collaboration Meeting

  4. CDF Computing Model • Develop Analysis on desktop • Access to all CDF data from anywhere • Large scale processing on batch clusters • Submission from anywhere • interactive tools: ls,top,head/tail/cat • Output to scratch space or desktop Implemented Now with CAF GridPP 9th Collaboration Meeting

  5. Use Cases for Summer 2004 • User Level MC Production • All CDF Users have access • No data on site -> SAM write • User Level Data Access • All users have access • Selected samples on site: Full SAM Support SAM Essential for Summer 2004 GridPP 9th Collaboration Meeting

  6. Medium Term Vision • Many Sites • Fully transparent submission to all of CDF resources: 75% FNAL, 25% outside • Fully transparent input and output of data GridPP 9th Collaboration Meeting

  7. Summer 04 Functionality • User selects submission site, saying what dataset they will use • System checks they can do this (privileges) • User access with SAM/dCache • User registers output with SAM GridPP 9th Collaboration Meeting

  8. October 04 • To extend beyond 25% outside computing JIM is essential: JIM Test for CDF June04, production October 04 • HOWEVER: It already seems that the 25% resources are not sufficient for the produciton passes: will want JIM earlier. GridPP 9th Collaboration Meeting

  9. CDF Grid from a User Perspective CDFGrid from a User Perspective CAF Gui/CLI CAF Gui/CLI Uses SAM Uses SAM Uses SAM AC++ Outside Lab Only Fermilab Grid Grid Italy Toronto Korea Taiwan FermiCAF UK GridPP 9th Collaboration Meeting

  10. CDF Grid Strategy • 25% of CDF Computing from external resources. All CDF computing on CDF Grid by April 15: Utilize resources fully controlled by CDF: Kerberos/fbsng: dCAF + SAM • October 15, 2004: JIM to capture shared resources • June 2005: 50% of Computing resources external GridPP 9th Collaboration Meeting

  11. Anywhere @ each site Simple JIM Desktop Globus GK CAF Submitter SAM Station Private LAN WN @regional centers Condor Submitter Private LAN dCache @FNAL June 2004 testing SAM DB Condor Matchmaker June 2005 required GridPP 9th Collaboration Meeting

  12. data meta-data job Flow of: User Interface User Interface User Interface User Interface Submission Submission Global Job Queue Resource Selector Grid Client Match Making Info Gatherer Info Collector Global DH Services SAM Naming Server Site Cache MSS SAM Log Server Cluster Data Handling Resource Optimizer Local Job Handling Info Manager Grid Gateway SAM Station (+other servs) SAM DB Server Web Serv MDS Local Job Handler (CAF, D0MC, BS, ...) RC MetaData Catalog Grid Monitoring SAM Stager(s) Info Providers Bookkeeping Service JIM Advertise User Tools XML DB server Worker Nodes Dist.FS Site Conf. Glob/Loc JID map AAA Site Site Site ... Detailed JIM GridPP 9th Collaboration Meeting

  13. Meeting the Needs • Progress in SAM • JIM Status • RunJob • CDFGridWorkshop: “Nerd’s Paradise” • Strict Project Management and process to respond to operational issues GridPP 9th Collaboration Meeting

  14. Progress in SAM • Dbserver, the database server between applications and Oracle, was upgraded to use a common schema for CDF and D0. • All CDF data files are in SAM • Sam in is in beta testing on the CDF CAF (1200 cpus): passed 20TB/Day delivery • Minos uses SAM for its Data Handling • Steve Mrenna (Phenomenology) depositing ALPGEN files in SAM for common CDF/D0 use. GridPP 9th Collaboration Meeting

  15. JIM Deployment Issues Communication with the expert! Focus: • 200 jobs each getting 200 files generated 120000 requests simultaneously to the DBServer! • Sensible sam: reliability went to 60%. Now add retries. Training Users • D0 has D0Tools: Big script; determines where user is and copies files: harder to get into a sandbox; • CAF conditions users! Distribution and compatibility: • This has made great strides with SAM, now time for JIM GridPP 9th Collaboration Meeting

  16. RunJob • Dedicated farms at FNAL will go away and RunJob will be used for production processing of data • CDF will use RunJob for MC production • Dave Evans worked for CDF for 2 mo.: has made CDFRunJob based on RunJob(Shakar), a tool common to CMS. Morag will work on this. GridPP 9th Collaboration Meeting

  17. Florida workshop: Now 20! • 11 installations in about 2 hours. Integrated with dCAF in 2 cases in 2 days. • 3 in Asia, 4 in Europe • 6 sites committed to summer 2004 usage of their facilities for all of CDF (mostly MC) • Sam installation now: initsam cdf <stationname> • Follow-up on April 1. • Each site has a local user support person to reduce load on core development team. • Generally: Security ate 80% of the effort! GridPP 9th Collaboration Meeting

  18. GridPP 9th Collaboration Meeting

  19. Florida Workshop: After 2 Days GridPP 9th Collaboration Meeting

  20. 2TB/Day: Karlsruhe GridPP 9th Collaboration Meeting

  21. CDF Dcache on CAF ALL CDF on CAF reads 20TB/Day GridPP 9th Collaboration Meeting

  22. GridPP 9th Collaboration Meeting

  23. GridPP 9th Collaboration Meeting

  24. Dcache and SAM • Dcache shapes traffic into disk: If a SAM cache is large, need to use Dcache instead of nfs mounts • Dcache gives the user what is requested. 1TB gets same priority as 1GB: CDF users must send email requesting data to be staged. • SAM examines consumption rate before staging next files – No EMAIL needed. • SAM uses Dcache for its Caching at FNAL. • This needs further work with SRM GridPP 9th Collaboration Meeting

  25. SAMGrid Management Sam Management Team Sam Project Leaders Sam Technical Leaders Sam Design Sam Operations And Projects GridPP 9th Collaboration Meeting

  26. SamGrid Development Process Chaired by Project Leaders Chaired by Technical Managers SAMGrid Design SAMGrid Operations/Projects Issue Raised Grid Deliverables SAMGrid Management Team Subproject GridPP 9th Collaboration Meeting

  27. Subproject Organization • Each Subproject has a subproject leader (SPL) responsible for making a plan and reporting progress. • Each Subproject has one of the Technical leaders evaluating against an assessment template. • No deliverable requires more than 3mo work to deliver. GridPP 9th Collaboration Meeting

  28. SubProject Assessment Template • Background Documents • Project Definition/Mission Statement • Deliverables and timetable • Inter-project deliverables • Project status • Challenges and Critical Path Items • Lessons Learned • Project specific comments, alternate views GridPP 9th Collaboration Meeting

  29. Housekeeping MC / Reconstruction Housekeeping SAMGrid Assigned SubProjects Work FlowPackage MCRequest H Stream for CDF JIM:MCD0 Test Harness Retire CDF Replica Catalog Database Server Rewrite Database Servers toLinux User analysis Apps JIM:D0Tools Configuration Management Caching Infrastructure Metadata Query with configurable Params Common API GridPP 9th Collaboration Meeting

  30. Status of Assessments • Subprojects defined • Interviews conducted on about ½ • Assessment reports being written GridPP 9th Collaboration Meeting

  31. Conclusions • CDF has embraced the need for the Grid to achieve its physics mission • Progress in deployment, robustness testing has SAM in CDF • JIM is rapidly solving its problems • … with the help of a review and management process GridPP 9th Collaboration Meeting

More Related