310 likes | 328 Views
Explore requirements, strategies, and progress in SAMGrid and JIM technology for CDF research, enhancing data access, processing capabilities, and user experiences in a collaborative academic environment.
E N D
SAMGrid:JIM and CDF Development Rick St. Denis, University of Glasgow • CDF Accepts the Need for the Grid • Requirements • How to Meet the Need • Status of SAMGrid for CDF GridPP 9th Collaboration Meeting
Spokespersons’ Requirements for CDF • Maximize physics output @ low Lumi • L3 output rate: 80 -> 360Hz by 06 CDF needs the Grid Director’s review, International Finance Committee: 50% computing outside FNAL CDFGrid supported by FNAL PAC GridPP 9th Collaboration Meeting
Scale of CDF Requirements 6-7 sites, 100Duals each, by 2006 + 700 @FNAL GridPP 9th Collaboration Meeting
CDF Computing Model • Develop Analysis on desktop • Access to all CDF data from anywhere • Large scale processing on batch clusters • Submission from anywhere • interactive tools: ls,top,head/tail/cat • Output to scratch space or desktop Implemented Now with CAF GridPP 9th Collaboration Meeting
Use Cases for Summer 2004 • User Level MC Production • All CDF Users have access • No data on site -> SAM write • User Level Data Access • All users have access • Selected samples on site: Full SAM Support SAM Essential for Summer 2004 GridPP 9th Collaboration Meeting
Medium Term Vision • Many Sites • Fully transparent submission to all of CDF resources: 75% FNAL, 25% outside • Fully transparent input and output of data GridPP 9th Collaboration Meeting
Summer 04 Functionality • User selects submission site, saying what dataset they will use • System checks they can do this (privileges) • User access with SAM/dCache • User registers output with SAM GridPP 9th Collaboration Meeting
October 04 • To extend beyond 25% outside computing JIM is essential: JIM Test for CDF June04, production October 04 • HOWEVER: It already seems that the 25% resources are not sufficient for the produciton passes: will want JIM earlier. GridPP 9th Collaboration Meeting
CDF Grid from a User Perspective CDFGrid from a User Perspective CAF Gui/CLI CAF Gui/CLI Uses SAM Uses SAM Uses SAM AC++ Outside Lab Only Fermilab Grid Grid Italy Toronto Korea Taiwan FermiCAF UK GridPP 9th Collaboration Meeting
CDF Grid Strategy • 25% of CDF Computing from external resources. All CDF computing on CDF Grid by April 15: Utilize resources fully controlled by CDF: Kerberos/fbsng: dCAF + SAM • October 15, 2004: JIM to capture shared resources • June 2005: 50% of Computing resources external GridPP 9th Collaboration Meeting
Anywhere @ each site Simple JIM Desktop Globus GK CAF Submitter SAM Station Private LAN WN @regional centers Condor Submitter Private LAN dCache @FNAL June 2004 testing SAM DB Condor Matchmaker June 2005 required GridPP 9th Collaboration Meeting
data meta-data job Flow of: User Interface User Interface User Interface User Interface Submission Submission Global Job Queue Resource Selector Grid Client Match Making Info Gatherer Info Collector Global DH Services SAM Naming Server Site Cache MSS SAM Log Server Cluster Data Handling Resource Optimizer Local Job Handling Info Manager Grid Gateway SAM Station (+other servs) SAM DB Server Web Serv MDS Local Job Handler (CAF, D0MC, BS, ...) RC MetaData Catalog Grid Monitoring SAM Stager(s) Info Providers Bookkeeping Service JIM Advertise User Tools XML DB server Worker Nodes Dist.FS Site Conf. Glob/Loc JID map AAA Site Site Site ... Detailed JIM GridPP 9th Collaboration Meeting
Meeting the Needs • Progress in SAM • JIM Status • RunJob • CDFGridWorkshop: “Nerd’s Paradise” • Strict Project Management and process to respond to operational issues GridPP 9th Collaboration Meeting
Progress in SAM • Dbserver, the database server between applications and Oracle, was upgraded to use a common schema for CDF and D0. • All CDF data files are in SAM • Sam in is in beta testing on the CDF CAF (1200 cpus): passed 20TB/Day delivery • Minos uses SAM for its Data Handling • Steve Mrenna (Phenomenology) depositing ALPGEN files in SAM for common CDF/D0 use. GridPP 9th Collaboration Meeting
JIM Deployment Issues Communication with the expert! Focus: • 200 jobs each getting 200 files generated 120000 requests simultaneously to the DBServer! • Sensible sam: reliability went to 60%. Now add retries. Training Users • D0 has D0Tools: Big script; determines where user is and copies files: harder to get into a sandbox; • CAF conditions users! Distribution and compatibility: • This has made great strides with SAM, now time for JIM GridPP 9th Collaboration Meeting
RunJob • Dedicated farms at FNAL will go away and RunJob will be used for production processing of data • CDF will use RunJob for MC production • Dave Evans worked for CDF for 2 mo.: has made CDFRunJob based on RunJob(Shakar), a tool common to CMS. Morag will work on this. GridPP 9th Collaboration Meeting
Florida workshop: Now 20! • 11 installations in about 2 hours. Integrated with dCAF in 2 cases in 2 days. • 3 in Asia, 4 in Europe • 6 sites committed to summer 2004 usage of their facilities for all of CDF (mostly MC) • Sam installation now: initsam cdf <stationname> • Follow-up on April 1. • Each site has a local user support person to reduce load on core development team. • Generally: Security ate 80% of the effort! GridPP 9th Collaboration Meeting
Florida Workshop: After 2 Days GridPP 9th Collaboration Meeting
2TB/Day: Karlsruhe GridPP 9th Collaboration Meeting
CDF Dcache on CAF ALL CDF on CAF reads 20TB/Day GridPP 9th Collaboration Meeting
Dcache and SAM • Dcache shapes traffic into disk: If a SAM cache is large, need to use Dcache instead of nfs mounts • Dcache gives the user what is requested. 1TB gets same priority as 1GB: CDF users must send email requesting data to be staged. • SAM examines consumption rate before staging next files – No EMAIL needed. • SAM uses Dcache for its Caching at FNAL. • This needs further work with SRM GridPP 9th Collaboration Meeting
SAMGrid Management Sam Management Team Sam Project Leaders Sam Technical Leaders Sam Design Sam Operations And Projects GridPP 9th Collaboration Meeting
SamGrid Development Process Chaired by Project Leaders Chaired by Technical Managers SAMGrid Design SAMGrid Operations/Projects Issue Raised Grid Deliverables SAMGrid Management Team Subproject GridPP 9th Collaboration Meeting
Subproject Organization • Each Subproject has a subproject leader (SPL) responsible for making a plan and reporting progress. • Each Subproject has one of the Technical leaders evaluating against an assessment template. • No deliverable requires more than 3mo work to deliver. GridPP 9th Collaboration Meeting
SubProject Assessment Template • Background Documents • Project Definition/Mission Statement • Deliverables and timetable • Inter-project deliverables • Project status • Challenges and Critical Path Items • Lessons Learned • Project specific comments, alternate views GridPP 9th Collaboration Meeting
Housekeeping MC / Reconstruction Housekeeping SAMGrid Assigned SubProjects Work FlowPackage MCRequest H Stream for CDF JIM:MCD0 Test Harness Retire CDF Replica Catalog Database Server Rewrite Database Servers toLinux User analysis Apps JIM:D0Tools Configuration Management Caching Infrastructure Metadata Query with configurable Params Common API GridPP 9th Collaboration Meeting
Status of Assessments • Subprojects defined • Interviews conducted on about ½ • Assessment reports being written GridPP 9th Collaboration Meeting
Conclusions • CDF has embraced the need for the Grid to achieve its physics mission • Progress in deployment, robustness testing has SAM in CDF • JIM is rapidly solving its problems • … with the help of a review and management process GridPP 9th Collaboration Meeting