1 / 20

Proto-GRID at Tevatron a personal view

Proto-GRID at Tevatron a personal view. Stefano Belforte INFN-Trieste. Proto – GRID at Tevatron. Tevatron means now 2 experiments: CDF and D0 Running experiments started ~15 years ago. Now is Run2. Started Run2 with ~same structure as Run1 it works, don’t fix it !

doctor
Download Presentation

Proto-GRID at Tevatron a personal view

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Proto-GRID at Tevatrona personal view Stefano Belforte INFN-Trieste Proto-GRID at Tevatron

  2. Proto – GRID at Tevatron • Tevatron means now 2 experiments: CDF and D0 • Running experiments started ~15 years ago. Now is Run2. • Started Run2 with ~same structure as Run1 • it works, don’t fix it ! • Run2 : data = 10xRun1, 5 years later, a piece of cake ? • Instead: cpu needs = 1000 x Run1 • solution: 10x from technology, 100x from brute force (linux farms) • Evolution toward full fledged GRIDs natural and in progress • can’t wait for LCG tools to be production quality • Hence a “proto-GRID”: building most of GRID functionalities with simpler tools, explore the fastest ways, test effectiveness, user’s response, cost/benefit ratio.. • Biggest contribution to LCG will be from our experience, not our designing. So I will only talk about what I really know: CDF and in particular data analysis. Proto-GRID at Tevatron

  3. CDF situation • Motivated by a vast data sample, and unexciting code performance, CDF is going to gather computing resources from “anywhere” “The CDF-GRID” • A project started recently • From “enabling remote institutions to look at some data” • To “integrate remote resources in a common framework” • Data Handling is at the core and is now a joint project CDF-D0 • SAM = SequentialAccess through Metadata • Moves data around (fast and safely, CRC check enforced) and manages local disk cache while keeping track of locations and associates data and metadata (files to datasets primarily) • users process hundreds of files in a single job Proto-GRID at Tevatron

  4. Why the CDF-GRID ? Physics !! • CDF is increasing DAQ “rate to tape” • Event compression: x2 rate in same bandwidth • Increased bandwidth: 20  60MBytes/sec by 2006 • Main motivation is B physics • Get the most out of the Tevatron • Increase and saturate L1/L2/L3/DAQ bandwidth • Many analysis (Bs e.g.) statistic limited • This doubles the (already large) needs for analysis computing: • Cpu • Disk • Tape drives • Analysis computing by far the single most expensive item in CDF computing: ~50% of total cost (2~3 M$/year vs. 1.5 available from Fermilab) Proto-GRID at Tevatron

  5. Convergence toward a GRID • Resources from Fermilab not enough any more • All MC must be done offsite • At least 50% of users analysis has to be done offsite, I.e. at least all the hadronic-B sample • Thanks to SVT, lots of data to do beautiful physics • Huge sample whose size is independent of Tevatron Luminosity • Collaborating institutions want to do more at home • Have more money, and/or computers are cheaper and/or want to spend more locally… • Want to tap on local “SuperComputer centers”: CM,UCSD • Want to tap on emerging LHC-oriented computing centers • Want independence, resource control • At last is possible: WAN not a bottleneck any more • no data has moved on tape in/out of FNAL in run2 Proto-GRID at Tevatron

  6. What to do on the GRID • Reconstruction: limited need, one site enough, mostly logistic/bookkeeping and code robustness problem • rare bugs (1/10^6 events) slow down farm significantly • Monte Carlo: not a very difficult problem, relatively easy to do offsite, centralized/controlled activity, best to limit to a few sites • just a matter of money • User’s Analysis: the most demanding both as resources and as functional requirements, needs to reach everybody everywhere and be easy, fast, solid, effective • still, the most rewarding for users • main topic of following discussion • DataBase: too seldom forgotten. At the heart of everything there is a DB that keeps track of all. Very difficult, very unexciting and unrewarding to work on. Proto-GRID at Tevatron

  7. User’s analysis on the GRID • Very challenging • Have to cope immediately and effectively with • Authentication/Authorization • Hundreds of users: fair share, priorities, quotas, short lived data (a user produce little data at a time, but does it again and again), scratch areas, access to desktops… • Immediate response, robustness, easy of usage, diagnostics • Why does my job not run after 1hour, 5 hours, 2 days ? • Why did my job crash ? It was running on my desktop ! • Full cycle optimization, no point in making ntuple fast if desktop can not process them fast • Need to do it across many sites: “the CDF-GRID” Proto-GRID at Tevatron

  8. Starting point: CAF = CDF Analysis Farm tape My Desktop • Compile/link/debug everywhere • Submit from everywhere • Execute on the CAF • Submission of N parallel jobs with single command • Access local data from CAF disks • Access tape data via transparent cache • Get job output everywhere • Store small output on local scratch area for later analysis • Access to scratch area from everywhere • IT WORKS NOWat FNAL, in Italy and elsewhere • ~ 10 CAF’s all around the world My favorite Computer FNAL out job Enstore Log ftp rootd gateway scratchserver dCache N jobs out SAM GridFtp INFN dCache NFS rootd Local Data servers A pile of PC’s Proto-GRID at Tevatron

  9. The VISION beyond many CAF’s • Develop/debug application on desktop anywhere in the world • Submit to CDF-GRID specifying usual CAF stuff and “dataset” • Data are (pre)fetched if/as needed from central repository • DH take care of striping datasets across physical volumes for optimal performance, load balancing, fault tolerance etc. • User’s output data are also stored by DH on limited,recycled disk space for each user, backup on request, cataloguing, storing and associating metadata are also provided • Interactive grid to provide fast (~GB/sec) root access to final data • Organize GRID around virtual analysis centers, not just “regional centers”, each site has a copy of one (or more) datasets and support everybody’s analysis on those. • More efficient than everyone has a piece of many datasets • Force collaboration: you run my jobs, I run yours Proto-GRID at Tevatron

  10. From Vision to Reality: POLITICS • Recently CDF International Finance Committee received a proposal from the collaboration: • Move offsite 50% of the foreseen analysis load • ~equivalent to ~0.5~1 M$ contribution every year • Require 6 sites tied in a CDF-GRID providing at least ~100 dual cpu servers and ~20TB disk each • Candidates: italy, uk, germany, ucsd, canada,… • Good response by committee, no money committed yet, but most of that hardware is already in the planning. • Means we will build the CDF-GRID and try to get more hardware in it as we go along • Financing bodies accept the idea that e.g. hardware bought in Italy for INFN physicists can be expanded and shared with everybody Proto-GRID at Tevatron

  11. Software • CAF at FNAL, the basic brick • dCAF’s: CAF’s clones around the world • SAM • Data management on the WAN scale • Metadata and Data File catalog: • Datasets = documented file collections, handled as a single unit • no tcl files with hundreds/thousands of file names • dCache: our best (and only) solution for serving up to ~100TB to ~1000nodes without hitting NFS limits • JIM • Job brokering across many farms • From kerberos to x509 for authentication • PEAC (proof enabled analysis cluster) (proof = parallel root) • CPU need is a series of temporal “spikes”, how to get the CPU ? • Piggy back on top of large batch farm, allow high priority proof to suck ~10% of total time, handing a set of nodes to each user who will accept ~1% duty cycle. Proto-GRID at Tevatron

  12. PEAC • Initiate sessions in “minutes”, perform queries in “seconds” • 1GB/5sec “proofed” with 10 nodes (demonstrated on real Bs sample) Proto-GRID at Tevatron

  13. Status • What works (extremely well): • CAF + dCAF’s • SAM for data import • dCache at Fermilab • What is still in progress • Usage of dCache outside FNAL • Integration SAM/dCache • Tapeless, redundant, dCache pool • Friendly tools to manage user’s data in/out of SAM/dCache/enstore • JIM • How are we doing • Reasonably well • Too slow (as usual) but 2004 will be the year of the CDF-GRID Proto-GRID at Tevatron

  14. What we are learning • Real world means 1st priority is authentication/authorization • Can’t use any tool that does not have a solid and easy to use authentication method now • Do not try to outguess/outsmart users • Do not look for complete automatization, expect some intelligence from users: shall I run this MC at FNAL or at FZKA? • Beware providing a tool that they will not use • Be prepared for success: it started just to see if it works, and now none can live without it, even if it is ugly and not “ready”, when will we do cleanup and documentation ? • Do not only look at usage patterns in the past/present, try to imagine what they will be with new tool, try to figure out how it will affect the daily work of the student who is doing analysis, our real and only customer • Give the users abundant monitor/diagnosing tools, let them figure out by themselves why their jobs crash (the dCAF provide top, ls and tail of log files, and gdb) Proto-GRID at Tevatron

  15. Example: Who needs a resource broker ? • The Vision: submit your job to the grid, the grid will look for resources, run it, bring back the result “asap” • The Reality: real world is complex, some information just is not on the web. Very difficult to automate educated decisions, • Farm at site A is full now, but … • I have friends there who will let me jump ahead in priority • most jobs are from my colleague X and I know they are going to fail • most jobs are from my students and I will ask them to kill them • Farm in country B is free, but … • I know my colleague Y is preparing a massive MC who will swamp it for weeks starting tonight • Data I want are not cached in farm C now, it will take longer if I run there, but… • I know I will run on those data again next week • because that farm has lots of cpu, or … • What is the point in giving users something that is not as good ? Proto-GRID at Tevatron

  16. More learning • Sites are managed by people: opinions differ • Security concerns are different at different places • Most sites will not relinquish ownership of system: • Have to make software work on different environments rather then imposing environment on users, can not distribute system installation. Let local sysman deal with security patches, ssh version, default compiler etc. • Live with firewall, nodes on private network, constraints on node names, user name, sharing of computer farm with other experiments (CDF and D0 can not both have a user “sam” on the same cluster) • Lots of sites do not have full time sys.managers dedicated to CDF • Experiment software installation, operation, upgrade should not require system privileges, must be doable by users • This includes much of the CDF-GRID infrastructure • SAM and CAF are operated by non-privileged users Proto-GRID at Tevatron

  17. Conclusion • Never forget that users already have a way to do analysis • It maybe ackward and slow, but it works • User’s priority is to get results, not to experiment tools • New tools have to provide significant advantages • CDF is using a bottom up approach in which we introduce grid elements w/o breaking the current working (although saturated) system, looking for just those tools that make analysis easier and letting users decide if new tools are better then old ones. • This makes CDF a less cutting-edge place on the technical standpoint, but an excellent testing grounds for the effectiveness and relative priority of various grid components. • We are a physics driven collaboration ! • Software improvement is graded in “time-to-publication” • We hope LCG learns something from us, while we try to incorporate new tools from them Proto-GRID at Tevatron

  18. Spare/additional slides Proto-GRID at Tevatron

  19. Hardware • CDF computing cost Proto-GRID at Tevatron

  20. Politics details • CDF has recently reviewed (internally) the possibility of upgrading the system (called CSL) that writes data to disk online. It presently peaks at 20MB/s The recommended upgrade would be capable of writing up to 40 MB/s and eventually 60 MB/s to disk. The main physics goals associated with this upgrade are to strengthen the B physics program associated with the silicon vertex trigger (SVT), developed largely by Italy (Ristori et al.) and collaboration from the US. The SVT has been very successful and we continue to plan on how to best exploit this novel resource. The yield of charm and bottom is limited by the trigger and the rate we write data to disk. • CDFGrid Proposal CDF will pursue the increased bandwidth upgrade. This upgrade will increase the charm and bottom physics program of CDF while maintaining the full high transverse momentum program at the highest luminosity. We will pursue a GRID model of computing and are asking our international colleagues to participate in building a world-wide network for CDF analysis. Each country would be welcome to contribute what is practical. Discussions are under way with the Fermilab CD for support for a local GRID team that will facilitate the plan. It is envisioned that this work will be beneficial to CDF and LHC experiments. Making LHC software and CDF/Fermilab software more GRID friendly is expected to require a large effort. Proto-GRID at Tevatron

More Related