1 / 17

CDF Data Handling: Resource Management and Tests E.Buckley-Geer, S.Lammel, F.Ratnikov, T.Watts

CDF Data Handling: Resource Management and Tests E.Buckley-Geer, S.Lammel, F.Ratnikov, T.Watts. Hardware and Resources Organization of Data User View of Access to Data Batch queues Disk Management Tests. Mixed flavor unix cluster ( CPU resource ).

shing
Download Presentation

CDF Data Handling: Resource Management and Tests E.Buckley-Geer, S.Lammel, F.Ratnikov, T.Watts

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CDF Data Handling:Resource Management and TestsE.Buckley-Geer, S.Lammel, F.Ratnikov, T.Watts • Hardware and Resources • Organization of Data • User View of Access to Data • Batch queues • Disk Management • Tests CHEP2000 Paper 368

  2. Mixed flavor unix cluster (CPU resource). Fibre channel disk arrays on each node of cluster currently (disk resource). Tape drives and robot tape library (tape drives resource). Drives connected directly on each node. Concentrate in talk on resources during reading of data. Datasets, filesets, files of 1GB. Datasets: raw, primary, secondary… Tapes store a group of filesets. Associations in Datafile Catalog (see Paper 367). Hardware Resources; Organization of Data. CHEP2000 Paper 368

  3. UserView CHEP2000 Paper 368

  4. User View of Access to Data • Batch queues to manage cpu cycles • Access data only from disk, not tape. • Staging jobs in parallel. • Disk inventory manager package for shared disk space. • Batch queues to manage tape drives. CHEP2000 Paper 368

  5. LSF (Platform Computing) proposed. Fairshare scheduling. Combined quotas across queues desirable. CPU queues for analysis jobs: Allocate CPU cycles by group, by user, by special project. I/O queues for staging jobs: input, output, event pick. 1 tape drive per queue slot. I/O job cpu use is proportional to data volume. Allocate drives and data volume by group, user, project. Batch Queues CHEP2000 Paper 368

  6. Disk Management • By fileset (reduce bookkeeping overhead) • Allow static filesets for important datasets • Filesets remain on disk until space is needed. • Use-reservation prevents deletion of fileset. • Delete algorithm looks at frequency of use and time since last use-reservation. • Allocate space algorithm uses quotas by group and user. CHEP2000 Paper 368

  7. User gives dataset Dataset converted to list of filesets Stager manages list and returns next fileset when asked. Stager Part of User Job: Maintains small buffer of use-reservations to keep ahead of analysis job Adds use-reservations for filesets on disk or spawns input staging jobs to maintain buffer Releases use-reservations when fileset processed User Job and Disk Management CHEP2000 Paper 368

  8. Effects of Disk Management • Job processes filesets on disk first (different orders, different times) • Multiple jobs using same fileset share staging jobs • Fast analysis job gets multiple staging jobs • Only a fraction of a dataset is present on disk at one time (conserves disk space). CHEP2000 Paper 368

  9. Set of basic queues on workstation (LSF) Basic staging software Simulated analysis jobs which process dummy data Set of big and small dummy datasets Basic CDF Data Catalog software with contents for this simulation Purpose is to test ideas on resource management, and evaluate how analysis jobs interact in a resource limited environment. Prototype Tests CHEP2000 Paper 368

  10. Single cpu workstation, b0ib04 Staging disk 9 GB Filesets of size 0.5 GB 4 small datasets @ 1GB i.e. 10% of disk 4 large datasets @ 10 GB i.e. 100% of disk 2 cpu queues, short & long Analysis jobs with variable cpu time 4 execution slots for each cpu queue 2 simulated tape drives (2 slots in io queue) 1 real tape drive in Emass robot Prototype Scaled Down Environment CHEP2000 Paper 368

  11. Purpose: Investigate effect of patterns of use by collaboration (CDF “spin” jobs, repetitive small dataset jobs) Exercise data access features Choosing scenarios: A. Short vs long job competition B. Several jobs using same big dataset (CDF “spin” jobs) C. Competition for tape drives and disk space Simulation Scenarios CHEP2000 Paper 368

  12. One long job vs a stream of short jobs Three long jobs on same dataset, see figure Ten long jobs on same dataset Mixed set of different long jobs and users (6 jobs, 6 users, 4 datasets). Stream of short jobs vs 4 different long jobs The disk allowed 4 different big datasets to be processed together, as expected for this simulation. Extra staging jobs for the streams of short jobs occurred when expected (when contesting against 4 or more different big datasets). Some scenarios studied: CHEP2000 Paper 368

  13. Trial 45ThreeLongJobs CHEP2000 Paper 368

  14. Trial 29Stream of shortsvs 1 long, 1 short CHEP2000 Paper 368

  15. Conclusions from Prototype Tests • DIM/Stager worked well. • Stager functions appropriate, simple. • Gave guidance for full implementation (client/server structure, cleanup, admin functions) • Limited test of LSF (batch queues) worked well. CHEP2000 Paper 368

  16. Mock Data Challenge 1 • During December 99 and January 00, CDF successfully tested the movement of MC simulated data from the online Level 3 trigger farm of processors to the tape library, and through the offline reconstruction farm back to the tape library. • Many sub-groups were involved. • The resource management methods discussed here were implemented and used but will not be stressed until the rate tests of Challenge 2 in Spring 2000. CHEP2000 Paper 368

  17. Summary • Resource management methods were explained. • Prototype tests were extolled. • Full implementation of methods is underway. More tests to come. • CDF Engineering run occurs in August 00. CHEP2000 Paper 368

More Related