Cdf data handling resource management and tests e buckley geer s lammel f ratnikov t watts
1 / 17

CDF Data Handling: Resource Management and Tests E.Buckley-Geer, S.Lammel, F.Ratnikov, T.Watts - PowerPoint PPT Presentation

  • Uploaded on

CDF Data Handling: Resource Management and Tests E.Buckley-Geer, S.Lammel, F.Ratnikov, T.Watts. Hardware and Resources Organization of Data User View of Access to Data Batch queues Disk Management Tests. Mixed flavor unix cluster ( CPU resource ).

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
Download Presentation

PowerPoint Slideshow about 'CDF Data Handling: Resource Management and Tests E.Buckley-Geer, S.Lammel, F.Ratnikov, T.Watts' - shing

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Cdf data handling resource management and tests e buckley geer s lammel f ratnikov t watts
CDF Data Handling:Resource Management and TestsE.Buckley-Geer, S.Lammel, F.Ratnikov, T.Watts

  • Hardware and Resources

  • Organization of Data

  • User View of Access to Data

  • Batch queues

  • Disk Management

  • Tests

CHEP2000 Paper 368

Hardware resources organization of data

Mixed flavor unix cluster (CPU resource).

Fibre channel disk arrays on each node of cluster currently (disk resource).

Tape drives and robot tape library (tape drives resource). Drives connected directly on each node.

Concentrate in talk on resources during reading of data.

Datasets, filesets, files of 1GB.

Datasets: raw, primary, secondary…

Tapes store a group of filesets.

Associations in Datafile Catalog (see Paper 367).

Hardware Resources; Organization of Data.

CHEP2000 Paper 368

User view

CHEP2000 Paper 368

User view of access to data
User View of Access to Data

  • Batch queues to manage cpu cycles

  • Access data only from disk, not tape.

  • Staging jobs in parallel.

  • Disk inventory manager package for shared disk space.

  • Batch queues to manage tape drives.

CHEP2000 Paper 368

Batch queues

LSF (Platform Computing) proposed.

Fairshare scheduling.

Combined quotas across queues desirable.

CPU queues for analysis jobs:

Allocate CPU cycles by group, by user, by special project.

I/O queues for staging jobs: input, output, event pick.

1 tape drive per queue slot.

I/O job cpu use is proportional to data volume.

Allocate drives and data volume by group, user, project.

Batch Queues

CHEP2000 Paper 368

Disk management
Disk Management

  • By fileset (reduce bookkeeping overhead)

  • Allow static filesets for important datasets

  • Filesets remain on disk until space is needed.

  • Use-reservation prevents deletion of fileset.

  • Delete algorithm looks at frequency of use and time since last use-reservation.

  • Allocate space algorithm uses quotas by group and user.

CHEP2000 Paper 368

User job and disk management

User gives dataset

Dataset converted to list of filesets

Stager manages list and returns next fileset when asked.

Stager Part of User Job:

Maintains small buffer of use-reservations to keep ahead of analysis job

Adds use-reservations for filesets on disk or spawns input staging jobs to maintain buffer

Releases use-reservations when fileset processed

User Job and Disk Management

CHEP2000 Paper 368

Effects of disk management
Effects of Disk Management

  • Job processes filesets on disk first (different orders, different times)

  • Multiple jobs using same fileset share staging jobs

  • Fast analysis job gets multiple staging jobs

  • Only a fraction of a dataset is present on disk at one time (conserves disk space).

CHEP2000 Paper 368

Prototype tests

Set of basic queues on workstation (LSF)

Basic staging software

Simulated analysis jobs which process dummy data

Set of big and small dummy datasets

Basic CDF Data Catalog software with contents for this simulation

Purpose is to test ideas on resource management, and evaluate how analysis jobs interact in a resource limited environment.

Prototype Tests

CHEP2000 Paper 368

Prototype scaled down environment

Single cpu workstation, b0ib04

Staging disk 9 GB

Filesets of size 0.5 GB

4 small datasets @ 1GB i.e. 10% of disk

4 large datasets @ 10 GB i.e. 100% of disk

2 cpu queues, short & long

Analysis jobs with variable cpu time

4 execution slots for each cpu queue

2 simulated tape drives (2 slots in io queue)

1 real tape drive in Emass robot

Prototype Scaled Down Environment

CHEP2000 Paper 368

Simulation scenarios


Investigate effect of patterns of use by collaboration (CDF “spin” jobs, repetitive small dataset jobs)

Exercise data access features

Choosing scenarios:

A. Short vs long job competition

B. Several jobs using same big dataset (CDF “spin” jobs)

C. Competition for tape drives and disk space

Simulation Scenarios

CHEP2000 Paper 368

Some scenarios studied

One long job vs a stream of short jobs

Three long jobs on same dataset, see figure

Ten long jobs on same dataset

Mixed set of different long jobs and users (6 jobs, 6 users, 4 datasets).

Stream of short jobs vs 4 different long jobs

The disk allowed 4 different big datasets to be processed together, as expected for this simulation.

Extra staging jobs for the streams of short jobs occurred when expected (when contesting against 4 or more different big datasets).

Some scenarios studied:

CHEP2000 Paper 368

Trial 45 three long jobs
Trial 45ThreeLongJobs

CHEP2000 Paper 368

Trial 29 stream of shorts vs 1 long 1 short
Trial 29Stream of shortsvs 1 long, 1 short

CHEP2000 Paper 368

Conclusions from prototype tests
Conclusions from Prototype Tests

  • DIM/Stager worked well.

  • Stager functions appropriate, simple.

  • Gave guidance for full implementation (client/server structure, cleanup, admin functions)

  • Limited test of LSF (batch queues) worked well.

CHEP2000 Paper 368

Mock data challenge 1
Mock Data Challenge 1

  • During December 99 and January 00, CDF successfully tested the movement of MC simulated data from the online Level 3 trigger farm of processors to the tape library, and through the offline reconstruction farm back to the tape library.

  • Many sub-groups were involved.

  • The resource management methods discussed here were implemented and used but will not be stressed until the rate tests of Challenge 2 in Spring 2000.

CHEP2000 Paper 368


  • Resource management methods were explained.

  • Prototype tests were extolled.

  • Full implementation of methods is underway. More tests to come.

  • CDF Engineering run occurs in August 00.

CHEP2000 Paper 368