slide1 n.
Download
Skip this Video
Download Presentation
Jean-Yves Nief CC-IN2P3, Lyon

Loading in 2 Seconds...

play fullscreen
1 / 12

Jean-Yves Nief CC-IN2P3, Lyon - PowerPoint PPT Presentation


  • 116 Views
  • Uploaded on

BaBar Tier A @ CC-IN2P3. An example of a data access model in a Tier 1. Jean-Yves Nief CC-IN2P3, Lyon. Overview of BaBar @ CC-IN2P3 (I). CC-IN2P3: mirror site of Slac for BaBar since November 2001: real data. simulation data. ( total = 220 TB )

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Jean-Yves Nief CC-IN2P3, Lyon' - zamir


Download Now An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
slide1

BaBar Tier A @ CC-IN2P3

An example of a data access model in a Tier 1

Jean-Yves Nief

CC-IN2P3, Lyon

LCG Phase 2 Planning Meeting - Friday July 30th, 2004

overview of babar @ cc in2p3 i
Overview of BaBar @ CC-IN2P3 (I)
  • CC-IN2P3: mirror site of Slac for BaBar since November 2001:
    • real data.
    • simulation data.

(total = 220 TB)

  • Provides the infrastructure needed to analyze these data by the end users.
  • Open to all the BaBar physicists.

LCG Phase 2 Planning Meeting - Friday July 30th, 2004

overview of babar @ cc in2p3 ii

permanent

cache

Overview of BaBar @ CC-IN2P3 (II)
  • 2 types of data available:
    • Objectivity format (commercial OO database): giving it up.
    • Root format (ROOT I/O: Xrootd developped @ SLAC).
  • Hardware:
    • 200 GB tapes (type: 9940).
    • 20 tape drives (r/w rate = 20 MB/s).
    • 20 Sun servers.
    • 30 TB of disks (ratio disk/tape = 15%).

actually ratio ~30% (ignoring rarely accessed data)

LCG Phase 2 Planning Meeting - Friday July 30th, 2004

babar usage @ cc in2p3
BaBar usage @ CC-IN2P3
  • 2002 – 2004: ~ 20% of the CPU available (on a total of ~1000 CPUs available).
  • Up to 450-500 users’ jobs running in //
  • « Distant access » of the Objy and root files from the batch worker (BW):
    • random access to the files: only the objects needed by the client are transfered to the BW (~kB per request).
    • hundreds of connections per server.
    • thousands of requests per second.

LCG Phase 2 Planning Meeting - Friday July 30th, 2004

data access model

(6)

(5)

(4)

(etc…)

(etc…)

(3)

Master servers

(1)

(2)

Data access model

T1.root

HPSS

Data servers disks

Slave daemon:

Xrootd / Objy

(1) + (2): dynamic load balancing

(4) + (5): dynamic staging

(6): random access to the data

Master daemon:

Xrootd / Objy

Client

T1.root ?

LCG Phase 2 Planning Meeting - Friday July 30th, 2004

dynamic staging
Dynamic staging
  • Average file size: 500 MB.
  • Average staging time: 120 s.
  • When the system was overloaded (before dyn. load balancing era): 10-15 min delays (with only 200 jobs)

Up to 10k files from tape to disk cache / day

(150k staging requests/month!).

Max of 4 TB from tape to disk cache / day

LCG Phase 2 Planning Meeting - Friday July 30th, 2004

dynamic load balancing
Dynamic load balancing
  • Up and running since December 2003 for Objectivity (before a file could only be staged on a given server).
  • no more delayed jobs (even with 450 jobs in //).
  • more efficient management of the disk cache (entire disk space seen as a single file system).
  • fault tolerance in case of server crashes.

LCG Phase 2 Planning Meeting - Friday July 30th, 2004

slide8
Pros …
  • Mass Storage System (MSS) usage completly transparent for the end user.
  • No cache space management by the user.
  • Extremely fault tolerant (server crashes or during maintenance work).
  • Highly scalable + entire disk space efficiently used.
  • On the admin side: can choose your favourite MSS, favourite protocol to do the staging (Slac: pftp, Lyon: RFIO, ….).

LCG Phase 2 Planning Meeting - Friday July 30th, 2004

and cons
… and cons
  • Entire machinery relies on a lot of different components (especially a MSS).
  • In case of a very high demand on the client side  response time can be real slow.

But also depending on:

    • number of data sets available.
    • a good data structure.

LCG Phase 2 Planning Meeting - Friday July 30th, 2004

data structure the fear factor
Data structure: the fear factor
  • A performant data access model depends also on this.
  • Deep copies vs « pointers’ » files  (only containing pointers to other files) ?

LCG Phase 2 Planning Meeting - Friday July 30th, 2004

what about other experiments
What about other experiments ?
  • Xrootd well adapted for users’ jobs using ROOT to analyze a large dataset.
  • being included in the official version of ROOT.
  • already setup in Lyon and being used or tested by other groups: D0, EUSO and INDRA.
  • access to files stored in HPSS transparently.
  • no need to manage the disk space.

LCG Phase 2 Planning Meeting - Friday July 30th, 2004

summary
Summary
  • Storage and data access is the main challenge.
  • Good ratio disk/tape hard to find: depends on many factors (users, number of tape drives etc…).
  • Xrootd provides lots of interesting features for distant data access.

 extremely robust (great achievement for a distributed system).

LCG Phase 2 Planning Meeting - Friday July 30th, 2004