1 / 13

CASTOR: CERN’s data management system

CASTOR: CERN’s data management system. CHEP03 25/3/2003 Ben Couturier, Jean-Damien Durand, Olof Bärring CERN. Introduction. CERN Advanced STORage Manager Hierarchical Storage Manager used to store user and physics files Manages the secondary and tertiary storage History

shelby
Download Presentation

CASTOR: CERN’s data management system

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CASTOR: CERN’s data management system CHEP03 25/3/2003 Ben Couturier, Jean-Damien Durand, Olof Bärring CERN

  2. Introduction • CERN Advanced STORage Manager • Hierarchical Storage Manager used to store user and physics files • Manages the secondary and tertiary storage • History • Development started in 1999 based on SHIFT, CERN's tape and disk management system since beginning of 1990s (SHIFT was awarded the 21st Century Achievement Award by Computerworld in 2001) • In production since the beginning of 2001 • Currently holds more than 9 million files and 2000 TB of data • http://cern.ch/castor/ CASTOR: CERN's data management system

  3. Main Characteristics (1) • CASTOR Namespace • All files belong to the “/castor” hierarchy • The rights are standard UNIX rights • POSIX Interface • The files are accessible through a standard POSIX interface, all calls are rfio_xxx (e.g. rfio_open, rfio_close…) • RFIO Protocol • All remote file access done using the Remote File IO protocol, developed at CERN. CASTOR: CERN's data management system

  4. Main Characteristics (2) • Modularity • The components in CASTOR have well defined roles and interfaces, it is possible to change a component without affecting the whole system • Highly Distributed System • CERN uses a very distributed configuration with many disk servers/tape servers. • Can also run in more limited environment • Scalability • The number of disk servers, tape servers, name servers… is not limited • Use of RDBMS (Oracle, MySQL) to improve the scalability of some critical components CASTOR: CERN's data management system

  5. Main Characteristics (3) • Tape drive sharing • A large number of drives can be shared between users or dedicated to some users/experiments • Drives can be shared with other applications: with TSM, for example • High Performance Tape Mover • Use of threads and circular buffers • Overlaid device and network I/O • Grid Interfaces • A GridFTP daemon interfaced with CASTOR is currently in test • A SRM Interface (V1.0) for CASTOR has been developed CASTOR: CERN's data management system

  6. Hardware Compatibility • CASTOR runs on: • Linux, Solaris, AIX, HP-UX, Digital UNIX, IRIX • The clients and some of the servers run on Windows NT/2K • Supported drives • DLT/SDLT, LTO, IBM 3590, STK 9840, STK9940A/B (and old drives already supported by SHIFT) • Libraries • SCSI Libraries • ADIC Scalar, IBM 3494, IBM 3584, Odetics, Sony DMS24, STK Powderhorn CASTOR: CERN's data management system

  7. CASTOR Components • Central servers • Name Server • Volume Manager • Volume and Drive Queue Manager (Manages the volume and drive queues per device group) • UPV (Authorization daemon) • “Disk” subsystem • RFIO (Disk Mover) • Stager (Disk Pool Manager and Hierarchical Resource Manager) • “Tape” Subsystem • RTCOPY daemon (Tape Mover) • Tpdaemon (PVR) CASTOR: CERN's data management system

  8. TPDAEMON (PVR) CASTOR Architecture CUPV VDQM server NAME server RFIO Client VDQM server NAME server STAGER RTCPD RTCPD (TAPE MOVER) RFIOD (DISK MOVER) VOLUME manager MSGD DISK POOL CASTOR: CERN's data management system

  9. CASTOR Setup at CERN • Disk servers • ~ 140 disk servers • ~ 70 TB of staging pools • ~ 40 stagers • Tape drives and servers • Libraries • 2 sets of 5 Powderhorn silos (2 x 27500 cartridges) • 1 Timberwolf (1 x 600 cartridges) • 1 L700 (1 x 600 cartridges) CASTOR: CERN's data management system

  10. Evolution of Data in CASTOR CASTOR: CERN's data management system

  11. Tape Mounts per group CASTOR: CERN's data management system

  12. Tape Mounts per drive type CASTOR: CERN's data management system

  13. ALICE Data Challenge • Migration rate of 300 MB/s sustained for a week • Using 18 STK T9940B drives • ~ 20 disk servers managed by 1 stager • A separate name server was used for the data challenge • See presentation of Roberto Divia CASTOR: CERN's data management system

More Related