1 / 14

HEPiX FSWG Progress Report (Phase 2)

HEPiX FSWG Progress Report (Phase 2). Andrei Maslennikov November 2007 – St. Louis. Summary. Reminder: raison d’être Active members Workflow June 2007 - October 2007 Phase 2 revelations Plans for Phase 3 Discussion. Reminder: raison d’être. Commissioned by IHEPCCC in the end of 2006

blackv
Download Presentation

HEPiX FSWG Progress Report (Phase 2)

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. HEPiX FSWG Progress Report (Phase 2) Andrei Maslennikov November 2007 – St. Louis

  2. Summary • Reminder: raison d’être • Active members • Workflow June 2007 - October 2007 • Phase 2 revelations • Plans for Phase 3 • Discussion

  3. Reminder: raison d’être • Commissioned by IHEPCCC in the end of 2006 • Officially supported by the HEP IT managers • The goal is to review the available file system solutions and storage access methods, and to divulge the know-how among HEP organizations and beyond • Timescale : Feb 2007 – April 2008 • Milestones: 2 progress reports (Spring 2007, Fall 2007), 1 final report (Spring 2008)

  4. Active members • Currently we have 22 people on the list, but only these 17 participated in conference calls and/or actually did something during the phase 2: CASPUR A.Maslennikov (Chair), M.Calori (Web Master) CEA J-C.Lafoucriere CERN B.Panzer-Steindel DESY M.Gasthuber, P.van der Reest FZK J.van Wezel IN2P3 L.Tortay INFN G.Donvito, V. Sapunenko LAL M.Jouvin NERSC/LBL C.Whitney RAL N.White RZG H.Reuter SLAC R.Melen, A.May U.Edinburgh G.A.Cowan • Held 9 phone conferences during this phase

  5. Workflow June – October 2007 • Started to look at the previously reduced set of data access solutions • Are concentrating only on two classes of data areas: • Shared Home Directories, currently: AFS, NFS • Large Scalable Shared Areas suitable for batch farms, currently: GPFS, Lustre, dCache, CASTOR, HPSS, Xrootd • Will be mentioning disk/tape migration means and underlying hardware • Collected information is being archived in a newly established technology tracking web site • Try to evidentiate the issues common for all architectures and capture the general trends

  6. HEPiX Storage Technology Web Site • Consultable at http://hepix.caspur.it/storage • Meant as a storage reference site for HEP • Not meant to become yet another storage Wikipedia • Requires time, is being filled on the best effort basis • Volunteers wanted!

  7. Volunteers wanted!

  8. Observed trends - general • Most sites foresee an increase in Transparent File Access (TFA) storage in the next future. Lustre and GPFS are dominating the field. • HSM functionality is seen as an important addition for TFA storage. Combinations like Lustre/HPSS, GPFS/HPSS, GPFS/TSM are being considered. • TFA-based solutions are proving to be competitive. See, for instance, these interesting talks of G.Donvito and V.Sapunenko: 1) GPFS vs dCache and xrootd http://hepix.caspur.it/storage/test-io-performance.pdf 2) GPFS vs CASTOR v2 http://hepix.caspur.it/storage/GPFS-Spring2007.pdf

  9. Observed trends – Tier-1 sites • Balanced data streams handling proves to be the most difficult part. This includes the disk/tape migration scheduling and load dependent data distribution over the storage elements. • Plenty of factors are to be accounted for. Optimized real time monitoring of system components is a key input for automated data migration and data access decisions. These decisions have to be made in a very fast manner.  There is still a lot of room for improvement in this area • On-site competence is of crucial importance for Tier-1 sites that choose technologies like CASTOR or dCache.

  10. Observed trends – Tier-2 sites • These sites are mostly disk-based and hence have a bigger freedom of choice of the data access solutions. • Selection of the storage technology is influenced by the need of integration with other sites. More specifically, the SRM interface is requested. This essentially reduces the circle to the 4 practicable solutions which dictate the ways in which the hardware is to be used: 1) dCache (built-in SRM interface, aggregates multiple disk server nodes) 2) Disk Pool Manager (same strategy as dCache) 3) Multiple Xrootd servers with a stand-alone SRM interface like BestMan 4) A solid distributed TFA file system + a stand-alone SRM interface like StoRM • Only a few comparative studies exist so far for a subset of these solutions (see the aforementioned talks of Donvito and Sapunenko). They however reveal interesting facts and indicate a pronounced need to perform more investigations in this direction.

  11. Plans for Phase 3 (T1 oriented) • Will perform an assessment of the mixed disk/tape data access solutions adopted at Tier-1 sites (performance+cost). • Will try to present things as they are and leave the readers to draw their own conclusions. Results will make into a chapter of our final report. • The overall picture is positive, but an accurate independent analysis of the situation may be of great help for the T1 IT managers. The same hardware base allows for multiple solutions, so it will never be late to review or improve.

  12. Plans for Phase 3 (T2 oriented) • Will perform a comparative analysis of the two most diffused file systems (Lustre and GPFS), on the same hardware base. Tests will have to include a significant number of disk servers and file system clients. Possible test sites: IN2P3, CERN, FZK, DESY or CNAF. • Will try to perform a comparative analysis of dCache, DPM, Xrootd and one of the TFA FS implementations on the same hardware base. • The test results will be used to provide practical recommendations for the T2 sites which is the main goal of this working group. • Will try to sort the acceptable T2 solutions in function of their performance and TCO. Will try to indicate the approriate architecture(s) for different types of T2 sites .

  13. Plans for Phase 3 (general) • Will continue working on the storage technology tracking pages on the best effort basis (Volunteers welcome!) • Will complete / update the Questionnaire. The summary numbers are to appear in the final report • The final report is due for the Spring 2008 meeting at CERN. • We count on the active collaboration of all the sites involved!

  14. Discussion

More Related