1 / 19

Armenuhi Abramyan, Narine Manukyan ALICE team of A.I. Alikhanian National Scientific Laboratory

Storage Volume Freeing Service (SVFS). Armenuhi Abramyan, Narine Manukyan ALICE team of A.I. Alikhanian National Scientific Laboratory {aabramya, nmanukya}@mail.yerphi.am. (General) formulation of the problem and solution. Problem: Replication of data files -> Repletion of SEs ->

toni
Download Presentation

Armenuhi Abramyan, Narine Manukyan ALICE team of A.I. Alikhanian National Scientific Laboratory

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Storage Volume Freeing Service (SVFS) Armenuhi Abramyan, Narine Manukyan ALICE team of A.I. Alikhanian National Scientific Laboratory {aabramya, nmanukya}@mail.yerphi.am

  2. (General) formulation of the problem and solution Problem: Replication of data files -> Repletion of SEs -> Prevention of further inflow of files to these SEs Solution: Regular (partial) cleaning of SEs by removing certain portion of replicas Step one: Definition of data files (DFs) and data file sets (FSs), subjects to removal Step two: Construction of the removal and cleaning general scheme Step three: Determination of the order of the FSs’ removal(replicas killing queuing ) • Step two: Algorithm to determine the portion of replicas of DFSs to remove from SEs • Step three: Algorithm to determine the portion of replicas of DFSs to remove from SEs

  3. Data files and data file sets, subjects to removal

  4. Data Files Data File (DF) =any .root file • ‘Centrally’ created DFs: • <*>ESD<*>.root; • <*>AOD<*>.root; • <*>QA<*>.root; • OCDB.root Volumes occupied by DFs in LHC12c directory: LHC12c 286 TB (100.00 %) __________________________________________ RAWS 205.0 TB (71.7 %) __________________________________________ ESDs 64.4 TB (22.5 %) AODs 13.0 TB (4.54 %) QAs 3.55 TB (1.23 %) OCDBs 0.03 TB (0.02 %) __________________________________________ PWG directories 0.02 TB (0.01%) • PWG-created root files(?) What about user-created root files(?). How much volume do they occupy?

  5. File Sets Elementary File Sets (EFSs) = zip archives of DFs( Σ of output .root files of each job). In AliEn, the data replication is done by these units. Types of archives in AliEn: root_archive.zip=set of ESD |AOD | QA | PWG-created | user-createdDFs aod_archive.zip = set of AOD type DFs QA_archive.zip = set of QA related DFs log_archive | log_archive.zip = set of log files • A lowest order file whose content is considered as a whole • Examples of Basic Data Files: • <*>AOD<*>.root; • <*>ESD<*>.root; • <*>QA<*>.root To discuss and decide: The removal is convenient to be performed not on the level of the EFSs themselves, but in their (meaningful) combinations = Composite File Sets (CFSs) Suggestions of CFSs containing the same types of EFSs: Run_AOD,Period_AOD - Full set of AOD type EFSs in Run, Period Run_ESD, Period_ESD - Full set of ESDs type EFS in Run, Period

  6. CFSs’ removal principles

  7. Custodial replicas • How many custodial replicas (CMS parlance)of CFSs should be kept? • Should the number of custodial replicas be dependent on the type of CFS (AOD,…. OCDB,..)? • Where to keep custodial replicas? ( on Tier1s? Tier2s with low CPU?)

  8. Global removal of CFSs Global removal = removal from AliEn, without taking care of the state of individual SEs. SE1 archive1.zip In Global removal approach, any CFS can be removed entirely! … SE2 SEN1 archiveN.zip SEN2 A problem: Which one of the replicas of an archive.zip to remove? • An ordered removal of replicas of the same archive.zip: • One keeps custodial replica(s) • One takes into account additional criteria related to the individual sites: • a) Repletion state of the sites where the replicas are placed; • b) Number of the failures to access the replicas on the sites they are placed. • ……………… 1 CFS 1+1 CFS 1+1 CFS SE1 archive1.zip archive1.zip SE2 … … SE2 SEN1 SE1 archiveN.zip SEN2 archiveN.zip SEN2 SE3

  9. Freeing storage space on individual SEs. Local removal of CFSs Since the EFSs entering CFS are dispersed by different SEs, an entire removal of a CFS is not possible in the local approach. Removal of a CFS can be made possible only in a correlated (with the other SEs) manner. No ordering of replicas of the same archive.zip for removal is needed. A single restriction- Custodial replicas are not removed. The drawback of local approach! If we give priority to local freeing but not to the removal of an entire CFS, then only part of the EFSs entering a CFS can be removed. As a result, we will have ‘amputated’CFSs. 1 CFS 1’ CFS 1+1 CFS … SE1 archive1.zip SE1 archive1.zip … SE2 SEN1 archiveN.zip SEN1 archiveN.zip archive1.zip SE2 … SEN2 (amputated CFS)

  10. CMS approach

  11. File Sets and removal mechanism in CMS Analog of EFS- Block: The smallest unit in computing space which corresponds to a group of files likely to be accessed together. Files are grouped in blocks for bulk data management reasons Dataset: The largest, homogeneous collection of files, related to a same trigger stream, processing campaign, output data format. The replica removal in CMS is done by datasets - Analog of CFS. (Note: One entire replica of a dataset is stored on one SE) Replica removal in CMS: - Localremoval scheme Only Tier-2 sites are cleaned, with the following conditions: 1 The removal procedure starts if the site has less than 10% (or 15 TB for small SEs) of free space. 2. The removal procedure ends when 30 %(or 25 TB) of site space is freed.

  12. Ordering of CFSs for removal The ordering of CFSs for removal can be done using the Popularity notion. An example of popularity construction for a CFS is the summary number of calls to Data Files (summary number of occurrences of their LFNs) entering that CFS. We are considering several (popularity - based) models for the solution of the ordering problem . Realistic verification of our models and final construction is possible only on the base of fine-grained monitoring data on EFSs.

  13. File Access Monitoring Service (FAMoS) The purpose of the File Access Monitoring Service is to monitor the frequency of the accesses to the files in the AliEnFile Catalogue. A bit of history • The work on the development of FAMoShas been started on July 2012; • The code has been included in AliEn v2.20 (on October 2012) and AliEn v2.21 (on January 2013); • A part of FAMoS, which records the details on the file accesses has been deployed on all AliEn central servers on 6th of August 2013. Thanks to Miguel.

  14. The list of monitoring attributes Your comments and suggestions would be highly appreciated !!! The source: Since all the accesses to the files within File Catalogueare authenticated by the AliEnAuthentication (Authen) service, a plugin(called attributes) has been included in the Authenservice in order to record the values of specified attributes (into “Authen_ops" daily log files).

  15. The initial construction of CFSs (Composite File Sets)

  16. Does the FAMoS need to monitor all the file accesses? Suggested filters: /alice/packages/* *.par */bin/* *.log *log_archive* *validation.sh *.rc *.sh *.jdl *.xml *.C *.h *.cxx They are ~94 % of accesses!!!

  17. (1) First statistics gathered by FAMoS by (06-09).08.2013

  18. (2) First statistics gathered by FAMoS by (06-09).08.2013

  19. Thanks

More Related