1 / 6

Changes in PD2P replication strategy

Changes in PD2P replication strategy. S. Campana ( CERN IT/ ES) on behalf of ADC. Current PD2P algorithm. For T1s: distributes proportionally with T1 MoU share This is OK, we do not discuss this today For T2s: same algorithm as job brokering

yael
Download Presentation

Changes in PD2P replication strategy

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Changes in PD2P replication strategy S. Campana(CERN IT/ES) on behalf of ADC

  2. Current PD2P algorithm • For T1s: distributes proportionally with T1 MoU share • This is OK, we do not discuss this today • For T2s: same algorithm as job brokering • Send a copy to the place which has higher chance to run a job once the dataset gets there • Weight ~ #CPUS * #Running / # Waiting • The current algorithm is optimized for re-brokering • And offloading the T1 • Not necessarily it is optimal for an balanced data distribution • The main purpose of PD2P is to replicate popular data for reuse Presentation Title - 2

  3. New PD2P algorithm • PD2P will use two different algorithm for • Express replica (for quick reuse) • Long Term replica (aimed to balanced data distribution) • Express replica • Same algorithm as today. • Quick data delivery (use ClosedSites only) • Possibility to run promptly a job on the new replica • Long Term Replica(s) • Based on size (disk) of the site • Based on performance in Analysis Functional Tests (last month) Presentation title - 3

  4. Pd2P and Pre-Placement • PD2P and Pre-Placement of data are (almost) orthogonal • PD2P replicates only what has been used – with a minimal delay • Pre-Placement replicates everything • The delay can be very short but also very long (congestion after reprocessing for example) • ADC would prefer to have only one mechanism for Data Replication: PD2P Presentation title - 4

  5. Proposal • We leave the situation unchanged for T1s and CERN • We keep going with no pre-placement for T2s • We increase the number of replicas from PD2P create on first use (and subsequent uses): • One Express Replica at T2s • Two Long Term replicas at T2s • PD2P is applied to ADOs, (D)ESDs, NTUP • Both Data and MC • Including what is produced by Group Production Presentation title - 5

  6. Monitoring and Docs • The PD2P twiki • https://twiki.cern.ch/twiki/bin/viewauth/Atlas/PandaDynamicDataPlacement • PD2P replication (tables) from Mikhail • http://panda.cern.ch/server/pandamon/query?mode=pd2p • PD2P logs from Tadashi • http://panda.cern.ch/server/pandamon/query?mode=mon&hours=240&name=panda.mon.prod&type=pd2p • Plots from Jarka (will be moved soon, check the twiki above) • http://hpv2.farm.particle.cz/~schovan/pd2p/pd2p_index.html Presentation title - 6

More Related