1 / 16

SC3 experiences

SC3 experiences. Ron Trompert SARA. SC3 Infrastructure. Starting point DMF-based HSM DMF has no SRM implementation DMF does not support functionality promised by the SRM standard, like file pinning. SC3 Infrastructure. dCache. dCache provides an srm I/F

benny
Download Presentation

SC3 experiences

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. SC3 experiences Ron Trompert SARA

  2. SC3 Infrastructure • Starting point • DMF-based HSM • DMF has no SRM implementation • DMF does not support functionality promised by the SRM standard, like file pinning.

  3. SC3 Infrastructure dCache • dCache provides an srm I/F • dCache provides flexibility with respect to HSM backends • If we need to switch to another HSM setup for some reason

  4. SC3 Infrastructure: throughput phase

  5. SC3 Throughput phase • Disk2disk: 100-110 MB/s • Problems with stability of the nodes:solved by limiting the number of I/O movers • Disk2tape: 50 MB/s • Not enough bandwidth, SAN not dedicated

  6. SC3 Infrastructure: service phase

  7. SC3 service phase statistics Percentage of computational resources used (october-december)

  8. SC3 service phase statistics

  9. SC3 service phase statistics • Setting up the infrastructure took longer than we had hoped so unfortunately we missed ALICE. • Sizes and number of files transferred to srm SE

  10. SC3 service phase observations • Networking problems • Hardware problems • 10GE to CERN was dedicated but the 10G switch not. Switching back and forth between dedicated 10GE and Geant. • Routing problems • Considerably less data stored for Atlas than expected. • In plans on Wiki 20 TB

  11. SC3 service phase observations • Communication problem • Network changes not reported • We were not informed of changes in subnets. • Problems are not always reported • Failed transfers are not always reported • Network outage CERN-SARA between Xmas and New Year, nobody informed us • Monitoring: experiment monitoring websites in Wiki but also found other monitoring website urls in emails. • Not clear what the experiments exact plans are • When there are no transfers and no problems are reported, it is not clear whether there is something wrong or things go just as planned.

  12. SC3 service phase observations • Failed transfers by attempting to overwrite files • Not allowed by PNFS • At dCache sites running a gridftp door on there srm node files can be thrown away immediately using edg-gridftp-rm or glite-gridftp-rm • At dCache sites that don’t run a gridftp door on the srm node an advisory delete can be done. But then files are not immediately deleted.

  13. SC3 service phase observations • dCache security (gsi)dcap • Using dccp it is possible to get anything in /pnfs/grid.sara.nl/data/<vo> by anyone • Unix permissions on directories are not honoured • Files in a directory with –rwxr-x--- are world readable. • File permission are honoured but when data is copied in /pnfs it gets –rw-r--r--. • Using gsidcap you are authenticated but the behaviour above stays the same. • Write permissions are OK. • Maybe this is OK for HEP VOs but for some VOs this is too liberal.

  14. SC3 service phase observations • Oracle database • Every now and then it just hangs and needs to be restarted. • Backups didn’t work but FTS and LFC did.

  15. SC3 service phase observations • A user wanted to run a job using root I/O which is rfio/dcap based. • Rfio/dcap are unauthenticated protocols to access data • Rfio comes automatically when installing a classic SE with yaim. • We don’t really like it but what do the other T1s think about this?

  16. SC4 Outlook • Current plans (being updated) -Setup T2 tests -Separate T1 tape storage from general storage -Replace old SE by SRM SE -Setup DB node for FTS/LFC

More Related