Grid computing for high energy physics in japan
Sponsored Links
This presentation is the property of its rightful owner.
1 / 26

Grid Computing for High Energy Physics in Japan PowerPoint PPT Presentation


  • 131 Views
  • Uploaded on
  • Presentation posted in: General

Grid Computing for High Energy Physics in Japan. Hiroyuki Matsunaga International Center for Elementary Particle Physics (ICEPP), The University of Tokyo International Workshop on e-Science for Physics 2008. Major High Energy Physics Program in Japan. KEK-B (Tsukuba) Belle J-PARC (Tokai)

Download Presentation

Grid Computing for High Energy Physics in Japan

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Grid Computing for High Energy Physics in Japan

Hiroyuki Matsunaga

International Center for Elementary Particle Physics (ICEPP),

The University of Tokyo

International Workshop on e-Science for Physics 2008


Major High Energy Physics Program in Japan

  • KEK-B (Tsukuba)

    • Belle

  • J-PARC (Tokai)

    • Japan Proton Accelerator Research Complex

    • Operation will start within this year

    • T2K (Tokai to Kamioka)

      • long baseline neutrino experiment

  • Kamioka

    • SuperKamiokande

    • KamLAND

  • International collaboration

    • CERN LHC (ATLAS, ALICE)

    • Fermilab Tevatron (CDF)

    • BNL RHIC (PHENIX)


Grid Related Activities

  • ICEPP, University of Tokyo

    • WLCG Tier2 site for ATLAS

      • Regional Center for ATLAS-Japan group

  • Hiroshima University

    • WLCG Tier2 site for ALICE

  • KEK

    • Two EGEE production sites

      • BELLE experiment, J-PARC, ILC…

    • University support

    • NAREGI

  • Grid deployment at universities

    • Nagoya U. (Belle), Tsukuba U. (CDF)…

  • Network


Grid Deployment at University of Tokyo

  • ICEPP, University of Tokyo

    • Involved in international HEP experiments since 1974

  • Operated pilot system since 2002

  • Current computer system started working last year

    • TOKYO-LCG2. gLite3 installed

  • CC-IN2P3 (Lyon, France) is the associated Tier 1 site within ATLAS computing model

    • Detector data from CERN go through CC-IN2P3

    • Exceptionally far distance for T1-T2

      • RTT ~280msec, ~10 hops

      • Challenge for efficient data transfer

      • Data catalog for the files in Tokyo located at Lyon

    • ASGC (Taiwan) could be additional associated Tier1

      • Geographically nearest Tier 1 (RTT ~32msec)

      • Operations have been supported by ASGC

        • Neighboring timezone


Hardware resources

  • Tier-2 site plus (non-grid) regional center facility

    • Support local user analysis by ALTAS Japan group

  • Blade servers

    • 650 nodes (2600 cores)

  • Disk arrays

    • 140 Boxes (~6TB/box)

    • 4Gb Fibre-Channel

  • File servers

    • Attach 5 disk arrays

    • 10 GbE NIC

  • Tape robot (LTO3)

    • 8000 tapes, 32 drives

Tape robot

Blade servers

Disk arrays

Disk arrays


SINET3

  • SINET3 (Japanese NREN)

    • Third generation of SINET, since Apr. 2007

    • Provided by NII (National Institute of Informatics)

  • Backbone: up tp 40Gbps

  • Major universities connect

    with 1-10 Gbps

    • 10 Gbps to Tokyo RC

  • International links

    • 2 x 10 Gbps to US

    • 2 x 622 Mbps to Asia


International Link

  • 10Gbps between Tokyo and CC-IN2P3

    • SINET3 + GEANT + RENATER (French NREN)

    • public network (shared with other traffic)

  • 1Gbps link to ASGC (to be upgraded to 2.4 Gbps)

GEANT (10Gbps)

SINET3 (10Gbps)

RENATER (10Gbps)

Lyon

New York

Tokyo

Taipei


Network test with Iperf

  • Memory-to-memory test performed with Iperf program

  • Use Linux boxes dedicated for iperf test at both ends

    • 1Gbps limited by NIC

    • Linux kernel 2.6.9 (BIC TCP)

    • Window size 8Mbytes, 8 parallel streams

  • For Lyon-Tokyo: long recovery time due to long RTT

Taipei <-> Tokyo (RTT: 32ms)

Lyon <-> Tokyo (RTT: 280ms)


Data Transfer from Lyon Tier1 center

  • Data transferred from Lyon to Tokyo

    • Used Storage Elements in production

    • ATLAS MC simulation data

  • Storage Elements

    • Lyon: dCache (>30 gridFTP servers, Solaris, ZFS)

    • Tokyo: DPM (6 gridFTP servers, Linux, XFS)

  • FTS (File Transfer System)

    • Main tool for bulk data transfer

    • Execute multiple file transfers (by using gridFTP) concurrently

      • Set number of streams for gridFTP

    • Used in ATLAS Distributed Data Management system


Performance of data transfer

Throughput per file transfer

  • >500 Mbytes/s observed in May, 2008

    • Filesize: 3.5Gbytes

    • 20 files in parallel, 10 streams each

    • ~40Mbytes/s for each file transfer

  • Low activity at CC-IN2P3 during the period (other than ours)

100

10

1

40

20

0

Mbytes/s

500 Mbytes/s


Data transfer between ASGC and Tokyo

  • Transferred 1000 files at a test (1Gbytes filesize)

  • Tried various numbers of concurrent files / streams

    • From 4/1 to 25/15

  • Saturate 1Gbps WAN bandwidth

25/10

25/15

25/10

20/10

8/2

16/1

8/1

4/2

4/4

Tokyo -> ASGC

4/1

20/10

16/1

4/2

25/10

4/4

ASGC -> Tokyo


CPU Usage in the last year (Sep 2007 – Aug 2008)

  • 3,253,321 CPU time (kSI2k*hours) in last year

    • Most jobs are ATLAS MC simulation

      • Job submission is coordinated by CC-IN2P3 (the associated Tier1)

      • Outputs are uploaded to the data storage at CC-IN2P3

    • Large contribution to the ATLAS MC production

TOKYO-LCG2 CPU time per month

CPU time at Large Tier2 sites


ALICE Tier2 center at Hiroshima University

  • WLCG/EGEE site

    • “JP-HIROSHIMA-WLCG”

  • Possible Tier 2 site for ALICE


Status at Hiroshima

  • Just became EGEE production site

    • Aug. 2008

  • Associated Tier1 site will likely be CC-IN2P3

    • No ALICE Tier1 in Asia-Pacific region

  • Resources

    • 568 CPU cores

      • Dual-Core Xeon(3GHz) X 2cpus X 38boxes

      • Quad-Core Xeon(2.6GHz) X 2cpus X 32boxes

      • Quad-Core Xeon(3GHz) X 2cpus X 20blades

    • Storage: ~200 TB next year

  • Network: 1Gbps

    • On SINET3


KEK

  • Belle experiment has been running

    • Need to have access to existing peta-bytes of data

  • Site operations

    • KEK does not support any LHC experiment

    • Try to gain experience by operating sites in order to prepare for future Tier1 level Grid center

  • University support

  • NAREGI

KEK Tsukuba campus

Mt. Tsukuba

Belle exp.

KEKB

Linac


Grid Deployment at KEK

  • Two EGEE sites

    • JP-KEK-CRC-1

      • Rather experimental use and R&D

    • JP-KEK-CRC-2

      • More stable services

  • NAREGI

    • Used beta version for testing and evaluation

  • Supported VOs

    • belle (main target at present), ilc, calice, …

    • Not support LCG VOs

  • VOMS operation

    • belle (registered in CIC)

    • ppj (accelerator science in Japan), naokek

    • g4med, apdg, atlasj, ail


Belle VO

  • Federation established

    • 5 countries, 7 institutes, 10 sites

      • Nagoya Univ., Univ. of Melbourne, ASGC, NCU, CYFRONET, Korea Univ., KEK

  • VOMS is provided by KEK

  • Activities

    • Submit MC production jobs

    • Functional and performance tests

    • Interface to existing peta-bytes of data


Takashi Sasaki (KEK)


ppj VO

  • Federated among major universities and KEK

    • Tohoku U. (ILC, KamLAND)

    • U. Tsukuba (CDF)

    • Nagoya U. (Belle, ATLAS)

    • Kobe U. (ILC, ATLAS)

    • Hiroshima IT (ATLAS, Computing Science)

  • Common VO for accelerator science in Japan

    • NOT depend on specific projects, but resources shared

  • KEK acts as GOC

    • Remote installation

    • Monitoring

      • Based on Nagios and Wiki

    • Software update


KEK Grid CA

  • Started since Jan. 2006

  • Accredited as an IGTF (International Grid Trust Federation) compliant CA

Numbers of Issued certificates


NAREGI

  • NAREGI: NAtional REsearch Grid Initiative

    • Host institute: National Institute of Infrmatics (NII)

    • R&D of the Grid middleware for research and industrial applications

    • Main targets are nanotechnology and biotechnology

      • More focused on computing grid

      • Data grid part integrated later

  • Ver. 1.0 of middleware released in May, 2008

    • Software maintenance and user support services will be continued


NAREGI at KEK

  • NAREGI-b version installed on the testbed

    • 1.0.1: Jun. 2006 – Nov. 2006

      • Manual installation for all the steps

    • 1.0.2: Feb 2007

    • 2.0.0: Oct. 2007

      • apt-rpm installation

    • 2.0.1: Dec. 2007

  • Site federation test

    • KEK-NAREGI/NII: Oct. 2007

    • KEK-National Astronomy Observatory (NAO): Mar. 2008

      • Evaluation of application environment of NAREGI

        • job submission/retrieval, remote data stage-in/out


Takashi Sasaki (KEK)


Data Storage: Gfarm

  • Gfarm: distributed file system

    • DataGrid part in NAREGI

    • Data are stored in multiple disk servers

  • Tests performed :

    • Stage-in and stage-out to the Gfarm storage

    • GridFTP interface

      • Between gLite site and NAREGI site

    • File access from application

      • Have access with FUSE (Filesystem in userspace)

        • Without the need of changing application program

        • IO speed is several times slower than local disk


Future Plan on NAREGI at KEK

  • Migration to the production version

  • Test of interoperability with gLite

  • Improve the middleware in the application domain

    • Development of the new API to the application

      • Virtualization of the middleware

      • for script languages (to be used at web portal as well)

    • Monitoring

      • Jobs, sites,…


Summary

  • WLCG

    • ATLAS Tier2 at Tokyo

      • Stable operation

    • ALICE Tier2 at Hiroshima

      • Just started operation in production

  • Coordinated effort lead by KEK

    • Site operations with gLite and NAREGI middlewares

      • Belle VO: SRB

        • Will be replaced with iRODs

      • ppj VO: deployment at universities

        • Supported and monitored by KEK

    • NAREGI

      • R&D, interoperability


  • Login