Grid computing for high energy physics in japan
This presentation is the property of its rightful owner.
Sponsored Links
1 / 26

Grid Computing for High Energy Physics in Japan PowerPoint PPT Presentation


  • 125 Views
  • Uploaded on
  • Presentation posted in: General

Grid Computing for High Energy Physics in Japan. Hiroyuki Matsunaga International Center for Elementary Particle Physics (ICEPP), The University of Tokyo International Workshop on e-Science for Physics 2008. Major High Energy Physics Program in Japan. KEK-B (Tsukuba) Belle J-PARC (Tokai)

Download Presentation

Grid Computing for High Energy Physics in Japan

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Grid computing for high energy physics in japan

Grid Computing for High Energy Physics in Japan

Hiroyuki Matsunaga

International Center for Elementary Particle Physics (ICEPP),

The University of Tokyo

International Workshop on e-Science for Physics 2008


Major high energy physics program in japan

Major High Energy Physics Program in Japan

  • KEK-B (Tsukuba)

    • Belle

  • J-PARC (Tokai)

    • Japan Proton Accelerator Research Complex

    • Operation will start within this year

    • T2K (Tokai to Kamioka)

      • long baseline neutrino experiment

  • Kamioka

    • SuperKamiokande

    • KamLAND

  • International collaboration

    • CERN LHC (ATLAS, ALICE)

    • Fermilab Tevatron (CDF)

    • BNL RHIC (PHENIX)


Grid related activities

Grid Related Activities

  • ICEPP, University of Tokyo

    • WLCG Tier2 site for ATLAS

      • Regional Center for ATLAS-Japan group

  • Hiroshima University

    • WLCG Tier2 site for ALICE

  • KEK

    • Two EGEE production sites

      • BELLE experiment, J-PARC, ILC…

    • University support

    • NAREGI

  • Grid deployment at universities

    • Nagoya U. (Belle), Tsukuba U. (CDF)…

  • Network


Grid deployment at university of tokyo

Grid Deployment at University of Tokyo

  • ICEPP, University of Tokyo

    • Involved in international HEP experiments since 1974

  • Operated pilot system since 2002

  • Current computer system started working last year

    • TOKYO-LCG2. gLite3 installed

  • CC-IN2P3 (Lyon, France) is the associated Tier 1 site within ATLAS computing model

    • Detector data from CERN go through CC-IN2P3

    • Exceptionally far distance for T1-T2

      • RTT ~280msec, ~10 hops

      • Challenge for efficient data transfer

      • Data catalog for the files in Tokyo located at Lyon

    • ASGC (Taiwan) could be additional associated Tier1

      • Geographically nearest Tier 1 (RTT ~32msec)

      • Operations have been supported by ASGC

        • Neighboring timezone


Hardware resources

Hardware resources

  • Tier-2 site plus (non-grid) regional center facility

    • Support local user analysis by ALTAS Japan group

  • Blade servers

    • 650 nodes (2600 cores)

  • Disk arrays

    • 140 Boxes (~6TB/box)

    • 4Gb Fibre-Channel

  • File servers

    • Attach 5 disk arrays

    • 10 GbE NIC

  • Tape robot (LTO3)

    • 8000 tapes, 32 drives

Tape robot

Blade servers

Disk arrays

Disk arrays


Sinet3

SINET3

  • SINET3 (Japanese NREN)

    • Third generation of SINET, since Apr. 2007

    • Provided by NII (National Institute of Informatics)

  • Backbone: up tp 40Gbps

  • Major universities connect

    with 1-10 Gbps

    • 10 Gbps to Tokyo RC

  • International links

    • 2 x 10 Gbps to US

    • 2 x 622 Mbps to Asia


International link

International Link

  • 10Gbps between Tokyo and CC-IN2P3

    • SINET3 + GEANT + RENATER (French NREN)

    • public network (shared with other traffic)

  • 1Gbps link to ASGC (to be upgraded to 2.4 Gbps)

GEANT (10Gbps)

SINET3 (10Gbps)

RENATER (10Gbps)

Lyon

New York

Tokyo

Taipei


Network test with iperf

Network test with Iperf

  • Memory-to-memory test performed with Iperf program

  • Use Linux boxes dedicated for iperf test at both ends

    • 1Gbps limited by NIC

    • Linux kernel 2.6.9 (BIC TCP)

    • Window size 8Mbytes, 8 parallel streams

  • For Lyon-Tokyo: long recovery time due to long RTT

Taipei <-> Tokyo (RTT: 32ms)

Lyon <-> Tokyo (RTT: 280ms)


Data transfer from lyon tier1 center

Data Transfer from Lyon Tier1 center

  • Data transferred from Lyon to Tokyo

    • Used Storage Elements in production

    • ATLAS MC simulation data

  • Storage Elements

    • Lyon: dCache (>30 gridFTP servers, Solaris, ZFS)

    • Tokyo: DPM (6 gridFTP servers, Linux, XFS)

  • FTS (File Transfer System)

    • Main tool for bulk data transfer

    • Execute multiple file transfers (by using gridFTP) concurrently

      • Set number of streams for gridFTP

    • Used in ATLAS Distributed Data Management system


Performance of data transfer

Performance of data transfer

Throughput per file transfer

  • >500 Mbytes/s observed in May, 2008

    • Filesize: 3.5Gbytes

    • 20 files in parallel, 10 streams each

    • ~40Mbytes/s for each file transfer

  • Low activity at CC-IN2P3 during the period (other than ours)

100

10

1

40

20

0

Mbytes/s

500 Mbytes/s


Data transfer between asgc and tokyo

Data transfer between ASGC and Tokyo

  • Transferred 1000 files at a test (1Gbytes filesize)

  • Tried various numbers of concurrent files / streams

    • From 4/1 to 25/15

  • Saturate 1Gbps WAN bandwidth

25/10

25/15

25/10

20/10

8/2

16/1

8/1

4/2

4/4

Tokyo -> ASGC

4/1

20/10

16/1

4/2

25/10

4/4

ASGC -> Tokyo


Cpu usage in the last year sep 2007 aug 2008

CPU Usage in the last year (Sep 2007 – Aug 2008)

  • 3,253,321 CPU time (kSI2k*hours) in last year

    • Most jobs are ATLAS MC simulation

      • Job submission is coordinated by CC-IN2P3 (the associated Tier1)

      • Outputs are uploaded to the data storage at CC-IN2P3

    • Large contribution to the ATLAS MC production

TOKYO-LCG2 CPU time per month

CPU time at Large Tier2 sites


Alice tier2 center at hiroshima university

ALICE Tier2 center at Hiroshima University

  • WLCG/EGEE site

    • “JP-HIROSHIMA-WLCG”

  • Possible Tier 2 site for ALICE


Status at hiroshima

Status at Hiroshima

  • Just became EGEE production site

    • Aug. 2008

  • Associated Tier1 site will likely be CC-IN2P3

    • No ALICE Tier1 in Asia-Pacific region

  • Resources

    • 568 CPU cores

      • Dual-Core Xeon(3GHz) X 2cpus X 38boxes

      • Quad-Core Xeon(2.6GHz) X 2cpus X 32boxes

      • Quad-Core Xeon(3GHz) X 2cpus X 20blades

    • Storage: ~200 TB next year

  • Network: 1Gbps

    • On SINET3


Grid computing for high energy physics in japan

KEK

  • Belle experiment has been running

    • Need to have access to existing peta-bytes of data

  • Site operations

    • KEK does not support any LHC experiment

    • Try to gain experience by operating sites in order to prepare for future Tier1 level Grid center

  • University support

  • NAREGI

KEK Tsukuba campus

Mt. Tsukuba

Belle exp.

KEKB

Linac


Grid deployment at kek

Grid Deployment at KEK

  • Two EGEE sites

    • JP-KEK-CRC-1

      • Rather experimental use and R&D

    • JP-KEK-CRC-2

      • More stable services

  • NAREGI

    • Used beta version for testing and evaluation

  • Supported VOs

    • belle (main target at present), ilc, calice, …

    • Not support LCG VOs

  • VOMS operation

    • belle (registered in CIC)

    • ppj (accelerator science in Japan), naokek

    • g4med, apdg, atlasj, ail


Belle vo

Belle VO

  • Federation established

    • 5 countries, 7 institutes, 10 sites

      • Nagoya Univ., Univ. of Melbourne, ASGC, NCU, CYFRONET, Korea Univ., KEK

  • VOMS is provided by KEK

  • Activities

    • Submit MC production jobs

    • Functional and performance tests

    • Interface to existing peta-bytes of data


Grid computing for high energy physics in japan

Takashi Sasaki (KEK)


Ppj vo

ppj VO

  • Federated among major universities and KEK

    • Tohoku U. (ILC, KamLAND)

    • U. Tsukuba (CDF)

    • Nagoya U. (Belle, ATLAS)

    • Kobe U. (ILC, ATLAS)

    • Hiroshima IT (ATLAS, Computing Science)

  • Common VO for accelerator science in Japan

    • NOT depend on specific projects, but resources shared

  • KEK acts as GOC

    • Remote installation

    • Monitoring

      • Based on Nagios and Wiki

    • Software update


Kek grid ca

KEK Grid CA

  • Started since Jan. 2006

  • Accredited as an IGTF (International Grid Trust Federation) compliant CA

Numbers of Issued certificates


Naregi

NAREGI

  • NAREGI: NAtional REsearch Grid Initiative

    • Host institute: National Institute of Infrmatics (NII)

    • R&D of the Grid middleware for research and industrial applications

    • Main targets are nanotechnology and biotechnology

      • More focused on computing grid

      • Data grid part integrated later

  • Ver. 1.0 of middleware released in May, 2008

    • Software maintenance and user support services will be continued


Naregi at kek

NAREGI at KEK

  • NAREGI-b version installed on the testbed

    • 1.0.1: Jun. 2006 – Nov. 2006

      • Manual installation for all the steps

    • 1.0.2: Feb 2007

    • 2.0.0: Oct. 2007

      • apt-rpm installation

    • 2.0.1: Dec. 2007

  • Site federation test

    • KEK-NAREGI/NII: Oct. 2007

    • KEK-National Astronomy Observatory (NAO): Mar. 2008

      • Evaluation of application environment of NAREGI

        • job submission/retrieval, remote data stage-in/out


Grid computing for high energy physics in japan

Takashi Sasaki (KEK)


Data storage gfarm

Data Storage: Gfarm

  • Gfarm: distributed file system

    • DataGrid part in NAREGI

    • Data are stored in multiple disk servers

  • Tests performed :

    • Stage-in and stage-out to the Gfarm storage

    • GridFTP interface

      • Between gLite site and NAREGI site

    • File access from application

      • Have access with FUSE (Filesystem in userspace)

        • Without the need of changing application program

        • IO speed is several times slower than local disk


Future plan on naregi at kek

Future Plan on NAREGI at KEK

  • Migration to the production version

  • Test of interoperability with gLite

  • Improve the middleware in the application domain

    • Development of the new API to the application

      • Virtualization of the middleware

      • for script languages (to be used at web portal as well)

    • Monitoring

      • Jobs, sites,…


Summary

Summary

  • WLCG

    • ATLAS Tier2 at Tokyo

      • Stable operation

    • ALICE Tier2 at Hiroshima

      • Just started operation in production

  • Coordinated effort lead by KEK

    • Site operations with gLite and NAREGI middlewares

      • Belle VO: SRB

        • Will be replaced with iRODs

      • ppj VO: deployment at universities

        • Supported and monitored by KEK

    • NAREGI

      • R&D, interoperability


  • Login