1 / 24

Gfarm Grid File System for Distributed and Parallel Data Computing

APAN Workshop on Exploring eScience Aug 26, 2005 Taipei, Taiwan. Gfarm Grid File System for Distributed and Parallel Data Computing. Osamu Tatebe o.tatebe@aist.go.jp Grid Technology Research Center, AIST. [Background] Petascale Data Intensive Computing. High Energy Physics

Download Presentation

Gfarm Grid File System for Distributed and Parallel Data Computing

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. APAN Workshop on Exploring eScience Aug 26, 2005 Taipei, Taiwan Gfarm Grid File System for Distributed and Parallel Data Computing Osamu Tatebe o.tatebe@aist.go.jp Grid Technology Research Center, AIST

  2. [Background] Petascale Data Intensive Computing • High Energy Physics • CERN LHC, KEK-B Belle • ~MB/collision, 100 collisions/sec • ~PB/year • 2000 physicists, 35 countries Detector forLHCb experiment Detector for ALICE experiment • Astronomical Data Analysis • data analysis of the whole data • TB~PB/year/telescope • Subaru telescope • 10 GB/night, 3 TB/year

  3. Petascale Data-intensive ComputingRequirements • Peta/Exabyte scale files, millions of millions of files • Scalable computational power • > 1TFLOPS, hopefully > 10TFLOPS • Scalable parallel I/O throughput • > 100GB/s, hopefully > 1TB/s within a system and between systems • Efficiently global sharing with group-oriented authentication and access control • Fault Tolerance / Dynamic re-configuration • Resource Management and Scheduling • System monitoring and administration • Global Computing Environment

  4. Goal and feature of Grid Datafarm • Goal • Dependable data sharing among multiple organizations • High-speed data access, High-performance data computing • Grid Datafarm • Gfarm Grid File System– Global dependable virtual file system • Federates scratch disks in PCs • Parallel & distributed data computing • Associates Computational Grid with Data Grid • Features • Secured based on Grid Security Infrastructure • Scalable depending on data size and usage scenarios • Data location transparent data access • Automatic and transparent replica selection for fault tolerance • High-performance data access and computing by accessing multiple dispersed storages in parallel (file affinity scheduling)

  5. /gfarm ggf jp file1 file2 aist gtrc file2 file1 file3 file4 Gfarm file system (1) • Virtual file system that federates local disks of cluster nodes or Grid nodes • Enables transparent access using Global namespace to dispersed file data in a Grid • Supports fault tolerance and avoid access concentration by automatic and transparent replica selection • It can be shared among all cluster nodes and clients Global namespace mapping File replica creation Gfarm File System

  6. Gfarm file system (2) • A file can be shared among all nodes and clients • Physically, it may be replicated and stored on any file system node • Applications can access it regardless of its location • In cluster environment, shared secret key is used for authentication Client PC /gfarm Gfarm file system metadata File A File A Note PC File B File C File C File A File B File B … File C

  7. Grid-wide configuration • Grid-wide file system by integrating local disks in several areas • GSI authentication • It can be shared among all cluster nodes and clients • GridFTP and samba servers in each site Gfarm Grid file system /gfarm /gfarm /gfarm /gfarm /gfarm /gfarm /gfarm /gfarm /gfarm /gfarm /gfarm /gfarm /gfarm /gfarm /gfarm /gfarm Japan Singapore US

  8. Feature of Gfarm file system • A file can be stored on any file system (compute) node (Distributed file system) • A file can be replicated and stored on different nodes (Fault tolerant, access concentration tolerant) • When there is a file replica on a compute node, it can be accessed without overhead (High performance, scalable I/O)

  9. More Scalable I/O Performance User’s view Physical execution view in Gfarm (file-affinity scheduling) User A submits that accesses is executed on a node that has File A File A Job A Job A User B submits that accesses is executed on a node that has File B File B Job B Job B network Cluster, Grid CPU CPU CPU CPU Gfarm file system File system nodes = compute nodes Shared network file system Do not separate storage and CPU (SAN not necessary) Move and execute program instead of moving large-scale data Scalable file I/O by exploiting local I/O

  10. GfarmTM Data Grid middleware • Open source development • GfarmTM version 1.1.1 released on May 17th, 2005 (http://datafarm.apgrid.org/) • Read-write mode support, more support for existing binary applications, metadata cache server • A shared file system in a cluster or a grid • Accessibility from legacy applications without any modification • Standard protocol support by scp, GridFTP server, samba server, . . . Metadata server • Existing applications can accessGfarm file system without any modification using LD_PRELOADof syscall hooking library or GfarmFS-FUSE application gfmd slapd Gfarm client library CPU CPU CPU CPU gfsd gfsd gfsd gfsd . . . Compute and file system nodes

  11. GfarmTM Data Grid middleware (2) • libgfarm – Gfarm client library • Gfarm API • gfmd, slapd – Metadata server • Namespace, replica catalog, host information, process information • gfsd – I/O server • Remote file access Metadata server application File, host information gfmd slapd Gfarm client library Remote file access CPU CPU CPU CPU gfsd gfsd gfsd gfsd . . . Compute and file system nodes

  12. Access from legacy applications • libgfs_hook.so – system call hooking library • It emulates to mount Gfarm file system at /gfarm hooking open(2), read(2), write(2), … • When it accesses under /gfarm, call appropriate Gfarm API • Otherwise, call ordinal system call • Re-link not necessary by specifying LD_PRELOAD • Linux, FreeBSD, NetBSD, … • Higher portability than developing kernel module • Mounting Gfarm file system • GfarmFS-FUSE enables to mount Gfarm file system using FUSE mechanism in Linux • released on Jul 12, 2005 • Need to develop a kernel module for other OSs • Need volunteers

  13. Gfarm – Application and performance result http://datafarm.apgrid.org/

  14. Scientific Application (1) • ATLAS Data Production • Distribution kit (binary) • Atlfast – fast simulation • Input data stored in Gfarmfile system not NFS • G4sim – full simulation (Collaboration with ICEPP, KEK) • Belle Monte-Carlo/Data Production • Online data processing • Distributed data processing • Realtime histgram display • 10 M events generated in a few daysusing a 50-node PC cluster (Collaboration with KEK, U-Tokyo)

  15. Scientific Application (2) • Astronomical Object Survey • Data analysis on the wholearchive • 652 GBytes data observed by SUBARU telescope • Large configuration data from Lattice QCD • Three sets of hundreds of gluon field configurations on a 24^3*48 4-D space-time lattice(3 sets x 364.5 MB x 800 = 854.3 GB) • Generated by the CP-PACS parallel computer atCenter for Computational Physics, Univ. of Tsukuba (300Gflops x years of CPU time)

  16. Performance result of parallel grep • 25 GBytes text file • Xeon 2.8GHz/512KB, 2GB memory NFS 340 sec (sequential grep) Gfarm 15 sec (16 fs nodes, 16 parallel processes) 22.6 times superlinear speed up Compute node Compute node Compute node Compute node . . . NFS Gfarm file system *Gfarm file system consists oflocal disks of compute nodes

  17. GridFTP data transfer performance Client Client Client Client Client Client Client Client ftpd Local disk vs Gfarm (1~2 nodes) Two GridFTP servers can provide almost peak performance (1 Gbps)

  18. Gaussian 03 in Gfarm • Ab initio quantum chemistry Package • Install once and run everywhere • No modification required to access Gfarm • Test415 (IO intensive test input) • 1h 54min 33sec (NFS) • 1h 0min 51sec (Gfarm) • Parallel analysis of all 666 test inputs using 47 nodes • Write error! (NFS) • Due to heavy IO load • 17h 31m 02s (Gfarm) • Quite good scalability of IO performance • Elapsed time can be reduced by re-ordering test inputs Compute node NFS vs Gfarm Compute node Compute node Compute node . . . NFS vs Gfarm *Gfarm consists of local disks of compute nodes

  19. Bioinformatics in Gfarm • iGAP (Integrative Genome Annotation Pipeline) • A suite of bioinformatics software for protein structural and functional annotation • More than 140 complete or partial proteomes analyzed • iGAP on Gfarm • Install once and run everywhere using Gfarm’s high performance file replication and transfer • no modifications required to use distributed compute and storage resource Burkholderia mallei (Bacteria) Gfarm makes it possible to use iGAP to analyzethe complete proteome (available 9/28/04)of the bacteria Burkholderia mallei,a known biothreat agent, on distributed resources.This is a collaboration under PRAGMA andthe data is available through http://eol.sdsc.edu. Participating sites: SDSC/UCSD (US), BII (Singapore), Osaka Univ, AIST (Japan), Konkuk Univ, Kookmin Univ, KISTI (Korea)

  20. Protein sequences structure info sequence info Prediction of : signal peptides (SignalP, PSORT) transmembrane (TMHMM, PSORT) coiled coils (COILS) low complexity regions (SEG) NR, PFAM SCOP, PDB Step 1 Building FOLDLIB: PDB chains SCOP domains PDP domains CE matches PDB vs. SCOP 90% sequence non-identical minimum size 25 aa coverage (90%, gaps <30, ends<30) Structural assignment of domains by WU-BLAST Step 2 Structural assignment of domains by PSI-BLAST profiles on FOLDLIB Step 3 Structural assignment of domains by 123D on FOLDLIB Step 4 Functional assignment by PFAM, NR assignments FOLDLIB Step 5 Domain location prediction by sequence Step 6 Data Warehouse

  21. Cluster configuration of Worldwide iGAP/Gfarm data analysis

  22. Preliminary performance result • Multiple cluster data analysis NFS 4-node cluster A 30.07 min Gfarm 4-node cluster A + 4-node cluster B 17.39 min

  23. Development Status and Future Plan • Gfarm – Grid file system • Global virtual file system • A dependable network shared file system in a cluster or a grid • High performance data computing support • Associates Computational Grid with Data Grid • Gfarm Grid software • Version 1.1.1 released on May 17, 2005 (http://datafarm.apgrid.org/) • Version 1.2 available real soon now • Existing programs can access Gfarm file system using syscall hooking library or GfarmFS-FUSE • Distribute analysis shows scalable I/O performance • iGAP/Gfarm – bioinformatics package • Gaussian 03 – Ab initio quantum chemistry package • Standardization effort with GGF Grid File System WG (GFS-WG) https://datafarm.apgrid.org/

More Related