1 / 9

A Proposal of Capacity and Performance Assured Storage in The PRAGMA Grid Testbed

A Proposal of Capacity and Performance Assured Storage in The PRAGMA Grid Testbed. Yusuke Tanimura 1) , Hidetaka Koie 1,2) , Tomohiro Kudoh 1) Isao Kojima 1) , and Yoshio Tanaka 1) 1) National Institute of AIST, Japan 2) SURIGIKEN Co., Ltd. Background.

manjit
Download Presentation

A Proposal of Capacity and Performance Assured Storage in The PRAGMA Grid Testbed

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. A Proposal ofCapacity and Performance Assured Storagein The PRAGMA Grid Testbed Yusuke Tanimura1),Hidetaka Koie1,2),Tomohiro Kudoh1) Isao Kojima1),and Yoshio Tanaka1) 1) National Institute of AIST, Japan 2) SURIGIKEN Co., Ltd.

  2. Background • Support of data-intensive scientific applications is one of the challenges of the PRAGMA grid testbed work. • Avian Flu Grid applications • Geo sciences applications • GLEON/CLEON applications • Gfarm provides ... • Global file access using a single name space • POSIX-like • Efficient file replication among sites • Competitive to grid-ftp • Excellent performance with data access locality • Each process just accesses a local disk drive. • However, there is a missing part of the storage resource management from the view point of resource sharing.

  3. Problem 1 • Most PRAGMA sites are multitenant and a shared storage tends to be a performance bottleneck. • Total performance is not satisfied. • Access conflict occurs at some of storage servers. Each PRAGMA site Application A (parallel job) Application B (striping I/O) Compute servers of a cluster A shared storage for the cluster (NFS, PVFS, Lustre, etc.)

  4. Problem 2 • Using a remote storage over the Internet is required in some use cases but ... • The remote storage is not fast enough, the performance is unpredictable, or the disk space is not enough. • On the other hand, high-bandwidth or bandwidth-guaranteed dynamic network (Ex. lambda paths) is available. Client x x Performance? x x x x Disk space? x Lambda path network Site A x Site B Storage Site C

  5. Our proposed storage (Papio) • Allow users to reserve performance in advance • Specify date, time, and read/write throughput • For write access, disk space can be reserved, too. • During reserved time, the storage servers are dedicated to the user or the user’s access is prioritized (SLA). • Use existing I/O control techniques for prioritization. • Disk I/O scheduling • Expect stable disk throughput or performance prediction when using flush disk (Ex. SSD). • Flow control of I/O path • Reserve buffer cache on storage servers • Reservation interface • Provide a special command and a Web-services based interface. • Collocation with network resources is also supported.

  6. Flow control (by PSPacer) Our proposed storage (Papio) Proposed storage (deployed in a single site) Network Resource Manager Management server Reservation management service File metadata service Storage Resource Manager (SRM) Web services based protocol Reserve Administrate I/O controls according to the reservation Collocation Global Resource Coordinator Disk I/O control (by dm-ioband) Storage server Reserve request (by command) Reserve request (by Web services based protocol, GNS-WSI3) Storage server Application Storage server Client node

  7. MPI-IO application, virtual clusters, etc. Require high-throughput 60MB/s for each process 150MB/s 420MB/s Application B Application C Application C Application C Application C Application A 140MB/s 60MB/sec 60MB/sec 60MB/sec 60MB/sec 150MB/s 140MB/s 140MB/s Resource allocation • Papio allocates the storage resources (Disk space, I/O path, etc.) to each application according to reservations. Storage server Storage server Storage server Storage server Storage server Storage server Storage server 200MB/s 200MB/s 200MB/s 200MB/s 100MB/s 100MB/s 100MB/s

  8. Current status and future work • A prototype implementation is planned to be completed by this Summer. • Not for production but for tests and demonstration • Performance guarantee is challenging. • We first support, • Dedicated use and then try to support prioritized use. • Sequential read throughput (MB/sec) reservation • Write access is more complicated. • Need to study how much performance granularity can be guaranteed. • If someone is interested in this, we can deploy the software on the AIST cluster for experimental use around PRAGMA 19.

  9. Acknowledgement • A part of this research is supported by the Special Coordination Funds for Promoting Science and Technology of Ministry of Education, Culture, Sports, Science and Technology (MEXT) in Japan.

More Related