1 / 29

The NERSC Global File System NERSC June 12th, 2006

The NERSC Global File System NERSC June 12th, 2006. Overview. NGF: What/Why/How NGF Today Architecture Who’s Using it Problems/Solutions NGF Tomorrow Performance Improvements Reliability Enhancements New Filesystems(/home). What is NGF?. NERSC Global File System - what.

thanos
Download Presentation

The NERSC Global File System NERSC June 12th, 2006

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. The NERSC Global File System NERSC June 12th, 2006

  2. Overview • NGF: What/Why/How • NGF Today • Architecture • Who’s Using it • Problems/Solutions • NGF Tomorrow • Performance Improvements • Reliability Enhancements • New Filesystems(/home)

  3. What is NGF?

  4. NERSC Global File System - what • What do we mean by a global file systems? • Available via standard APIs for file system access on all NERSC systems. • POSIX • MPI-IO • We plan on being able to extend that access to remote sites via future enhancements. • High Performance • NGF is seen as a replacement for our current file systems, and is expected to meet the same high performance standards

  5. NERSC Global File System - why • Increase User productivity • To reduce users’ data management burden. • Enable/Simplify workflows involving multiple NERSC computational systems • Accelerate the adoption of new NERSC systems • Users have access to all of their data, source code, scripts, etc. the first time they log into the new machine • Enable more flexible/responsive management of storage • Increase Capacity/Bandwidth on demand

  6. NERSC Global File System - how • Parallel • Network/SAN heterogeneous access model • Multi-Platform (AIX/linux for now)

  7. NGF Today

  8. NGF current architecture • NGF is a GPFS file system using GPFS multi-cluster capabilities • Mounted on all NERSC systems as /project • External to all NERSC computational clusters • Small linux server cluster managed separately from computational systems. • 70 TB user visible storage. 50+ Million inodes. • 3GB/s aggregate bandwith

  9. NGF Current Configuration

  10. /project • Limited initial deployment - no homes, no /scratch • Projects can include many users potentially using multiple systems(mpp, vis, …) and seemed to be prime candidates to benefit from the NGF shared data access model • Backed up to HPSS bi-weekly • Will eventually receive nightly incremental backups. • Default project quota: • 1 TB • 250,000 inodes

  11. /project – 2 • Current usage • 19.5 TB used (28% of capacity) • 2.2 M inodes used (5% of capacity) • NGF /project is currently mounted on all major NERSC systems (1240+ clients): • Jacquard, LNXI Opteron System running SLES 9 • Da Vinci, SGI Altix running SLES 9 Service Pack 3 with direct storage access • PDSF IA32 Linux cluster running Scientific Linux • Bassi, IBM Power5 running AIX 5.3 • Seaborg, IBM SP running AIX 5.2

  12. /project – problems & Solutions • /project has not been without it’s problems • Software bugs • 2/14/06 outage due to Seaborg gateway crash – problem reported to IBM, new ptf with fix installed. • GPFS on AIX5.3 ftruncate() error on compiles – problem reported to IBM. efix now installed on Bassi. • Firmware bugs • FibreChannel Switch bug – firmware upgraded. • DDN firmware bug(triggered on rebuild) – firmware upgraded • Hardware Failures • Dual disk failure in raid array – more exhaustive monitoring of disk health including soft errors now in place

  13. NGF – Solutions • General actions taken to improve reliability. • Pro-active monitoring – see the problems before they’re problems • Procedural development – decrease time to problem resolution/perform maintenance without outages • Operations staff activities – decrease time to problem resolution • PMRs filed and fixes applied – prevent problem recurrence • Replacing old servers – remove hardware with demonstrated low MTBF • NGF Availability since 12/1/05: ~99% (total down time: 2439 minutes)

  14. Current Project Information • Projects using /project file system: (46 projects to date) • narccap: North American Regional Climate Change Assessment Program – Phil Duffy, LLNL • Currently using 4.1 TB • Global model with fine resolution in 3D and time; will be used to drive regional models • Currently using only Seaborg • mp107: CMB Data Analysis – Julian Borrill, LBNL • Currently using 2.9 TB • Concerns about quota management and performance • 16 different file groups

  15. Current Project Information • Projects using /project file system (cont.): • incite6: Molecular Dynameomics – Valerie Daggett, UW • Currently using 2.1 TB • snaz: Supernova Science Center – Stan Woosley, UCSC • Currently using 1.6 TB

  16. Other Large Projects

  17. NGF Performance • Many users have reported good performance for their applications(little difference from /scratch) • Some applications show variability of read performance(MADCAP/MADbench) – we are investigating this actively.

  18. MADbench Results

  19. Bassi Read Performance

  20. Bassi Write Performance

  21. Current Architecture Limitations • NGF performance is limited by the architecture of current NERSC systems • Most NGF I/O uses GPFS TCP/IP storage access protocol • Only Da Vinci can access NGF storage directly via FC. • Most NERSC systems have limited IP bandwidth outside of the cluster interconnect. • 1 gig-e per I/O node on Jacquard. each compute node uses only 1 I/O node for NGF traffic. 20 I/O noodes feed into 1 10Gb ethernet • Seaborg has 2 gateways with 4xgig-e bonds. Again each compute node uses only 1 gateway. • Bassi nodes each have 1-gig interfaces all feeding into a single 10Gb ethernet link

  22. NGF tomorrow(and beyond …)

  23. Performance Improvements • NGF Client System Performance upgrades • Increase client bandwidth to NGF via hardware and routing improvements. • NGF storage fabric upgrades • Increase Bandwidth and ports of NGF storage fabric to support future systems. • Replace old NGF Servers • New servers will be more reliable. • 10-gig ethernet capable. • New Systems will be designed to support High performance to NGF.

  24. NGF /home • We will deploy a shared /home file system in 2007 • Initially only home for 1 system, may be mounted on others. • New systems thereafter all have home directories on NGF /home • Will be a new file system with tuning parameters configured for small file accesses.

  25. /home layout – decision slide Two options • A user’s login directory is the same for all systems • /home/matt/ • A user’s login directory is a different subdirectory of the user’s directory for each system • /home/matt/seaborg • /home/matt/jacquard • /home/matt/common • /home/matt/seaborg/common -> ../common

  26. One directory for all • Users see exactly the same thing in their home dir every time they log in, no matter what machine they’re on. • Problems • Programs sometimes change the format of their configuration files(dotfiles) from one release to another without changing the file’s name. • Setting $HOME affects all applications not just the one that needs different config files • Programs have been known to use getpwnam() to determine the users home directory, and look there for config files rather than in $HOME • Setting $HOME essentially emulates the effect of having separate home dirs for each system

  27. One directory per system • By default users start off in a different directory on each system • Dotfiles are different on each system unless the user uses symbolic links to make them the same • All of a users files are accessible from all systems, but a user may need to “cd ../seaborg” to get at files he created on seaborg if he’s logged into a different system

  28. NGF /home conclusion • We currently believe that the multiple directories option will result in less problems for the users, but are actively evaluating both options. • We would welcome user input on the matter.

  29. NGF /scratch • We plan on deploying a shared /scratch to NERSC-5 sometime in 2008

More Related