COLA’s Information Systems

COLA’s Information Systems 2011

Computing at COLAMaximizing Productivity and Collaboration • IT culture of service to scientists • Focus is on in-house data analysis • Large spinning disk volumes, ~400TB • Shared access to data • Servers for data analysis • GrADS • Primary tool for data analysis and visualization

Internet 2 1Gb/s Internet 1 20Mb/s COLA Data Center Ethernet 1Gb/s cpuX 6 nodes @ 8 cores 24-32 GB atlas1 1 node@ 24 cores 256 GB colaXX 2nodes @ 8cores 32 GB Infiniband (DDR) /homes 8 TB /data 250TB /shared 100 TB /project 33 TB Backups Quantum i6000 library Nightly backups ~50 TB tape archive

Offsite Resources • NCAR (CISL) • Bluefire, Mirage • 20TB project space • 448TB tape archive • Oak Ridge, NICS (XSEDE) • Kraken, Nautilus • Scratch only disk space • 1.2PB tape archive • NASA Ames (NAS) • Pleiades • 1TB project disk space • 325TB Tape archive Total: ~10 million CPU hours

Offsite Computing Trade-offs • Resources are shared, not dedicated (multi-project/multi-purpose systems) • Relatively small amounts of online disk space • Disk and tape throughput issues • Islands of disk • Lack of cross-site data analysis capabilities

Onsite Computing Benefits • Complete autonomy and flexibility • Scientists have easy access to IS staff • IS staff can quickly respond to science needs • Fast, cost-effective, large spinning disk volume • Subsets from all remote projects stored and analyzed locally • Shared data and analysis methods support broad COLA-wide collaborations

Data Management • New, more scalable design • builds on collaboration with NCAR-CISL • Isolate and curate frequently used static data (shared) • Active oversight by Data Management Committee • Scientists make data management decisions • Organize remaining data files and scripts (project) • Tape archive and retrieval • Catalog shared data /shared 100 TB /project 33 TB

Future of COLA Computing • We are anticipating challenges from thenext generation of data sets: • From community: CFSRR, CMIP5, etc. • COLA-generated: ISI, Decadal, … • Athena project foreshadowed many obstacles • 3-year plan now in place to meet these challenges • expecting to learn valuable lessons • GrADS is an essential component of strategy

COLA’s Information Systems