1 / 12

Chiba City An Open Source Computer Science Testbed

Chiba City An Open Source Computer Science Testbed. http://www.mcs.anl.gov/chiba/ Mathematics & Computer Science Division Argonne National Laboratory. The Chiba City Project. Chiba City is a Linux cluster built of 314 computers. It was installed at MCS in October of 1999.

lori
Download Presentation

Chiba City An Open Source Computer Science Testbed

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Chiba CityAn Open Source Computer Science Testbed http://www.mcs.anl.gov/chiba/Mathematics & Computer Science DivisionArgonne National Laboratory

  2. The Chiba City Project • Chiba City is a Linux cluster built of 314 computers. It was installed at MCS in October of 1999. • The primary purpose of Chiba City is to be a scalability testbed, built from open sourcecomponents, for the High Performance Computing and Computer Science communities. • Chiba City is a first step towards a many-thousand node system.

  3. Cluster Projects at MCS • Windows NT SuperCluster (1996-1998) (8 heterogeneous nodes) (r.i.p.) • Windows NT MPI porting project (8 nodes) • Microsoft supported (PIIs,SMP, 256 MB, 8 GB disk, Giganet) • Collage and ActiveMural (22 nodes) • Linux Cluster to Drive (PIIs, 256 MB, 4 GB disk, display Adapters) • Chiba City Test Clusters (4-16 nodes) • various development environments • Chiba City Main System (300+ nodes) • 256 core nodes (dual PIIIs, 512MB RAM, 9 GBdisk, Myrinet, Gig-E) • 32 visualization nodes (PIIIs, 512MB RAM, 9 GBdisk, Myrinet, Matrox G200s) • 8 storage nodes (Xeon, 300 GB Disk/node) • 18 management nodes (PIIs, multiple networks, 18 GB disk) • The DSL Data Grid Node (20 nodes) 4 TB aggregate • PIII, 512 MB RAM, 200 GB disk, Fast and Gig-E • Alpha Cluster for Computational Biology • 18 XP1000’s (633MHz, Ev6), 512 MB RAM, 10 GB disk , ServerNetII, Management Node

  4. Cluster-Related Research at MCS • Scalable Systems Management • Chiba City Management Model (w/LANL, LBNL) • Msys and City Software for Linux Clusters and Large Unix Environments • MPI and Communications Software • MPICH • GigaNet, Myrinet, ServerNetII • Data Management and Grid Services • Globus Services on Linux (w/LBNL, ISI) • Visualization and Collaboration Tools • Parallel OpenGL server (w/Princeton, UIUC) • vTK and CAVE Software for Linux Clusters • Scalable Media Server (FL Voyager Server on Linux Cluster) • Scalable Display Environment and Tools • Virtual Frame Buffer Software (w/Princeton) • VNC (ATT) modifications for ActiveMural • Parallel I/O • MPI-IO and Parallel Filesystems Developments (w/Clemson, PVFS) • Plus many other MCS research projects that focus on parallel computing in general, with clusters being a specific case of that. - PETSc, ALICE, ADIFOR, Neos, ...

  5. Chiba City User Community • Computer Scientists • Computational Scientists • Industry and Educational Partners • Open Source development groups

  6. Chiba City Design Goals • Support Computer Scientists and Open Source Developers • Dedicated visualization nodes • Dedicated storage nodes • Extremely flexible node configuration and recovery • Support Computational Science • Computation nodes for message passing parallel applications • High performance network • Production environment: reliable system, scheduled allocated projects • Prototype Scalable Systems Software • Management fabric • Hierarchical, central management system • Database driven configuration

  7. Chiba City The Argonne Scalable Cluster 8 Computing Towns 256 Dual Pentium III systems 1 Storage Town 8 Xeon systems with 300G disk each 1 Visualization Town 32 Pentium III systems with Matrox G400 cards Cluster Management 12 PIII Mayor Systems 4 PIII Front End Systems 2 Xeon File Servers 3.4 TB disk Management Net Gigabit and Fast Ethernet Gigabit External Link High Performance Net 64-bit Myrinet 27 Sep 1999

  8. A “town” is the basic cluster building unit. 8-32 systems, for the actual work. 1 mayor, for management. OS loading, monitoring, file service Network and management gear. 8 compute towns: 32 dual PIII 500 compute nodes that run user jobs. 1 storage town: 8 Xeon systems with 300G disk For storage-related research, eventually for production global storage. 1 visualization town: 32 nodes for dedicated visualization experiments. Chiba Computing Systems mayor node node node node node node node node node node node node node node node node node node node node node node node node node node node node node node node node

  9. High Performance Network (Not Shown) 64-bit Myrinet. All systems connected. Flat topology. Management Network Switched Fast and Gigabit Ethernet. Primarily for builds and monitoring. Fast ethernet: Each individual node. Bonded gigabit ethernet: Mayors, servers, & login nodes. Town interconnects. External links. IP Topology: 1 flat IP subnet. … … Chiba Networks mayor n n n n n n n n Eth Switch n n n n … n n n n n n n n Gig Eth Switch n n n n … n n n n n n n n Control Systems Front Ends Test Clusters Gigabit Ethernet Fast Ethernet ANL network

  10. Chiba Deployment Schedule • The Chiba City Barnraising - October 1999 • SC99 - November 1999 • System Shakedown, Myrinet Installation - December 1999 • Early Users - January 2000 • Production mode - Spring 2000

  11. MPI and Parallel I/O on Chiba • Using MPICH, ROMIO, and PVFS (Clemson) • Redesigning MPICH internals for better scalability, support for faster networking technologies (VIA, Myrinet), and for MPI-2 • PVFS servers will run on the storage town • Collaborating with Clemson on PVFS: • scalability and reliability • eliminated dependency on NFS for metadata storage • redesigning PVFS to support faster communication mechanisms

  12. Chiba CityAn Open Source Computer Science Testbed http://www.mcs.anl.gov/chiba/Mathematics & Computer Science DivisionArgonne National Laboratory

More Related