220 likes | 360 Views
Experiences with the FutureGrid Testbed. Shava Smallen ssmallen@sdsc.edu. UC Cloud Summit UCLA April 19, 2011. FutureGrid. FutureGrid is an international testbed modeled on Grid5000 Track 2D award (4 years) - started in October 2009
E N D
Experiences with the FutureGridTestbed Shava Smallen ssmallen@sdsc.edu UC Cloud Summit UCLA April 19, 2011
FutureGrid • FutureGrid is an international testbed modeled on Grid5000 • Track 2D award (4 years) - started in October 2009 • Supporting international Computer Science and Computational Science research in cloud, grid and parallel computing (HPC) • Industry and Academia • The FutureGrid testbed provides to its users: • A flexible development and testing platform for middleware and application users looking at interoperability, functionality, performance or evaluation • Each use of FutureGrid is an experiment that is reproducible • A rich education and teaching platform for advanced cyberinfrastructure (computer science) classes
FutureGridPartners (Red institutions have FutureGridhardware) • Indiana University (Architecture, core software, Support) • Purdue University (HTC Hardware) • San Diego Supercomputer Center at University of California San Diego (Inca, Monitoring) • University of Chicago/Argonne National Labs (Nimbus) • University of Florida (ViNE, Education and Outreach) • University of Southern California Information Sciences (Pegasus) • University of Tennessee Knoxville (Benchmarking) • University of Texas at Austin/Texas Advanced Computing Center (Portal) • University of Virginia (OGF, Advisory Board and allocation) • Center for Information Services and GWT-TUD from TechnischeUniverstität Dresden. (VAMPIR)
FutureGrid: a Grid/Cloud/HPC Testbed NID: Network Impairment Device FG Network PrivatePublic
5 Use Types for FutureGrid • ~100 approved projects over last 6 months • Training Education and Outreach • Semester and short events; promising for non research intensive universities • Interoperability test-beds • Grids and Clouds; OpenGrid Forum OGF really needed this • Domain Science applications • Life science highlighted • Computer science • Largest current category (> 50%) • Computer Systems Evaluation • TeraGrid (TIS, TAS, XSEDE), OSG, EGI
Fine-grained Application Energy ModelingCatherine Olschanowsky (UCSD/SDSC) • PhD student in CSE dept at UCSD • Research: estimate the energy requirements for specific application-resource pairings • Method to collect fine-grained DC power measurements on HPC resources • Energy-centric benchmark infrastructure • Models • FutureGrid experiment: • Required bare metal access to 1 node of Sierra for 2 weeks • Custom-made power monitoring harness attached to CPU and memory • WattsUp device connected to power Power monitoring harness attached to Sierra node Close-up of harness attachments
TeraGrid QA Testing and DebuggingShava Smallen (UCSD/SDSC) • Co-lead of TeraGrid Quality Assurance Working Group • GRAM 5 scalability testing • Emulated Science Gateway use • Created virtual cluster via Nimbus on Foxtrot for ~1 month • Discovered bug where large log file was created in user’s home dir • GridFTP 5 testing • Verified data synchronization and server offline mode • Created VM via Nimbus on Sierra and Foxtrot • Discovered small bug in synchronization GRAM 5 scalability testing results run on 4-node Nimbus cluster on Foxtrot
Architecture Goals • Provide management capabilities for reproducible experiments • Conveniently define, execute, and repeat application or distributed/grid/cloud middleware experiments • Leverages dedicated network and a Spirent XGEM network fault and delay generator • Support diverse user community • Application developers, Middleware developers, System administrators, Educators, Application users • Support shifting technology base • Support diverse access models • Implemented using Open Source tools
Phase I – Static Partitions • HPC partition • Torque/Moab • Intel compilers, OpenMPI, IMPI • Persistent endpoints for Unicore and Genesis II • Eucalyptus and Nimbus deployments with Xenhypervisor • One machine deployed with KVM (Alamo) – plan to migrate others based on performance analysis work* • Also plan to enable advanced instruction sets based on Magellan work * Andrew J. Younge, et. al "Analysis of Virtualization Technologies for High Performance Computing Environments" at The 4th International Conference on Cloud Computing (IEEE CLOUD) 2011
Phase I – Inca Monitoring Status of basic cloud tests Statistics displayed from HPCC performance measurement VM instance creation times for Nimbus History of HPCC performance
Phase II – Image management • Goal: support a growing image library for MPI, OpenMP, Hadoop, Dryad, gLite, Unicore, Globus, CTSS, etc. • For different hypervisors (Xen, KVM) and cloud tools (Eucalyptus, Nimbus) • Currently have prototypes for image generator (fg-image-generate) and image repository (fg-image-deploy) • Currently separate repositories for Nimbus and Eucalyptus deployments • CentOS, Fedora, Debian images • Grid appliances (Nimbus) for Hadoop and MPI
FutureGrid Tutorials • Tutorial topic 1: Cloud Provisioning Platforms • Tutorial NM1: Using Nimbus on FutureGrid • Tutorial NM2: Nimbus One-click Cluster Guide • Tutorial GA6: Using the Grid Appliances to run FutureGrid Cloud Clients • Tutorial EU1: Using Eucalyptus on FutureGrid • Tutorial topic 2: Cloud Run-time Platforms • Tutorial HA1: Introduction to Hadoop using the Grid Appliance • Tutorial HA2: Running Hadoop on Eucalyptus • Tutorial TW1: Running Twister on Eucalyptus • Tutorial topic 3: Educational Virtual Appliances • Tutorial GA1: Introduction to the Grid Appliance • Tutorial GA2: Creating Grid Appliance Clusters • Tutorial GA3: Building an educational appliance from Ubuntu 10.04 • Tutorial GA4: Deploying Grid Appliances using Nimbus • Tutorial GA5: Deploying Grid Appliances using Eucalyptus • Tutorial GA7: Customizing and registering Grid Appliance images using Eucalyptus • Tutorial MP1: MPI Virtual Clusters with the Grid Appliances and MPICH2 • Tutorial topic 4: High Performance Computing • Tutorial VA1: Performance Analysis with Vampir • Tutorial VT1: Instrumentation and tracing with VampirTrace
More Information FutureGrid Website http://portal.futuregrid.org FutureGrid Help help@futuregrid.org Feel free to also send me any questions at ssmallen@sdsc.edu
FutureGrid modeled on Grid’5000 • Experimental testbed • Configurable, controllable, monitorable • Established in 2003 • 10 sites • 9 in France • Porto Allegre in Brazil • ~5000+ cores http://futuregrid.org
Storage Hardware Will add substantially more disk on node and at IU and UF as shared storage
Network Impairment Device • Spirent XGEM Network Impairments Simulator for jitter, errors, delay, etc • Full Bidirectional 10G w/64 byte packets • up to 15 seconds introduced delay (in 16ns increments) • 0-100% introduced packet loss in .0001% increments • Packet manipulation in first 2000 bytes • up to 16k frame size • TCL for scripting, HTML for human configuration • More easily replicable than keeping teenagers around the house……
FG RAIN Command • fg-rain –hhostfile –iaas nimbus –image img • fg-rain –hhostfile –paashadoop … • fg-rain –hhostfile –paas dryad … • fg-rain –hhostfile –gaasgLite … • fg-rain –hhostfile –image img • Authorization is required to use fg-rain without virtualization.