futuregrid design and implementation of a national grid test bed
Skip this Video
Download Presentation
FutureGrid Design and Implementation of a National Grid Test-Bed

Loading in 2 Seconds...

play fullscreen
1 / 26

FutureGrid Design and Implementation of a National Grid Test-Bed - PowerPoint PPT Presentation

  • Uploaded on

FutureGrid Design and Implementation of a National Grid Test-Bed. David Hancock – [email protected] HPC Manager - Indiana University Hardware & Network Lead - FutureGrid. IU in a nutshell. $1.7B Annual Budget, >$100M annual IT budget Recent credit upgrade to AAA One university with

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
Download Presentation

PowerPoint Slideshow about 'FutureGrid Design and Implementation of a National Grid Test-Bed' - nolen

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
futuregrid design and implementation of a national grid test bed

FutureGridDesign and Implementation of a National Grid Test-Bed

David Hancock – [email protected]

HPC Manager - Indiana University

Hardware & Network Lead - FutureGrid

iu in a nutshell
IU in a nutshell
  • $1.7B Annual Budget, >$100M annual IT budget
  • Recent credit upgrade to AAA
  • One university with
    • 8 campuses
    • 107,000 students
    • 3,900 faculty
  • Nation’s 2nd largest school of medicine
  • Serious HPC since 1990’s
  • Research staff increased from 30-120 since 1995
  • 50/50 Split in base and grant funding
  • Large scale projects: TeraGrid, Open Science Grid (ATLAS Tier2 center), PolarGrid, Data Capacitor
  • New Data Center opened in 2009

Research Technologies

nsf track overview
NSF Track Overview
  • Track 1 – NCSA Blue Waters
  • Track 2a – TACC Ranger
  • Track2b – NICS Kraken
  • Track 2d
    • Data Intensive High Performance System (SDSC)
    • Experimental High Performance System (GaTech)
    • Experimental High Performance Test-Bed (IU)
  • The goal of FutureGrid is to support the research on the future of distributed, grid, and cloud computing.
  • FutureGrid will build a robustly managed simulation environment and test-bed to support the development and early use in science of new technologies at all levels of the software stack: from networking to middleware to scientific applications.
  • The environment will mimic TeraGrid and/or general parallel and distributed systems – FutureGrid is part of TeraGrid and one of two experimental TeraGrid systems (other is GPU)
  • This test-bed will succeed if it enables major advances in science and engineering through collaborative development of science applications and related software.
  • FutureGrid is a (small 5400 core) Science/Computer Science Cloud but it is more accurately a virtual machine based simulation environment
futuregrid partners
FutureGrid Partners
  • Indiana University (Architecture, core software, Support)
  • Purdue University (HTC Hardware)
  • San Diego Supercomputer Center at University of California San Diego (INCA, Performance Monitoring)
  • University of Chicago/Argonne National Labs (Nimbus)
  • University of Florida (ViNe, Education and Outreach)
  • University of Southern California Information Sciences Institute (Pegasus to manage experiments)
  • University of Tennessee Knoxville (Benchmarking)
  • University of Texas at Austin/Texas Advanced Computing Center (Portal)
  • University of Virginia (OGF, User Advisory Board)
  • Center for Information Services and GWT-TUD from Technische Universtität Dresden. (VAMPIR)
  • Blue institutions host FutureGrid hardware
other important collaborators
Other Important Collaborators
  • Early users from an application and computer science perspective and from both research and education
  • Grid5000 and D-Grid in Europe
  • Commercial partners such as
    • Eucalyptus ….
    • Microsoft (Dryad + Azure)
    • Application partners
  • NSF
  • TeraGrid – Tutorial at TG10
  • Open Grid Forum – Possible BoF
  • Possibly Open Nebula, Open Cirrus Testbed, Open Cloud Consortium, Cloud Computing Interoperability Forum. IBM-Google-NSF Cloud, and other DoE/NSF/… clouds
futuregrid timeline
FutureGrid Timeline
  • October 2009 – Project Starts
  • November 2009 – SC09 Demo
  • January 2010 – Significant Hardware installed
  • April 2010 – First Early Users
  • May 2010 – FutureGrid network complete
  • August 2010 – FutureGrid Annual Meeting
  • September 2010 – All hardware, except shared memory system, available
  • October 2011 – FutureGrid allocatable via TeraGrid process – first two years by user/science board
futuregrid usage scenarios
FutureGrid Usage Scenarios
  • Developers of end-user applications who want to create new applications in cloud or grid environments, including analogs of commercial cloud environments such as Amazon or Google.
    • Is a Science Cloud for me? Is my application secure?
  • Developers of end-user applications who want to experiment with multiple hardware environments.
  • Grid/Cloud middleware developers who want to evaluate new versions of middleware or new systems.
  • Networking researchers who want to test and compare different networking solutions in support of grid and cloud applications and middleware.
  • Education as well as research
  • Interest in performance testing requires that bare metal images areimportant
storage hardware
Storage Hardware
  • FutureGrid has a dedicated network (except to TACC) and a network fault and delay generator
  • Experiments can be isolated by request
  • Additional partner machines may run FutureGrid software and be supported (but allocated in specialized ways)
system milestones
System Milestones
  • New Cray System (xray)
    • Delivery: January 2010
    • Acceptance: February 2010
    • Available for Use: April 2010
  • New IBM Systems (india)
    • Delivery: January 2010
    • Acceptance: March 2010
    • Available for Use: May 2010
  • Dell System (tango)
    • Delivery: April 2010
    • Acceptance: June 2010
    • Available for Use: July 2010
  • Existing IU iDataPlex (sierra)
    • Move to SDSC: January 2010
    • Available for Use: April 2010
  • Storage Systems (Sun & DDN)
    • Delivery: December 2009
    • Acceptance: January 2010
network impairments device
Network Impairments Device
  • Spirent XGEM Network Impairments Simulator for jitter, errors, delay, etc
  • Full Bidirectional 10G w/64 byte packets
  • up to 15 seconds introduced delay (in 16ns increments)
  • 0-100% introduced packet loss in .0001% increments
  • Packet manipulation in first 2000 bytes
  • up to 16k frame size
  • TCL for scripting, HTML for manual configuration
network milestones
Network Milestones
  • December 2009
    • Setup and configuration of core equipment at IU
    • Juniper EX 8208
    • Spirent XGEM
  • January 2010
    • Core equipment relocated to Chicago
    • IP addressing & AS #
  • February 2010
    • Coordination with local networks
    • First Circuits to Chicago Active
  • March 2010
    • Peering with TeraGrid & Internet2
  • April 2010
    • NLR Circuit to UFL (via FLR) Active
  • May 2010
    • NLR Circuit to SDSC (via CENIC) Active
global noc background
Global NOC Background
  • ~65 total staff
  • Service Desk: proactive & reactive monitoring 24x7x365, coordination of support
  • Engineering: All operational troubleshooting
  • Planning/Senior Engineering: Senior Network engineers dedicated to single projects
  • Tool Developers: Developers of GlobalNOC tool suite
supported projects
Supported Projects



futuregrid architecture
FutureGrid Architecture
  • Open Architecture allows to configure resources based on images
  • Managed images allows to create similar experiment environments
  • Experiment management allows reproducible activities
  • Through our modular design we allow different clouds and images to be “rained” upon hardware.
  • Will support deployment of preconfigured middleware including TeraGrid stack, Condor, BOINC, gLite, Unicore, Genesis II
software goals
Software Goals
  • Open-source, integrated suite of software to
    • instantiate and execute grid and cloud experiments.
    • perform an experiment
    • collect the results
    • tools for instantiating a test environment
      • TORQUE, Moab, xCAT, bcfg, and Pegasus, Inca, ViNE, a number of other tools from our partners and the open source community
      • Portal to interact
    • Benchmarking


command line
Command line
  • fg-deploy-image
    • host name
    • image name
    • start time
    • end time
    • label name
  • fg-add
    • label name
    • framework hadoop
    • version 1.0
  • Deploys an image on a host
  • Adds a feature to a deployed image


fg stratosphere
FG Stratosphere
  • Objective
    • Higher than a particular cloud
    • Provides all mechanisms to provision a cloud on a given FG hardware
    • Allows the management of reproducible experiments
    • Allows monitoring of the environment and the results
  • Risks
    • Lots of software
    • Possible multiple path to do the same thing
  • Good news
    • We worked in a team, know about different solutions and have identified a very good plan
    • We can componentize Stratosphere


dynamic provisioning
Dynamic Provisioning
  • Change underlying system to support current user demands
  • Linux, Windows,Xen/KVM, Nimbus, Eucalyptus
  • Stateless images
    • Shorter boot times
    • Easier to maintain
  • Stateful installs
    • Windows
  • Use Moab to trigger changes and xCAT to manage installs


xcat and moab
xCAT and Moab
  • xCAT
    • uses installation infrastructure to perform installs
    • creates stateless Linux images
    • changes the boot configuration of the nodes
    • remote power control and console
  • Moab
    • meta-schedules over resource managers
      • TORQUE and Windows HPC
    • control nodes through xCAT
      • changing the OS


experiment manager
Experiment Manager
  • Objective
    • Manage the provisioning for reproducible experiments
    • Coordinate workflow of experiments
    • Share workflow and experiment images
    • Minimize space through reuse
  • Risk
    • Images are large
    • Users have different requirements and need different images


  • FutureGrid - http://www.futuregrid.org/
  • NSF Award OCI-0910812
  • NSF Solicitation 08-573
    • http://www.nsf.gov/pubs/2008/nsf08573/nsf08573.htm
  • ViNe - http://vine.acis.ufl.edu/
  • Nimbus - http://www.nimbusproject.org/
  • Eucalyptus - http://www.eucalyptus.com/
  • VAMPIR - http://www.vampir.eu/
  • Pegasus - http://pegasus.isi.edu/