Futuregrid training education and outreach
1 / 32

FutureGrid Training, Education and Outreach - PowerPoint PPT Presentation

  • Uploaded on

FutureGrid Training, Education and Outreach. Presented by Renato Figueiredo renato@acis.ufl.edu Associate Professor University of Florida. Bloomington Indiana January 17 2010. Overview.

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
Download Presentation

PowerPoint Slideshow about 'FutureGrid Training, Education and Outreach' - kaleb

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Futuregrid training education and outreach

FutureGridTraining, Education and Outreach

Presented by Renato Figueiredo


Associate Professor

University of Florida

Bloomington Indiana

January 17 2010


Traditional ways of delivering hands-on training and education in parallel/distributed computing have non-trivial dependences on the environment

Difficult to replicate same environment on different resources (e.g. HPC clusters, desktops)

Difficult to cope with changes in the environment (e.g. software upgrades)

Virtualization technologies remove key software dependences through a layer of indirection


FutureGrid enables new approaches to education and training and opportunities to engage in outreach

Cloud, virtualization and dynamic provisioning – environment can adapt to the user, rather than expect user to adapt to the environment

Focus of FutureGrid TEO is on leveraging the unique capabilities of the infrastructure and its software to:

Reduce barriers to entry and engage new users

Use of encapsulated environments (“appliances”) as a primary delivery mechanism of education/training modules – promoting reuse, replication, and sharing

Summary of activities 1
Summary of activities (1)

Focus activities in the first year

Infrastructure supporting TEO activities

Documentation, integration of educational materials, input/recommendations for portal and computing infrastructure

Development of hands-on tutorials tailored to FutureGrid technologies and resources

Development, integration, testing of educational virtual appliances

Summary of activities 2
Summary of activities (2)

Focus activities in the first year

Education activities

Working with early adopters in class environments

Understand requirements, opportunities, challenges

Outreach activities

Demonstrations and presentations highlighting FutureGrid’s unique capabilities in conferences, workshops

Engaging with minority serving institutions

Teo infrastructure guiding principles
TEO Infrastructure - guiding principles

Fidelity: TEO activities should use full-fledged, executable software: education/training modules

Learn using the proper tools

Reproducibility: Creators of content should be able to install, configure, and test their modules once, and be assured of the same functional behavior regardless of where the module is deployed

Incentive to invest effort in developing, testing and documenting new modules

Teo infrastructure guiding principles1
TEO Infrastructure - guiding principles

Deployability: Students and users should be able to deploy modules in a simple manner, and in a variety of resources

Reduce barriers to entry; avoid dependences upon a particular infrastructure

Community-oriented: Modules should be simple to share, discover, reuse, and expand

Create conditions for “viral” growth

Towards this vision in futuregrid
Towards this vision in FutureGrid

Executable modules – virtual appliances

Deployable on FutureGrid resources

Deployable on other cloud platforms, as well as virtualized desktops

Community sharing – Web 2.0 portal, appliance image repositories

An aggregation hub for executable modules and documentation

Educational appliances
Educational appliances

  • A flexible, extensible platform for hands-on, lab-oriented education on FutureGrid

  • Need to support clustering of resources

  • Virtual machines + social/virtual networking to create sandboxed modules

    • Virtual “Grid” appliances: self-contained, pre-packaged execution environments

    • Group VPNs: simple management of virtual clusters by students and educators

Virtual appliance example
Virtual appliance example

Linux, Java, Hadoop, configuration scripts



A Hadoop worker

Another Hadoop worker






Virtual networking
Virtual Networking

A single appliance encapsulates software and configuration

Cluster/Grid/Cloud computing

Middleware expects a collection of machines, typically on a LAN (Local Area Network)

Appliances need to communicate and coordinate with each other

Each worker needs an IP address, uses TCP/IP sockets

Virtual cluster appliances
Virtual cluster appliances

Virtual appliance + virtual network







Another Hadoop worker

A Hadoop worker






Support for clustering
Support for clustering

Network virtualization software on FutureGrid includes ViNe and GroupVPN

Nimbus has support for contextualization of one-click virtual clusters

Within a LAN, or coupled with ViNe

Grid appliances use peer-to-peer overlay for discovery and configuration of virtual addresses (DHCP) and cluster middleware

Groupvpn overview
GroupVPN Overview


Network API

Alice’s public keys

Bob’s public keys

Carol’s public key

Messaging layer/information system

Social network

(e.g. XMPP,

group site)



Web interface

Bootstrapping private links through

Web 2.0 interfaces and IP-over-P2P overlay tunneling

Private IP address spaces, DHCP

Appliances perceive virtual LAN

Virtual network




Deploying virtual clusters
Deploying virtual clusters

Same image, different VPNs







Another Hadoop worker

A Hadoop worker









Web site)

Virtual IP - DHCP

Virtual IP - DHCP

Futuregrid example
FutureGrid example

Deploying a Condor virtual appliance cluster on FutureGrid or desktop resources

Nimbus: cloud-client.sh --run --name grid-appliance-amd64.tar.gz

Eucalyptus: euca-run-instances ami-fd4aa494 --instance-type m1.large -k keypair

Vmware player: double-click Grid-appliance.vmx

Upload GroupVPN configuration file to appliances

Fg appliances status
FG appliances - Status





FutureGrid resources,

Appliance images (Condor,

Hadoop), tutorials

GroupVPN portal, image

downloads, bootstrap


Use of futuregrid in classes
Use of FutureGrid in classes

First-year ramp-up of hardware and software

Training and education emphasis has been use in classes, tutorials with early adopters


Cloud computing class at Indiana University

Distributed Scientific Computing class at Louisiana State University (LSU)

Big data summer school at IU

Nimbus tutorial at CloudCom conference

Futuregrid training education and outreach

Big Data for Science










of Florida



San Diego




at Chicago



University of


University of

Texas at El Paso

University of

California at

Los Angeles

IBM Almaden

Research Center

300+ Students (200 on sites from 10 institutes; 100 online)

IU MapReduce and UF Virtual Appliance technologies are supported by FutureGrid.

July 26-30, 2010 NCSA Summer School Workshop




University of


(Slide courtesy of Judy Qiu)

Cloud computing class at iu
Cloud computing class at IU

Graduate-level “Cloud computing for Data-Intensive Sciences” (Judy Qiu, Fall 2010)

Virtualization technologies and tools

Infrastructure as a service

Parallel programming (MPI, Hadoop)

FutureGrid provided a set of software options that made it possible for students to work on different projects along the system stack.

Futuregrid training education and outreach

Term Projects


#1 Matrix Multiplication (Swapnil,Amit,Pradnay)

#2PhyloD (Ratul,Adrija,Chengming)

Higher Level Languages

Iterative MapReduce

#3 LDA (Changsi, Yang)

#4MemCache (Saliya, Yiming ,Jerome)

#5 Avro (Yuduo, Yuan, patanachai)

#6PageRank (Shuo-Huan,Parag)

Cloud Platform

Cloud Infrastructure

#7 Nimbus, Eucalyptus (Stephen, Sonali, Shakeela)



Cloud Storage

#8 Cloud Storage Survey (Xiaoming, Nixiaogang)



#9 Hypervisor Performance Analysis Project (James , Andrew)

(Slide courtesy of Judy Qiu)

Distributed scientific computing class at lsu
Distributed Scientific Computing class at LSU

FutureGrid supported activities in a new semester-long class offered Fall 2010 at LSU (Gabrielle Allen, Shantenu Jha)

A practical and comprehensive graduate course preparing students for research involving scientific computing

Module E (Distributed Scientific Computing) taught by Shantenu Jha

Topics where FutureGrid was used:

Introduction to the practice of distributed computing

Cloud computing and master-worker pattern

Distributed application case studies

Approximately half of a lecture provided an overview of FutureGrid and the process to get accounts and started

As part of the homework assignment associated with lecture E0, each student had to confirm access and successful login to FG-Sierra and FG-India

Distributed scientific computing class at lsu1
Distributed Scientific Computing class at LSU

FutureGrid (FG) was used by students to

(i) compile, deploy and execute basic SAGA commands

(ii) learn the basics of remote job submission and elementary Master-Worker based distributed applications (such as MapReduce and computing the Mandelbrot Set) using FG-India and FG-Sierra nodes

(iii) to get hands on training with IaaS Clouds, namely stand-up virtual machines using Eucalyptus and deploy software and/or applications from (i) and (ii)

Students also used Eucalyptus on FG-India and FG-Sierra to do their Module E projects, which ranged from:

(a) Clouds as accelerators for Cactus-based applications,

(b) calculate PI using distributed tasks,

(c) extend the calculation of the Mandelbrot Set to ``new'' backends on FutureGrid (in addition to the ``default'' remote/ssh backends), and

(d) the execution of workers on bare-metal as well as Clouds concurrently (i.e., hybrid Grid-Cloud infrastructure) for master-worker applications.


IMAGE emi-8D2A13F7 smaddi2-saga-bucket/saga153-ubuntu.manifest.xml smaddi2 available public x86_64 machine eri-5BB61255 eki-78EF12D2

IMAGE emi-DBD61078 ubuntu-0904-saga-1.5.2/image.manifest.xml luckow available public x86_64 machine eri-5BB61255 eki-78EF12D2

IMAGE emi-0E0E165E ajyounge/ubuntu-twister-memcached.img.manifest.xml ajyounge available public x86_64 machine eri-5BB61255 eki-78EF12D2

Nimbus tutorial at cloudcom
Nimbus tutorial at CloudCom

Half-day (3-hour) presentation + hands-on activities

30 attendees used their own computers to instantiate virtual machines on FutureGrid resources

Template for a self-learning tutorial for new users and prospective users

Futuregrid tutorials
FutureGrid tutorials

Tutorial topic 1: Cloud Provisioning Platforms

Using Nimbus on FutureGrid

Nimbus One-click Cluster Guide

Using the Grid Appliances to run FutureGrid Cloud Clients

Using Eucalyptus on FutureGrid

Tutorial topic 2: Cloud Run-time Platforms

Introduction to Hadoop using the Grid Appliance

Running Hadoop on FG using Eucalyptus (.ppt)

Running Hadoop on Eualyptus

Tutorial topic 3: Educational Virtual Appliances

Introduction to the Grid Appliance

Creating Grid Appliance Clusters

Building an educational appliance from Ubuntu 10.04

Deploying Grid Appliances using Nimbus

Deploying Grid Appliances using Eucalyptus

Customizing and registering Grid Appliance images using Eucalyptus

MPI Virtual Clusters with the Grid Appliances and MPICH2

Tutorial topic 4: High Performance Computing

Performance Analysis with Vampir

Instrumentation and tracing with VampirTrace

Year 1 outreach activities
Year-1 Outreach activities

Demonstrations, presentations, booths at major events

SuperComputing, TeraGrid Conference, OGF (Open Grid Forum), CloudCom, CCGrid, Grid’5000 meeting, Vampir workshop

1114 CPU cores (457 VMs) distributed over 3 sites in FutureGrid and 3 sites in Grid’5000 (P. Riteau et al, OGF-29 demo, Chicago, IL, June 2010).

Outreach activities
Outreach activities

At IU, working with dean for diversity and education to organize outreach and pursue REU funding to bring MSI students to IU for summer internships and to coordinate education and training workshops

Involvement of students from Historically Black Colleges and Universities (HBCUs)

REU supplement for FutureGrid this year funded 2 HBCU students in summer 2010; will apply each year

Planned teo activities
Planned TEO activities

Plan to engage MSIs with which IU has already established formal collaborative agreements

MSI Cyberinfrastructure Empowerment Coalition (MSI-CIEC). Primary theme: “teach the teachers” at MSIs so that they can incorporate cyberinfrastructure into their research and involve students and staff at their home institutions.

MSI-CIEC’s principal activity: Cyberinfrastructure Days - daylong workshops feature prominent speakers who discuss the application of cyberinfrastructure to research and education

Planned teo activities1
Planned TEO activities

With Elizabeth City State University

Planning summer school on cloud computing for ADMI (Association of Computer/Information Sciences and Engineering Departments at Minority Institutions) faculty and students

Leverage Indiana University’s STEM Initiative

Provides travel, housing, and support for HBCU students to intern at Indiana University during the summer

Planned teo activities2
Planned TEO activities

Coordinate Web tutorials and documentation; emphasis to support short tutorials that can be given by partners at conferences, and self-guided learning by new or prospective users

Continuously provide recommendations and guidance, Web portal, user accounts

Engage with potential early adopters in computer science and engineering classes

Leverage existing MSI contacts, and use of FutureGrid in workshops, summer schools, and internships