Clouds interoperation and pragma
This presentation is the property of its rightful owner.
Sponsored Links
1 / 51

Clouds, Interoperation and PRAGMA PowerPoint PPT Presentation


  • 81 Views
  • Uploaded on
  • Presentation posted in: General

Clouds, Interoperation and PRAGMA. Phili p M. Papadopoulos, Ph.D University of California, San Diego San Diego Supercomputer Center Calit2. Remember the Grid Promise?.

Download Presentation

Clouds, Interoperation and PRAGMA

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Clouds interoperation and pragma

Clouds, Interoperation and PRAGMA

Philip M. Papadopoulos, Ph.D

University of California, San Diego

San Diego Supercomputer Center

Calit2


Remember the grid promise

Remember the Grid Promise?

The Grid is an emerging infrastructure that will fundamentally change the way we think about-and use-computing. The word Grid is used by analogy with the electric power grid, which provides pervasive access to electricity and has had a dramatic impact on human capabilities and society

The grid: blueprint for a new computing infrastructure,

Foster, Kesselman. From Preface of first edition, Aug 1998


Some t hings that happened on the way to cloud computing

Some Things that Happened on the Way to Cloud Computing

  • Web Version 1.0 (1995)

  • 1 Cluster on Top 500 (June 1998)

  • Dot Com Bust (2000)

  • Clusters > 50% of Top 500 (June 2004)

  • Web Version 2.0 (2004)

  • Cloud Computing (EC2 Beta - 2006)

  • Clusters > 80% of Top 500 (Nov. 2008)


Gartner emerging tech 2005

Gartner Emerging Tech 2005


Gartner emerging tech 2008

Gartner Emerging Tech - 2008


Gartner emerging tech 2010

Gartner Emerging Tech 2010


What is fundamentally different about cloud computing vs grid computing

What is fundamentally different about Cloud computing vs. Grid Computing

  • Cloud computing – You adapt the infrastructure to your application

    • Should be less time consuming

  • Grid computing – you adapt your application to the infrastructure

    • Generally is more time consuming

  • Cloud computing has a financial model that seems to work – grid never had a financial model

    • The Grid “Barter” economy only valid for provider-to-provider trade. Pure consumers had no bargaining power


Iaas of most interest to pragma

IaaS – Of Most Interest to PRAGMA

Amazon EC2

Rackspace

Sun

GoGrid

3Tera

IBM

Run (virtual) computers to solve your problem, using your software


Cloud hype

Cloud Hype

  • “Others do all the hard work for you”

  • “You never have to manage hardware again”

  • “It’s always more efficient to outsource”

  • “You can have a cluster in 8 clicks of the mouse”

  • “It’s infinitely scalable”


Amazon web services

Amazon Web Services

  • Amazon EC2 – catalytic event in 2006 that really started cloud computing

  • Web Services access for

    • Compute (EC2)

    • Storage (S3, EBS)

    • Messaging (SQS)

    • Monitoring (CloudWatch)

    • + 20 (!) More services

      • “I thought this was supposed be simple”


Basic ec2

Basic EC2

Elastic Compute Cloud (EC2)

Amazon Cloud Storage

Copy AMI &

Boot

S3 – Simple Storage Service

EBS – Elastic Block Store

Amazon Machine Images (AMIs)

  • AMIs are copied from S3 and booted in EC2 to create a “running instance”

  • When instance is shutdown, all changes are lost

    • Can save as a new AMI


Basic ec21

Basic EC2

  • AMI (Amazon Machine Image) is copied from S3 to EC2 for booting

    • Can boot multiple copies of an AMI as a “group”

    • Not a cluster, all running instances are independent

    • Clusters Instances are about $2/Hr (8 cores) ($17K/year)

  • If you make changes to your AMI while running and want them saved

    • Must repack to make a new AMI

      • Or use Elastic Block Store (EBS) on a per-instance basis


Some challenges in ec2

Some Challenges in EC2

  • Defining the contents of your Virtual Machine (Software Stack)

  • Preparing, packing, uploading image

  • Understanding limitations and execution model

  • Debugging when something goes wrong

  • Remembering to turn off your VM

    • Smallest 64-bit VM is ~$250/month running 7x24


One problem too many choices

One Problem: too many choices


Reality for scientific applications

Reality for Scientific Applications

  • The complete software stack is critical to proper operation

    • Libraries

    • compiler/interpreter versions

    • file system location

    • Kernel

  • This is the fundamental reason that the Grid is hard: my cluster is not the same environment as your cluster

    • Electrons are universal, software packages are not


People and science are distributed

People and Science are Distributed

  • PRAGMA – Pacific Rim Applications and Grid Middleware Assembly

    • Scientists are from different countries

    • Data is distributed

  • Cyber Infrastructure to enable collaboration

  • When scientists are using the same software on the same data

    • Infrastructure is no longer in the way

    • It needs to be their software (not my software)


Clouds interoperation and pragma

PRAGMA’s Distributed Infrastructure Grid/Clouds

UZH

Switzerland

UZH

Switzerland

JLU

China

AIST

OsakaU

UTsukuba

Japan

CNIC

China

KISTI

KMU

Korea

IndianaU

USA

SDSC

USA

LZU

China

LZU

China

ASGC

NCHC

Taiwan

HKU

HongKong

UoHyd

India

ASTI

Philippines

NECTEC

KU

Thailand

CeNAT-ITCR

Costa Rica

HCMUT

HUT

IOIT-Hanoi

IOIT-HCM

Vietnam

UValle

Colombia

MIMOS

USM

Malaysia

UChile

Chile

MU

Australia

BESTGrid

New Zealand

26 institutions in 17 countries/regions,23 compute sites, 10VM sites


Our goals

Our Goals

  • Enable Specialized Applications to run easily on distributed resources

  • Investigate Virtualization as a practical mechanism

    • Multiple VM Infrastructures (Xen, KVM, OpenNebula, Rocks, WebOS, EC2)

  • Use Geogrid Applications as a driver of the process


Geogrid applications as driver

GeoGrid Applications as Driver

I am not part of GeoGrid, but PRAGMA members are!


Deploy three different software stacks on the pragma cloud

Deploy Three Different Software Stacks on the PRAGMA Cloud

  • QuiQuake

    • Simulator of ground motion map when earthquake occurs

    • Invoked when big earthquake occurs

  • HotSpot

    • Find high temperature area from Satellite

    • Run daily basis (when ASTER data arrives from NASA)

  • WMS server

    • Provides satellite images via WMS protocol

    • Run daily basis, but the number of requests is not stable.

Source: Dr. Yoshio Tanaka, AIST, Japan


Example of current configuration

Example of current configuration

WMS Server

QuiQuake

Hot spot

  • Fix nodes to each application

  • Should be more adaptive and elastic according to the requirements.

Source: Dr. Yoshio Tanaka, AIST, Japan


1 st step adaptive resource allocation in a single system

1st step: Adaptive resource allocation in a single system

Change nodes for each application according to the situation and requirements.

WMS Server

QuiQuake

Hot Spot

Big Earthquake !

WMS Server

QuiQuake

Hot Spot

WMS Server

QuiQuake

Hot spot

Increase WMS requests

Source: Dr. Yoshio Tanaka, AIST, Japan


2 nd step extend to distributed environments

2ndStep: Extend to distributed environments

Terra/ASTER

ALOS/PALSAR

TDRS

OCC

(AIST)

JAXA

NASA

ERSDAC

UCSD

NCHC

Source: Dr. Yoshio Tanaka, AIST, Japan


What are the essential steps

What are the Essential Steps

  • AIST/Geogrid creates their VM image

  • Image made available in “centralized” storage

  • PRAGMA sites copy Geogrid images to local clouds

    • Assign IP addresses

    • What happens if image is in KVM and site is Xen?

  • Modified images are booted

  • Geogrid infrastructure now ready to use


Basic operation

Basic Operation

  • VM image authored locally, uploaded into VM-image repository (Gfarm from U. Tsukuba)

  • At local sites:

    • Image copied from repository

    • Local copy modified (automatic) to run on specific infrastructure

    • Local copy booted

  • For running in EC2, adapted methods automated in Rocks to modify, bundle, and upload after local copy to UCSD.


Clouds interoperation and pragma

VM Deployment Phase I - Manual

http://goc.pragma-grid.net/mediawiki-1.16.2/index.php/Bloss%2BGeoGrid

# rocks add host vm container=…

# rocks set host interface subnet …

# rocks set host interface ip …

# rocks list host interface …

# rocks list host vm … showdisks=yes

# cd /state/partition1/xen/disks

# wgethttp://www.apgrid.org/frontend...

# gunzip geobloss.hda.gz

# lomount –diskimagegeobloss.hda -partition1 /media

# vi /media/boot/grub/grub.conf

# vi /media/etc/sysconfig/networkscripts/ifc…

# vi /media/etc/sysconfig/network

# vi /media/etc/resolv.conf

# vi /etc/hosts

# vi /etc/auto.home

# vi /media/root/.ssh/authorized_keys

# umount /media

# rocks set host boot action=os …

# Rocks start host vmgeobloss…

Website

Geogrid

+ Bloss

Geogrid

+ Bloss

VM devel

server

frontend

Geogrid + Bloss

vm-container-0-0

vm-container-0-1

vm-container-0-2

vm-container-….

VM hosting server


What we learned in manual approach

What we learned in manual approach

AIST, UCSD and NCHC met in Taiwan for 1.5 days to test in Feb 2011

  • Much faster than Grid deployment of the same infrastructure

  • It is not too difficult to modify a Xen image and run under KVM

  • Nearly all of the steps could be automated

  • Need a better method than “put image on a website” for sharing


Clouds interoperation and pragma

Centralized VM Image Repository

VM images

depository and sharing

Gfarm Client

Gfarm meta-server

Gfarm file server

vmdb.txt

Geogrid + Bloss

Fmotif

Nyouga

QuickQuake

Gfarm file server

Gfarm file server

Gfarm Client

Gfarm file server

Gfarm file server

Gfarm file server


Gfarm using native tools

Gfarm using Native tools


Clouds interoperation and pragma

VM Deployment Phase II - Automated

http://goc.pragma-grid.net/mediawiki-1.16.2/index.php/VM_deployment_script

$ vm-deploy quiquake vm-container-0-2

Quiquake

Gfarm

Client

VM development server

frontend

S

vm-deploy

Quiquake

vm-container-0-0

Fmotif

vm-container-0-1

Nyouga

Geogrid + Bloss

vmdb.txt

vm-container-0-2

Gfarm

Client

quiquake, xen-kvm,AIST/quiquake.img.gz,…

Fmotif,kvm,NCHC/fmotif.hda.gz,…

Quiquake

vm-container-….

Gfarm Cloud

VM hosting server


Clouds interoperation and pragma

Put all together

Store VM images in Gfarm systems

Run vm-deploy scripts at PRAGMA Sites

Copy VM images on Demand from gFarm

Modify/start VM instances at PRAGMA sites

Manage jobs with Condor

S

S

gFC

gFC

VM Image

copied from

gFarm

VM Image

copied from

gFarm

Condor

Master

gFS

slave

slave

SDSC (USA)

Rocks Xen

AIST (Japan)

OpenNebulaXen

gFS

GFARM Grid File

System (Japan)

S

S

gFC

gFC

AIST QuickQuake + Condor

VM Image

copied from

gFarm

VM Image

copied from

gFarm

gFS

NCHC Fmotif

gFS

gFS

slave

slave

gFS

UCSD Autodock + Condor

NCHC (Taiwan)

OpenNebula KVM

IU (USA)

Rocks Xen

AIST Web Map Service + Condor

AIST Geogrid + Bloss

AIST HotSpot + Condor

S

S

gFS

gFC

gFS

gFC

VM Image

copied from

gFarm

VM Image

copied from

gFarm

gFS

gFS

slave

slave

LZU (China)

Rocks KVM

S

= VM deploy Script

Osaka (Japan)

Rocks Xen

gFC

= Grid Farm Client

= Grid Farm Server

gFS


Moving more quickly with pragma cloud

Moving more quickly with PRAGMA Cloud

  • PRAGMA 21 – Oct 2011

    • 4 sites: AIST, NCHC, UCSD, and EC2 (North America)

  • SC’11 – Nov 2011

    • New Sites:

      • Osaka University

      • Lanzhou University

      • Indiana University

      • CNIC

      • EC2 – Asia Pacific


Condor pool ec2 web interface

Condor Pool + EC2 Web Interface

  • 4 different private clusters

  • 1 EC2 Data Center

  • Controlled from Condor Manager in AIST, Japan


Clouds interoperation and pragma

Cloud Sites Integrated in Geogrid Execution Pool

PRAGMA Compute Cloud

JLU

China

AIST

OsakaU

Japan

CNIC

China

IndianaU

USA

LZU

China

LZU

China

NCHC

Taiwan

SDSC

USA

UoHyd

India

ASTI

Philippines

MIMOS

Malaysia


Roles of each site pragma geogrid

Roles of Each Site PRAGMA+Geogrid

  • AIST – Application driver with natural distributed computing/people setup

  • NCHC – Authoring of VMs in a familiar web environment. Significant Diversity of VM infra

  • UCSD – Lower-level details of automating VM “fixup” and rebundling for EC2

    We are all founding members of PRAGMA


Nchc webos cloud authoring portal

NCHC WebOS/Cloud Authoring Portal

Users start with well-defined Base Image then add their software


Getting things working in ec2

Getting things working in EC2

  • Short Background on Rocks Clusters

  • Mechanisms for using Rocks to create an EC2 compatible image

  • Adapting methodology to support non-Rocks defined images


Rocks http www rocksclusters org

Rocks – http:// www.rocksclusters.org

  • Technology transfer of commodity clustering to application scientists

  • Rocks is a cluster/System Configuration on a CD

    • Clustering software (PBS, SGE, Ganglia, Condor, … )

    • Highly programmatic software configuration management

    • Put CDs in Raw Hardware, Drink Coffee, Have Cluster.

  • Extensible using “Rolls”

  • Large user community

    • Over 1PFlop of known clusters

    • Active user / support list of 2000+ users

  • Active Development

    • 2 software releases per year

    • Code Development at SDSC

    • Other Developers (UCSD, Univ of Tromso, External Rolls

  • Supports Redhat Linux, Scientific Linux, Centos and Solaris

  • Can build Real, Virtual, and Hybrid Combinations (2 – 1000s)

Rocks Core Development NSF award #OCI-0721623


Key rocks concepts

Key Rocks Concepts

  • Define components of clusters as Logical Appliances (Compute, Web, Mgmt, Login DB, PFS Metadata, PFS Data, … )

    • Share common configuration among appliances

    • Graph decomposition of the full cluster SW and Config

    • Rolls are the building blocks: reusable components (Package + Config + Subgraph)

  • Use installer’s (Redhat Anaconda, Solaris Jumpstart) text format to describe an appliance configuration

    • Walk the Rocks graph to compile this definition

  • Heterogeneous Hardware (Real and Virtual HW) with no additional effort


Clouds interoperation and pragma

A Mid-Sized Cluster Resource

Includes : Computing, Database, Storage, Virtual Clusters, Login, Management Appliances

Triton Resource

  • Large Memory PSDAF

  • 256 GB & 512 GB Nodes (32 core)

  • 8TB Total

  • 128 GB/sec

  • ~ 9TF

  • Shared Resource

  • Cluster

  • 16 GB/Node

  • 4 - 8TB Total

  • 256 GB/sec

  • ~ 20 TF

x256

x28

UCSD Research Labs

  • Large Scale Storage

  • (Delivery by Mid May)

  • 2 PB ( 384 TB Today)

  • ~60 GB/sec ( 7 GB/s )

  • ~ 2600 (384 Disks Now)

Campus Research Network

http://tritonresource.sdsc.edu


What s in your cluster

What’s in YOUR cluster?


How rocks treats virtual hardware

How Rocks Treats Virtual Hardware

  • It’s just another piece of HW.

    • If RedHat supports it, so does Rocks

  • Allows mixture of real and virtual hardware in the same cluster

    • Because Rocks supports heterogeneous HW clusters

  • Re-use of all of the software configuration mechanics

    • E.g., a compute appliance is compute appliance, regardless of “Hardware”

Virtual HW must meet minimum HW Specs

  • 1GB memory

  • 36GB Disk space*

  • Private-network Ethernet

  • + Public Network on Frontend

    * Not strict – EC2 images are 10GB


Extended condor pool very similar to aist geogrid

Extended Condor Pool (Very Similar to AIST GeoGrid)

Rocks Frontend

Job

Submit

Condor Collector

Scheduler

Cluster Private Network

(e.g. 10.1.x.n)

Cloud 0

Cloud 1

Node 0

Node 1

Identical system images

Node n

Condor Pool with both local and cloud resources


Complete recipe

Complete Recipe

Upload Image to Amazon S3

Boot AMI as an

Amazon Instance

Register Image

as EC2 AMI

3

5

4

Rocks Frontend

Kickstart Guest VM

ec2_enable=true

1

VM Container

Amazon EC2 Cloud

Guest VM

Bundle as S3 Image

2

Optional: Test and Rebuild of Image

“Compiled” VM Image

Disk Storage

Local Hardware


At the command line provided by the rocks ec2 roll xen rolls

At the Command Line: provided by the Rocks EC2 Roll/Xen Rolls

  • rocks set host boot action=install compute-0-0

  • rocks set host attr compute-0-0 ec2_enable true

  • rocks start host vm compute-0-0

    • After reboot inspect, then shut down

  • rocks create ec2 bundle compute-0-0

  • rocks upload ec2 bundle compute-0-0 <s3bucket>

  • ec2-register <s3bucket>/image.manifest.xml

  • ec2-run instances <ami>


Modify to support non rocks images for pragma experiment

Modify to Support Non-Rocks Imagesfor PRAGMA Experiment

Gfarm

Upload Image to Amazon S3

Register Image

as EC2 AMI

Rocks Frontend

4

5

vm-deploy

nyouga2 vm-container-0-20

1

VM Container

Amazon EC2 Cloud

Guest VM

Bundle as S3 Image

Boot AMI as an

Amazon Instance

6

3

Disk Storage

“Modified” VM Image

Makeec2.sh <image file>

2

Local Hardware


Observations

Observations

  • This is much faster than our Grid deployments

  • Integration of private and commercial cloud is at proof-of-principle state

  • Haven’t scratched the surface of when one expands into an external cloud

  • Networking among instances in different clouds has pitfalls (firewalls, addressing, etc)

  • Users can focus on the creation of their software stack


Heterogenous clouds

Heterogenous Clouds


More information online

More Information Online


Revisit cloud hype

Revisit Cloud Hype

  • “Others do allsome of the hard work for you”

  • “You never still have to manage hardware again”

  • “It’s alwayssometimes more efficient to outsource”

  • “You can have a cluster in 8 clicks of the mouse, but it may not have your software”

  • “It’s infinitely scalable”

  • Location of data is important

  • Interoperability across cloud infrastructures is possible


Thank you

Thank You!

[email protected]


  • Login