Grid computing 3 special topics in computer engineering
This presentation is the property of its rightful owner.
Sponsored Links
1 / 73

Grid Computing (3) (Special Topics in Computer Engineering) PowerPoint PPT Presentation


  • 131 Views
  • Uploaded on
  • Presentation posted in: General

Grid Computing (3) (Special Topics in Computer Engineering). Veera Muangsin 13 February 200 4. Outline. High-Performance Computing Grid Computing Grid Applications Grid Architecture Grid Middleware Grid Services. Network. Before the Grid. independent sites

Download Presentation

Grid Computing (3) (Special Topics in Computer Engineering)

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Grid computing 3 special topics in computer engineering

Grid Computing (3)(Special Topics in Computer Engineering)

Veera Muangsin

13 February 2004


Outline

Outline

  • High-Performance Computing

  • Grid Computing

  • Grid Applications

  • Grid Architecture

  • Grid Middleware

  • Grid Services


Before the grid

Network

Before the Grid

  • independent sites

  • independent hardware and software

  • independent user ids

  • security policy requiring local connection to the machine.

User

Application

The User is responsible for resolving the complexities of the environment

Site A

Site B


First step to the grid

First Step to the Grid

  • Metacenter

  • Two or more resources connected in a controlled user environment

  • Constraints

  • common architecture

  • single name space

  • common scheduler

User

Application

A layer of abstraction is added that hides some of the complexities associated with running jobs in a distributed computing environment, however, limitations exist

Network

Centralized Scheduler and file staging

Site A

Site B


The grid today

Request info from the grid

1

Get response

2

1

2

3

Make selection and submit job

3

The Grid Today

  • Common Middleware

  • abstracts independent, hardware, software, user ids, into a service layer with defined APIs

  • comprehensive security,

  • allows for site autonomy

  • provides a common infrastructure based on middleware

User

Application

The underlying infrastructure is abstracted into defined APIs thereby simplifying developer and the user access to resources, however, this layer is not intelligent

Grid Middleware

Infrastructure

Network

Site A

Site B


The near future grid

The Near Future Grid

  • Customizable Grid Services built on defined Infrastructure APIs

  • automatic selection of resources

  • information products tailored to users

  • accountless processing

  • flexible interface: web based, command line, APIs

User

Application

Resources are accessed via various intelligent services that access infrastructure APIs

The result: The Scientist and Application Developer can focus on science and not on systems management

Intelligent, Customized Middleware

Grid Middleware - Infrastructure APIs

(service oriented)

Infrastructure

Network

Site A

Site B


Layered grid architecture by analogy to internet architecture

Application

Application

Internet Protocol Architecture

“Coordinating multiple resources”: ubiquitous infrastructure services, app-specific distributed services

Collective

“Sharing single resources”: negotiating access, controlling use

Resource

“Talking to things”: communication (Internet protocols) & security

Connectivity

Transport

Internet

“Controlling things locally”: Access to, & control of, resources

Fabric

Link

Layered Grid Architecture(By Analogy to Internet Architecture)


Grid components

Grid Components

Applications and Portals

Grid

Apps.

Prob. Solving Env.

Collaboration

Engineering

Web enabled Apps

Scientific

Grid

Tools

Development Environments and Tools

Web tools

Libraries

Languages

Monitoring

Resource Brokers

Debuggers

Distributed Resources Coupling Services

Grid

Middleware

QoS

Data Access

Sign on & Security

Information

Comm.

Process

Local Resource Managers

TCP/IP & UDP

Queuing Systems

Operating Systems

Libraries & App Kernels

Grid

Fabric

Networked Resources across Organisations

Storage Systems

Data Sources

Clusters

Scientific Instruments

Computers


Example high throughput computing system

API

SDK

C-point

Protocol

Checkpoint

Repository

API

SDK

Access

Protocol

Compute

Resource

Example:High-Throughput Computing System

App

High Throughput Computing System

Collective

(App)

Dynamic checkpoint, job management, failover, staging

Collective

(Generic)

Brokering, certificate authorities

Access to data, access to computers, access to network performance data

Resource

Communication, service discovery (DNS), authentication, authorization, delegation

Connect

Storage systems, schedulers

Fabric


Example data grid architecture

Example:Data Grid Architecture

App

Discipline-Specific Data Grid Application

Collective

(App)

Coherency control, replica selection, task management, virtual data catalog, virtual data code catalog, …

Collective

(Generic)

Replica catalog, replica management, co-allocation, certificate authorities, metadata catalogs,

Access to data, access to computers, access to network performance data, …

Resource

Communication, service discovery (DNS), authentication, authorization, delegation

Connect

Storage systems, clusters, networks, network caches, …

Fabric


Globus toolkit

Globus Toolkit

  • Grid computing middleware

    • Software between the hardware and high-level services

    • Basic libraries, services, command-line programs

  • Most common middleware used in grids

  • Integrated with Web Service


Globus software architecture

  • get and put files

  • 3rd party copy

  • interactive file management

  • parallel transfers

  • login

  • execute commands

  • copy files

  • execute remote applications

  • stage executable, stdin, stdout, stderr

information about resources and services

Monitoring and Discovery Service (MDS)

Globus Resource Allocation Manager (GRAM)

Grid SSH

Grid FTP

LDAP

PBS

LSF

fork/exec

Grid Security Infrastructure (GSI)

X.509 Certificates

SSL/TLS

distributed directory service

job management systems

credentials for users, services, hosts

  • authentication

  • secure communication

  • single sign on

  • delegation of credentials

  • authorization

Globus Software Architecture


Globus deployment architecture

User

User application/tool

Web portal

Globus client system

Grid FTP Client

GRAM Client

Grid SSH Client

MDS Client

Clients are programs and libraries

MDS server system

MDS GIIS

Grid FTP Server

GRAM Server

Grid SSH Server

Grid SSH Server

GRAM Server

Grid FTP Server

PBS

MDS GRIS

MDS GRIS

LSF

Globus server system

Globus server system

Globus Deployment Architecture


Globus toolkit1

Globus Toolkit™

  • A software toolkit addressing key technical problems in the development of Grid enabled tools, services, and applications

    • Offer a modular “bag of technologies”

    • Enable incremental development of grid-enabled tools and applications

    • Implement standard Grid protocols and APIs

    • Make available under liberal open source license


General approach

General Approach

  • Define Grid protocols & APIs

    • Protocol-mediated access to remote resources

    • Integrate and extend existing standards

    • “On the Grid” = speak “Intergrid” protocols

  • Develop a reference implementation

    • Open source Globus Toolkit

    • Client and server SDKs, services, tools, etc.

  • Grid-enable wide variety of tools

    • Globus Toolkit, FTP, SSH, Condor, SRB, MPI, …


Four key protocols

Four Key Protocols

  • The Globus Toolkit™ centers around four key protocols

    • Connectivity layer:

      • Security: Grid Security Infrastructure (GSI)

    • Resource layer:

      • Resource Management: Grid Resource Allocation Management (GRAM)

      • Information Services: Grid Resource Information Protocol (GRIP)

      • Data Transfer: Grid File Transfer Protocol (GridFTP)


The globus toolkit security services

The Globus Toolkit™:Security Services

The Globus Project™

Argonne National LaboratoryUSC Information Sciences Institute

http://www.globus.org


Why grid security is hard

Why Grid Security is Hard

  • Resources are often located in distinct administrative domains

    • Each resource has own policies & procedures

  • Set of resources used by a single computation may be large, dynamic, and unpredictable

    • Not just client/server, requires delegation

  • It must be broadly available & applicable

    • Standard, well-tested, well-understood protocols; integrated with wide variety of tools


Grid security infrastructure gsi

Grid Security Infrastructure (GSI)

  • Extensions to standard protocols & APIs

    • Standards: SSL/TLS, X.509 & CA, GSS-API

    • Extensions for single sign-on and delegation

  • Globus Toolkit reference implementation of GSI

    • SSLeay/OpenSSL + GSS-API + SSO/delegation

    • Tools and services to interface to local security

      • Simple ACLs; SSLK5/PKINIT for access to K5, AFS; …

    • Tools for credential management

      • Login, logout, etc.

      • Smartcards

      • MyProxy: Web portal login and delegation

      • K5cert: Automatic X.509 certificate creation


Gsi in action create processes at a and b that communicate access files at c

Single sign-on via “grid-id”

& generation of proxy cred.

Or: retrieval of proxy cred.

from online repository

Remote process

creation requests*

GSI-enabled

GRAM server

Authorize

Map to local id

Create process

Generate credentials

Ditto

GSI-enabled

GRAM server

Process

Process

Communication*

Local id

Local id

Kerberos

ticket

Restricted

proxy

Remote file

access request*

Restricted

proxy

User Proxy

GSI-enabled

FTP server

Proxy

credential

Authorize

Map to local id

Access file

* With mutual authentication

GSI in Action“Create Processes at A and B that Communicate & Access Files at C”

User

Site A

(Kerberos)

Site B

(Unix)

Computer

Computer

Site C

(Kerberos)

Storage

system


Review of public key cryptography

Review ofPublic Key Cryptography

  • Asymmetric keys

    • A private key is used to encrypt data.

    • A public key can decrypt data encrypted with the private key.

  • An X.509 certificate includes…

    • Someone’s subject name (user ID)

    • Their public key

    • A “signature” from a Certificate Authority (CA) that:

      • Proves that the certificate came from the CA.

      • Vouches for the subject name

      • Vouches for the binding of the public key to the subject


Public key based authentication

Public Key Based Authentication

  • User sends certificate over the wire.

  • Other end sends user a challenge string.

  • User encodes the challenge string with private key

    • Possession of private key means you can authenticate as subject in certificate

  • Public key is used to decode the challenge.

    • If you can decode it, you know the subject

  • Treat your private key carefully!!

    • Private key is stored only in well-guarded places, and only in encrypted form


User proxies

User Proxies

  • Minimize exposure of user’s private key

  • A temporary, X.509 proxy credential for use by our computations

    • We call this a user proxy certificate

    • Allows process to act on behalf of user

    • User-signed user proxy cert stored in local file

    • Created via “grid-proxy-init” command

  • Proxy’s private key is not encrypted

    • Rely on file system security, proxy certificate file must be readable only by the owner


Delegation

Delegation

  • Remote creation of a user proxy

  • Results in a new private key and X.509 proxy certificate, signed by the original key

  • Allows remote process to act on behalf of the user

  • Avoids sending passwords or private keys across the network


Gsi applications

GSI Applications

  • Globus Toolkit™ uses GSI for authentication

  • Many Grid tools, directly or indirectly, e.g.

    • Condor-G, SRB, MPICH-G2, Cactus, GDMP, …

  • Commercial and open source tools, e.g.

    • ssh, ftp, cvs, OpenLDAP, OpenAFS

    • SecureCRT (Win32 ssh client)

  • And since we use standard X.509 certificates, they can also be used for

    • Web access, LDAP server access, etc.


The globus toolkit resource management services

The Globus Toolkit™:Resource Management Services

The Globus Project™

Argonne National LaboratoryUSC Information Sciences Institute

http://www.globus.org


The challenge

The Challenge

  • Enabling secure, controlled remote access to heterogeneous computational resources and management of remote computation

    • Authentication and authorization

    • Resource discovery & characterization

    • Reservation and allocation

    • Computation monitoring and control

  • Addressed by new protocols & services

    • GRAM protocol as a basic building block

    • Resource brokering & co-allocation services

    • GSI for security, MDS for discovery


Resource management

Resource Management

  • The Grid Resource Allocation Management (GRAM) protocol and client API allows programs to be started on remote resources, despite local heterogeneity

  • Resource Specification Language (RSL) is used to communicate requirements

  • A layered architecture allows application-specific resource brokers and co-allocators to be defined in terms of GRAM services

    • Integrated with Condor, PBS, MPICH-G2, …


Resource management architecture

Broker

Co-allocator

Resource Management Architecture

RSL

specialization

RSL

Application

Information Service

Queries

& Info

Ground RSL

Simple ground RSL

Local

resource

managers

GRAM

GRAM

GRAM

LSF

Condor

NQE


Globus toolkit implementation

Globus Toolkit Implementation

  • Gatekeeper

    • Single point of entry

    • Authenticates user, maps to local security environment, runs service

    • In essence, a “secure inetd”

  • Job manager

    • A gatekeeper service

    • Layers on top of local resource management system (e.g., PBS, LSF, etc.)

    • Handles remote interaction with the job


Gram components

GRAM Components

MDS client API calls

to locate resources

Client

MDS: Grid Index Info Server

Site boundary

MDS client API calls

to get resource info

GRAM client API calls to

request resource allocation

and process creation.

MDS: Grid Resource Info Server

Query current status

of resource

GRAM client API state

change callbacks

Grid Security

Infrastructure

Local Resource Manager

Allocate &

create processes

Request

Job Manager

Create

Gatekeeper

Process

Parse

Monitor &

control

Process

RSL Library

Process


Job submission interfaces

Job Submission Interfaces

  • Globus Toolkit includes several command line programs for job submission

    • globus-job-run: Interactive jobs

    • globus-job-submit: Batch/offline jobs

    • globusrun: Flexible scripting infrastructure

  • Others are building better interfaces

    • General purpose

      • Condor-G, PBS, GRD, Hotpage, etc

    • Application specific

      • ECCE’, Cactus, Web portals


The globus toolkit information services

The Globus Toolkit™:Information Services

The Globus Project™

Argonne National LaboratoryUSC Information Sciences Institute

http://www.globus.org


Grid information services

Grid Information Services

  • System information is critical to operation of the grid and construction of applications

    • What resources are available?

      • Resource discovery

    • What is the “state” of the grid?

      • Resource selection

    • How to optimize resource use

      • Application configuration and adaptation?

  • We need a general information infrastructure to answer these questions


Examples of useful information

Examples of Useful Information

  • Characteristics of a compute resource

    • IP address, software available, system administrator, networks connected to, OS version, load

  • Characteristics of a network

    • Bandwidth and latency, protocols, logical topology

  • Characteristics of the Globus infrastructure

    • Hosts, resource managers


Grid information facts of life

Grid Information: Facts of Life

  • Information is always old

    • Time of flight, changing system state

    • Need to provide quality metrics

  • Distributed state hard to obtain

    • Complexity of global snapshot

  • Component will fail

  • Scalability and overhead

  • Many different usage scenarios

    • Heterogeneous policy, different information organizations, etc.


Grid information service

Grid Information Service

  • Provide access to static and dynamic information regarding system components

  • A basis for configuration and adaptation in heterogeneous, dynamic environments

  • Requirements and characteristics

    • Uniform, flexible access to information

    • Scalable, efficient access to dynamic data

    • Access to multiple information sources

    • Decentralized maintenance


The gis problem many information sources many views

VO C

?

?

?

?

VO A

VO B

The GIS Problem: Many Information Sources, Many Views

R

R

R

R

R

R

R

R

R

R

R

R

R

R

R

R

R


Information protocols

Information Protocols

  • Grid Resource Registration Protocol

    • Support information/resource discovery

    • Designed to support machine/network failure

  • Grid Resource Inquiry Protocol

    • Query resource description server for information

    • Query aggregate server for information

    • LDAP V3.0 in Globus 1.1.3


Gis architecture

GIS Architecture

Customized Aggregate Directories

Users

A

A

Enquiry

Protocol

Registration

Protocol

R

R

R

R

Standard Resource Description Services


Metacomputing directory service

Metacomputing Directory Service

  • Use LDAP as Inquiry

  • Access information in a distributed directory

    • Directory represented by collection of LDAP servers

    • Each server optimized for particular function

  • Directory can be updated by:

    • Information providers and tools

    • Applications (i.e., users)

    • Backend tools which generate info on demand

  • Information dynamically available to tools and applications


Two classes of mds servers

Two Classes Of MDS Servers

  • Grid Resource Information Service (GRIS)

    • Supplies information about a specific resource

    • Configurable to support multiple information providers

    • LDAP as inquiry protocol

  • Grid Index Information Service (GIIS)

    • Supplies collection of information which was gathered from multiple GRIS servers

    • Supports efficient queries against information which is spread across multiple GRIS server

    • LDAP as inquiry protocol


Grid resource information service

Grid Resource Information Service

  • Server which runs on each resource

    • Given the resource DNS name, you can find the GRIS server (well known port = 2135)

  • Provides resource specific information

    • Much of this information may be dynamic

      • Load, process information, storage information, etc.

      • GRIS gathers this information on demand

  • “White pages” lookup of resource information

    • Ex: How much memory does machine have?

  • “Yellow pages” lookup of resource options

    • Ex: Which queues on machine allows large jobs?


Grid index information service

Grid Index Information Service

  • GIIS describes a class of servers

    • Gathers information from multiple GRIS servers

    • Each GIIS is optimized for particular queries

      • Ex1: Which Alliance machines are >16 process SGIs?

      • Ex2: Which Alliance storage servers have >100Mbps bandwidth to host X?

    • Akin to web search engines

  • Organization GIIS

    • The Globus Toolkit ships with one GIIS

    • Caches GRIS info with long update frequency

      • Useful for queries across an organization that rely on relatively static information (Ex1 above)

  • Can be merged into GRIS


Logical mds deployment

Logical MDS Deployment

Grads

Gusto

GIIS

ISI

GRISes


Example discovering cpu load

Example: Discovering CPU Load

  • Retrieve CPU load fields of compute resources

    % grid-info-search -L “(objectclass=GlobusComputeResource)” \

    dn cpuload1 cpuload5 cpuload15

dn: hn=lemon.mcs.anl.gov, ou=MCS, o=Argonne National Laboratory, o=Globus, c=US

cpuload1: 0.48

cpuload5: 0.20

cpuload15: 0.03

dn: hn=tuva.mcs.anl.gov, ou=MCS, o=Argonne National Laboratory, o=Globus, c=US

cpuload1: 3.11

cpuload5: 2.64

cpuload15: 2.57


The globus toolkit data management services

The Globus Toolkit™:Data Management Services

The Globus Project™

Argonne National LaboratoryUSC Information Sciences Institute

http://www.globus.org


Data intensive issues include

Data Intensive Issues Include …

  • Harness [potentially large numbers of] data, storage, network resources located in distinct administrative domains

  • Respect local and global policies governing what can be used for what

  • Schedule resources efficiently, again subject to local and global constraints

  • Achieve high performance, with respect to both speed and reliability

  • Catalog software and virtual data


Desired data grid functionality

Desired Data Grid Functionality

  • High-speed, reliable access to remote data

  • Automated discovery of “best” copy of data

  • Manage replication to improve performance

  • Co-schedule compute, storage, network

  • “Transparency” wrt delivered performance

  • Enforce access control on data

  • Allow representation of “global” resource allocation policies


A model architecture for data grids

A Model Architecture for Data Grids

Attribute Specification

Replica Catalog

Metadata Catalog

Application

Multiple Locations

Logical Collection and Logical File Name

MDS

Selected

Replica

Replica

Selection

Performance

Information &

Predictions

NWS

GridFTP Control Channel

Disk Cache

GridFTPDataChannel

TapeLibrary

Disk Array

Disk Cache

Replica Location 1

Replica Location 2

Replica Location 3


Globus toolkit components

Globus Toolkit Components

Two major Data Grid components:

1. Data Transport and Access

  • Common protocol

    • Secure, efficient, flexible, extensible data movement

  • Family of tools supporting this protocol

    2. Replica Management Architecture

  • Simple scheme for managing:

    • multiple copies of files

    • collections of files


Access transport protocol requirements

Access/Transport Protocol Requirements

  • Suite of communication libraries and related tools that support

    • GSI, Kerberos security

    • Third-party transfers

    • Parameter set/negotiate

    • Partial file access

    • Reliability/restart

    • Large file support

    • Data channel reuse

  • All based on a standard, widely deployed protocol

  • Integrated instrumentation

  • Loggin/audit trail

  • Parallel transfers

  • Striping (cf DPSS)

  • Policy-based access control

  • Server-side computation

  • Proxies (firewall, load bal)


And the protocol is gridftp

And The Protocol Is … GridFTP

  • Why FTP?

    • Ubiquity enables interoperation with many commodity tools

    • Already supports many desired features, easily extended to support others

    • Well understood and supported

  • We use the term GridFTP to refer to

    • Transfer protocol which meets requirements

    • Family of tools which implement the protocol

  • Note GridFTP > FTP

  • Note that despite name, GridFTP is not restricted to file transfer!


Gridftp basic approach

GridFTP: Basic Approach

  • FTP protocol is defined by several IETF RFCs

  • Start with most commonly used subset

    • Standard FTP: get/put etc., 3rd-party transfer

  • Implement standard but often unused features

    • GSS binding, extended directory listing, simple restart

  • Extend in various ways, while preserving interoperability with existing servers

    • Striped/parallel data channels, partial file, automatic & manual TCP buffer setting, progress monitoring, extended restart


Replica management

Replica Management

  • Maintain a mapping between logical names for files and collections and one or more physical locations

  • Important for many applications

    • Example: CERN HLT data

      • Multiple petabytes of data per year

      • Copy of everything at CERN (Tier 0)

      • Subsets at national centers (Tier 1)

      • Smaller regional centers (Tier 2)

      • Individual researchers will have copies


Replica catalog structure a climate modeling example

Replica Catalog Structure: A Climate Modeling Example

Replica Catalog

Logical Collection

C02 measurements 1998

Logical Collection

C02 measurements 1999

Filename: Jan 1998

Filename: Feb 1998

Logical File Parent

Location

jupiter.isi.edu

Location

sprite.llnl.gov

Filename: Mar 1998

Filename: Jun 1998

Filename: Oct 1998

Protocol: gsiftp

UrlConstructor:

gsiftp://jupiter.isi.edu/

nfs/v6/climate

Filename: Jan 1998

Filename: Dec 1998

Protocol: ftp

UrlConstructor:

ftp://sprite.llnl.gov/

pub/pcmdi

Logical File Jan 1998

Logical File Feb 1998

Size: 1468762


Replica catalog services as building blocks examples

Replica Catalog Servicesas Building Blocks: Examples

  • Combine with information service to build replica selection services

    • E.g. “find best replica” using performance info from NWS and MDS

    • Use of LDAP as common protocol for info and replica services makes this easier

  • Combine with application managers to build data distribution services

    • E.g., build new replicas in response to frequent accesses


Replica catalog directions

Replica Catalog Directions

  • Many data grid applications do not require tight consistency semantics

    • At any given time, you may not be able to discover all copies

    • When a new copy is made, it may not be immediately recognized as available

  • Allows for much more scalable design

    • Distributed catalogs: local catalogs which maintain their own LFN -> PFN mapping

    • Soft-state updates as basis for building various configurations of global catalogs


Virtual data in action

Virtual Data in Action

  • Data request may

    • Access local data

    • Compute locally

    • Compute remotely

    • Access remote data

  • Scheduling subject to local & global policies

  • Local autonomy


Evolution of grid technologies

Evolution of Grid Technologies

  • Initial exploration (1996-1999; Globus 1.0)

    • Extensive appln experiments; core protocols

  • Data Grids (1999-??; Globus 2.0+)

    • Large-scale data management and analysis

  • Open Grid Services Architecture (2001-??, Globus 3.0)

    • Integration w/ Web services, hosting environments, resource virtualization

    • Databases, higher-level services

  • Radically scalable systems (2003-??)

    • Sensors, wireless, ubiquitous computing


Grids and open standards

Open Grid

Services Arch

Web services

GGF: OGSI, …

(+ OASIS, W3C)

Multiple implementations,

including Globus Toolkit

X.509,

LDAP,

FTP, …

Globus Toolkit

Defacto standards

GGF: GridFTP, GSI

Grids and Open Standards

App-specific

Services

Increased functionality,

standardization

Custom

solutions

Time


Web services

“Web Services”

  • Increasingly popular standards-based framework for accessing network applications

    • W3C standardization; Microsoft, IBM, Sun, others

  • WSDL: Web Services Description Language

    • Interface Definition Language for Web services

  • SOAP: Simple Object Access Protocol

    • XML-based RPC protocol; common WSDL target

  • WS-Inspection

    • Conventions for locating service descriptions

  • UDDI: Universal Desc., Discovery, & Integration

    • Directory for Web services


The need to support transient service instances

The Need to SupportTransient Service Instances

  • “Web services” address discovery & invocation of persistent services

    • Interface to persistent state of entire enterprise

  • In Grids, must also support transient service instances, created/destroyed dynamically

    • Interfaces to the states of distributed activities

    • E.g. workflow, video conf., dist. data analysis

  • Significant implications for how services are managed, named, discovered, and used

    • In fact, much of our work is concerned with the management of service instances


Open grid services architecture

Open Grid Services Architecture

  • Service orientation to virtualize resources

  • From Web services:

    • Standard interface definition mechanisms: multiple protocol bindings, multiple implementations, local/remote transparency

  • Building on Globus Toolkit:

    • Grid service: semantics for service interactions

    • Management of transient instances (& state)

    • Factory, Registry, Discovery, other services

    • Reliable and secure transport

  • Multiple hosting targets: J2EE, .NET, “C”, …


Open grid services architecture1

More specialized &

domain-specific

services

Other

schemas

OGSA services: registry,

authorization, monitoring, data

access, management, etc., etc.

OGSA schemas

Open Grid Services Infrastructure

Web Services

Host. Env. & Protocol Bindings

Hosting Environment

Transport

Hosting Environment

Protocol

Open Grid Services Architecture

Priorities:

  • Data access and integration

  • Security

  • SLA negotiation

  • Manageability

  • Monitoring


Ogsa service model

OGSA Service Model

  • System comprises (a typically few) persistent services & (potentially many) transient services

  • All services adhere to specified Grid service interfaces and behaviors

    • Reliable invocation, lifetime management, discovery, authorization, notification, upgradeability, concurrency, manageability

  • Interfaces for managing Grid service instances

    • Factory, registry, discovery, lifetime, etc.

      => Reliable, secure mgmt of distributed state


The grid service

The Grid Service

  • A (potentially transient) Web service with specified interfaces & behaviors, including

    • Creation (Factory)

    • Global naming (GSH) & references (GSR)

    • Lifetime management

    • Registration & Discovery

    • Authorization

    • Notification

    • Concurrency

    • Manageability


Use of web services 1

Use of Web Services (1)

  • A Grid service interface is a WSDL portType

  • A Grid servicedefinition is a WSDL extension (serviceType) containing:

    • A set of one or more portTypes supported by the service

    • portType & serviceType compatibility statements, to support upgradability

      • For discovery of compatible services when interfaces are upgraded

    • Implementation version information


Use of web services 2

Use of Web Services (2)

  • A GSR is a WSDL document with extensions:

    • Extension to service element to reference serviceType

    • Service element extensions to carry the GSH, and the expiration time of the GSR

  • A GSH is an URL, with the following properties:

    • Globally unique for all time

    • http get on GSH + “.wsdl” returns GSR

    • Can derive GSH to Mapper from it

  • Registry returns WS-Inspection documents


Grid computing 3 special topics in computer engineering

Grids: An Emerging, Common Computing and Data Infrastructurefor Science and Engineering

scientific instruments

tertiary storage

Condor poolsof workstations

Web Portal Access to Application and Grid Services

Specialized Portal Access (high performance displays, PDAs, etc.)

. . .

Portals

Data Management: replication and metadata

Resource Brokering

Accounting

Applications

Fault Management

Workflow Management

Services

Services Building Blocks

Encapsulation as Web Services

Encapsulation for Script Based Services

Encapsulation as Java Based Services

Resource Discovery

Scheduling and Access to Computing

Uniform Data Access

Monitoring and Events

Basic GridFunctions

Grid Communication Functions

transport services

security services

Communications

Operational Support

space-based networks

...

optical networks

Internet

Distributed Resources

national supercomputer facilities

clusters


Grid computing 3 special topics in computer engineering

Grids: A Common Computing and Data Infrastructure forScience and Engineering

Portals: Services Presented to the Users to Accomplish Tasks

STS/SLI Mission

Analysis

ISS Training

ES Modeling

MER/CIP

Aviation Capacity

User Environment Portals

Collaboration Portals

Application Domain Specific Portals

Application Domain Independent Portals

Grid Web Services: Grid Functions and Application Functions Packaged for Building Portals

Instrument & Sensor Gateways

Computational Simulation

Workflow

Management

Experiment

Management

Flight Simulation

Programming Services

Data Processing & Analysis

Zooming

System Models

Archive Gateways

Coupling

Collaboration Services

Visualization

Monitoring

Events

Data Management

Domain Specific Web Services –Encapsulated Applications

Domain IndependentGrid Web Services

Grid Common Services: Uniform Access, Security, and Management of Compute, Data, and Instrument Resources

Multi-Site Compute, Data, and Instrument Resources


Combining grid and web services

Compute

(many)

Storage

(many)

Communi-cation

Instruments

(various)

ApplicationPortals

Web

Services

Grid Services:Collective and Resource Access

Resources

Clients

Combining Grid and Web Services

Grid Protocols and Grid Security Infrastructure

XML / SOAP over Grid Security Infrastructure

Job Submission /

Control

Grid ssh

Grid Protocols and Grid Security Infrastructure

Discipline /

Application

SpecificPortals

(e.g. SDSCTeleScience)

http, https. etc.

CORBA

File Transfer

GRAM

Data Management

X Windows

Condor-G

Monitoring

SRB/MetadataCatalogue

ProblemSolvingEnvironments(AVS, SciRun,Cactus)

Events

……

Web Browser

GridFTP

Data Replica and Metadata Catalog

EnvironmentManagement(LaunchPad,HotPage)

Credential

Management

GridMonitoringArchitecture

Workflow

Management

PDA

Grid X.509CertificationAuthority

  • other services:

  • visualization

  • interface builders

  • collaboration tools

  • numerical grid generators

  • etc.

MPI

compositionframeworks

(e.g. XCAT)

Secure, ReliableGroup Comm.

GridInformationService

CoG Kits implementing

Web Services in servelets, servers, etc.

Grid Web ServiceDescription (WSDL)

& Discovery (UDDI)

Python, Java, etc.,

JSPs

Apache Tomcat&WebSphere&Cold Fusion=JVM + servlet instantiation + routing

Apache SOAP,.NET, etc.


For more information

For More Information

  • Globus Project™

    • www.globus.org

  • Grid Forum

    • www.gridforum.org

  • Book (Morgan Kaufman)

    • www.mkp.com/grids


  • Login