globus n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Globus PowerPoint Presentation
Download Presentation
Globus

Loading in 2 Seconds...

play fullscreen
1 / 37

Globus - PowerPoint PPT Presentation


  • 83 Views
  • Uploaded on

Globus. Presented by: Yayati Kasralikar for CPA 5937. Motivational Example. Very large Database of cancer images. High-performance machine. Cancer image Data Mining Software. cancer images. R. cancer images. R. Data Pre-processing Software. cancer images. What is Grid?.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Globus' - orrin


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
globus
Globus

Presented by:

Yayati Kasralikar for CPA 5937

motivational example
Motivational Example

Very large

Database of

cancer images

High-performance machine

Cancer image

Data Mining Software

cancer

images

R

cancer

images

R

Data Pre-processing Software

cancer

images

what is grid
What is Grid?
  • Coordinates resources that are not subject to centralized control.
  • Uses standard, open, general-purpose protocols and interfaces.
  • Delivers nontrivial qualities of service.
  • Let’s Examine some technologies:
    • Clusters
    • P2P Systems (e.g. Gnutella)
    • Web

-Centralized Control

Do not use Open and Standard protocols

Not coordinated use resources

why use grid
Why use Grid?
  • A biochemist exploits 10,000 computers to screen 100,000 compounds in an hour.
  • 1,000 physicists worldwide pool resources for peta-op analyses of petabytes of data.
  • An insurance company mines data from partner hospitals for fraud detection.
  • An application service provider offloads excess load to a compute cycle provider
virtual organization vo

VO C

?

VO A

VO B

Virtual Organization (VO)

A dynamic set of individuals or institutions sharing

resources for problem solving

R

R

R

R

R

R

R

R

R

R

R

R

R

R

R

R

R

grid characteristics
Grid Characteristics
  • Scale and Resource Selection
    • Particular applications selecting resources from a very large collection according to criteria such as connectivity,cost,security and reliability
  • Heterogeneity at multiple levels
    • heterogeneity ranging from physical devices, system software to scheduling and usage
  • Dynamic and unpredictable behavior
    • Behavior and performance of shared resources vary over time
  • Multiple administrative domain.
    • Challenging security problem
globus initiative
Globus Initiative
  • Provide basic infrastructure, Protocols, Services, APIs and SDKs for Grid Computing.
    • Protocols: Focus on externals(interactions) rather than internals(resource characteristics) (e,g. GRIP, IP)
    • Service: Protocol+Behavior (e.g. Information).
    • APIs and SDKs: Facilitate application developers to develop complex applications(e.g. GSS API,JDBC API,JNDI SDK). Application robustness, correctness, development and maintenance cost.
  • Globus Toolkit: A community-based,open-architecture,open-source set of services and software libraries that supports Grids and Grid Applications.
layered grid architecture
Layered Grid Architecture

Application

Application

Collective

Internet Protocol Architecture

Resource

Grid Protocol Architecture

Transport

Connectivity

Internet

Fabric

Link

connectivity layer
Connectivity Layer

Application

Collective

Resource

Grid Security Infrastructure

GSI

Grid Protocol Architecture

Connectivity

Nexsus

Interface

Fabric

resource layer
Resource Layer

Grid Resource Access Management

(GRAM)

Application

Resource Management

Collective

Grid Resource Information Protocol

(GRIP)

Resource

Grid Protocol Architecture

Grid Resource Registration Protocol

(GRRP)

Connectivity

Grid Information Services

GridFTP

Fabric

Data Transfer

collective layer
Collective Layer

Application

Directory Services

Collective

Data Replication Services

Resource

Grid Protocol Architecture

Monitoring Services

Connectivity

Scheduling and Brokering Services

Fabric

application layer
Application Layer

Application

Languages & Frameworks

Collective

Collective APIs and SDKs

Collective Service Protocols

Resource

Grid Protocol Architecture

Resource APIs and SDKs

Resource Service Protocols

Connectivity

Connectivity APIs

Connectivity Protocols

Fabric

Fabric

communication services
Communication Services

Communication

link

0

1

2

SP

EP

EP

  • Diverse Communication needs.
  • IP does not meet these needs on the other hand MPI do not provide rich range of communication abstractions.
  • Communication link and remote service request (RSR).
    • One-sided asynchronous RPC transfer data from SP to EP(s) and integrate it into the process containing the EP(s)

SP

SP

Nexus communication mechanism

resource management
Resource Management

Challenging resource management problems:

  • site autonomy
    • resources are typically owned and operated by different organizations, in different administrative domains
  • heterogeneous substrate
    • different sites may use different local resource management systems
  • policy extensibility
    • A resource management solution must support the frequent development of new domain-specific management structures
  • co-allocation
    • using resources simultaneously at several sites
  • online control.
    • substantial negotiation can be required to adapt application requirements to resource availability
resource management architecture

Broker

Co-allocator

Resource Management Architecture

RSL

specialization

RSL

Application

Information Service

Queries

& Info

Ground RSL

Simple ground RSL

Local

resource

managers

GRAM

GRAM

GRAM

LSF

Condor

NQE

resource specification language
Resource Specification Language
  • Based on the syntax for filter specifications in the LDAP.
  • An RSL is constructed by combining simple parameter specifications and conditions with following operators:
  • &: Specify conjunction
  • | : Specify disjunction
  • + : Combine two or more requests
  • Resource brokers,co-allocators and resource managers can each define a set of parameters.
  • Example: I want “5 nodes with at least 256MB memory, or 10 nodes with 64MB for myprog”
  • RSL:&(executable=myprog)(|(&(count=5)
  • (memory>=256)) (|(&(count=10) (memory>=64)))
local resource management
Local Resource Management
  • Globus Resource Allocation Manager (GRAM) provide local component for resource management.
  • GRAM is responsible for:
    • Processing RSL specifications
    • Enabling remote monitoring and management of jobs
    • Periodically updates the information service.
  • Two major software components of GRAM:
    • GateKeeper: create Grid service
    • Job Manager Instance(JMI): resource management and Job control
the hour glass principle
The Hour-Glass principle
  • Simple well-defined interface form the neck.
  • Uniform access to diverse local implementations and higher-level global services.
grid security characteristics
Grid Security Characteristics
  • Single Sign on
    • Users must be able to authenticate just once to access to multiple grid resources.
  • Delegation
    • Users must be able to endow a program with the ability to run on his/her behalf.
  • Integration with local security Solutions
    • Interoperate with various local solutions.
  • User-based trust relationships
    • Each of the resource providers must not interact with each other to configure security environment.
security policies
Security Policies:
  • Grid Environment consists of multiple trust domains.
  • Operations confined to a single trust domain are subject to local security policy only.
  • Both local and global participants exists. For each trust domain, there exists a partial mapping from global to local.
  • Operations between entities located in different trust domains require mutual authentication.
  • An authenticated global subject mapped into a local subject is assumed to be equivalent to being locally authenticated as that local subject.
  • All access control decisions are made locally on the basis of the local subject.
  • A program or process is allowed to act on behalf of a user and be delegated a subset of the user's rights.
  • Processes running on behalf of the same subject within the same trust domain may share a single set of credentials.
globus security infrastructure
Globus Security Infrastructure

Credentials

User

User Proxy

Globus Credentials

GRAM

GRAM

User Process

User Process

GSI

GSI

User Process

Certificate

User Process

Certificate

User Process

User Process

Kerberos

Public Key

globus security scenario

Single sign-on via “grid-id”

& generation of proxy cred.

Or: retrieval of proxy cred.

from online repository

Remote process

creation requests

GSI-enabled

GRAM server

Authorize

Map to local id

Create process

Generate credentials

Same

GSI-enabled

GRAM server

Process

Process

Communication

Local id

Local id

Kerberos

ticket

Restricted

proxy

Remote file

access request

Restricted

proxy

User Proxy

GSI-enabled

FTP server

Proxy

credential

Authorize

Map to local id

Access file

Globus Security Scenario

User

Site A

(Kerberos)

Site B

(Unix)

Computer

Computer

Site C

(Kerberos)

Storage

system

information services
Information Services
  • Initial Discovery and ongoing monitoring of Resources
  • Existing services such as LDAP and UDDI do not address
  • the dynamic addition and deletion of resources.
  • Two Fundamental entities in Grid Information Service:
    • Highly distributed information providers.
    • Specialized aggregate directory services.
  • Both these entities speak two fundamental protocols.
information services1
Information Services

VO-specific Aggregate Directories

discovery (GRIP)

D

D

registration (GRRP)

lookup (GRIP)

P

P

P

P

Information Provider Services

  • Initial Discovery and ongoing monitoring of Resources
  • Existing services such as LDAP and UDDI do not address
  • the dynamic addition and deletion of resources.
  • Two Fundamental entities in Grid Information Service:
    • Highly distributed information providers.
    • Specialized aggregate directory services.
  • Both these entities speak two fundamental protocols.
information services protocols
Information Services - Protocols

Grid Information Protocol (GRIP)

  • Used to access information about entities
  • GRIP supports both discovery and enquiry
  • GRIP is adopted from Lightweight Directory Access Protocol (LDAP)
  • LDAP defines data model,query language and wire protocol.

Grid Registration Protocol (GRRP)

  • Define a notification mechanism to push simple information from one ‘element’ to another ‘element’.
  • It is a soft-state protocol which is resilient to failures.
  • GRRP message contains name of the service,type of notification service and timestamp.
hierarchical discovery
Hierarchical Discovery

Each directory uses

GRIP and act as a

Information Provider

Host:hn=R1,O=O1

Host:hn=R2,O=O1

Host:hn=R3,O=O1

Host:hn=R1,O=O2

Host:hn=R2,O=O2

Host:hn=R1

VO Directory

O2

O1

R1

Center 1

Directory

Center 2

Directory

Host:hn=R1

Host:hn=R2

Host:hn=R3

Host:hn=R1

Host:hn=R2

Host

R1

R2

R3

R1

R3

Information Provider

Host

Host

Host

Host

Host

Network of aggregate directories

data transfer gridftp
Data Transfer - GridFTP
  • High-speed transport protocol which extends the popular FTP protocol.
  • GridFTP Functionality:
    • GridFTP must support GSI
    • Third-party control of data transfer
    • Parallel data transfer
    • Stripped data transfer
    • Partial file transfer
    • Support for reliable and restartable data transfer.
  • The implementation consists of two principal libraries: globus_ftp_control_library and globus_ftp_client_library
replica management service
Replica Management Service

Application

Location of Selected Replicas

(8)

Attributes of desired data

(1)

(5)

Logical File Names

(2)

Location of 1 or more replicas

(4)

Replica Selection Service

Metadata Service

(3)

Replica Management Service

Performance Measurements and Predictions

(7)

Sources and destination

(6)

Information Services

replica management service1
Replica Management Service
  • Creating new copies of a complete or partial collection of files
  • Registering them in a Replica Catalog
  • Allow Applications to query the catalog
  • Data are organized into files.
    • Logical File name Vs Physical File name.
  • Key Architecture Decisions:
    • Separation of Replication and Metadata Information
    • Does not enforce Replication Semantics
    • Provide Rollback to keep the state consistent in case of failures
    • No distributed locking mechanism
relationships to other technologies
Relationships to other technologies
  • World Wide Web
    • Web technologies mainly support client-server architecture. Lack features (at least for now) for rich interaction and single-sign on security.
  • ASP and SSP.
    • Provide outsource solutions which depend on specific customer. Lack dynamic configuration.
  • Enterprise Computing
    • Static arrangements of sharing resources.
  • P2P computing
    • Getting closer to Grid technology, but provide specific solutions rather than common protocols.
other grid perspective
Other Grid Perspective
  • Grid as a next-generation Internet
  • Grid is a source of free cycles
  • Grid requires new programming models
  • Grid makes high-performance computers superfluous
references
References
  • What Is The Grid? A Three Point Checklist. I. Foster, GRIDToday, July 22, 2002: Vol. 1 No. 6.
  • Grid Computing on the Web Using the Globus Toolkit, G. Aloisio, M. Cafaro, P. Falabella, C. Kesselman, R. Williams HPCN Europe.
  • Computational Grids. I. Foster, C. Kesselman. Chapter 11 of "The Grid: Blueprint for a New Computing Infrastructure", Morgan-Kaufman, 1999.
  • The Globus Project: A Status Report. I. Foster, C. Kesselman. Proc. IPPS/SPDP '98 Heterogeneous Computing Workshop, pp. 4-18, 1998.
  • Globus: A Metacomputing Infrastructure Toolkit. I. Foster, C. Kesselman. Intl J. Supercomputer Applications, 11(2):115-128, 1997.
references1
References
  • Data Management and Transfer in High Performance Computational Grid Environments. B. Allcock, J. Bester, J. Bresnahan, A. L. Chervenak, I. Foster, C. Kesselman, S. Meder, V. Nefedova, D. Quesnal, S. Tuecke. Parallel Computing Journal, Vol. 28 (5), May 2002, pp. 749-771.
  • Computational Grids.I. Foster, C. Kesselman. Chapter 2 of "The Grid: Blueprint for a New Computing Infrastructure", Morgan-Kaufman, 1999.
  • A Directory Service for Configuring High-Performance Distributed Computations. S. Fitzgerald, I. Foster, C. Kesselman, G. von Laszewski, W. Smith, S. Tuecke. Proc. 6th IEEE Symposium on High-Performance Distributed Computing, pp. 365-375, 1997.
references2
References
  • Grid Information Services for Distributed Resource Sharing. K. Czajkowski, S. Fitzgerald, I. Foster, C. Kesselman. Proceedings of the Tenth IEEE International Symposium on High-Performance Distributed Computing (HPDC-10), IEEE Press, August 2001.
  • A Security Architecture for Computational Grids. I. Foster, C. Kesselman, G. Tsudik, S. Tuecke. Proc. 5th ACM Conference on Computer and Communications Security Conference, pp. 83-92, 1998.
  • A Resource Management Architecture for Metacomputing Systems. K. Czajkowski, I. Foster, N. Karonis, C. Kesselman, S. Martin, W. Smith, S. Tuecke. Proc. IPPS/SPDP '98 Workshop on Job Scheduling Strategies for Parallel Processing, pg. 62-82, 1998.
closing remarks
Closing Remarks

We will probably see the spread of 'computer utilities', which, like present electric and telephone utilities, will service individual homes and offices across the country."

- 1969, Len Kleinrock

We are a little late, but we are ready now!

extra 1 a model architecture for data grids
Extra-1: A Model Architecture for Data Grids

Attribute Specification

Replica Catalog

Metadata Catalog

Application

Multiple Locations

Logical Collection and Logical File Name

MDS

Selected

Replica

Replica

Selection

Performance

Information &

Predictions

NWS

GridFTP Control Channel

Disk Cache

GridFTPDataChannel

TapeLibrary

Disk Array

Disk Cache

Replica Location 1

Replica Location 2

Replica Location 3

extra 2 replica catalog structure
Extra-2: Replica Catalog Structure:

Replica Catalog

Logical Collection

C02 measurements 1998

Logical Collection

C02 measurements 1999

Filename: Jan 1998

Filename: Feb 1998

Logical File Parent

Location

jupiter.isi.edu

Location

sprite.llnl.gov

Filename: Mar 1998

Filename: Jun 1998

Filename: Oct 1998

Protocol: gsiftp

UrlConstructor:

gsiftp://jupiter.isi.edu/

nfs/v6/climate

Filename: Jan 1998

Filename: Dec 1998

Protocol: ftp

UrlConstructor:

ftp://sprite.llnl.gov/

pub/pcmdi

Logical File Jan 1998

Logical File Feb 1998

Size: 1468762