resource management of large scale applications on a grid
Download
Skip this Video
Download Presentation
Resource Management of Large-Scale Applications on a Grid

Loading in 2 Seconds...

play fullscreen
1 / 49

Resource Management of Large-Scale Applications on a Grid - PowerPoint PPT Presentation


  • 224 Views
  • Uploaded on

Resource Management of Large-Scale Applications on a Grid . Laukik Chitnis and Sanjay Ranka (with Paul Avery, Jang-uk In and Rick Cavanaugh) Department of CISE University of Florida, Gainesville [email protected] 352 392 6838 (http://www.cise.ufl.edu/~ranka/). Overview.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Resource Management of Large-Scale Applications on a Grid' - Pat_Xavi


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
resource management of large scale applications on a grid

Resource Management of Large-Scale Applications on a Grid

Laukik Chitnis and Sanjay Ranka

(with Paul Avery, Jang-uk In and Rick Cavanaugh)

Department of CISE

University of Florida, Gainesville

[email protected]

352 392 6838

(http://www.cise.ufl.edu/~ranka/)

overview
Overview
  • High End Grid Applications and Infrastructure at University of Florida
  • Resource Management for Grids
  • Sphinx Middleware for Resource Provisioning
  • Grid Monitoring for better meta-scheduling
  • Provisioning Algorithm Research for multi-core and grid environments
the evolution of high end applications and their system characteristics
Compute Intensive Applications

MainFrame

Applications

The Evolution of High-End Applications (and their system characteristics)
  • Geographicallydistributed datasets
  • High speed storage
  • Gigabit networks

Data Intensive Applications

  • Large clusters
  • Supercomputers
  • Centralmainframes

1980

1990

2000

some representative applications

Some Representative Applications

HEP, Medicine, Astronomy, Distributed Data Mining

representative application high energy physics
Representative Application: High Energy Physics

1000+

20+ countries

1-10 petabytes

1-

representative application tele radiation therapy
Representative Application: Tele-Radiation Therapy

RCET Center for Radiation Oncology

representative application distributed intrusion detection
Application

Application

Data Mining and Scheduling Services

Data Transport Services

.

.

Data Management

Services

Data Management

Services

Representative Application: Distributed Intrusion Detection

NSF ITR Project:

Middleware for Distributed

Data Mining

(PI: Ranka

joint with Kumar and Grossman)

grid infrastructure

Grid Infrastructure

Florida Lambda Rail and UF

campus grid university of florida
Campus Grid (University of Florida)

NSF Major Research Instrumentation Project

(PI: Ranka, Avery et. al.)

20 Gigabit/sec Network

20+ Terabytes

2-3 Teraflops

10 Scientific and Engineering Applications

Gigabit Ethernet Based Cluster

Infiniband based Cluster

grid services

Grid Services

The software part of the infrastructure!

services offered in a grid
Security ServicesServices offered in a Grid

Resource Management Services

Monitoring and Information Services

Data Management Services

Note that all the other services use security services

slide12
Resource Management Services
  • Provide a uniform, standard interface to remote resources including CPU, Storage and Bandwidth
  • Main component is the remote job manager
  • Ex: GRAM (Globus Resource Allocation Manager)
resource management on a grid
UserResource Management on a Grid

GRAM

LSF

Site 2

Condor

Site 1

PBS

fork

Site 3

Site n

The Grid

Narration: note the different local schedulers

scheduling your application15
Scheduling your Application
  • An application can be run on a grid site as a job
  • The modules in grid architecture (such as GRAM) allow uniform access to the grid sites for your job
  • But…
    • Most applications can be “parallelized”
    • And these separate parts of it can be scheduled to run simultaneously on different sites
      • Thus utilizing the power of the grid
modeling an application workflow
Many workflows can be modeled as a Directed Acyclic Graph

The amount of resource required (in units of time) is known to a degree of certainty

There is a small probability of failure in execution (in a grid environment this could happen due to resources no longer available)

Directed Acyclic Graph

Modeling an Application Workflow
slide17
Workflow Resource Provisioning

Executing multiple

workflows

over distributed and

adaptive (faulty)

resources

while managing

policies

Large

Precedence

Applications

Time Constraints

Data Intensive

Access

Control

Priority

Multi-core

Heterogeneous

Policies

Resources

Multiple

Ownership

Quota

Faulty

Distributed

a real life example from high energy physics
UW

MIT

UI

FNAL

Caltech

UCSD

Rice

UF

BU

UM

UC

BNL

ANL

IU

LBL

OU

UTA

SMU

A Real Life Example from High Energy Physics
  • Merge two grids into a single

multi-VO“Inter-Grid”

  • How to ensure that
    • neither VO is harmed?
    • both VOs actually benefit?
    • there are answers to questions like:
      • “With what probability will my job be scheduled and complete before my conference deadline?”
  • Clear need for a scheduling middleware!
slide19
Typical scenario

VDT Client

?

?

?

VDT Server

VDT Server

VDT Server

slide20
Typical scenario

@#^%#%[email protected]#

VDT Client

?

?

?

VDT Server

VDT Server

VDT Server

slide21
Some Requirements for Effective Grid Scheduling
  • Information requirements
    • Past & future dependencies of the application
      • Persistent storage of workflows
    • Resource usage estimation
    • Policies
      • Expected to vary slowly over time
    • Global views of job descriptions
    • Request Tracking and Usage Statistics
      • State information important
    • Resource Properties and Status
      • Expected to vary slowly with time
    • Grid weather
      • Latency of measurement important
    • Replica management
  • System requirements
    • Distributed, fault-tolerant scheduling
    • Customisability
    • Interoperability with other scheduling systems
    • Quality of Service
slide22
Incorporate Requirementsinto a Framework

VDT Client

?

?

?

  • Assume the GriPhyN Virtual Data Toolkit:
    • Client (request/job submission)
      • Globus clients
      • Condor-G/DAGMan
      • Chimera Virtual Data System
    • Server (resource gatekeeper)
      • MonALISA Monitoring Service
      • Globus services
      • RLS (Replica Location Service)

VDT Server

VDT Server

VDT Server

slide23
Incorporate Requirementsinto a Framework

?

  • Framework design principles:
    • Information driven
    • Flexible client-server model
    • General, but pragmatic and simple
    • Avoid adding middleware requirements on grid resources

VDT Client

Recommendation

Engine

VDT Server

  • Assume the Virtual Data Toolkit:
    • Client (request/job submission)
      • Clarens Web Service
      • Globus clients
      • Condor-G/DAGMan
      • Chimera Virtual Data System
    • Server (resource gatekeeper)
      • MonALISA Monitoring Service
      • Globus services
      • RLS (Replica Location Service)

VDT Server

VDT Server

slide25
Innovative Workflow Scheduling Middleware
  • Modular system
    • Automated scheduling procedure based on modulated service
  • Robust and recoverable system
    • Database infrastructure
    • Fault-tolerant and recoverable from internal failure
  • Platform independent interoperable system
    • XML-based communication protocols
      • SOAP, XML-RPC
    • Supports heterogeneous service environment
  • 60 Java Classes
    • 24,000 lines of Java code
    • 50 test scripts, 1500 lines of script code
the sphinx workflow execution framework
The Sphinx Workflow Execution Framework

VDT Client

Sphinx Server

Sphinx Client

Chimera

Virtual Data

System

Clarens

WS Backbone

Request

Processing

Condor-G/DAGMan

Data

Warehouse

Data

Management

VDT Server Site

Globus Resource

Information

Gathering

Replica Location Service

MonALISA Monitoring Service

sphinx workflow scheduling server
Sphinx Workflow Scheduling Server

Sphinx Server

Message Interface

  • Functions as the Nerve Centre
  • Data Warehouse
    • Policies, Account Information, Grid Weather, Resource Properties and Status, Request Tracking, Workflows, etc
  • Control Process
    • Finite State Machine
      • Different modules modify jobs, graphs, workflows, etc and change their state
    • Flexible
    • Extensible

Graph Reducer

Control Process

Job Predictor

Graph Predictor

Job Admission Control

Graph Admission Control

Graph Data Planner

Data Warehouse

Job Execution Planner

Graph Tracker

Data Management

Information Gatherer

sphinx

SPHINX

Scheduling in Parallel for Heterogeneous Independent NetworXs

policy based scheduling
Policy Based Scheduling

Submissions

Resources

Time

  • Sphinx provides “soft” QoS through time dependent, global views of
    • Submissions (workflows, jobs, allocation, etc)
    • Policies
    • Resources
  • Uses Linear Programming Methods
    • Satisfy Constraints
      • Policies, User-requirements, etc
    • Optimize an “objective” function Estimate probabilities to meet deadlines within policy constraints

J. In, P. Avery, R. Cavanaugh, and S. Ranka, "Policy Based Scheduling for Simple Quality of Service in Grid Computing", in Proceedings of the 18th IEEE IPDPS, Santa Fe, New Mexico, April, 2004

Policy Space

Submissions

Resources

Time

ability to tolerate task failures
Ability to tolerate task failures

Jang-uk In, Sanjay Ranka et. al. "SPHINX: A fault-tolerant system for scheduling

in dynamic grid environments", in Proceedings of the 19th IEEE IPDPS, Denver,

Colorado, April, 2005

  • Significant Impact of using feedback information
slide32
File

Service

File

Service

File

Service

File

Service

VDT Resource

Service

VDT Resource

Service

VDT Resource

Service

VDT Resource

Service

Fermilab

Caltech

Florida

Iowa

Sphinx

RLS

MonALISA

ROOT

Chimera

Sphinx/VDT

Monitoring

Service

Execution

Service

Replica

Location

Service

Virtual Data

Service

Scheduling

Service

Data Analysis

Client

Distributed Services for Grid

Enabled Data Analysis

Distributed Services for Grid

Enabled Data Analysis

Clarens

Clarens

Globus

Clarens

Clarens

GridFTP

Globus

Globus

MonALISA

limitation of existing monitoring systems for the grid
Limitation of Existing Monitoring Systems for the Grid
  • Information aggregated across multiple users is not very useful in effective resource allocation.
  • An end-to-end parameter such as Average Job Delay - the average queuing delay experienced by a job of a given user at an execution site - is a better estimate for comparing the resource availability and response time for a given user.
  • It is also not very susceptible to monitoring latencies.
effective dag scheduling
Effective DAG Scheduling
  • The completion time based algorithm here uses the Average Job Delay parameter for scheduling
  • As seen in the adjoining figure, it outperforms the algorithms tested with other monitored parameters.
work in progress modeling workflow cost and developing efficient provisioning algorithms
Directed Acyclic GraphWork in Progress: Modeling Workflow Cost and developing efficient provisioning algorithms

1. Developing an objective measure of completion time

Integrating performance and reliability of workflow execution P (Time to complete >=T) <= epsilon

2. Relating this measure to the properties of the longest path of the DAG based on the mean and uncertainty of time required for underlying tasks due to

1) variable time requirements due to different parameter values

2) failure due to change of the underlying resources etc.

3. Developing novel scheduling and replication techniques to optimize allocation based on these metrics.

work in progress provisioning algorithms for multiple workflows yield management
Work in Progress: Provisioning algorithms for multiple workflows (Yield Management)

Multiple Workflows

Level 1

Level 1

Level 2

Level 2

Level 3

Level 3

Level 4

Level 4

Dag 1

Dag 1

Dag 2

Dag 2

Dag 3

Dag 3

Dag 4

Dag 4

Dag 5

Dag 5

  • Quality of Service guarantees for each workflow
  • Controlled (a cluster of multi-core processors) versus uncontrolled
  • (grid of multiple clusters owned by multiple units) environment
chepreo grid education and networking
CHEPREO - Grid Education and Networking
  • E/O Center in Miami area
  • Tutorial for Large Scale Application Development
grid education
Grid Education
  • Developing a Grid tutorial as part of CHEPREO
    • Grid basics
    • Components of a Grid
    • Grid Services OGSA …
  • OSG summer workshop
    • South Padre island, Texas. July 11-15, 2005
    • http://osg.ivdgl.org/twiki/bin/view/SummerGridWorkshop/
    • Lectures and Hands-on sessions
    • Building and Maintaining a Grid
acknowledgements
Acknowledgements
  • CHEPREO project, NSF
  • GriPhyN/iVDgL, NSF
  • Data Mining Middleware, NSF
  • Intel Corporation
thank you

Thank You

May the Force be with you!

effect of latency on average job delay
Effect of latency on Average Job Delay
  • Latency is simulated in the system by purposely retrieving old values for the parameter while making scheduling decisions
  • The correlation indices with added latencies are comparable, though lower as expected, to the correlation indices of ‘un-delayed’ Average Job Delay parameter. The amount of correlation is still quite high.
sphinx scheduling latency
SPHINX Scheduling Latency

Average scheduling latency for various number of

DAG’s (20, 40 , 80 and 100) with different arrival rate per minute.

slide45
Virtual data service

Chimera

Graphical user interface

for data analysis

ROOT

Grid enabled

Web service

Clarens

Clarens

Clarens

Grid resource

management

service

VDT server

Grid enabled

execution

service

VDT client

Grid resource monitoring system

MonALISA

Grid scheduling

service

Sphinx

Clarens

Clarens

Replica location service

RLS

Demonstration at Supercomputing Conference:

Distributed Data Analysis in a Grid Environment

The architecture has been implemented and demonstrated in SC03 and SC04, Arizona, USA, 2003.

scheduling dags dynamic critical path algorithm
Scheduling DAGs: Dynamic Critical Path Algorithm

The DCP algorithm executes the following steps iteratively:

  • Compute the earliest possible start time (AEST) and the latest possible start time (ALST) for all tasks on each processor.
  • Select a task which has the smallest difference between its ALST and AEST and has no unscheduled parent task. If there are tasks with the same differences, select the one with a smaller AEST.
  • Select a processor which gives the earliest start time for the selected task
scheduling dags ilp novel algorithm to support heterogeneity work supported by intel corporation
Directed Acyclic GraphScheduling DAGs: ILP- Novel algorithm to support heterogeneity (work supported by Intel Corporation)

There are two novel features:

  • Assign multiple independent tasks simultaneously – cost of task assigned depends on the processor available, many tasks commence with a small difference in start time.
  • Iteratively refine the scheduling - refines the scheduling by using the cost of the critical path based on the assignment in the previous iteration.
comparison of different algorithms
Comparison of different algorithms

Number of processors = 30.

Number of Tasks = 2000.

Number of processors = 30.

ad