Resource management of large scale applications on a grid
Download
1 / 49

resource management of large-scale applications on a grid - PowerPoint PPT Presentation


  • 221 Views
  • Uploaded on

Resource Management of Large-Scale Applications on a Grid . Laukik Chitnis and Sanjay Ranka (with Paul Avery, Jang-uk In and Rick Cavanaugh) Department of CISE University of Florida, Gainesville [email protected] 352 392 6838 (http://www.cise.ufl.edu/~ranka/). Overview.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' resource management of large-scale applications on a grid ' - Pat_Xavi


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Resource management of large scale applications on a grid l.jpg

Resource Management of Large-Scale Applications on a Grid

Laukik Chitnis and Sanjay Ranka

(with Paul Avery, Jang-uk In and Rick Cavanaugh)

Department of CISE

University of Florida, Gainesville

[email protected]

352 392 6838

(http://www.cise.ufl.edu/~ranka/)


Overview l.jpg
Overview

  • High End Grid Applications and Infrastructure at University of Florida

  • Resource Management for Grids

  • Sphinx Middleware for Resource Provisioning

  • Grid Monitoring for better meta-scheduling

  • Provisioning Algorithm Research for multi-core and grid environments


The evolution of high end applications and their system characteristics l.jpg

Compute Intensive Applications

MainFrame

Applications

The Evolution of High-End Applications (and their system characteristics)

  • Geographicallydistributed datasets

  • High speed storage

  • Gigabit networks

Data Intensive Applications

  • Large clusters

  • Supercomputers

  • Centralmainframes

1980

1990

2000


Some representative applications l.jpg

Some Representative Applications

HEP, Medicine, Astronomy, Distributed Data Mining


Representative application high energy physics l.jpg
Representative Application: High Energy Physics

1000+

20+ countries

1-10 petabytes

1-


Representative application tele radiation therapy l.jpg
Representative Application: Tele-Radiation Therapy

RCET Center for Radiation Oncology


Representative application distributed intrusion detection l.jpg

Application

Application

Data Mining and Scheduling Services

Data Transport Services

.

.

Data Management

Services

Data Management

Services

Representative Application: Distributed Intrusion Detection

NSF ITR Project:

Middleware for Distributed

Data Mining

(PI: Ranka

joint with Kumar and Grossman)


Grid infrastructure l.jpg

Grid Infrastructure

Florida Lambda Rail and UF


Campus grid university of florida l.jpg
Campus Grid (University of Florida)

NSF Major Research Instrumentation Project

(PI: Ranka, Avery et. al.)

20 Gigabit/sec Network

20+ Terabytes

2-3 Teraflops

10 Scientific and Engineering Applications

Gigabit Ethernet Based Cluster

Infiniband based Cluster


Grid services l.jpg

Grid Services

The software part of the infrastructure!


Services offered in a grid l.jpg

Security Services

Services offered in a Grid

Resource Management Services

Monitoring and Information Services

Data Management Services

Note that all the other services use security services


Slide12 l.jpg

Resource Management Services

  • Provide a uniform, standard interface to remote resources including CPU, Storage and Bandwidth

  • Main component is the remote job manager

  • Ex: GRAM (Globus Resource Allocation Manager)


Resource management on a grid l.jpg

User

Resource Management on a Grid

GRAM

LSF

Site 2

Condor

Site 1

PBS

fork

Site 3

Site n

The Grid

Narration: note the different local schedulers



Scheduling your application15 l.jpg
Scheduling your Application

  • An application can be run on a grid site as a job

  • The modules in grid architecture (such as GRAM) allow uniform access to the grid sites for your job

  • But…

    • Most applications can be “parallelized”

    • And these separate parts of it can be scheduled to run simultaneously on different sites

      • Thus utilizing the power of the grid


Modeling an application workflow l.jpg

Many workflows can be modeled as a Directed Acyclic Graph

The amount of resource required (in units of time) is known to a degree of certainty

There is a small probability of failure in execution (in a grid environment this could happen due to resources no longer available)

Directed Acyclic Graph

Modeling an Application Workflow


Slide17 l.jpg

Workflow Resource Provisioning

Executing multiple

workflows

over distributed and

adaptive (faulty)

resources

while managing

policies

Large

Precedence

Applications

Time Constraints

Data Intensive

Access

Control

Priority

Multi-core

Heterogeneous

Policies

Resources

Multiple

Ownership

Quota

Faulty

Distributed


A real life example from high energy physics l.jpg

UW

MIT

UI

FNAL

Caltech

UCSD

Rice

UF

BU

UM

UC

BNL

ANL

IU

LBL

OU

UTA

SMU

A Real Life Example from High Energy Physics

  • Merge two grids into a single

    multi-VO“Inter-Grid”

  • How to ensure that

    • neither VO is harmed?

    • both VOs actually benefit?

    • there are answers to questions like:

      • “With what probability will my job be scheduled and complete before my conference deadline?”

  • Clear need for a scheduling middleware!


Slide19 l.jpg

Typical scenario

VDT Client

?

?

?

VDT Server

VDT Server

VDT Server


Slide20 l.jpg

Typical scenario

@#^%#%$@#

VDT Client

?

?

?

VDT Server

VDT Server

VDT Server


Slide21 l.jpg

Some Requirements for Effective Grid Scheduling

  • Information requirements

    • Past & future dependencies of the application

      • Persistent storage of workflows

    • Resource usage estimation

    • Policies

      • Expected to vary slowly over time

    • Global views of job descriptions

    • Request Tracking and Usage Statistics

      • State information important

  • Resource Properties and Status

    • Expected to vary slowly with time

  • Grid weather

    • Latency of measurement important

  • Replica management

  • System requirements

    • Distributed, fault-tolerant scheduling

    • Customisability

    • Interoperability with other scheduling systems

    • Quality of Service


  • Slide22 l.jpg

    Incorporate Requirementsinto a Framework

    VDT Client

    ?

    ?

    ?

    • Assume the GriPhyN Virtual Data Toolkit:

      • Client (request/job submission)

        • Globus clients

        • Condor-G/DAGMan

        • Chimera Virtual Data System

      • Server (resource gatekeeper)

        • MonALISA Monitoring Service

        • Globus services

        • RLS (Replica Location Service)

    VDT Server

    VDT Server

    VDT Server


    Slide23 l.jpg

    Incorporate Requirementsinto a Framework

    ?

    • Framework design principles:

      • Information driven

      • Flexible client-server model

      • General, but pragmatic and simple

      • Avoid adding middleware requirements on grid resources

    VDT Client

    Recommendation

    Engine

    VDT Server

    • Assume the Virtual Data Toolkit:

      • Client (request/job submission)

        • Clarens Web Service

        • Globus clients

        • Condor-G/DAGMan

        • Chimera Virtual Data System

      • Server (resource gatekeeper)

        • MonALISA Monitoring Service

        • Globus services

        • RLS (Replica Location Service)

    VDT Server

    VDT Server



    Slide25 l.jpg

    • Innovative Workflow Scheduling Middleware

    • Modular system

      • Automated scheduling procedure based on modulated service

    • Robust and recoverable system

      • Database infrastructure

      • Fault-tolerant and recoverable from internal failure

    • Platform independent interoperable system

      • XML-based communication protocols

        • SOAP, XML-RPC

      • Supports heterogeneous service environment

    • 60 Java Classes

      • 24,000 lines of Java code

      • 50 test scripts, 1500 lines of script code


    The sphinx workflow execution framework l.jpg
    The Sphinx Workflow Execution Framework

    VDT Client

    Sphinx Server

    Sphinx Client

    Chimera

    Virtual Data

    System

    Clarens

    WS Backbone

    Request

    Processing

    Condor-G/DAGMan

    Data

    Warehouse

    Data

    Management

    VDT Server Site

    Globus Resource

    Information

    Gathering

    Replica Location Service

    MonALISA Monitoring Service


    Sphinx workflow scheduling server l.jpg
    Sphinx Workflow Scheduling Server

    Sphinx Server

    Message Interface

    • Functions as the Nerve Centre

    • Data Warehouse

      • Policies, Account Information, Grid Weather, Resource Properties and Status, Request Tracking, Workflows, etc

    • Control Process

      • Finite State Machine

        • Different modules modify jobs, graphs, workflows, etc and change their state

      • Flexible

      • Extensible

    Graph Reducer

    Control Process

    Job Predictor

    Graph Predictor

    Job Admission Control

    Graph Admission Control

    Graph Data Planner

    Data Warehouse

    Job Execution Planner

    Graph Tracker

    Data Management

    Information Gatherer


    Sphinx l.jpg

    SPHINX

    Scheduling in Parallel for Heterogeneous Independent NetworXs


    Policy based scheduling l.jpg
    Policy Based Scheduling

    Submissions

    Resources

    Time

    • Sphinx provides “soft” QoS through time dependent, global views of

      • Submissions (workflows, jobs, allocation, etc)

      • Policies

      • Resources

    • Uses Linear Programming Methods

      • Satisfy Constraints

        • Policies, User-requirements, etc

      • Optimize an “objective” function Estimate probabilities to meet deadlines within policy constraints

        J. In, P. Avery, R. Cavanaugh, and S. Ranka, "Policy Based Scheduling for Simple Quality of Service in Grid Computing", in Proceedings of the 18th IEEE IPDPS, Santa Fe, New Mexico, April, 2004

    Policy Space

    Submissions

    Resources

    Time


    Ability to tolerate task failures l.jpg
    Ability to tolerate task failures

    Jang-uk In, Sanjay Ranka et. al. "SPHINX: A fault-tolerant system for scheduling

    in dynamic grid environments", in Proceedings of the 19th IEEE IPDPS, Denver,

    Colorado, April, 2005

    • Significant Impact of using feedback information



    Slide32 l.jpg

    File

    Service

    File

    Service

    File

    Service

    File

    Service

    VDT Resource

    Service

    VDT Resource

    Service

    VDT Resource

    Service

    VDT Resource

    Service

    Fermilab

    Caltech

    Florida

    Iowa

    Sphinx

    RLS

    MonALISA

    ROOT

    Chimera

    Sphinx/VDT

    Monitoring

    Service

    Execution

    Service

    Replica

    Location

    Service

    Virtual Data

    Service

    Scheduling

    Service

    Data Analysis

    Client

    Distributed Services for Grid

    Enabled Data Analysis

    Distributed Services for Grid

    Enabled Data Analysis

    Clarens

    Clarens

    Globus

    Clarens

    Clarens

    GridFTP

    Globus

    Globus

    MonALISA



    Limitation of existing monitoring systems for the grid l.jpg
    Limitation of Existing Monitoring Systems for the Grid systems

    • Information aggregated across multiple users is not very useful in effective resource allocation.

    • An end-to-end parameter such as Average Job Delay - the average queuing delay experienced by a job of a given user at an execution site - is a better estimate for comparing the resource availability and response time for a given user.

    • It is also not very susceptible to monitoring latencies.


    Effective dag scheduling l.jpg
    Effective DAG Scheduling systems

    • The completion time based algorithm here uses the Average Job Delay parameter for scheduling

    • As seen in the adjoining figure, it outperforms the algorithms tested with other monitored parameters.


    Work in progress modeling workflow cost and developing efficient provisioning algorithms l.jpg

    Directed Acyclic Graph systems

    Work in Progress: Modeling Workflow Cost and developing efficient provisioning algorithms

    1. Developing an objective measure of completion time

    Integrating performance and reliability of workflow execution P (Time to complete >=T) <= epsilon

    2. Relating this measure to the properties of the longest path of the DAG based on the mean and uncertainty of time required for underlying tasks due to

    1) variable time requirements due to different parameter values

    2) failure due to change of the underlying resources etc.

    3. Developing novel scheduling and replication techniques to optimize allocation based on these metrics.


    Work in progress provisioning algorithms for multiple workflows yield management l.jpg
    Work in Progress: Provisioning algorithms for multiple workflows (Yield Management)

    Multiple Workflows

    Level 1

    Level 1

    Level 2

    Level 2

    Level 3

    Level 3

    Level 4

    Level 4

    Dag 1

    Dag 1

    Dag 2

    Dag 2

    Dag 3

    Dag 3

    Dag 4

    Dag 4

    Dag 5

    Dag 5

    • Quality of Service guarantees for each workflow

    • Controlled (a cluster of multi-core processors) versus uncontrolled

    • (grid of multiple clusters owned by multiple units) environment


    Chepreo grid education and networking l.jpg
    CHEPREO - Grid Education and Networking workflows (Yield Management)

    • E/O Center in Miami area

    • Tutorial for Large Scale Application Development


    Grid education l.jpg
    Grid Education workflows (Yield Management)

    • Developing a Grid tutorial as part of CHEPREO

      • Grid basics

      • Components of a Grid

      • Grid Services OGSA …

    • OSG summer workshop

      • South Padre island, Texas. July 11-15, 2005

      • http://osg.ivdgl.org/twiki/bin/view/SummerGridWorkshop/

      • Lectures and Hands-on sessions

      • Building and Maintaining a Grid


    Acknowledgements l.jpg
    Acknowledgements workflows (Yield Management)

    • CHEPREO project, NSF

    • GriPhyN/iVDgL, NSF

    • Data Mining Middleware, NSF

    • Intel Corporation


    Thank you l.jpg

    Thank You workflows (Yield Management)

    May the Force be with you!


    Additional slides l.jpg

    Additional slides workflows (Yield Management)


    Effect of latency on average job delay l.jpg
    Effect of latency on workflows (Yield Management)Average Job Delay

    • Latency is simulated in the system by purposely retrieving old values for the parameter while making scheduling decisions

    • The correlation indices with added latencies are comparable, though lower as expected, to the correlation indices of ‘un-delayed’ Average Job Delay parameter. The amount of correlation is still quite high.


    Sphinx scheduling latency l.jpg
    SPHINX Scheduling Latency workflows (Yield Management)

    Average scheduling latency for various number of

    DAG’s (20, 40 , 80 and 100) with different arrival rate per minute.


    Slide45 l.jpg

    Virtual data service workflows (Yield Management)

    Chimera

    Graphical user interface

    for data analysis

    ROOT

    Grid enabled

    Web service

    Clarens

    Clarens

    Clarens

    Grid resource

    management

    service

    VDT server

    Grid enabled

    execution

    service

    VDT client

    Grid resource monitoring system

    MonALISA

    Grid scheduling

    service

    Sphinx

    Clarens

    Clarens

    Replica location service

    RLS

    Demonstration at Supercomputing Conference:

    Distributed Data Analysis in a Grid Environment

    The architecture has been implemented and demonstrated in SC03 and SC04, Arizona, USA, 2003.


    Scheduling dags dynamic critical path algorithm l.jpg
    Scheduling DAGs: Dynamic Critical Path Algorithm workflows (Yield Management)

    The DCP algorithm executes the following steps iteratively:

    • Compute the earliest possible start time (AEST) and the latest possible start time (ALST) for all tasks on each processor.

    • Select a task which has the smallest difference between its ALST and AEST and has no unscheduled parent task. If there are tasks with the same differences, select the one with a smaller AEST.

    • Select a processor which gives the earliest start time for the selected task


    Scheduling dags ilp novel algorithm to support heterogeneity work supported by intel corporation l.jpg

    Directed Acyclic Graph workflows (Yield Management)

    Scheduling DAGs: ILP- Novel algorithm to support heterogeneity (work supported by Intel Corporation)

    There are two novel features:

    • Assign multiple independent tasks simultaneously – cost of task assigned depends on the processor available, many tasks commence with a small difference in start time.

    • Iteratively refine the scheduling - refines the scheduling by using the cost of the critical path based on the assignment in the previous iteration.


    Comparison of different algorithms l.jpg
    Comparison of different algorithms workflows (Yield Management)

    Number of processors = 30.

    Number of Tasks = 2000.

    Number of processors = 30.


    Time for scheduling l.jpg
    Time for Scheduling workflows (Yield Management)


    ad