Mate monitoring analysis and tuning environment
Download
1 / 37

MATE: Monitoring, Analysis and Tuning Environment - PowerPoint PPT Presentation


  • 85 Views
  • Uploaded on

Paradyn/Condor Week 2004 April 2004. MATE: Monitoring, Analysis and Tuning Environment. Anna Morajko, Tomàs Margalef and Emilio Luque Universitat Autònoma de Barcelona. Content. Introduction Dynamic Performance Tuning MATE Tuning Techniques Conclusions and future work. Introduction.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' MATE: Monitoring, Analysis and Tuning Environment' - emma


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Mate monitoring analysis and tuning environment

Paradyn/Condor Week 2004

April 2004

MATE:Monitoring, Analysis and Tuning Environment

Anna Morajko, Tomàs Margalef and Emilio LuqueUniversitat Autònoma de Barcelona


Content
Content

  • Introduction

  • Dynamic Performance Tuning

  • MATE

  • Tuning Techniques

  • Conclusions and future work


Introduction
Introduction

Application performance

  • Demand of high performance computation

  • The main goal of parallel/distributed applications: solve a considered problem in the possible fastest way

  • Performance is one of the most important issues

  • Developers must optimize application performance to provide efficient and useful applications


Introduction1

Application development

Source

Instrumentation

Application

Measurements

Modifications

Changes

Monitored execution

Performance data

Monitoring

Tuning

Bottlenecks

Source code relation

Performance analysis

Solutions

Introduction

Application performance optimization

Steps:

  • monitoring,

  • analysis,

  • tuning


Introduction2
Introduction

Application performance optimization

  • Difficulties in finding bottlenecks and determining their solutions for parallel/distributed applications

    • Many tasks that cooperate with each other

  • High degree of expertise

  • Application behavior may change on input data or environment

  • Difficult task especially for non-expert users


Introduction3
Introduction

Our goals

  • Investigate if it is possible to optimize performance of parallel/distributed applications dynamically without user intervention

  • Investigate the applicability of dynamic tuning

  • Create a tool that is able to dynamically optimize applications:

    • automatically improve application performance

    • improve the application execution during run time

    • tune without recompiling and rerunning

    • adapt application to existing conditions

  • Practically evaluate profitability of dynamic tuning


Introduction4

Problem /

Application development

Solution

Source

User

Application

Execution

Modifications

Performance data

Instrumentation

Monitoring

Tuning

Tool

Events

Performance analysis

Introduction

Dynamic automatic tuning


Content1
Content

  • Introduction

  • Dynamic Performance Tuning

  • MATE

  • Tuning Techniques

  • Conclusions and future work


Dynamic performance tuning
Dynamic Performance Tuning

Requirements

  • No user intervention

  • No source recompilation

  • Performance analysis on the fly

    • Global analysis

    • Decisions taken in a short time

    • Not complex analysis and modifications

  • Run time monitoring

  • Run time tuning

    • Modifications performed carefully

  • Parallel/distributed application control

  • Low intrusion


Dynamic performance tuning1
Dynamic Performance Tuning

Key question

What can be tuned in an application?

Application knowledge

Limited information about the application

Tuning layers

Approaches to tuning


Dynamic performance tuning2

Application code

API

Libraries code

OS API

Operating System

kernel

Hardware

Dynamic Performance Tuning

Tuning layers

  • Application specific code

  • Standard and custom libraries (API+code)

  • Operating system libraries (API+code)

  • Hardware


Dynamic performance tuning3

Application code

API

Libraries code

OS API

Operating System

kernel

Hardware

Dynamic Performance Tuning

Application

  • Application code changes

    • Different bottlenecks that depend on the application implementation

      Libraries

  • Library code changes

  • API usage

    • Standard

      • C/C++ library -> memory management, dynamic containers

    • Custom

      • PVM, MPI -> communication

        OS

  • Kernel code changes

  • API usage

    • Adjustment of options (e.g. TCP/IP socket), I/O request grouping

More bottlenecks common for wider group of applications


Dynamic performance tuning4

Application code

API

Libraries code

OS API

Operating System

kernel

Hardware

Dynamic Performance Tuning

Approaches to tuning

  • Cooperative

    • Application must be prepared for tuning

    • Application-specific knowledgeis provided

  • Automatic - black-box

    • Tuning of any application

    • No application-specific knowledge is required

    • Knowledge about bottleneck is required

    • No changes are introduced into the application source code

More cooperative, more application-specific

More automatic, more generic information available


Dynamic performance tuning5

Formulasand conditions for optimal behavior

measurements

optimal values

Dynamic Performance Tuning

Knowledge representation

  • Measure points

    • Where the instrumentation must be inserted to provide measurements

  • Performance model

    • Determines minimal execution time of the entire application

  • Tuning points/actions/synchronization

    • What and when can be changed in the application

      • point – element that may be changed

      • action – what to invoke on a point

      • synchronization – when a tuning action can be invoked to ensure application correctness


Dynamic performance tuning6

Provided by the user

Measure points

Application code

API

Performance model

Libraries code

OS API

Provided automatically by a tuning system

Operating System

kernel

Tuning point, action, sync

Hardware

Dynamic Performance Tuning

Application knowledge


Dynamic performance tuning7
Dynamic Performance Tuning

Manipulation of a running application

  • monitoring – collect information about the behavior of a running application

  • tuning – insert tuning code into a running application that improves its performance

    Dynamic instrumentation – DynInst


Dynamic performance tuning8
Dynamic Performance Tuning

Dynamic modifications of a running application with DynInst

  • Function replacement

  • Function invocation

  • One-time function invocation

  • Function call elimination

  • Function parameter changes

  • Variable changes


Content2
Content

  • Introduction

  • Dynamic Performance Tuning

  • MATE

  • Tuning Techniques

  • Conclusions and future work


MATE

MATE – Monitoring, Analysis and Tuning Environment

  • prototype implementation in C++

  • for PVM based applications

  • Sun Solaris 2.x / SPARC


Machine 1

Machine 2

pvmd

pvmd

modif.

AC

AC

Task1

Task3

Task2

DMLib

DMLib

DMLib

instr.

instr.

events

events

Machine 3

Analyzer

MATE

  • Application Controller - AC

  • Dynamic Monitoring Library - DMLib

  • Analyzer


Mate application controller
MATE: Application Controller

Services

  • Distributed application control

    • Startup/exit of tasks (Tasker)

    • Startup/exit of PVM daemons, slave ACs (Hoster)

    • Clock synchronization

  • Application model management (Task Manager)

  • Performance monitoring (Monitors)

    • Manage monitoring instrumentation

    • Provide monitoring API for Analyzer

  • Performance tuning (Tuners)

    • Manage tuning instrumentation

    • Provide tuning API for Analyzer


Mate application controller1

Machine 1

Task2

Task1

Instrument

Via

DynInst

DMLib

DMLib

AC

Monitor

add event/

remove event

Machine 2

Analyzer

MATE: Application Controller

  • Monitors

  • Instrumentation management via DynInst

    • Dynamically load DMLib

    • Generate monitoring snippets that call appropriate library functions

    • Insert/remove snippets in/from requested points

  • API

    • AddEventTrace(tid, eventId, funcName, instrPlace, attrs)

    • RemoveEventTrace(tid,eventId)


Mate application controller2

Machine 1

Task2

Task1

Tune

Via

DynInst

AC

Tuner

Machine 2

Apply tuning

Analyzer

MATE: Application Controller

Tuners

  • Tuning via DynInst

    • Generate tuning snippet according to the request

    • Insert tuning snippet

  • API

    • LoadLibrary(tid,path)

    • SetVariableValue(tid,params,brkpt)

    • ReplaceFunction(…)

    • InsertFunctionCall(…)

    • OneTimeFunctionCall(…)

    • RemoveFunctionCall(…)

    • FunctionParamChange(…)


Mate dynamic monitoring library

Machine 1

Task1

DMLib

pvm_send (p1, p2)

{

}

entry

event

DMLib_OpenEvent();

DMLib_AddIntAttr();

DMLib_AddIntAttr();

DMLib_CloseEvent();

1 0 64884 524247262149 1

API implementation

TCP/IP

Analyzer

MATE: Dynamic Monitoring Library

Services

  • Register event

    • What – event type (id, place)

    • When – global timestamp

    • Where – task identifier

    • Requested attributes – e.g. function call parameters, return value

  • Deliver event to the Analyzer

  • API

    • DMLib_InitLogger(tid, analyzerHost,port,clockDiff)

    • DMLib_OpenEvent(id, nAttrs)

    • DMLib_AddIntAttr(value)

    • DMLib_AddFloatAttr(value)

    • DMLib_AddCharAttr(value)

    • DMLib_AddStringAttr(value)

    • DMLib_CloseEvent()

    • DMLib_DoneLogger()


Mate analyzer
MATE: Analyzer

Services

  • Automatic performance analysis on the fly

    • Request for events

    • Collect incoming events

    • Find bottlenecks among events applying performance model

    • Find solutions that overcome bottlenecks

    • Send tuning request

  • Analyzer is provided with an application knowledge about performance problems

  • Information related to one problem we call a tuning technique

  • A tuning technique describes a complete performance optimization scenario


Mate analyzer1

Analyzer

Tunlet

Performance model

Measure points

Tuning point, action, sync

MATE: Analyzer

Tunlets

  • Each technique is implemented in MATE as a tunlet

  • A tunlet contains specific code (analysis logic) related to one concrete performance problem

    • measure points – what events are needed

    • performance model – how to determine bottlenecks and solutions

    • tuning actions/points/synchronization - what to change, where, when

  • A tunlet is a C/C++ library dynamically loaded to the Analyzer process


Mate analyzer2

thread

Events (from DMLibs) via TCP/IP

MetaData (from ACs) via TCP/IP

Tuning request (to tuner) via TCP/IP

Event

Collector

Controller

AC Proxy

Instrument. request (to monitor) via TCP/IP

Event

Repository

DTAPI

Application model

Tunlet

Tunlet

Tunlet

MATE: Analyzer


Content3
Content

  • Introduction

  • Dynamic Performance Tuning

  • MATE

  • Tuning Example

  • Conclusions and future work


Tuning example
Tuning Example

Workload balancing (App layer)

  • Imbalance problem:

    • Heterogeneous computing and communication powers

    • Varying amount of distributed work

  • Goal:

    • minimize the idle time by balancing the work among the processes considering efficiency of machines

  • Balancing -> faster machines process more work than slower

  • It cannot be statically balanced before program execution (different input data, network load, machine power and load)


Tuning example1
Tuning Example

Workload balancing (App layer)

  • Many scheduling methods -> Factoring Scheduling method

    • Work is divided into different-size tuples according to the factor

  • Application must be tunable:

    • well known variable that represents the factor

    • the factor must be checked before each iteration of the work distribution

    • the work tuples are calculated using the factoring scheduling method and according to the current factor value


Tuning example2
Tuning Example

Example application

  • Forest Fire propagation – Xfire

  • High computation cost

Benefits:

1) Up to 2%

2) Up to 49%

3) Up to 48%

Scenarios:

1) homogeneous and dedicated

2) heterogeneous and dedicated

3) heterogeneous and non-dedicated


Content4
Content

  • Introduction

  • Dynamic Performance Tuning

  • MATE

  • Tuning Techniques

  • Conclusions and future work


Conclusions
Conclusions

  • The principal conclusion: dynamic tuning works, is applicable, effective and useful in certain conditions

  • Limits of such tuning -> incomplete application information

  • Classification of layers where tuning can be performed (OS, libraries, apps)

  • Approaches to tuning: automatic and cooperative

  • Application knowledge representation:

    • measure points, performance model, tuning point/action/sync


Conclusions1
Conclusions

  • Working prototype environment – MATE – that automatically monitors, analyses and tunes running applications

  • Practical experiments conducted with MATE and parallel/distributed applications prove that it automatically adapts application behavior to existing conditions during run time!


Future work
Future work

  • Global and local analysis

    • Scalability (problems with global analysis)

    • Some problems can be treated locally

  • Performance analysis

    • How tuning techniques influence other techniques

    • Other approaches than performance model

  • Metrics

    • Complementary information provided by metrics

  • Provision of the application knowledge

    • Tunlet provided externally in a declarative manner

  • Instrumentation evaluation

    • Prediction of monitoring and tuning instrumentation cost


Future work1
Future work

  • Tuning techniques

    • OS layer

      • TCP/IP options (e.g. sending without delay – Nagle’s algorithm)

      • I/O operations (e.g. read/write operations, I/O buffer size)

    • Library layer

      • Investigation of problems in MPI, numerical libraries

    • Application layer

      • Automatic selection of algorithm (e.g. sorting algorithm)

  • Recommendations

    • Provision of good explanation to the user

  • Towards grid


Thesis

March, 2004

Thank you very much


ad