Development of the distributed monitoring system for the nica cluster
Download
1 / 19

Development of the distributed monitoring system for the NICA cluster - PowerPoint PPT Presentation


  • 110 Views
  • Uploaded on

Development of the distributed monitoring system for the NICA cluster. Ivan Slepov (LHEP, JINR). Mathematical Modeling and Computational Physics Dubna , Russia, July 8, 2013. The MultiPurpose Detector – MPD to study Heavy Ion Collisions at NICA.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' Development of the distributed monitoring system for the NICA cluster' - sorcha


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Development of the distributed monitoring system for the nica cluster

Development of the distributed monitoring system for the NICA cluster

Ivan Slepov

(LHEP, JINR)

Mathematical Modeling and Computational Physics Dubna, Russia, July 8, 2013


The multipurpose detector mpd to study heavy ion collisions at nica
The MultiPurpose Detector – MPDto study Heavy Ion Collisions at NICA


Software for m ulti p urpose d etector
Software for MultiPurposeDetector

ROOT + FairRoot (FairBase + FairSoft software packages) =

Detectors simulation

MpdRootFramework

components:

Data reconstruction

Event analysis


Software for m ulti p urpose d etector1
Software for MultiPurposeDetector

ROOT + FairRoot (FairBase + FairSoft software packages) =

Detectors simulation

MpdRootFramework

components:

Data reconstruction

Event analysis


Software for m ulti p urpose d etector2
Software for MultiPurposeDetector

ROOT + FairRoot (FairBase + FairSoft software packages) =

Detectors simulation

MpdRootFramework

components:

Data reconstruction

Event analysis


Software for m ulti p urpose d etector3
Software for MultiPurposeDetector

ROOT + FairRoot (FairBase + FairSoft software packages) =

Detectors simulation

MpdRootFramework

components:

Data reconstruction

Event analysis


Computing resources for mpd data processing
Computing resources for MPDdata processing

CPU:128 XEON cores

GPU: ~1500 TESLA cores


Computing resources for mpd data processing1
Computing resources for MPDdata processing

CPU:128 XEON cores => in future ~10000XEON cores GPU: ~1500 TESLA cores


Motivation to develop monitoring system
Motivation to develop monitoring system

MPD users need more information about

all own cluster nodes and public computers!

  • Computing resources information (free space, memory, cpu, etc)

  • System load (load average, processes)

  • MPD software information (FairSoft version)

  • Cluster software information (SGE, xrootd, proof)

  • User tasks monitoring (batch processing and interactive jobs)


Monitoring system schemes
Monitoring system schemes

Scheme 1 – for collect general information

DSH

Software

BASH Scripts

Cron

run job

MySQL

DB

WEB

Interface

MySQL

DB

PHP

Scripts


Monitoring system schemes1
Monitoring system schemes

Scheme 1 – for collect general information

DSH

Software

BASH Scripts

Cron

run job

MySQL

DB

WEB

Interface

MySQL

DB

PHP

Scripts

WEB

Interface

PHP

Scripts

DSH

Software

BASH

Scripts

MySQL

DB

Scheme 2 – for collect information about user tasks and provide data management


Web-interface for

Monitoring system

MPD software information

Computing resources information

System load

User tasks monitoring






Motivation to develop system monitoring
Motivation to develop system monitoring

MPD users need more information about

all own cluster nodes and public computers!

Why? If, for example, the concept of grid uses a layer of abstraction from the resources.

Because MPD software now still under development and needs testing and debugging.


ad