1 / 22

System Management for distributed DCS

System Management for distributed DCS. Contents. Introduction: Planned Controls Architecture Extended Controls Architecture The SysMES Project System Properties System Functionality Architecture Current State Outlook. DCS Board. Controls Server. Linux. Planned Architecture. Detector.

mira-vang
Download Presentation

System Management for distributed DCS

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. System Management for distributed DCS

  2. Contents • Introduction: • Planned Controls Architecture • Extended Controls Architecture • The SysMES Project • System Properties • System Functionality • Architecture • Current State • Outlook

  3. DCS Board Controls Server Linux Planned Architecture Detector Detector DCS Board DAQ Component Controls Server Controls Server Linux Controls communication protocol Network Controls communication protocol GUI GUI Controls Client Controls Client

  4. DCS Board DCS Board Controls Server Controls Server Linux Linux Planned Architecture Detector Detector Detector Detector DAQ Component DAQ Component Controls Server Controls Server Control servers obtain and save sensor data from detector Controls communication protocol Control servers send information to the network using a control communication protocol Network Network Controls communication protocol Control clients get and display information from the network GUI GUI GUI GUI Controls Client Controls Client Controls Client Controls Client

  5. Detector Detector Detector Detector DCS Board DCS Board DAQ Component DAQ Component Controls Server Controls Server Controls Server Controls Server Linux Linux Network Network Commands can be sent from clients to servers GUI GUI GUI GUI Controls Client Controls Client Controls Client Controls Client Planned Architecture

  6. Controls Systems Characteristics • Controls system (e.g. Epics) • Designed for the measurement and visualization of system information • Very good scalability • Very high data measurement rate • Measured values build a real time database • Normally static configuration • Difficult to implement high availability infrastructure • Limited interactions with the control servers on the front end • Limited possibilities for information correlation for detecting undesirable states • Limited facility for automatic or manual reaction in case of failure Strong Points Weak Points

  7. Extended Architecture Detector Detector Detector Detector DCS Board DCS Board DCS Board DCS Board DAQ Component DAQ Component Controls Server Controls Server SysMES Client SysMES Client Controls Server Controls Server SysMES Client SysMES Client Controls Server Controls Server SysMES Client SysMES Client SysMES client gets and stores information from controls server Job Message Network Network Data obtained from detectors as before SysMES client can react automatically without framework interaction SysMES framework sends a reaction to the clients Communication between SysMES Clients and framework if necessary GUI GUI GUI GUI SysMES Framework SysMES Framework Controls Client Controls Client Controls Client Controls Client

  8. Distributed System Management • Highly available Architecture • Using clustering and redundancy (Dynamic) • Possibility to interact with the DCS Board using Jobs • Execution of binaries by the client (e.g. restart of the Control Server) • Possibility to reconfigure SysMES Clients/Servers on the fly • Changing the management capabilities without restart • Possibility to recover a SysMES Client Configuration or State on the fly • In case of crash can recover a previous configuration and data state • Complex rule system for triggering on conditions • Complex rule triggering and reaction on the client or externally on the servers

  9. The SysMES Project System Management for Embedded Systems

  10. System properties • Based on available, established standards • Interoperability and manufacturer independance XML (Extensible Markup Language) CIM (Common Information Model) • Object-orientated modelling of the complete system • Simple modelling of their relationships • Reusability • Decentralization • Decentralized modelling • Decentralized storage of information • Decentralized and dynamic configurations management • Decentralized management

  11. System properties • High availability and scalability • No single-point-of-failure • Redundant storage of information • Clustering of management resources and DB • Load balancing • Flexibility • Platform independent • Self management • Automatic reaction to triggering conditions • Reliability • Transaction-based communication to avoid information loss

  12. System functionality • Modelling • Object orientated modelling of resources in UML • Creation of Objects from this model • Transfer of Objects to management framework • Monitoring • Monitoring occurs directly in Client • Interface to other monitoring systems • Message generation • Evaluation of the measured values in Client • Dynamic decision of which values have to be processed • Prevention of management environment system overload

  13. System functionality • Message Handling • Decentralized storage of messages in DB and on Client • Job Management • Communication with Clients through Jobs • Different Jobs types • Configuration Jobs: e.g. changeMonitor • Update Jobs: e.g. addMonitor • Management Jobs: e.g. deleteMessages

  14. System functionality • Configuration Management • Client knows its current configuration state • Client stores its current configuration for recovery • All possible configurations are stored on server • Complex Rule Handling • 3 tier Rule management System • Tier 1. Rule management on the client (reaction < 10 ms) • Tier 2. Simple rule management on the server (reaction < 300 ms) • Tier 3. Complex rule management on the server using a expert system (reaction < 1 s)

  15. Architecture – Physical View AccessPoint

  16. Architecture – Logic View Class Management Model xmi CIM Server XMI2MOF mof ArgoUML / Poseidon / Rational Rose Java CIM Object Manager CIM Navigator XMI2MOF Modelling Object Management Model object Deploy Module xml CIMConnector Rule Module Message Module Job Module Database ManagementFramework Connection Module Message Handling Rule Handling Job Handling Monitoring Clients Third party Interface Operating System

  17. Architecture – Logic View Class Management Model xmi WAM / LAM Server XMI2MOF mof Modelling Object Management Model MySQL Database Cluster Java AccessPoint JBOSS 4.0 Tomcat 5.0.28 Enterprise JavaBeans object Deploy Module xml CIMConnector Rule Module Message Module Job Module Database ManagementFramework Connection Module xml Message Handling Rule Handling Job Handling Monitoring Clients Third party Interface Operating System

  18. Architecture – Logic View Class Management Model xmi Thin / Full Client XMI2MOF mof Modelling Object Management Model Third Party Interfacte Linux μcLinux Java Interpreter / C Compiler object Deploy Module xml http CIMConnector Rule Module Message Module Job Module Database ManagementFramework Connection Module xml xml Message Handling Rule Handling Job Handling Monitoring Clients Third party Interface Operating System

  19. Current State • Prototype implementation at Kirchhoff Institute for Physics has been completed • Management of HLT experimental cluster • 32 Linux PCs • Monitoring with EPICS/SNMP, Lemon and Ganglia • Rules, Jobs, Configurations Management used • Next test at University of Paderborn, Arminius Cluster - 3rd to 6th March • 200 PCs x 2 Intel Xeon 3.2 GHz Processors • Online analysis of simulated ALICE TPC Data (Time Projection Chamber) • Management of cluster analysing 1000 simulated Proton-Proton events at 1-3 MByte per event

  20. Outlook • Future implementation on 500 node ALICE HLT cluster • Dynamisation of HLT cluster: • On-the-fly rerouting of data through the analysis chain • On-the-fly shutdown of idle nodes to save resources • Extension of the CBM project detector controls system (if wanted?!)

  21. Summary • The SysMES Framework includes the advantages of current controls systems and extends their functionality to dynamic management • It is suited to complete controls systems and cluster management systems

  22. Thank you for your attention Any questions?

More Related