1 / 13

SAM Architecture

SAM Architecture. 15.05.2013. team. SAM Architecture. Architecture overview b asic architecture the whole picture Components overview Summary. SAM Architecture. SAM Architecture. SAM Architecture. ATP (Aggregated Topology Provider) polls information sources to gather topology

pancho
Download Presentation

SAM Architecture

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. SAM Architecture 15.05.2013 team

  2. SAM Architecture Contents 2/13 • Architecture overview • basic architecture • the whole picture • Components overview • Summary

  3. SAM Architecture Basic architecture 3/13

  4. SAM Architecture Architecture — the whole picture 4/13

  5. SAM Architecture • ATP (Aggregated Topology Provider) • polls information sources to gather topology • services, flavours, sites, downtimes, vo-mappings, capacity, federations, tiers… • Web API • local and central deployments • updated twice per hour • Python + (MySQL, PL/SQL) Components overview 5/13

  6. SAM Architecture • POEM (Profile Management) • stores profile definitions • synchronizes instances via poem_sync daemon • namespace support • web admin interface • web API • local and central deployments • Python + Django definition for ATLAS_CRITICAL Components overview 6/13

  7. SAM Architecture • NCG (Nagios Configuration Generator) • reads from ATP and POEM via API • generates Nagios configuration to • set up which metrics to run • in which services for which sites • configures metric attributes • test parameters (SE path, CE queue…) • Nagios execution flags (Passive check, obsess…) • specifies metrics to import from other nodes Components overview 7/13

  8. SAM Architecture • Nagios and probes • patched and packaged • probes encapsulate tests, which are run periodically • probes are provided by different parties • SAM supports only the SAM probe • Product Teams provide their own probes • imports test results from other Nagios instances • special probe distributes metric results • send_to_db • send_to_msg Components overview 8/13

  9. SAM Architecture • MRS (Metric Results Store) • aggregates Nagios results • stores all metric results • summarizes service status from metric results • keeps track of status changes • per metric and service • per service and profile • keeps track of missing and removed metrics • bootstraps from POEM every hour • which metrics are to be expected for each service and profile? • local and central deployments • MySQL and Oracle Components overview 9/13

  10. SAM Architecture • ACE (Availability Computation Engine) • summarizes MRS statuses • translates status changes into status evolution • hourly, daily, weekly and monthly granularities • service, flavour and site level aggregations • generates availability values using a profile algorithm • uses logic operations on status values • e.g.: (ARC-CE + CE) * SRMv2 * BDII • takes downtime into account to generate reliability values • runs every hour • Python + Oracle SQL Components overview 10/13

  11. SAM Architecture • MyWLCG • Visualization tool for SAM data • metric results • service, flavour and site status, availability and reliability • Reads from ATP, POEM, MRS and ACE via database • Other applications • availability trends, experiment usage, topology view… • Exposes SAM results via web API • Report generation • Python + Django Components overview 11/13

  12. SAM Architecture • Messaging clients • multiple, heterogeneous clients • send_to_msg, consume_to_db • msg_to_handler, recv_from_queue • wnjob • atp_synchro • transports metric data from one instance to another • integrates third party monitoring systems • MEG (Message Groove) • common messaging client framework • Python + stompclt Components overview 12/13

  13. SAM Architecture • Summary • ATP provides topology • POEM defines profiles • NCG configures Nagios • Nagios runs the probes • Messaging transports results • MRS aggregates metric results into status • ACE aggregates status into availability • MyWLCG displays and exposes data Summary 13/13

More Related