1 / 14

SAM Aggregated Topology Provider

SAM Aggregated Topology Provider. pedro.andrade@cern.ch 5 June 2013 IT/SDC/MI section meeting. History. Development started in 2009 during the EGEE project by Steve Traylen , James Casey, David Collados , and others Many features/improvements added by the BARC team during 2011-12

Download Presentation

SAM Aggregated Topology Provider

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.


Presentation Transcript

  1. SAM Aggregated Topology Provider pedro.andrade@cern.ch 5 June 2013 IT/SDC/MI section meeting

  2. History • Development started in 2009 during the EGEE project by Steve Traylen, James Casey, David Collados, and others • Many features/improvements added by the BARC team during 2011-12 • Maintained by me in the last months

  3. Overview • ATP scope is: • Aggregate grid topology info from different sources • Single authoritative source of grid topology info • Manage groups of resources from VO perspective

  4. Architecture GOCDB BDII OSG Central Operations Portal ATP Sync VO Feeds MyWLCG ATP API MyWLCG ATP WEB ORACLE MYSQL

  5. Architecture • ATP is composed of 3 main packages: • ATP Sync: A python based package to periodically synchronize data from various topology providers. It also includes PL/SQL for Oracle/MySQL and the Django model. • MYWLCG ATP Web: A front-end for ATP developed in Django. It provides a web interface to display/find the topologies of grid resources • MYWLCG ATP API: A front-end for ATP developed in Django. It provides programmatic feeds to expose ATP data through JSON/XML interfaces.

  6. Input • CIC Portal • VOs • VOMS • VO contacts • GOCDB (EGI) • Sites, Services • Flavours, Downtimes • Site and region contacts • RSV (OSG) • Sites, Services • Flavours, Downtimes • Capacity • GStat: • Capacity • REBUS: • WLCG federations • WLCG tiers • BDII: • Service endpoints • Services/VOs mapping • MPI info • VO Feeds • VO groups of services

  7. Clients • ATP WEB: POEM, NCG • ATP DB: MRS, ACE, MyWLCG

  8. Source Code Repo: http://svnweb.cern.ch/world/wsvn/sam/trunk/atp/ Doc: http://sam-doc.web.cern.ch/sam-doc/atp/doc/build/html/

  9. Configuration • Default configuration structure distributed in ATP package • atp_synchro.conf : main configuration file • atp_db.conf : database connection configuration • atp_logging_files.conf : location of log configuration file • atp_logging_parameters_config.conf : log configuration • roc.conf : list of enabled regions • vo_feeds.conf : list of enabled vo feeds

  10. Execution • Cronjob running ATP daemon: [root@samnag031 ~]# cat /etc/cron.d/atp-sync 50 * * * * edguser [ -f /var/lock/subsys/atp_synchro ] && ( /usr/bin/atp_synchro -d /etc/atp/atp_db.conf -c /etc/atp/atp_synchro.conf -l /etc/atp/atp_logging_files.conf ) > /dev/null 2>&1 • ATP sync execution is structured in synchronizers: [root@samnag031 ~]# cat /etc/atp/atp_synchro.conf cic_portal = Yes gocdb_topology = Yes gocdb_downtime = Yes osg = Yes osg_downtime = Yes gstat = Yes bdii = Yes vo_feeds = Yes

  11. Logs • Log of last execution: /var/log/atp/atp.log • Log of all executions: /var/log/atp/atp_full.log (logrotate) • Errors are also sent to system logging • Six levels of debugging: • CRITICAL, ERROR, WARNING, INFO, DEBUG, NOTSET • Default configuration is on INFO (20) • Standard log file line: • “2012-03-22 15:24:02,308 - ATP - INFO - CIC - Execution – Starting” • CIC: synchronizer name (e.g. CIC, GOCDB Topology, VOFeeds, etc) • Execution: task type (e.g. configuration, validation, execution) • Starting: action description

  12. Debug Tips • The atp.log is quite useful to understand problems: • It will at least help to locate the affected synchronizer • However ATP is based on many PL/SQL procedures/functions: • SQL developer will help ;) • ATP synchronizes from distinct external data sources. ATP execution fails due to “invalid” or “not available” input data: • Check the “aalidation” tag in atp.log to understand which data source was not reachable or was providing invalid data

  13. Problems • No support for other topology entities • Designed to monitor only services • Services check • Strict dependency on services declared in GOCDB, OIM • Duplication of PL/SQL code • Difficult to manage two versions for Oracle and MySQL • Complex relational database model • e.g. isdeleted flags

  14. Suggestions • ATP was started in Sep 2009… 4 years ago  • Perhaps it is ready for retirement • The grid topology is always evolving • Perhaps less focus on state, and more on history • Support for two RDBMS is hard • Perhaps no RDBMS can be even better

More Related