1 / 71

Extend/alter Condor via developer APIs/plugins CERN Feb 14 2011

Todd Tannenbaum. Extend/alter Condor via developer APIs/plugins CERN Feb 14 2011. Some classifications. Application Program Interfaces (APIs) Job Control Operational Monitoring Extensions. Job Control APIs. The biggies: Command Line Tools DRMAA Condor DBQ Web Service Interface (SOAP)

tommy
Download Presentation

Extend/alter Condor via developer APIs/plugins CERN Feb 14 2011

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Todd Tannenbaum Extend/alter Condor via developer APIs/pluginsCERN Feb 14 2011

  2. Some classifications Application Program Interfaces (APIs) • Job Control • Operational Monitoring Extensions

  3. Job Control APIs The biggies: • Command Line Tools • DRMAA • Condor DBQ • Web Service Interface (SOAP) http://condor-wiki.cs.wisc.edu/index.cgi/wiki?p=SoapWisdom

  4. Command Line Tools • Don’t underestimate them! • Your program can create a submit file on disk and simply invoke condor_submit: system(“echo universe=VANILLA > /tmp/condor.sub”); system(“echo executable=myprog >> /tmp/condor.sub”); . . . system(“echo queue >> /tmp/condor.sub”); system(“condor_submit /tmp/condor.sub”);

  5. Command Line Tools • Your program can create a submit file and give it to condor_submit through stdin: PERL: fopen(SUBMIT, “|condor_submit”); print SUBMIT “universe=VANILLA\n”; . . . C/C++: int s = popen(“condor_submit”, “r+”); write(s, “universe=VANILLA\n”, 17/*len*/); . . .

  6. Command Line Tools • Using the +Attribute with condor_submit: universe = VANILLA executable = /bin/hostname output = job.out log = job.log +webuser = “zmiller” queue

  7. Command Line Tools • Use -constraint and –format with condor_q: % condor_q -constraint 'webuser=="zmiller"' -- Submitter: bio.cs.wisc.edu : <128.105.147.96:37866> : bio.cs.wisc.edu ID OWNER SUBMITTED RUN_TIME ST PRI SIZE CMD 213503.0 zmiller 10/11 06:00 0+00:00:00 I 0 0.0 hostname % condor_q -constraint 'webuser=="zmiller"' -format "%i\t" ClusterId -format "%s\n" Cmd 213503 /bin/hostname

  8. Command Line Tools • condor_wait will watch a job log file and wait for a certain (or all) jobs to complete: system(“condor_wait job.log”); • can specify a timeout

  9. Command Line Tools • condor_q and condor_status –xml option • So it is relatively simple to build on top of Condor’s command line tools alone, and can be accessed from many different languages (C, PERL, python, PHP, etc). • However…

  10. DRMAA • DRMAA is a OGF standardized job-submission API • Has C (and now Java) bindings • Is not Condor-specific • SourceForge Project http://sourceforge.net/projects/condor-ext

  11. DRMAA • Unfortunately, the DRMAA 1.x API does not support some very important features, such as: • Fault tolerance • Transactions

  12. Condor Database Queue (Condor DBQ) • Layer on top of Condor • Relational database interface to • Submit work to Condor • Monitor status of submission • Monitor status of individual jobs • Perfect for applications that • Submit jobs to Condor • Already use a database

  13. Web App Before Condor DBQ Crash!!! You did implement two phase commit and recovery, to get run once semantics, right? App tables Condor Pool Submit Job (SOAP or cmd line interface) Web Application Schedd Check Status (job log file, SOAP, or cmd line interface) R/W app data Non- Trivial Code User log DBMS

  14. Web App After Condor DBQ App tables Condor Pool Web Application Schedd • Single SQL statements • Transactional R/W app data Submit Job Check Status User log DBMS Submit Job (cmd line) Get Job Updates work table job table condor_dbq Check New Work Update Status

  15. Benefits of Condor DBQ • Natural simple SQL API • Submit work insert into work values(condor-submit-file) • Check status select * from jobs where work_id = id • Transactions/Consistency comes for free • DBMS performs crash recovery

  16. Condor DBQ Limitations Overrides log file location All jobs submitted as same user Dagman not supported Only Vanilla and Standard universe jobs supported (others are unknown) Currently only supports PostgreSQL

  17. Web Service Interface • Simple Object Access Protocol • Mechanism for doing RPC using XML (typically over HTTP or HTTPS) • A World Wide Web Consortium (W3C) standard • SOAP Toolkit: Transform a WSDL to a client library

  18. Benefits of a Condor SOAP API • Can be accessed with standard web service tools • Condor accessible from platforms where its command-line tools are not supported • Talk to Condor with your favorite language and SOAP toolkit

  19. Condor SOAP API functionality • Get basic daemon info (version, platform) • Submit jobs • Retrieve job output • Remove/hold/release jobs • Query machine status • Advertise resources • Query job status

  20. SOAP over HTTP Getting machine status via SOAP Your program condor_collector queryStartdAds() Machine List SOAP library

  21. Some classifications Application Program Interfaces (APIs) • Job Control • Operational Monitoring Extensions

  22. Operational Monitoring APIs • Via Web Services (SOAP) • Via Relational Database: Quill • Job, Machine, and Matchmaking data echoed into PostgreSQL RDBMS • Via a file: the Event Log • Just like the job log, but has events for all jobs submitted to a schedd. • Structured journal of job events • Sample code in C++ to read/parse these events • Via Enterprise messaging: Condor AMQP • Event Log events echoed into Qpid, an AMQP message broker in a highly reliable manner • https://condor-wiki.cs.wisc.edu/index.cgi/wiki?p=CondorPigeon

  23. Some classifications Application Program Interfaces (APIs) • Job Control • Operational Monitoring Extensions

  24. Extending Condor • APIs: How to interface w/ Condor • Extensions: Changing Condor’s behavior • Hooks • Plugins

  25. Job Wrapper Hook • Allows an administrator to specify a “wrapper” script to handle the execution of all user jobs • Set via condor_config “USER_JOB_WRAPPER” • Wrapper runs as the user, command-line args are passed, machine & job ad is available. • Errors can be propagated to the user. • Example: condor_limits_wrapper.sh

  26. Job Fetch & Prepare Hooks • Job Fetch hooks • Call outs from the condor_startd • Extend claiming • Normally jobs are pushed from schedd to startd – now jobs can be “pulled” from anywhere • Job Running Hooks • Call outs from the condor_starter • Transform the job classad • Perform any other pre/post logic

  27. Sidebar: “Toppings” • If work arrived via fetch hook “foo”, then prepare hooks “foo” will be used. • What if an individual job could specify a job prepare hook to use??? • Prepare hook to use can be alternatively specified in job classad via attribute “HookKeyword” • How cool is that???

  28. Toppings: Simple Example • In condor_config: ANSYS_HOOK_PREPARE_JOB= \ $(LIBEXEC)/ansys_prepare_hook.sh • Contents of ansys_prepare_hook.sh: #!/bin/sh #Read and discard the job classad cat >/dev/null echo'Cmd="/usr/local/bin/ansys"'

  29. Topping Example, cont. • In job submit file: universe=vanilla executable=whatever arguments=… +HookKeyword=“ANSYS" queue

  30. Configuration Hook • Instead of reading from a file, run a program to generate Condor config settings • Append “|” to CONDOR_CONFIG or LOCAL_CONFIG_FILE. Example: LOCAL_CONFIG_FILE = \ /opt/condor/sbin/make_config

  31. File Transfer Hooks • Allows the administrator to configure hooks for handling URLs during Condor's file transfer • Enables transfer from third party directly to execute machine, which can offload traffic from the submit point • Can be used in a number of clever ways

  32. File Transfer Hooks • API is extremely simple • Must support being invoked with the “-classad” option to advertise its abilities: #!/bin/env perl if ($ARGV[0] eq "-classad") { print "PluginType = \"FileTransfer\"\n"; print "SupportedMethods = \"http,ftp,file\"\n"; exit 0; }

  33. File Transfer Hooks • When invoked normally, a plugin simply transfers the URL (first argument) into filename (second argument) # quoting could be an issue but this runs in user space $cmd = "curl " . $ARGV[0] . " -o " . $ARGV[1]; system($cmd); $retval = $?; exit $retval;

  34. File Transfer Hooks • In the condor_config file, the administrator lists the transfer hooks that can be used • Condor invokes each one to find out its abilities • If something that looks like a URL is added to the list of input files, the plugin is invoked on the execute machine

  35. File Transfer Hooks • condor_config: • FILETRANSFER_PLUGINS = curl_plugin, hdfs_plugin, gdotorg_plugin, rand_plugin • Submit file: • transfer_input_files = normal_file, http://cs.wisc.edu/~zkm/data_file, rand://1024/random_kilobyte

  36. File Transfer Hooks • As you can see, the format of the URL is relatively arbitrary and is interpreted by the hook • This allows for tricks like rand://, blastdb://, data://, etc.

  37. Plugins

  38. Plugins • Shared Library Plugins • Gets mapped right into the process space of the Condor Services! May not block! Must be thread safe! • General and ClassAd Functions • Condor ClassAd Function Plugin • Add custom built-in functions to the ClassAd Language. • Via condor_config “CLASSAD_LIB_PATH” • Cleverly used by SAMGrid

  39. General Plugins • In condor_config, use “PLUGINS” or “PLUGIN_DIR”. • Very good idea to do: • SUBSYSTEM.PLUGIN or • SUBSYSTEM.PLUGIN_DIR • Implement C++ child class, and Condor will call methods at the appropriate times. • Some general methods (initialize, shutdown), and then callbacks based on plugin type • What’s available? Plugin Discovery…

  40. Plugin Discovery cd src/ dir /s Example*Plugin.cpp You will find: ExampleCollectorPlugin.cpp ExampleMasterPlugin.cpp ExampleNegotiatorPlugin.cpp ExampleClassAdLogPlugin.cpp ExampleScheddPlugin.cpp ExampleStartdPlugin.cpp And a ClassAdLogPluginManager.cpp

  41. Collector Plugin struct ExampleCollectorPlugin : public CollectorPlugin { void initialize(); void shutdown(); void update(int command, const ClassAd &ad); void invalidate(int command, const ClassAd &ad); };

  42. ClassAdLog Plugin Methods virtual void newClassAd(const char *key) = 0; virtual void destroyClassAd(const char *key) = 0; virtual void setAttribute(const char *key, const char *name, const char *value) = 0; virtual void deleteAttribute(const char *key, const char *name) = 0;

  43. Other Extending Ideas…

  44. Custom ClassAd Attributes • Job ClassAd • +Name = Value in submit file • SUBMIT_EXPRS in condor_config • Machine ClassAd • STARTD_EXPRS in condor_config for static attributes • STARTD_CRON_* settings in condor_config for dynamic attributes

  45. Thinking out of the box… • MAIL in condor_config • Green Computing Settings • HIBERNATION_PLUGIN (called by the startd) • ROOSTER_WAKEUP_CMD

  46. All else fails? Grab Source! Condor is open source ya know… Thank you! Questions?

  47. Extra Slides

  48. Lets get some SOAP details…

  49. The API • Core API, described with WSDL, is designed to be as flexible as possible • File transfer is done in chunks • Transactions are explicit • Wrapper libraries aim to make common tasks as simple as possible • Currently in Java and C# • Expose an object-oriented interface

  50. Things we will cover • Condor setup • Necessary tools • Job Submission • Job Querying • Job Retrieval • Authentication with SSL and X.509

More Related