1 / 26

Quill Tutorial Condor Week 2006

Quill Tutorial Condor Week 2006. What is Quill?. A non-invasive method of storing a read only version of the job queue and job historical data in a relational database. Why Do We Need It?. Presents the job queue information as a set of tables in a relational database (Big Win!)

rhian
Download Presentation

Quill Tutorial Condor Week 2006

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Quill TutorialCondor Week 2006

  2. What is Quill? A non-invasive method of storing a read only version of the job queue and job historical data in a relational database.

  3. Why Do We Need It? • Presents the job queue information as a set of tables in a relational database (Big Win!) • Fault tolerance • Provides performance enhancements in very large and busy pools

  4. schedd schedd Database quilld Job Queue Job Queue Job Queue Management Without Quill With Quill

  5. Deployment • One Quill daemon per schedd • Quill daemons must be uniquely named • Each Quill daemon uses a unique DB name • Multiple Quill daemons may utilize one database server • Currently uses PostgreSQL • Recommend PostgreSQL 8.1 or later for automatic vacuuming of tables

  6. Condor’s Interface to Quill • Modified two tools to utilize the DB • condor_q • condor_history • Very minor modifications to schedd • Multiple sources for Job Queue & History pose an interesting problem

  7. schedd Database quilld Job Queue Job Queue Discovery Sequence(Local Query) 2 1 3 condor_q

  8. schedd Database quilld collector Job Queue Job Queue Discovery Sequence(Remote Query) 2 1 0 3 condor_q

  9. A User Perspective: condor_q • condor_q changes • -name takes a ScheddName or QuillName • -avgqueuetime details average time in queue for all jobs

  10. A User Perspective: condor_qExample: condor_q -name Linux merlin > condor_q -name psilord_quilld@merlin.cs -- DB: psilord_quilld@merlin.cs : <merlin.cs.wisc.edu:42999> : psilord_db ID OWNER SUBMITTED RUN_TIME ST PRI SIZE CMD 92.0 psilord 4/21 09:21 0+00:00:00 I 0 9.8 foo 1 jobs; 1 idle, 0 running, 0 held

  11. A User PerspectiveExample: condor_q -avgqueuetime Linux merlin > condor_q -avgqueuetime -- DB: psilord_quilld@merlin.cs : <merlin.cs.wisc.edu:42999> : psilord_db Average time in queue for uncompleted jobs (in hh:mm:ss) 00:40:47.011993

  12. Database quilld Job Queue History File Job History Discovery Sequence(Local Query) The quilld is never queried directly! 1 2 condor_history

  13. Database quilld collector Job Queue History File Job History Discovery(Remote Query) NEW! The quilld is never queried directly! 1 0 condor_history

  14. A User Perspective: condor_history • condor_history changes • -name takes a Quill Name to retrieve job histories from a remote quill’s database • -completedsince returns all jobs completed since a PostgreSQL formatted date

  15. A User Perspective: condor_historyExample: condor_history -name Linux merlin > condor_history -name psilord_quilld@merlin.cs -- DB: psilord_quilld@merlin.cs : <merlin.cs.wisc.edu:42999> : psilord_db ID OWNER SUBMITTED RUN_TIME ST COMPLETED CMD 91.0 psilord 4/20 14:23 0+00:00:00 X ??? /scratch/psilor 92.0 psilord 4/21 09:21 0+00:00:00 X ??? /scratch/psilor 93.0 psilord 4/21 10:12 0+00:00:01 C 4/21 10:12 /scratch/psilor

  16. A User Perspective: condor_historyExample: condor_history -completedsince Linux merlin > condor_history -completedsince "2006-01-01 00:00:01" -- DB: psilord_quilld@merlin.cs : <merlin.cs.wisc.edu:42999> : psilord_db ID OWNER SUBMITTED RUN_TIME ST COMPLETED CMD 93.0 psilord 4/21 10:12 0+00:00:01 C 4/21 10:12 /scratch/psilor

  17. Short Circuiting the Discovery Sequence • Use the –direct option! • Examples • condor_q –direct rdbms • condor_q –direct quilld • condor_q –direct schedd • “rdbms”, “quilld”, and “schedd” are the actual parameters. • Invaluable for debugging!

  18. PostgreSQL 8.1 Installation • ./configure • gmake && gmake install • mkdir /path/to/pgsql/data • initdb –D /path/to/pgsql/data • postmaster –D /path/to/pgsql/data • Note: Default port binding is 5432.

  19. PostgreSQL Configuration • Add two special user accounts: quillreader and quillwriter • createuser quillreader --no-createdb --no-adduser --pwprompt • createuser quillwriter --createdb --no-adduser --pwprompt

  20. PostgreSQL Configuration (cont) • Allow TCP/IP connections • Edit file postgresql.conf • Add listen_address = '*' • Allow connections from specific hosts • Edit file pg_hba.conf • host all quillreader 128.105.0.0 255.255.0.0 password • host all quillwriter 128.105.0.0 255.255.0.0 password • Note: only use ‘password’ authentication at this time.

  21. Quill Configuration • User quillwriter needs a write password. • Store it in a file called .quillwritepassword in the $(SPOOL) directory. • Ensure only the condor uid can read it if Condor is running as root

  22. Quill Configuration (cont) • Condor system specific attributes in file condor_config.local • QUILL = $(SBIN)/condor_quill • QUILL_LOG = $(LOG)/QuillLog • QUILL_ADDRESS_FILE = $(LOG)/.quill_address • DAEMON_LIST = …, QUILL • VALID_SPOOL_FILES = …, .quillwritepassword • DC_DAEMON_LIST = …, QUILL

  23. Quill Configuration (cont) • Quill specific attributes • QUILL_ENABLED = TRUE • # The quill name must be unique across all • # quill daemons AND schedds • QUILL_NAME = psilord_quilld@merlin.cs • QUILL_DB_NAME = psilord_db • QUILL_DB_IP_ADDR = merlin.cs.wisc.edu:42999 • QUILL_POLLING_PERIOD = 10(seconds)

  24. Quill Configuration (cont) • QUILL_HISTORY_CLEANING_INTERVAL = 24 (hours) • QUILL_HISTORY_DURATION = 30 (days) • QUILL_MANAGE_VACUUM = FALSE • QUILL_IS_REMOTELY_QUERYABLE = TRUE • QUILL_DB_QUERY_PASSWD = xxx

  25. DB Storage Method • Schema designed to store and query classads • 4 tables to represent the job queue classads • 2 for history data • 1 for metadata • Some queries are easier than others • Ask more questions at the BOF!

  26. Thank you! • Want more information? • BOF “Databases in Condor: Now and in the Future”

More Related