Greg Thain Computer Sciences Department University of Wisconsin-Madison Configuring Quill Condor Week 2007
Typical Condor Pool Central Manager master collector negotiator = ClassAd Communication Pathway = Process Spawned Submit-Only master schedd Execute-Only master startd Execute-Only master startd
What is Quill? A technology to store a read only version of the job queue and job historical data in a relational database.
Why Quill? Offloads query overhead from sched Performance boost! › Easier to make web portal RDMS access easier than SOAP/CLI
Job Queue Management Job Queue scheddquilldDatabase Job Queue schedd Without QuillWith Quill
Quill downsides › Additional latency › More complicated setup › Handful of attributes not in DBMS
Quill and Quill++ › Quill in Condor since › Quill++ (quillpp) coming soon. Support for all daemons Multiple schedds in one database Support for Oracle on some platforms Replaces quill › We’ll talk about both
Typical Quill’d Condor Pool Central Manager master collector negotiator = ClassAd Communication Pathway = Process Spawned Submit-Only master schedd Execute-Only master startd Execute-Only master startd Database postgres query quill condor_q
Typical Quillpp’d Condor Pool Central Manager master collector negotiator = ClassAd Communication Pathway = Process Spawned Submit-Only master schedd Execute-Only master startd Execute-Only master startd Database postgres query quillquillpp condor_q quillpp
How to use Schema? › We’ll talk about this in another talk Quill Front End and Schema BoF Thursday 11am
Quill (not Quill++) Deployment › One Quill daemon per schedd › Quill daemons must be uniquely named › Each Quill daemon uses a unique DB name › Currently uses PostgreSQL Recommend PostgreSQL 8.2 or later Better disk management
Quill++ deployment › One condor_quillpp per machine › One condor_dbmsd per database › Manual installation of schema › One DB per pool › Uses Postgres or Oracle
Condor’s Interface to Quill › Modified two tools to utilize the DB condor_q condor_history
A User Perspective: condor_q › condor_q changes When QUILL_ENABLED, goes to rdbms -name takes a ScheddName or QuillName -avgqueuetime details average time in queue for all jobs
Condor_q -direct › -direct rdbms (default when QUIL_ENABLE=true) › -direct quilld (useful for firewall traversal) › -direct schedd (100% up-to-date view)
A User Perspective: condor_history › condor_history changes -name takes a Quill Name to retrieve job histories from a remote quill’s database
Condor_history -direct › There isn’t any (yet) › Condor_history –f \ `condor_config_val HISTORY` › No –direct quilld equivalent
PostgreSQL Configuration › Add two special user accounts: quillreader and quillwriter createuser quillreader --no-createdb --no-adduser --pwprompt createuser quillwriter --createdb --no-adduser --pwprompt
PostgreSQL Configuration (cont) › Allow TCP/IP connections Edit file postgresql.conf Add listen_address = '*' › Allow connections from specific hosts Edit file pg_hba.conf host all quillreader password host all quillwriter password › Note: only use ‘password’ authentication at this time.
Quill Configuration › User quillwriter needs a password. › Store it in › $(SPOOL) /.quillwritepassword (quill) › $(SPOOL) /.pgpass (quill++) .pgpass has host:port:db:user:pass › Ensure only the condor uid can read it if Condor is running as root
Quill Configuration (cont) › Condor system specific attributes in file condor_config.local QUILL = $(SBIN)/condor_quill QUILL_LOG = $(LOG)/QuillLog QUILL_ADDRESS_FILE = $(LOG)/.quill_address DAEMON_LIST = …, QUILL VALID_SPOOL_FILES = …,.quillwritepassword DC_DAEMON_LIST = …, QUILL
Quill Configuration (cont) › Quill specific attributes QUILL_ENABLED = TRUE # The quill name must be unique across all # quill daemons AND schedds QUILL_NAME = QUILL_DB_NAME = psilord_db QUILL_DB_IP_ADDR = merlin.cs.wisc.edu:42999 QUILL_POLLING_PERIOD = 10 (seconds)
Quill Configuration (cont) › QUILL_HISTORY_CLEANING_INTERVAL = 24 (hours) › QUILL_HISTORY_DURATION = 30 (days) › QUILL_MANAGE_VACUUM = FALSE › QUILL_IS_REMOTELY_QUERYABLE = TRUE › QUILL_DB_QUERY_PASSWD = xxx
Schema management › Quill automatically loads schema Upgrades itself automatically › Quill++ requires manual loading: Psql –Uquillwriter<common_createddl.sql Psql –Uquillwriter<pgsql_createddl.sql
Conversion to Quill++ › Conversion only matters for history › Conversion is one-way-only! › Two steps: Dump quill history tables to file with Condor_dump_history Load quill++ history tables from file with Condor_load_history
Data Management › Constrain database size History truncation Quill++ other tables, too Postgres Index management Oracle cleans itself › Careful of long queries, esp with Quill
Data Management: Quill › HISTORY_CLEANING_INTERVAL In hours (24 hours) › HISTORY_DURATION How long in days (7 days) › QUILL_SHOULD_REINDEX Boolean (false) › QUILL_MANAGE_VACUUM (false)
Data Management: Quill++ › Condor_dbmsd does all the work QUILL_DBSIZE_LIMIT (20 Gb) – s warning when 75% is hit DATABASE_PURGE_INTERVAL (s (24 hours)) DATABASE_REINDEX_INTERVAL (s (24 hours)) QUILL_DB_TYPE (oracle, pgsql) QUILL_RESOURCE_HISTORY_DURATION (7 days) QUILL_JOB_HISTORY_DURATION (10 years!) QUILL_RUN_HISTORY_DURATION (7 days)
Thank you! › Want more information? › BOF “Databases in Condor”