Presentation is loading. Please wait.

Presentation is loading. Please wait.

Peter Keller Computer Sciences Department University of Wisconsin-Madison Quill Tutorial Condor Week.

Similar presentations


Presentation on theme: "Peter Keller Computer Sciences Department University of Wisconsin-Madison Quill Tutorial Condor Week."— Presentation transcript:

1 Peter Keller Computer Sciences Department University of Wisconsin-Madison psilord@cs.wisc.edu http://www.cs.wisc.edu/condor Quill Tutorial Condor Week 2006

2 www.cs.wisc.edu/condor What is Quill? A non-invasive method of storing a read only version of the job queue and job historical data in a relational database.

3 www.cs.wisc.edu/condor Why Do We Need It? › Presents the job queue information as a set of tables in a relational database (Big Win!) › Fault tolerance › Provides performance enhancements in very large and busy pools

4 www.cs.wisc.edu/condor Job Queue Management Job Queue scheddquilldDatabase Job Queue schedd Without QuillWith Quill

5 www.cs.wisc.edu/condor Deployment › One Quill daemon per schedd › Quill daemons must be uniquely named › Each Quill daemon uses a unique DB name › Multiple Quill daemons may utilize one database server › Currently uses PostgreSQL  Recommend PostgreSQL 8.1 or later for automatic vacuuming of tables

6 www.cs.wisc.edu/condor Condor’s Interface to Quill › Modified two tools to utilize the DB  condor_q  condor_history › Very minor modifications to schedd › Multiple sources for Job Queue & History pose an interesting problem

7 www.cs.wisc.edu/condor Job Queue Discovery Sequence (Local Query) Job Queue scheddquilldDatabase condor_q 1 2 3

8 www.cs.wisc.edu/condor Job Queue Discovery Sequence (Remote Query) Job Queue schedd condor_q collectorquilldDatabase 1 2 3 0

9 www.cs.wisc.edu/condor A User Perspective: condor_q › condor_q changes  -name takes a ScheddName or QuillName  -avgqueuetime details average time in queue for all jobs

10 www.cs.wisc.edu/condor A User Perspective: condor_q Example: condor_q -name Linux merlin > condor_q -name psilord_quilld@merlin.cs -- DB: psilord_quilld@merlin.cs : : psilord_db ID OWNER SUBMITTED RUN_TIME ST PRI SIZE CMD 92.0 psilord 4/21 09:21 0+00:00:00 I 0 9.8 foo 1 jobs; 1 idle, 0 running, 0 held

11 www.cs.wisc.edu/condor A User Perspective Example: condor_q -avgqueuetime Linux merlin > condor_q -avgqueuetime -- DB: psilord_quilld@merlin.cs : : psilord_db Average time in queue for uncompleted jobs (in hh:mm:ss) 00:40:47.011993

12 www.cs.wisc.edu/condor Job History Discovery Sequence (Local Query) quilldDatabase condor_history History File The quilld is never queried directly! Job Queue 1 2

13 www.cs.wisc.edu/condor Job History Discovery (Remote Query) NEW! Job Queue condor_history quilldDatabase The quilld is never queried directly! History File collector 1 0

14 www.cs.wisc.edu/condor A User Perspective: condor_history › condor_history changes  -name takes a Quill Name to retrieve job histories from a remote quill’s database  -completedsince returns all jobs completed since a PostgreSQL formatted date

15 www.cs.wisc.edu/condor A User Perspective: condor_history Example: condor_history -name Linux merlin > condor_history -name psilord_quilld@merlin.cs -- DB: psilord_quilld@merlin.cs : : psilord_db ID OWNER SUBMITTED RUN_TIME ST COMPLETED CMD 91.0 psilord 4/20 14:23 0+00:00:00 X ??? /scratch/psilor 92.0 psilord 4/21 09:21 0+00:00:00 X ??? /scratch/psilor 93.0 psilord 4/21 10:12 0+00:00:01 C 4/21 10:12 /scratch/psilor

16 www.cs.wisc.edu/condor A User Perspective: condor_history Example: condor_history -completedsince Linux merlin > condor_history -completedsince "2006-01-01 00:00:01" -- DB: psilord_quilld@merlin.cs : : psilord_db ID OWNER SUBMITTED RUN_TIME ST COMPLETED CMD 93.0 psilord 4/21 10:12 0+00:00:01 C 4/21 10:12 /scratch/psilor

17 www.cs.wisc.edu/condor Short Circuiting the Discovery Sequence › Use the –direct option! › Examples  condor_q –direct rdbms  condor_q –direct quilld  condor_q –direct schedd › “rdbms”, “quilld”, and “schedd” are the actual parameters. › Invaluable for debugging!

18 www.cs.wisc.edu/condor PostgreSQL 8.1 Installation ›./configure › gmake && gmake install › mkdir /path/to/pgsql/data › initdb –D /path/to/pgsql/data › postmaster –D /path/to/pgsql/data › Note: Default port binding is 5432.

19 www.cs.wisc.edu/condor PostgreSQL Configuration › Add two special user accounts: quillreader and quillwriter  createuser quillreader --no-createdb --no-adduser --pwprompt  createuser quillwriter --createdb --no-adduser --pwprompt

20 www.cs.wisc.edu/condor PostgreSQL Configuration (cont) › Allow TCP/IP connections  Edit file postgresql.conf Add listen_address = '*' › Allow connections from specific hosts  Edit file pg_hba.conf host all quillreader 128.105.0.0 255.255.0.0 password host all quillwriter 128.105.0.0 255.255.0.0 password › Note: only use ‘password’ authentication at this time.

21 www.cs.wisc.edu/condor Quill Configuration › User quillwriter needs a write password. › Store it in a file called.quillwritepassword in the $(SPOOL) directory. › Ensure only the condor uid can read it if Condor is running as root

22 www.cs.wisc.edu/condor Quill Configuration (cont) › Condor system specific attributes in file condor_config.local  QUILL = $(SBIN)/condor_quill  QUILL_LOG = $(LOG)/QuillLog  QUILL_ADDRESS_FILE = $(LOG)/.quill_address  DAEMON_LIST = …, QUILL  VALID_SPOOL_FILES = …,.quillwritepassword  DC_DAEMON_LIST = …, QUILL

23 www.cs.wisc.edu/condor Quill Configuration (cont) › Quill specific attributes  QUILL_ENABLED = TRUE  # The quill name must be unique across all  # quill daemons AND schedds  QUILL_NAME = psilord_quilld@merlin.cs  QUILL_DB_NAME = psilord_db  QUILL_DB_IP_ADDR = merlin.cs.wisc.edu:42999  QUILL_POLLING_PERIOD = 10 (seconds)

24 www.cs.wisc.edu/condor Quill Configuration (cont) › QUILL_HISTORY_CLEANING_INTERVAL = 24 (hours) › QUILL_HISTORY_DURATION = 30 (days) › QUILL_MANAGE_VACUUM = FALSE › QUILL_IS_REMOTELY_QUERYABLE = TRUE › QUILL_DB_QUERY_PASSWD = xxx

25 www.cs.wisc.edu/condor DB Storage Method › Schema designed to store and query classads  4 tables to represent the job queue classads  2 for history data  1 for metadata › Some queries are easier than others › Ask more questions at the BOF!

26 www.cs.wisc.edu/condor Thank you! › Want more information? › BOF “Databases in Condor: Now and in the Future”


Download ppt "Peter Keller Computer Sciences Department University of Wisconsin-Madison Quill Tutorial Condor Week."

Similar presentations


Ads by Google