Download presentation
Presentation is loading. Please wait.
Published byΜυρίνα Βασιλικός Modified over 5 years ago
1
Submitting Jobs (and how to find them) John (TJ) Knoeller Center for High Throughput Computing
2
Overview Development of an advanced submit file
Using as many techniques and tricks as possible. Custom print formats A few more random tricks (time permitting)
3
The Story I have a lot of media files that I have collected over the years. I want to convert them all to .mp4 (Sounds like a high-throughput problem…)
4
Basic submit file for conversion
Executable = ffmpeg Transfer_executable = false Should_transfer_files = YES file = S1E2 The Train Job.wmv Transfer_input_files = $(file) Args = "-i '$(file)' '$(file).mp4'" Queue From now on, assume that Executable and should transfer are always present in the submit file.
5
Converting a set of files
Transfer_input_files = $(file) Args = "-i '$(file)' '$(file).mp4' " Queue FILE from ( S1E1 Serenity.wmv S1E2 The Train Job.wmv S1E3 Bushwhacked.wmv S1E4 Shindig.wmv … )
6
Output filename problems
Output is $(file).mp4. So output files are named S1E1 Serenity.wmv.mp4 S1E2 The Train Job.wmv.mp4 S1E3 Bushwhacked.wmv.mp4 S1E4 Shindig.wmv.mp4
7
$F() to the rescue $Fqpdnxba() expands to parts of a filename
file = "./Video/Firefly/S1E4 Shindig.wmv" $Fp(file) -> ./Video/Firefly/ $Fqp(file) -> "./Video/Firefly" $Fqpa(file)-> './Video/Firefly' $Fd(file) -> Firefly/ $Fdb(file) -> Firefly $Fn(file) -> S1E4 Shindig $Fx(file) -> .wmv $Fnx(file) -> S1E4 Shindig.wmv
8
$Fn() is name without extension
Transfer_Input_Files = $(file) Args = "-i '$Fnx(file)' '$Fn(file).mp4'" Resulting files are now S1E1 Serenity.mp4 S1E2 The Train Job.mp4 S1E3 Bushwhacked.mp4 S1E4 Shindig.mp4
9
$Fq() and Arguments $Fq(file) expands to quoted "filename"
Gives "parse error" with Arguments statement For Args use '$F(file)' instead. Becomes 'filename' on LINUX Becomes "filename" on Windows In 8.6 you can use $Fqa(file) instead
10
"new" Args preserves spaces
FILE = The Train Job.wmv Args = "-i '$Fnx(file)' -w640 '$Fn(file).mp4' " # Tool Tip* see it before you submit it. condor_submit test.sub -dump test.ads condor_status -ads test.ads -af Arguments -i The' 'Train' 'Job.wmv -w640 The' 'Train' 'Job.mp4 # On *nix the job sees -i 'The Train Job.wmv' –w640 'The Train Job.mp4‘ # on Windows the job sees -i "The Train Job.wmv" –w640 "The Train Job.mp4"
11
Sometimes you can't use Args
Argument quoting not portable across operating systems LINUX needs space and ' escaped Windows needs double quotes around filenames that have space or ^ What the job sees can be hard to predict Transfer_input_files will not transfer a file with a comma in the name.
12
Alternative to Args Use a script as your executable
Use custom job attributes to pass information to the script # these both set CustomAttr in the job ad +CustomAttr = "value" MY.CustomAttr = "value" You can refer to custom attributes in $() expansion in your submit file transfer_input_files = $F(My.CustomAttr) Use $F() here because it strips the quotes
13
Add custom attributes to the job
Executable = xcode.pl Args = -s 640x360 Transfer_executable = true Should_transfer_files = true # +WantIOProxy = true MY.SourceDir = $Fqp(FILE) MY.SourceFile = $Fqnx(FILE) +OutFile = "$Fn(FILE).mp4" Batch_name = $Fdb(FILE) Queue FILE matching files Firefly/*.wmv
14
Use a script to query the .job.ad
#!/usr/bin/env perl # xcode.pl # Pull filenames from job ad my $src = `condor_status -ads .job.ad -af SourceFile`; my $out = `condor_status -ads .job.ad -af OutFile`; # find condor_chirp (also need +WantIOProxy in job) my $lib = `condor_config_val libexec`; chomp $src; chomp $out; chomp $lib; # fetch the input file system("$lib/condor_chirp fetch '$src' '$src'") # do the conversion system("ffmpeg -i '$out'");
15
Use python to query the .job.ad
#!/usr/bin/env python # xcode.py # load the job ad and get the source filename job = classad.parseOne(open('.job.ad').read()) src = job['SourceFile'] srcpath = job['SourceDir'] + src # fetch the input file chirp.fetch(srcpath, src) # do the conversion and return the exit code exit(os.system("ffmpeg -i {0} {2} {1}".format( src, job['OutFile'], job['Arguments']));
16
See how it's going.. condor_q OWNER BATCH_NAME … DONE RUN IDLE TOTAL JOB_IDS Tj Firefly _ 2 2 _ condor_q -af:jh JobStatus SourceFile SourceDir ID JobStatus SourceFile SourceDir S1E1 Serenity Firefly/ S1E2 The Train Job Firefly/ S1E3 Bushwhacked Firefly/ S134 Shindig Firefly/
17
The ClassAd "Pretty Printer"
A c++ class within HTCondor that can print information from a ClassAd Used by condor_status / q / history for most of the standard output -format and -autoformat control it directly -print-format <format-file> gives the most control
18
Use a custom print format
-print-format <format-file> control attributes, headings, format, constraint like -autoformat on steroids condor_status, condor_q, and condor_history Config to make it your default output An "experimental" feature, but stable htcondor-wiki.cs.wisc.edu/index.cgi/wiki?p=ExperimentalFeatures
19
Custom print format syntax
SELECT [LABEL [SEPARATOR <string>]] \ [FIELDPREFIX <string>] \ [FIELDSUFFIX <string>] \ [RECORDPREFIX] <string>] \ [RECORDSUFFIX] <string>] <expr> [AS <label>][ PRINTF <fmt> \ | PRINTAS <function> \ | WIDTH [AUTO | <int>]] \ [LEFT | RIGHT] [NOPREFIX] [NOSUFFIX] .. repeat, as needed [GROUP BY <sort-expr> [ASCENDING | DECENDING]
20
Custom print format xcode.cpf
SELECT ClusterId AS " ID" PRINTAS JOB_ID JobStatus AS ST PRINTAS JOB_STATUS (time()-EnteredCurrentStatus)/60.0 AS MIN PRINTF %7.2f JobBatchName AS BATCH SourceFile AS SOURCE RemoteHost ?: "_" AS SLOT # ignore jobs without the custom SourceFile attribute WHERE SourceFile isnt undefined SUMMARY STANDARD ID ST MIN BATCH SOURCE SLOT R 5.02 Firefly S1E2 The Train Job
21
Make it your default output
In your personal config ~/.condor/user_config %USERPROFILE%\.condor\user_config Save the xcode.cpf file and add this knob # uncomment one of these depending on Linux/Windows # PERSONAL = $ENV(HOME)/.condor # PERSONAL = $ENV(USERPROFILE)\.condor Q_DEFAULT_PRINT_FORMAT_FILE=$(PERSONAL)/xcode.cpf
22
And now a few random tips...
23
Put Queue on command line
condor_submit xcod.sub -q FILE matching *.wmv (xcod.sub must NOT have a queue line) pick.sh | condor_submit x.sub -q FILE from -
24
Test using a subset of jobs
Queue FILE from ( S1E1 Serenity.wmv # S1E2 The Train Job.wmv # S1E3 Bushwhacked.wmv … ) use a python-style slice to define a subset Queue FILE matching files [:1] *.wmv
25
Even easier if you prepare
Put $(slice) in your submit file Queue FILE matching files $(slice) *.wmv Then control the slice from the command line condor_submit 'slice=[:1]' firefly.sub
26
Variable Tricks # transfer files starting with file001 sequence = $(ProcId)+1 transfer_input_files = file$INT(sequence,%03d) # Use the submit dir and cluster as the batch name batch_name = $Ffdb(SUBMIT_FILE)_$(ClusterId) # use the same random value for all jobs in this submit include command : /bin/echo rval=$RANDOM_INTEGER(1,100) Arguments = $(rval)
27
Submit from python # HTCondor or later for this code sub = htcondor.Submit(open('xcode.sub').read()) # override some things sub['MY.SourceDir'] = '"/media/Firefly/"' sub['FILE'] = '$(Item)' files = ['S1E1 Serenity.wmv', 'S1E2 The Train Job.wmv', 'S1E3 Bushwhacked.wmv', 'S1E4 Shindig.wmv' ] # submit using the files list with schedd.transaction() as txn: cluster = sub.queue_with_iter(txn, 1, iter(files)) # cluster object has ClusterId, range of ProcId's # and "common" job ClassAd
28
Questions?
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.