Presentation is loading. Please wait.

Presentation is loading. Please wait.

WMS - Tecniche di scripting

Similar presentations


Presentation on theme: "WMS - Tecniche di scripting"— Presentation transcript:

1 WMS - Tecniche di scripting
Fabio Scibilia INFN – Catania, Italy Dipartimento di Ingegneria Informatica e delle Telecomunicazioni (DIIT), Catania, Marzo 2007

2 Preliminars LCG middleware gLite Tips and tricks
The workload is managed by the Resource Broker Doesn’t support neither parametric jobs nor DAGs Works fine gLite Support both the parametric and the DAG jobs Under developing Uses WMProxy to manage the workload Will be available in a few months Tips and tricks Some ideas to use LCG middleware to support parametric jobs and DAGs while waiting for WMProxy stable release Luogo, Evento, dd.mm.aaaa

3 Exercise 1: Parametric jobs
Luogo, Evento, dd.mm.aaaa

4 Exercise 1: The bash script (1/2)
A set of jobs differing for input files only The bash script looks like this #!/bin/sh if [ "$2" = "" ]; then echo "Usage: $0 begin end [step]" echo " begin The first value of the sequence" echo " end The last value of the sequence" echo " step The step between two submissions" exit 0 fi joblist="jobs.list" begin_index=$ // the first parameter of the script end_index=$ // the second parameter of the script if [ "$3" = "" ]; then step=1; else step=$ // the third parameter of the script . . . Luogo, Evento, dd.mm.aaaa

5 Exercise 1: The bash script (2/2)
# starts iterations for ((index=$begin_index; index<=$end_index; index=$index+$step)) do # we generate the input file automatically. Obviously it can be made by hand inputfile="input$index.txt" echo "creating input file $inputfile" echo "The name of this input file is $inputfile" > $inputfile # create the correspondent jdl file depending on the index jdlfile="job$index.jdl“ # name of the jdl echo "creating JDL file $jdlfile" ( echo 'Type="Job";' echo 'JobType="Normal";' echo 'Executable=“/bin/cat";' echo "Arguments=\"$inputfile\";" echo "StdOutput=\"stdout$index.txt\";" echo "StdError=\"stderr$index.txt\";“ echo "InputSandbox={\"$inputfile\"};" echo "OutputSandbox={\"stdout$index.txt\", \"stderr$index.txt\"};" ) > $jdlfile edg-job-submit -o jobs.id $jdlfile # actual job submission done # end of iterations Luogo, Evento, dd.mm.aaaa

6 Exercise 2: DAGs Luogo, Evento, dd.mm.aaaa

7 Exercise 2: DAG modelling
DAGs can be emulated with a simplified Petri net A job is submitted only when activating jobs have terminated Each transition bar corresponds to a bash script that Waits for termination of activating job(s) by polling every minute Collects the output Submits next job(s) job 2 5 1 3 6 4 Luogo, Evento, dd.mm.aaaa

8 Exercise 2: An example We emulate a simple split and merge DAG
Two states machine Anyway, this example can be extended to any possible DAG ./submitter.sh: generates input[1..n].txt and submits jobs ./polling && ./last_job.sh: Implement the bar transion input1.txt 1 ./polling.sh: waits for jobs [1..n] completion, collect the output and creates the final input file stdout input2.txt stdout final_input ./last_job.sh: submits the last job and waits for its completion, downloading the output final_output 2 last stdout input(n).txt n Luogo, Evento, dd.mm.aaaa

9 Exercise 2: ./submitter.sh
#!/bin/sh if [ "$1" = "" ]; then echo "Usage: $0 num-splits“ ; exit 0 fi for ((index=1; index<= $1; index++)); do # for each job echo "this is the content of input$index.txt" >> input$index.txt ( ## creates the jdl for this job echo "Type=\"Job\";" echo "JobType=\"Normal\";" echo "Executable=\"/bin/cat\";" echo "Arguments=\"input$index.txt\";" echo "InputSandbox={\"input$index.txt\"};" echo "StdOutput=\"stdout.txt\";" echo "StdError=\"stderr.txt\";" echo "OutputSandbox={\"stdout.txt\", \"stderr.txt\"};" ) > job$index.jdl; edg-job-submit -o jobs.id job$index.jdl done Luogo, Evento, dd.mm.aaaa

10 Exercise 2: ./submitter.sh output
dag]$ ./submitter.sh 2 The job has been successfully submitted to the Network Server. Use edg-job-status command to check job current status. Your job identifier is: - The job identifier has been saved in the following file: /home/fscibi/tips_and_tricks/dag/jobs.id - Luogo, Evento, dd.mm.aaaa

11 Exercise 2: ./polling.sh (1/4)
#!/bin/sh while read line; do if [ "$line" != "###Submitted Job Ids###" ]; then joblist="$joblist $line" fi done < jobs.id for job in $joblist; do status="unknown" finished="false" while [ "$finished" = "false" ]; do # loops waiting for job completion ## Gets the status of the job echo echo "getting status of job $job" output=`edg-job-status $job` status=`echo "$output" | grep "Current Status" | awk '{print $3 }'` echo "status = $status" Luogo, Evento, dd.mm.aaaa

12 Exercise 2: ./polling.sh (2/4)
## depens on the status, decides what to do case $status in "Aborted“ ) echo "The job has been aborted on the CE" finished="true" ;; "Cleared“ ) echo "The job output sandbox has been already retrieved. I don't know where!" "Done“ ) echo "Job $job Done!!! Downloading the output" ## executes and parses the output of edg-job-get-output ## to understand where the output has been stored Luogo, Evento, dd.mm.aaaa

13 Exercise 2: ./polling.sh (3/4)
edg-job-get-output --dir . $job | (pipes the edg-job-get-output to llok for job status found="false" while read line; do if "$found" = "true“ ; then ## this line contains the dir path dirpath=$line echo "output sandbox stored at $dirpath" break fi if echo "$line" | grep -q "have been successfully retrieved and stored“ ; then found= "true" ## next line contains the dir path done if "$found" = "true“; then filename=$dirpath/stdout.txt echo "appending $filename to final_input" cat $filename >> final_input ) finished="true" ;; Luogo, Evento, dd.mm.aaaa

14 Exercise 2: ./polling.sh (4/4)
*) echo "sleeping 1 minute" sleep 1m ;; esac done # while done # for dag]$ ./polling.sh . . . (after a while) getting status of job Od68j9IBOuJHGlUq-EfWTg status = Done Job Done!!! Downloading the output output sandbox stored at /dag/fscibi_Od68j9IBOuJHGlUq-EfWTg appending dag/fscibi_Od68j9IBOuJHGlUq-EfWTg/stdout.txt to final_input getting status of job _-suh1wmmo1VvYJd_4AiLhA output sandbox stored at /dag/fscibi_-suh1wmmo1VvYJd_4AiLhA appending dag/fscibi_-suh1wmmo1VvYJd_4AiLhA/stdout.txt to final_input Luogo, Evento, dd.mm.aaaa

15 Exercise 2: Submitting last job
. . ] cat last_job.sh #!/bin/sh ## submits the last job edg-job-submit -o last_job.id last_job.jdl status=unknown while [ "$status" != "Done" ]; do echo "sleeping 30 seconds“ sleep 30s output=`edg-job-status -i last_job.id` status=`echo "$output" | grep "Current Status" | awk '{print $3 }'` echo "status = $status" done edg-job-get-output -i last_job.id --dir . echo "Everything is Done !!! " . . ] cat last_job.jdl Type="Job"; JobType="Normal"; Executable="/bin/cat"; Arguments="-n final_input"; StdOutput="final_output"; StdError="stderr.txt"; InputSandBox={"final_input"}; OutputSandbox={"stderr.txt", "final_output"}; Luogo, Evento, dd.mm.aaaa

16 Exercise 2: ./last_job.sh output
dag]$ ./last_job.sh . . . The job has been successfully submitted to the Network Server. - sleeping 30 seconds (many times) status = Scheduled (waiting for status Done) sleeping 30 seconds status = Done Retrieving files from host: Output sandbox files for the job: have been successfully retrieved and stored in the directory: /home/fscibi/tips_and_tricks/dag/fscibi_q_rqVQpNFt0GshDn5MZHEw "Everything is Done !!! dag]$ cat fscibi_q_rqVQpNFt0GshDn5MZHEw/final_output 1 this is the content of input1.txt 2 this is the content of input2.txt Luogo, Evento, dd.mm.aaaa

17 References JDL (WMS Netrwork Server) JDL (WMS WMProxy)
JDL (WMS WMProxy) Advanced BASH scripting Gilda twiki pages Luogo, Evento, dd.mm.aaaa

18 Questions . . . Luogo, Evento, dd.mm.aaaa


Download ppt "WMS - Tecniche di scripting"

Similar presentations


Ads by Google