J. Hanton - P. Herquet - F. Lequeux - A. Romeyer1 CONDOR-G Installation July 2004 : one independent PC for Grid FTP as a client to UCL August 2004 : complete installation of Globus 3.2 on the farm GridFTP server -> another picture to download Interface with CONDOR for batch submission Local test with the independent PC Globus-url-copy –v gsiftp://cms01.umh.ac.be/tmp/newlab_we_are_ready.jpg file ${HOME}/newlab_we_are_ready.jpg
J. Hanton - P. Herquet - F. Lequeux - A. Romeyer2 Cluster architecture Server Router Computer … 1 Gb/s 100 Mb/s Outer world Raid disk (2.4 TB) OS : Redhat CERN Cms Cms cms01.umh.ac.be Static IP XXX Independent PC CONDOR Globus
J. Hanton - P. Herquet - F. Lequeux - A. Romeyer3 Globus setup On both the indep. PC and CMS01 public machine : Configure the GridFTP server on port 2811 Configure the gatekeeper service on port 2119 Setting up the grid-mapfile : Setting up the grim port types : "/C=BE/O=BELGRID/OU=TESTBED/OU=umh.ac.be/CN=cmsuser" cmsuser "/C=BE/O=BELGRID/OU=TESTBED/OU=localdomain/CN=cmsuser" cmsuser "/C=BE/O=BELGRID/OU=TESTBED/OU=fynu.ucl.ac.be/CN=Alain NINANE" cmsuser d_job/ManagedJobPortType ed_job/ManagedJobPortType
J. Hanton - P. Herquet - F. Lequeux - A. Romeyer4 Globus setup… Lets test it : Nothing strange in the globus-gatekeeper.log Solution (only a trick…) : remove the gsi-authz.conf in /etc/grid-security Build the condor scheduler in Globus gpt-build scheduler-condor-3.2-src_bundle.tar.gz gcc32dbg gpt-postinstall Globus-job-run cms01.umh.ac.be /bin/date Gram Job submission failed because data transfer to the server failed (error code 10) globus-job-run cms01.umh.ac.be /bin/date Thu Aug 19 10:12:34 CEST 2004 globus-job-run cms01.umh.ac.be /bin/date Thu Aug 19 10:12:16 CEST 2004 Local test Remote test
J. Hanton - P. Herquet - F. Lequeux - A. Romeyer5 Test of CONDOR-G Start with a CONDOR example : io.c CONDOR.cmd file : #################### ## ## Test Condor command file ## #################### universe = globus globusscheduler = cms01.umh.ac.be/jobmanager-condor executable = io.remote output = io.out error = io.err log = io.log requirements = CMSFARM=?=True arguments = 200 queue Indep. PC CONDOR + Globus eth0 eth1 CMS01 CONDOR + Globus eth0 CMS02, CMS03 CONDOR Globus CONDOR
J. Hanton - P. Herquet - F. Lequeux - A. Romeyer6 Test of CONDOR-G… Launch the job from the indep. PC : condor_submit io.cmd condor_q -globus -- Submitter: cms-test.umh.ac.be : : cms-test.umh.ac.be ID OWNER STATUS MANAGER HOST EXECUTABLE cmsuser UNSUBMITTED condor cms01.umh.ac.be /tmp/Scratch/examp condor_q -globus -- Submitter: cms-test.umh.ac.be : : cms-test.umh.ac.be ID OWNER STATUS MANAGER HOST EXECUTABLE cmsuser PENDING condor cms01.umh.ac.be /tmp/Scratch/examp condor_q -globus -- Submitter: cms-test.umh.ac.be : : cms-test.umh.ac.be ID OWNER STATUS MANAGER HOST EXECUTABLE cmsuser ACTIVE condor cms01.umh.ac.be /tmp/Scratch/examp
J. Hanton - P. Herquet - F. Lequeux - A. Romeyer7 Test of CONDOR-G… On cms01 : condor_q -- Submitter: cms01.umh.ac.be : : cms01.umh.ac.be ID OWNER SUBMITTED RUN_TIME ST PRI SIZE CMD cmsuser 8/19 10: :00:51 R data Lumi2 1 jobs; 0 idle, 1 running, 0 held condor_q -r -- Submitter: cms01.umh.ac.be : : cms01.umh.ac.be ID OWNER SUBMITTED RUN_TIME HOST(S) cmsuser 8/19 10: :01:00 cms02 Job is running on cms02 Rem : the test has also been done with CMS reconstruction job