APST Internals Sathish Vadhiyar
apstd daemon should be started on the local resource Opens a port to listen for apst client requests Runs on the host where input files are located Input files can also be specified by running element apstd automatically copies output files from working directory to where apstd is started apst and apstd started by same user since apstd writes files on behalf of apst’s user
A APST run is associated with a XML file Task dependency can be enforced by APST XML <apst> </apst>
Some times there may not be file dependency but task dependency
XML example
security Some kind of security regarding what kind of commands will apstd accept over socket Given a description of the tasks to do and the resources (disks and machines) available, APST will assign individual tasks to available machines, copy the input files, run the tasks, and return the output files. APST also tries to assign tasks to machines intelligently, using information such as the load and speed of individual machines. The main APST program, apstd, handles all of the task assignment, application execution, and file copying. Splitting the control and user interface portions of APST like this allows you, for example, to run apstd on your main system but control it from your laptop.
Using local resources <apst> <task executable='perl' arguments='/home/${USER}/apst/Examples/charcount.pl /home/${USER}/apst/Examples/charcount0.dat' <task executable='perl' arguments='/home/${USER}/apst/Examples/charcount.pl /home/${USER}/apst/Examples/charcount0.dat' stdout='charcount0.out' /> stdout='charcount0.out' /> </apst> /home/${USER}/apst/bin/apstd -d --port 7890 first.xml
APST can use remote machines accessed through either a Globus GRAM or ssh, remote storage accessed through a Globus GASS server, scp, ftp, sftp, or an SRB server, and queueing systems controlled by Condor, DQS, LoadLeveler, LSF, PBS, or SGE.
Accessing remote resources – walk through <apst> </apst> Launches task on blueHost through ssh but assume files on local disk can be directly accessed
This tells apstd that blueHost can see files available on blueDisk, rather than those on the local disk.
The problem with this XML is that it requires that APST be installed on the remote machine in /home/${USER}/apst, since the arguments task attribute refers to files in this directory.
Equivalent to scp /home/${USER}/apst/Examples/charcount.pl blue.ufo.edu:/tmp/charcount.pl scp /home/${USER}/apst/Examples/charcount0.dat blue.ufo.edu:/tmp/charcount0.dat ssh blue.ufo.edu 'cd /tmp; perl./charcount.pl./charcount0.dat > charcount0.out' scp blue.ufo.edu:/tmp/charcount0.out /home/${USER}/apst/Examples/charcount0.out
Run the above example: /home/${USER}/apst/bin/apstd -d --port 7890 second.xml For globus: Scp -> gass Scp -> gass Ssh -> globus Ssh -> globus - i.e. machine and port where gatekeeper is running - i.e. machine and port where gatekeeper is running E.g.. E.g.. Run grid-proxy-init before starting apstd Run grid-proxy-init before starting apstd
Apst client program You can use apst to examine your application's state, add, stop, or restart tasks, and add or disable resources /home/${USER}/apst/bin/apst --host localhost:7890 command
Accessing batch systems Can replace pbs with lsf, condor, loadleveler
Gridinfo tag
Apstd daemon Can be started –heuristic= option. Default is wq Xml file has,,, Xml file has,,, <disk> Attributes – unique id, datadir Attributes – unique id, datadir Access method element Access method element Access method can be,,,,, or Access method can be,,,,, or
<host> Attributes – unique ID, cpus, disk, dnsname, memory, wd Attributes – unique ID, cpus, disk, dnsname, memory, wd Access method -,, or Access method -,, or Batch queuing system -,,,,, or Batch queuing system -,,,,, or Attributes – account, memory, node, nodetype, queue, stdin, stdout, stderr, time, option
<files> Specifies input, output and executable files Specifies input, output and executable files Contains one or more file attribute Contains one or more file attribute<file> Input files may have transfer attribute (yes or no) – whether files have to be transferred from submitting machine Input files may have transfer attribute (yes or no) – whether files have to be transferred from submitting machine Output files have analogously download attribute, may also have size attribute indicating the size of the output file – useful for scheduling decisions Output files have analogously download attribute, may also have size attribute indicating the size of the output file – useful for scheduling decisions
element may have element for input files element may have element for input files To indicate the placement of copies of the file that you have pre-staged to remote disks Will have disk attribute and copy attribute
<task> Attributes – executable, id, groups, wd, arguments, input, stdin, stdout, stderr, priority, host, memory, cost Attributes – executable, id, groups, wd, arguments, input, stdin, stdout, stderr, priority, host, memory, cost<infosource> Access method -,,, or. Access method -,,, or.