Ganga Core: Status Jakub T. Moscicki ARDA/LHCb LHCb Software Week, September, 2005
2 Ganga Overview AtlasPROD DIAL DIRAC LCG2 gLite localhost LSF submit, kill get output update status store & retrieve job definition prepare, configure Ganga4 Job scripts Gaudi Athena AtlasPROD DIAL DIRAC LCG2 gLite localhost LSF + split, merge, monitor
3 Release Schedule Ganga 4 Jul beta1 Aug beta2 Sep beta3 Sep beta4 Apr alpha1 May alpha2 May alpha3 May alpha4 Jun alpha5 Jun alpha6 Jun alpha7 Jul alpha8 Ganga 3 Mar Apr beta series: fully operational, public pre-release - bugfixes, testing, missing features - stability: config files, repository backwards compatibility audience: - tested/used ~10 users in LHCb, Atlas and outside - encouraged to be tried by everybody, no setup needed alpha series: prototype with frequent and incompatible changes audience: internal developers discontinued
4 Testing/stability Core testing: automatic test suite (61 test cases) unit tests / invariant tests / integration tests bugfix tests "a bug report have a test-case" policy subsystem stubs (test submitters, transient repository) Extensions testing: use-case tests in preparation (published as LHCb note) Release compatiblity: automatic repository regression testing GPI/config compatiblity policies
5 Project Structure Framework Submission logic Monitoring JobRepository FileWorkspace Utilities (config, logging) Interfaces: interactive shell command line / scripts embedding / library Plugins Applications Backends Datsets
6 Project Structure Release area: /afs/cern.ch/sw/ganga/install/slc3_gcc323/4.0.0-beta4 bin/ganga python/Ganga core framework Local, LSF, LCG, gLite backends Executable application python/GangaLHCb Gaudi,DIRAC plugins python/GangaAtlas Athena, ADA plugins [Configuration] RUNTIME_PATH = GangaLHCb:/myarea/GangaAtlas
7 Configuration Config file: ~/.ganga4 Default template is well documented. Configurable features: plugin location hierarchical logger levels polling rate (15 seconds) repository configuration (local/remote) file workspace (job input/output location) VO software versions plugin specific parameters Relevant command line options -c cfgfile -o[Repository]type=Remote -o[Logging]GangaLHCb.Lib.Dirac=DEBUG
8 Command line ganga -h *** Welcome to Ganga *** Version: Ganga beta4 Documentation and support: Type help() or help('index') for online help. usage: ganga [options] [script] [args]... options: --version show program's version number and exit -h, --help show this help message and exit -i enter interactive mode after running script -cFILE read user configuration from FILE (default ~/.ganga4) -g, --generate-config generate a default config file, backup the existing one -oEXPR, --option=EXPR set configuration options, may be repeated mutiple times, for example: -o[Logging]Ganga.Lib=DEBUG -oGangaLHCb=INFO -o[Configuration]TextShell = IPython FIXME: PATH-like variables are reset and not appended to (this behaviour is different from config file behaviour) --quiet only ERROR messages are printed --very-quiet only CRITICAL messages are printed --debug all messages including DEBUG are printed --no-prompt never prompt interactively for anything except IPython (FIXME:) --no-rexec rely on existing environment and do not re-exec ganga process to setup runtime plugin modules (affects LD_LIBRARY_PATH)
9 Interfaces Interactive Shell IPython:, coloring, history, editing, direct shell access Automatically generated GPI help index Scripting ganga script.py interpreter Embedding #!/bin/env ganga print jobs from Ganga.Runtime import GangaProgram prog = GangaProgram() prog.bootstrap() from Ganga.GPI import *
10 GPI Ganga Public Interface: GPI –high-level, user-friendly Python API for job manipulation –combines consistency and flexibility of programming language interface clarity and ease of use Ganga.Core GPI GUI CLIP SCRIPT
11 GPI Hello World >>> job = Job() >>> job.application.exe='/bin/echo' >>> job.application.parameters=['hello world']) >>> job.submit() submitting job >>> outfile = file(job.directory+'/output/std.out') >>> print outfile.read() Job started at: Fri Feb 18 14:05: Processing input files... /bin/echo Done hello world Application executed with the status code 0 Processing output files... Exiting... Job finished at: Fri Feb 18 14:05: >>> job2 = job.copy() >>> job2.backend = “LSF” >>> job2.submit()
12 GPI Inspecting the jobs >>> print job.id 5 >>> print jobs Statistics: 5 jobs jobs ID status name # 1 completed # 2 new Job # 3 completed Job # 4 submitted Job # 5 completed Job >>> for j in jobs[1:3]:... print j.id 1 2
13 GPI Complex scenarios >>> j = Job() >>> j.application = DaVinci() >>> j.application.options = 'my.opts' >>> j.backend = Glite() >>> j.backend.requirements = 'other.GlueCEUniqueID == "grid- ce.desy.de:2119/jobmanager-lcgpbs-short"' >>> for i in range(100): j = Job()
14 Ganga Tool vs Framework Ganga is a lightweight user tool easy to install (pure-python) “designed and optimized” for users GPI has a syntax (users have to judge): j.application, j.backend, j.id, j.submit(), …. Etc But also: Ganga is a developer framework Plugin model independent and rapid development of handlers (backends, applications) Promote but not force common GPI abstractions We do not require nor invent abstract base classes which are least common denominators between systems, example: –you may implement very complex application (e.g. ADA) and enable submission to DIAL only if that’s your main case the design of framework does not attempt to match all possible applications with all possible backends But: enable to build common tools on top of GPI: GUI, scripts,…
15 Some Design Principles Avoid shared environment example: in LCG environment LD_LIBRARY_PATH is incompatible with some application environments solution: LCG backend handler uses a private, cached environment Don't force common abstractions upfront application backend are connected via adapters (runtime handlers) in most cases adapters are shared (thus their number is reduced)
16 Adapters: 7 vs 11 vs 20 X63 LCG/glit e 7XXX DIAL XX4X DIRAC X LSF X521 Localho st ADAAthenaGaudi (DaVinci,Gauss, Boole,…) executabl e (any script)
17 Summary Factsheet (4-0-0-beta4) –size: Ganga base: ~400KB, pure-python (no install) Atlas and LHCb extensions: ~100KB –existing functionality: basic job manipulation via GPI easy configuration / extension local and remote registry, local workspace Lib: –local host, LSF, LCG2, DIRAC, glite –Gaudi (DaVinci, Gauss,...), Athena, DIAL, Ada –future functionality: GUI splitting/merging asynchronous job submission (remote job manager)
18
19 Backup Slides
20 Ganga Architecture Ganga.Core GPI GUI CLIP j = J o b ( b a c k e n d = ' L S F ' ) j.s u b m i t ( ) Job Reposito ry File Workspace IN/OUT SANDBOX AtlasPR OD DIAL DIRAC LCG2 gLite localhost LSF Athena Gaudi Plugin Modules Monitoring
21 Ganga Object Model
22 Gaudi Application Object class Gaudi(GangaObject): _schema = Schema(Version(1,0),{ 'optsfile': FileItem(), 'version': SimpleItem(None), 'platform': SimpleItem(None), 'package': SimpleItem(None), 'appname': SimpleItem(None), 'cmt_release_area': SimpleItem(None), 'cmt_user_path': SimpleItem(None), 'masterpackage': SimpleItem(None), 'extraopts': SimpleItem(None)}) _category='applications' _name='Gaudi' def _auto__init__(self):... def configure(self):... extra_cfg=GaudiExtras() extra_cfg.flatopts=FileParser.writeString(gaudiopts,"expand") return (modified, extra_cfg) def list_choices(self,property):...
23 Job Submit
24 class GaudiLFSRunTimeHandler: def prepare(self,app,extra): (algpack,alg,algver)=app.masterpackage.split('/',3) script="""#!/usr/bin/env bash export CMTPATH=###CMTUSERPATH### export ###THEAPP###_release_area=###CMTRELEASEAREA### if [ -f ${LHCBHOME}/scripts/ProjectEnv.sh ]; then. ${LHCBHOME}/scripts/ProjectEnv.sh ###THEAPP### ###VERSION### else echo "Could not find the ProjectEnv.sh script. Your job will probably fail" fi mkdir -p cmttemp/v1/cmt cat >cmttemp/v1/cmt/requirements <<EOF use ###ALG### ###ALGVER### ###ALGPACK### EOF cmt setup -sh -quiet -pack=cmttemp -version=v1 -path=$PWD >cmttemp/v1/cmt/setup.sh. cmttemp/v1/cmt/setup.sh $###THEAPP###_release_area/###APPUPPER###/###APPUPPER###_###VERSION###/###PACKAGE###/###THEAPP###/###VE RSION###/###PLATFORM###/###THEAPP###.exe myopts.opts """ script=script.replace('###CMTUSERPATH###',app.cmt_user_path) script=script.replace('###THEAPP###',app.appname) script=script.replace('###CMTRELEASEAREA###',app.cmt_release_area) script=script.replace('###VERSION###',app.version) script=script.replace('###ALG###',alg) script=script.replace('###ALGVER###',algver) script=script.replace('###ALGPACK###',algpack) script=script.replace('###APPUPPER###',app.appname.toupper()) script=script.replace('###PACKAGE###',app.package) script=script.replace('###PLATFORM###',app.platform) return {'jobscript': ('myscript',script), 'inputbox':[('myopts.opts',extra.flatopts)]}
25 LSF Submit (1) def submit(self,jobid, jobconfig): inw = FileWorkspace.InputWorkspace() outw = FileWorkspace.OutputWorkspace() logger.info('LSF: submitting job %d',jobid) inw.create(jobid) outw.create(jobid) scriptpath = self.preparejob(jobid,jobconfig,inw,outw) # FIXME: garbbing stdout is done by shell magic and probably should be implemented in python directly rc,soutfile = shell_cmd('cd %s; bsub %s' % (inw.getPath(),scriptpath)) if rc == 0: sout = file(soutfile).read() import re m = re.compile(r"^Job \d*)> is submitted to (\S*) queue \S*)>.", re.M).search(sout) if m is None: logger.warning('could not match the output and extract the LSF job identifier!') logger.warning('command output \n %s ',sout) else: self.id = m.group('id') queue = m.group('queue') if self.queue != queue: self.queue = queue logger.warning('you requested queue "%s" but the job was submitted to queue "%s"',self.queue,queue) logger.warning('command output \n %s ',sout) logger.info('job %d submission OK',jobid) return rc == 0
26 LSF Submit (2) def preparejob(self,jobid,jobconfig,inw,outw): appscriptpath = inw.writefile(jobconfig['jobscript'],executable=1) # put files into job workdir (also to protect the originals while the job is running) sharedinputbox = map(lambda f: inw.writefile(f), jobconfig['inputbox']) sharedoutputbox=outw.getPath() print sharedoutputbox text = """#!/usr/bin/env python import shutil sharedinputbox = ###SHAREDINPUTBOX### sharedoutputbox= ###SHAREDOUTPUTBOX### for fn in sharedinputbox: shutil.copy(fn,'.') s = os.system('###APPSCRIPTNAME###') print 'DEBUG: Job finshed with exit code: ',s if s == 0: for fn in os.listdir('.'): if not os.path.isdir(fn): shutil.copy(fn,sharedoutputbox) # FIXME: needs recursive copy sys.exit(s) """ text = text.replace('###SHAREDINPUTBOX###',repr(sharedinputbox)) text = text.replace('###APPSCRIPTNAME###',appscriptpath) text = text.replace('###SHAREDOUTPUTBOX###',repr(sharedoutputbox)) return inw.writefile(('__jobscript__',text),executable=1)
27 Job Submit Sequence
28 Files/Job Repository File Workspace ~/__Ganga4__/workspace/input/* ~/__Ganga4__/workspace/output/* Job Repository ~/__Ganga4__/repository/ganga_user
29 LSF backend object class LSF(GangaObject): _schema = Schema(Version(1,0), {'queue' : SimpleItem(defvalue='8nm'), 'id' : SimpleItem(defvalue=None,protected=1,copyable=0), 'status' : SimpleItem(defvalue=None,protected=1,copyable=0) }) _category = 'backends' _name = 'LSF' def __init__(self): super(LSF,self).__init__()
30 LSF Monitoring def updateMonitoringInformation(jobs): rc,soutfile = shell_cmd('bjobs -a',allowed_exit=[0,255]) sout = file(soutfile).read() if rc == 0: import re m1 = re.compile(r"JOBID\s+USER\s+STAT\s+QUEUE").search(sout) if not m1: logger.warning('problem with understanding the bjobs output:\n%s',sout) else: items = re.compile(r"^(?P \d+)(\s*)(\S*)(\s*)(\S*)", re.M).findall(sout) ids = map(lambda x: x[0], items) for j in jobs: try: idx = ids.index(j.backend.id) new_status = items[idx][4] if j.backend.status != new_status: logger.info('%d: LSF job status changed to %s',j.id,new_status) j.backend.status = new_status if j.backend.status == 'DONE' or j.backend.status == 'ERROR': j.status = "completed" except ValueError: pass updateMonitoringInformation = staticmethod(updateMonitoringInformation)
31 Hello CLI Hello World: # execute hello script locally from Ganga.CLI import * Job(exe='hello').submit() Hello DaVinci: # execute DaVinci on the LSF, GRID,... # analysis will start at a worker node somewhere far far away ;-) j = Job(name='serious analysis',backend='LSF') j.application = DaVinciApplication(version='v12r3') j.application.optsfile = "DV-demo.opts" j.outputfiles = ["DVNtuples.hbook"] j.submit()
32 Jobs # registry of persistent jobs jobs() Statistics: 2 jobs registry ID status name # 1 new serious analysis # 2 submitted hello # looping and selecting jobs j = jobs()[1] for j in jobs(): print j for j in jobs()[2:9]: j.name = 'important!' important = jobs()['important!']
33 Plugin Components Applications & Backends # list plugin components backends() ['TestSubmitter', 'Local', 'Glite'] applications() ['DaVinciApplication', 'TestApplication', 'Executable'] # creating objects app = DaVinciApplication(optionsfile='some.opts') bk = Local() j.application, j.backend = app, bk # creating objects by a string name j.application = 'DaVinciApplication' j.application.optionsfile = 'some.opts' j.backend = 'Local'
34 Templates and Copying Copy jobs # reuse existing jobs configuration to create new jobs j = other_job.copy() j = Job(template = other_job) Job templates # job templates are just like any other jobs # except that their sole purpose it to store job configuration t = JobTemplate(backend=LSF(queue='8nm')) j = Job(template = t) # templates are stored in a separate container templates() Statistics: 1 jobs templates ID status name # 1 TEMPLATE None
35 Design Principles CLI Design Principles Be predictable and follow python way of thinking Increase complexity of interface with complexity of task: Simple tasks – simple! Complicated tasks – also simple ;) ! Try to prevent users from slient mistakes: job.id = 5 # FAILS: id is a read-only property finished_job.name = 'newname' # FAILS: job is finished so can't modify Hide implementation: job._impl.attrs['id'] = 5 Be convinient and guide users j.application.exe j.exe # ALIASES of properties TAB completion shows properties and hides internals Be flexible: good for writing complex macros/scripts...
36 Ganga Architecture Client Ganga.Core GPI GUI CLIP j = J o b ( b a c k e n d = ' L S F ' ) j.s u b m i t ( ) Job Reposito ry File Workspace IN/OUT SANDBOX AtlasPR OD DIAL DIRAC LCG2 gLite localhost LSF Athena Gaudi Plugin Modules Monitoring