Pablo Saiz CAF and Grid User Forum AliEn Job Merging Pablo Saiz CAF and Grid User Forum
Job Merging Next logical step after job splitting See http://indico.cern.ch/conferenceDisplay.py?confId=31167 Concatenate the result of all subjobs of a given masterjob New status if a masterjob needs merge: INSERTED SPLITTING SPLITMERGINGDONE The ‘merging’ is another job It will wait in the queue like any other job 13 September 2018 pablo.saiz@cern.ch
1 image is better than 1000 words histo.root analysis.log Subjob 1 Merge Histo AllHisto.root Subjob 2 histo.root analysis.log User JDL Subjob 3 Merge Logs ERROR!! Alllogs.txt … histo.root analysis.log Subjob n Time INSERTED SPLIT MERGING DONE 13 September 2018 pablo.saiz@cern.ch
How to specify Merging In the JDL of the masterJob: AliEn will do: Merge={“<input>:<jdl>:<output>” (,“<input2>:<jdl2>:<output2>”)* } MergeOutputDir=“/path/where/you/want/the/output”; Default /proc/<user>/<masterid>/merge AliEn will do: submit <jdl> <masterJobId> <input> <output> <user> <procdir> <outputdir> 13 September 2018 pablo.saiz@cern.ch
How to start the merging Automatically: When all the subjobs are in a final state, AliEn sends the merging masterJob <id> merge Force the merging of the subjobs that have finished By hand: submit <jdl> <masterJobId> <input> <output> <user> <procdir> <outputdir> 13 September 2018 pablo.saiz@cern.ch
Existing merging JDLs /alice/jdl/mergerootfile.jdl /alice/jdl/mergerootfile-sequential.jdl No requirements. The merging can be executed anywhere! User defined Variations of the previous jdl There is no merging for text files: Needed? 13 September 2018 pablo.saiz@cern.ch
ToDo Given a masterjobid, follow up the status of the merging Automatically put requirements on the execution site for the merging. More documentation in the bible ? 13 September 2018 pablo.saiz@cern.ch
Conclusions Merging collects the output of subjobs into a single file Performed when all the subjobs are in a final state: ERROR or DONE Can also be trigger manually Documentation will be added to the bible 13 September 2018 pablo.saiz@cern.ch