Analysis Trains - Reloaded Andrei Gheata Costin Grigoras Jan Fiete Grosse-Oetringhaus
Idea Presented in the offline meeting in June and the offline week that followed Manage trains using MonALISA Users register wagons Train operators compose trains Automatic testing per wagon Train file generation Submission managed by ML (existing LPM infrastructure) Jan Fiete Grosse-Oetringhaus
Configuration & Testing Train Configuration New class AliAnalysisTaskCfg Contains description of wagons (add task macro, libraries, dependencies) Reads/writes to a text file format (used to read train configuration from ML) Testing Uses alientest04 machine Downloads AliEn packages (ROOT, AliRoot) Copies a part of the input data set local Runs tests per wagon Uses syswatch to extract mem/cpu information Tests also "base line" task which is empty Phys Sel Centr Sel User A User B User C Jan Fiete Grosse-Oetringhaus
Workflow User 1. adds wagons LPM MonALISA Train operator AliEn config 2. composes train 4. recompose after test test results 6. runs train Test machine train files 3. generates test files + executes test 5. generates train jdl + scripts Jan Fiete Grosse-Oetringhaus
Screenshot Handler configuration Wagon configuration Data configuration Testing and running status Jan Fiete Grosse-Oetringhaus
Handler Jan Fiete Grosse-Oetringhaus
Wagon Jan Fiete Grosse-Oetringhaus
Dataset Jan Fiete Grosse-Oetringhaus
Run Jan Fiete Grosse-Oetringhaus
Syswatch Jan Fiete Grosse-Oetringhaus
Demo… Enough theory, let's do some clicking… http://alimonitor.cern.ch/trains Jan Fiete Grosse-Oetringhaus
Some More Details Train runs with an analysis tag All code + "AddTask" macro has to be in the tag (no par file!) Output stored in the input data directory (like AOD, QA trains). E.g.: /alice/data/2010/LHC10h/000137366/ESDs/pass2/PWG4/CorrelationTrain/7_20111117_1350 Current infrastructure only allows per-run merging Jan Fiete Grosse-Oetringhaus
Open Issues ROOT AOD analysis Fix in TGridJDL was required. Is in v5-30-00-patches, but not yet deployed in the Grid. Needed for train operation AOD analysis Found huge leak even in empty analysis (20kb/event) Jan Fiete Grosse-Oetringhaus
Old Slides Jan Fiete Grosse-Oetringhaus
Idea Setting up and operating analysis trains is a lot of work Specific settings for each wagon Wagons have bugs, leaks etc. Automatic configuration needed Automatic testing needed (on a subset of the same data the train will run on) We have collected some ideas which we want to give a try using at the beginning the PWG4 train Jan Fiete Grosse-Oetringhaus
High Level Description Train runs on analysis tag (no modifications allowed) User registers task Train operator triggers train test Test results are fed back to Monalisa where the user & operator can see them Operator starts train with tasks that succeeded and have no (significant) leaks These steps are operated from MonaLisa Jan Fiete Grosse-Oetringhaus
Some Technical Details Container that contains task configuration (already shown by Andrei) Currently identified configuration items Location of AddTask macro + parameters Required libraries Tasks that have to run before Train testing Tasks are tested one by one On subset of data on which the train will run CPU/Real time, memory extracted w.r.t baseline Baseline from test with just PhysSel + Centrality Train macro generation By analysis framework using the wagons selected by the operator Macros for testing (wagon by wagon) Macros for full train (all wagons) Overall train submission Using the already existing ML submission framework (including merging jobs) Phys Sel Centr Sel User A User B User C Jan Fiete Grosse-Oetringhaus