LHCbComputing Manpower requirements
Disclaimer m In the absence of a manpower planning officer, all FTE figures in the following slides are approximate m In particular, there may be omissions (at the level of FTEs) in the tables describing currently available effort 2
3 Long term computing project responsibilities and needs o Project management (2 FTE) P Coordination, Planning (resources, activities, development), Liaison with outside bodies (WLCG, RRB, LHC experiments) o Software engineering support (4 FTE) P Code and release management, nightly builds, software performance infrastructure, user environment, tutorials, documentation o Central infrastructure support (1 FTE) P VO management, CERN-IT liaison, Web services, Vidyo o Applications coordination, maintenance, integration (6 FTE) P Framework maintenance (Gaudi, Persistency, Event model etc.) P Conditions database development, coordination, deployment P Physics applications release planning, integration, performance and regression testing, validation d Gauss, Boole, Brunel, DaVinci, Moore, Event display etc. o Computing operations (8 FTE) P Production planning, production management, data management, grid operations, user support o Distributed computing software maintenance (8 FTE) P Dirac+Ganga coordination and integration, book-keeping, databases, production tools, monitoring, accounting
Manpower currently committed to core activities 4 CountryFTE Brazil0.4 France0.5 Germany0.6 Italy3.5 Russia1.1 Spain1.5 CERN8 Switzerland0.5 Netherlands1.0 United Kingdom5.0 United States0.5 TOTAL22.6 (c.f. 29 needed)
Current manpower m Current manpower insufficient to cover core activities o Estimate 29 FTE needed, 22.6 FTE available P Some activities not covered (see next slide) m Very little manpower available for non-core activities o ~4 FTE at CERN in principle working on Gaudi and Dirac software development P In practice making up some of above missing manpower o Small pockets of effort in various countries, for example: P Spain (DIRAC development) P Italy, UK, CERN (Data Preservation and Outreach) P Italy, Netherlands (Multicore R&D) m Barely sufficient to keep our software and computing abreast with evolving technology 5
New activities m Core activities not covered by existing manpower o e.g. documentation, tutorials, event display, software validation, performance and regression testing m Software improvement activities for upgrade conditions o Application software development P e.g. Coordination of GPU activities, frameworks for multicore, adoption of Root6. o Software optimisation P e.g. vectorisation, architecture dependent compilation, C++11 o Data Management P e.g. Use of data federations, data popularity, remote access to data, event indices, optimisation of Root I/O o Distributed Computing P e.g. Virtualisation, Interfaces to Clouds, Multicore queues, DIRAC scalability m Data preservation and open access m Preliminary estimate: a further 10 FTEs needed 6
What do other experiments do? m Atlas, CMS, Alice all have some core activities covered by M&O A (either cash or in-kind manpower contribution) P Software engineering support P Central productions and operation P Central infrastructure support o Atlas + CMS: ~2 MCHF (or ~20 FTE, largely “in kind”) o Alice: 0.5 MCHF o (LHCb: 170 kCHF for subsistence) m In addition: o Atlas itemises all computing contributions under M&O B P 171 FTEs in 2013 o CMS finances additional core computing manpower at CERN through M&O B P 8 FTEs m All have formal agreements of where manpower comes from. 7
Observations m Manpower currently devoted to operations is incompressible o Compares very favourably with situation in GPDs P BUT many tasks do not scale with collaboration size P AND data handling for LHCb in upgrade comparable to GDPs in Run 1 m Funding for computing resources is (at best) following a constant budget o Growth per CHF follows Moore’s Law only if the software is optimised for new architectures o Growth of LHCb requirements is steeper than Moore’s law m Major evolution of computing model and software required o Requires significant injection of new manpower P Initially for coordination and R&D P Subsequently for deployment and operations 8
Possible scenario m Divide computing project into a number of work packages o Each including organisational, development and support components o Work in progress m Ask individual groups (or countries) to volunteer responsibility for one or more work packages o Each contribution: team of several people P Size will depend on work package, but 1-2 FTEs will be a minimum viable contribution o Similar to sub-detector organisation P Resulting in a document describing sharing of responsibilities o Precise sharing of responsibilities is essential P Best effort like now is not enough for the upgrade m Add up contributions. If large shortfall, may need to introduce charging for computing o e,g. funding of core software services through M&O A contributions, either in-kind or through a ‘tax’ m Principle to be discussed in CB this week 9
Other news m Ricardo Graciani’s mandate as computing resources coordinator is over o Concezio Bozzi (Ferrara) has agreed to take on this reponsibility 10