Download presentation
Presentation is loading. Please wait.
Published byRosemary Brooks Modified over 9 years ago
1
SFT Group Review: Additional projects, future directions and overall planning SPI project (WP8) Multi-core (WP9) Virtualization Other projects Vision Planning September, 30th 20091
2
SFT S o F T w a r e D e v e l o p m e n t f o r E x p e r i m e n t s SPI September, 30th 20092
3
SFT S o F T w a r e D e v e l o p m e n t f o r E x p e r i m e n t s Software Process and Infrastructure External libraries service LHC experiments use about 100 libraries (open-source and public domain libraries) – see http://lcgsoft.cern.ch/http://lcgsoft.cern.ch/ Automated building/distribution of all packages from sources for all the AA supported platforms – fast Recently introduced new compilers and OSs (slc5, gcc 4.3, VS9, icc11) Release management of AA software stack for LHC experiments Coordination done via the "Librarians and Integrators meeting" (low level) and "Architects Forum" (high level) Last year we have been deploying two major release series (LCG 55 and 56) with 3 bug-fix releases on top for each series (a-c) Moving to new compilers (e.g. gcc 4.3 for slc5) Optionally releasing parts of ROOT separately Release infrastructure also used by outside LHC experiments (DayaBay, Memphis, Dusel) September, 30th 20093
4
SFT S o F T w a r e D e v e l o p m e n t f o r E x p e r i m e n t s External software External software Python Boost Qt Xerces GSL valgrind Grid … 100 pack ages AA projects AA projects ROOT POOL COOL CORAL RELAX X gcc 4.0 icc 11 gcc 3.4 gcc 4.3 llvm 2.4 vc 7.1 vc 9 32 bit 64 bit Common software Java LHC Experiment Software LHC Experiment Software AliRoot CMSSW LHCb / Gaudi Atlas / Athena Mac OSX (10.5) Linux (slc4, slc5) Windows (XP) The LHC Software Stack
5
SFT S o F T w a r e D e v e l o p m e n t f o r E x p e r i m e n t s Software Process and Infrastructure (2) Nightly build, testing and integration service Nightly builds are building/testing all software on all AA provided platforms (Scientific Linux, Windows, MacOS) Currently extending to new compiler suites (icc) for improving software robustness and moving forward to Mac OSX 10.6 Nightlies are used in "chains" with LHC experiments building on top Fast feedback loop about changes in AA software Currently integrating nightly builds with CernVM - almost finished Collaborative tools (HyperNews, Savannah) Savannah is highly used in LHC and outside (CERN/IT, Grid, etc.) HyperNews service has been migrated to e-groups/Sharepoint for all LHC experiments but CMS New effort for AA wide web infrastructure based on Drupal (uniform look and feel, better integration with each other) Infrastructure in general for the rest of the group September, 30th 20095
6
SFT S o F T w a r e D e v e l o p m e n t f o r E x p e r i m e n t s Savannah Usage September, 30th 20096 Postings per monthRegistered users Bugs per experiment/project typesRegistered projects
7
SFT S o F T w a r e D e v e l o p m e n t f o r E x p e r i m e n t s MULTI-CORE R&D (WP8) September, 30th 20097
8
SFT S o F T w a r e D e v e l o p m e n t f o r E x p e r i m e n t s Main Goals Investigate software solutions to efficiently exploit the new multi-core architectures of modern computers that experiments will need to solve Memory, which will get worse as we go to higher luminosities CPU efficiency to keep up with computational needs Ongoing investigations covering four areas: Parallelization at event level using multiple processes Parallelization at event level using multiple threads High granularity parallelization of algorithms Optimization of memory access to adapt to new memory hierarchy Collaboration established with LHC experiments, Geant4, ROOT and OpenLab (CERN/IT) September, 30th 20098
9
SFT S o F T w a r e D e v e l o p m e n t f o r E x p e r i m e n t s Current Activities Close interaction with experiments (bi-weekly meetings, reports in AF) Workshops each six months (latest in June with IT on “deployment”) Training and working sessions in collaboration with OpenLab and Intel ATLAS has developed a prototype using fork & COW CMS is investigating the same route and the use of OpenMP in algorithms Gaudi team is investigating parallelization at Python level Geant4 has developed a multi-thread prototype ROOT is developing PROOF-light and use of multi-threads in the I/O A parallel version of Minuit (using OpenMP and MPI) already released in ROOT 5.24 A library to use shared memory has been developed in the project September, 30th 20099
10
SFT S o F T w a r e D e v e l o p m e n t f o r E x p e r i m e n t s Parallelization of Gaudi Framework September, 30th 200910 No change needed in user code or configuration Equivalent output (un-ordered events)
11
SFT S o F T w a r e D e v e l o p m e n t f o r E x p e r i m e n t s Exploit Copy on Write (COW) Modern OS share read-only pages among processes dynamically A memory page is copied and made private to a process only when modified Prototype in Atlas and LHCb Encouraging results as memory sharing is concerned (50% shared) Concerns about I/O (need to merge output from multiple processes) 11 September, 30th 2009 Memory (ATLAS) One process: 700MB VMem and 420MB RSS COW: (before) evt 0: private: 004 MB | shared: 310 MB (before) evt 1: private: 235 MB | shared: 265 MB... (before) evt50: private: 250 MB | shared: 263 MB See Sebastien Binet’s talk @ CHEP09
12
SFT S o F T w a r e D e v e l o p m e n t f o r E x p e r i m e n t s Exploit “Kernel Shared Memory” KSM is a linux driver that allows dynamically sharing identical memory pages between one or more processes. It has been developed as a backend of KVM to help memory sharing between virtual machines running on the same host. KSM scans just memory that was registered with it. Essentially this means that each memory allocation, sensible to be shared, need to be followed by a call to a registry function. CMS reconstruction of real data (Cosmics with full detector) No code change 400MB private data; 250MB shared data; 130MB shared code ATLAS No code change In a Reconstruction job of 1.6GB VM, up to 1GB can be shared with KSM 12September, 30th 2009
13
SFT S o F T w a r e D e v e l o p m e n t f o r E x p e r i m e n t s Parallel MINUIT Minimization of Maximum Likelihood or 2 requires iterative computation of the gradient of the NLL function Execution time scales with number θ free parameters and the number N of input events in the fit Two strategies for the parallelization of the gradient and NLL calculation: Gradient or NLL calculation on the same multi-cores node (OpenMP) Distribute Gradient on different nodes (MPI) and parallelize NLL calculation on each multi-cores node (pthreads): hybrid solution 13September, 30th 2009 Alfio Lazzaro and Lorenzo Moneta
14
SFT S o F T w a r e D e v e l o p m e n t f o r E x p e r i m e n t s Multi-Core R&D: Outlook Recent progress shows that we shall be able to exploit next generation multi-core with “small” changes to HEP code Exploit copy-on-write (COW) in multi-processing (MP) Develop an affordable solution for the sharing of the output file Leverage Geant4 experience to explore multi-thread (MT) solutions Continue optimization of memory hierarchy usage Study data and code “locality” including “core-affinity” Expand Minuit experience to other areas of “final” data analysis, such as machine learning techniques Investigating the possibility to use GPUs and custom FPGAs “Learn” how to run MT/MP jobs on the grid workshop at CERN, June 25th-26th: http://indico.cern.ch/conferenceDisplay.py?confId=56353 http://indico.cern.ch/conferenceDisplay.py?confId=56353 Collaboration established with CERN/IT and LCG 14 September, 30th 2009
15
SFT S o F T w a r e D e v e l o p m e n t f o r E x p e r i m e n t s Explore new Frontier of parallel computing Scaling to many-core processors (96-core processors foreseen for next year) will require innovative solutions MP and MT beyond event level Fine grain parallelism (OpenCL, custom solutions?) Parallel I/O WP8 is continuously transferring technologies and artifacts to the experiments: it will allow LHC experiments to best use current computing resources Computing technology will continue its route toward more and more parallelism: a continuous investment in following the technology and provide solutions adequate to the most modern architectures is instrumental to best exploit the computing infrastructure needed for the future of CERN 15September, 30th 2009
16
SFT S o F T w a r e D e v e l o p m e n t f o r E x p e r i m e n t s VIRTUALIZATION R&D (WP9) September, 30th 200916
17
SFT S o F T w a r e D e v e l o p m e n t f o r E x p e r i m e n t s Problem Software @ LHC Millions of lines of code Different packaging and software distribution models Complicated software installation/update/configuration procedure Long and slow validation and certification process Very difficult to roll out major OS upgrade (SLC4 -> SLC5) Additional constraints imposed by the grid middleware development Effectively locked on one Linux flavour Whole process is focused on middleware and not on applications How to effectively harvest multi and many core CPU power of user laptops/desktops if LHC applications cannot run in such environment? September, 30th 200917
18
SFT S o F T w a r e D e v e l o p m e n t f o r E x p e r i m e n t s Virtualization R&D Project Aims to provide a complete, portable and easy to configure user environment for developing and running LHC data analysis locally and on the Grid Code check-out, edition, compilation, local small test, debugging, … Grid submission, data access… Event displays, interactive data analysis No user installation of software required suspend/resume capability Independent of physical software and hardware platform (Linux, Windows, MacOS) September, 30th 200918
19
SFT S o F T w a r e D e v e l o p m e n t f o r E x p e r i m e n t s Virtualizing LHC applications Starting from experiment software… …ending with a custom Linux specialised for a given task September, 30th 200919
20
SFT S o F T w a r e D e v e l o p m e n t f o r E x p e r i m e n t s Developing CernVM Quick development cycles In close collaboration with experiments Very good feedback from enthusiastic users Planning presented at kickoff Workshop First release ahead of plan (one year ago) JFMAMJJASOND Release 1.2 - Release 1.3.3 2nd Workshop - Release 1.3.4 CSC2009 Release 1.4.0 Release 1.6.0 (Final) Release 2.0 (SL5) 2009 JFMAMJJASOND Preparation Release 0.5 - Release 0.6 On time! Kickoff Workshop - Release 0.8 - Release 0.7 - Release 0.91 (RC1) -Release 0.92 (RC2) 2008 Semi production operation in 2009 Stable and Development branches Dissemination 2nd Workshop (organized with IT) ACAT, HEPIX, CHEP09 CERN School of Computing 2009 Tutorials and presentations to experiments September, 30th 200920
21
SFT S o F T w a r e D e v e l o p m e n t f o r E x p e r i m e n t s ~1200 different IP addresses Where are CernVM users? September, 30th 200921
22
SFT S o F T w a r e D e v e l o p m e n t f o r E x p e r i m e n t s Download & Usage statistics September, 30th 200922
23
SFT S o F T w a r e D e v e l o p m e n t f o r E x p e r i m e n t s Transition from R&D to Service September, 30th 200923
24
SFT S o F T w a r e D e v e l o p m e n t f o r E x p e r i m e n t s CernVM Infrastructure We have developed, deployed and operating highly available service infrastructure to support 4 LHC Experiments and LCD If CernVM is to be transformed to proper 24x7 service, this will have to be moved to IT premises An opportunity to transfer know how and perhaps start collaborating on these issues September, 30th 200924
25
SFT S o F T w a r e D e v e l o p m e n t f o r E x p e r i m e n t s Continuing R&D Working on scalability and performance improvements of CVMFS P2P on LAN, CDN on WAN SLC5 compatibility Will be addressed in CernVM 2.0 CernVM as job hosting environment Ideally, users would like to run their applications on the grid (or cloud) infrastructure in exactly the same conditions in which they were originally developed CernVM already provides development environment and can be deployed on cloud (EC2) One image supports all four LHC experiments Easily extensible to other communities September, 30th 200925
26
SFT S o F T w a r e D e v e l o p m e n t f o r E x p e r i m e n t s Multi-core and Virtualization Workshop Last June we had a workshop on adapting applications and computing services to multi-core and virtualization Organized in conjunction with IT Participation from vendors, experiments and CERN-IT and Grid service providers The goals were Familiarize with latest technologies and the current industry trends Understand experiment applications with the new requirements introduced with virtualization and multi ‐ core Initial exploration of solutions A number of follow up actions identified Discussed them at the IT Physics Services meeting Following them up September, 30th 200926
27
SFT S o F T w a r e D e v e l o p m e n t f o r E x p e r i m e n t s FUTURE DIRECTIONS September, 30th 200927
28
SFT S o F T w a r e D e v e l o p m e n t f o r E x p e r i m e n t s Projects Summary The SFT activities can be summarized as Development, testing and validating common software packages Software services for the experiments User support and consultancy Provide people for certain roles in experiments Technology tracking and development Listening and responding to requests from experiments AF and other communications channels (e.g. LIM and AA meetings, Collaboration meetings, etc.) Direct participation to the experiments Follow technology trends and test new ideas Anticipate future needs from experiments Must keep up-to-date with this rapid changing field September, 30th 200928
29
SFT S o F T w a r e D e v e l o p m e n t f o r E x p e r i m e n t s Main Challenges Coping with the reduction of manpower We are [have been] forced to do more with less people Increase convergence between the projects To be more efficient and give more coherent view to our clients Some successes for far: nightlies, testing, savannah, web tools, etc. Many more things can be done but require temporary extra effort Incorporate new experiments/projects Minimal critical mass needed for each new activity Keep motivation during maintenance phase Spending more time on user support and maintenance than new developments September, 30th 200929
30
SFT S o F T w a r e D e v e l o p m e n t f o r E x p e r i m e n t s SFT evolution (not revolution) Support new experiments/projects (e.g. LCD, NA62) Take into account possible new requirements to evolve some of the software packages Provide people for certain roles ( e.g. Software Architects, Librarians, Coordinators, etc.) to new the projects to leverage from the expertise of the group The [AA] LCG structure has served well until now can continue with minor modifications Incorporate the new activities under the same structure The monitoring and review process should also include the new activities Regrouping the AA activities in a single group Organize, manage, distribute all the software packages as the “new CERN Library” Hosting specialists on MC and Data Analysis domains September, 30th 200930
31
SFT S o F T w a r e D e v e l o p m e n t f o r E x p e r i m e n t s Regrouping the AA activities Something we could follow-up is the idea of re-grouping the AA activities to the SFT group Currently the development/maintenance of the Persistency Framework packages (POOL, CORAL, COOL) are hosted in IT-DM group (~3 FTE) The rational has been to be closer to the Physics DB services The level of integration of these developments to the rest of the packages (e.g. ROOT) has suffered in general Benefits The activity can leverage from the expertise in the group Better integration with the rest of AA projects September, 30th 200931
32
SFT S o F T w a r e D e v e l o p m e n t f o r E x p e r i m e n t s The New CERN Library Provide a coherent and fairly complete set of software packages ready to be used by any existing and new HEP experiment Utility libraries, analysis tools, MC generators, full and fast simulation, reconstruction algorithms, etc. Reference place to look for some needed functionality Made from internally and externally developed products [fairly] independent packages including some integrating elements such as ‘dictionaries’, ‘standard interfaces’, etc. Support for I/O, interactivity, scripting, etc. Easy to configure, deploy and use by end-users Good packaging and deployment tools will be essential September, 30th 200932
33
SFT S o F T w a r e D e v e l o p m e n t f o r E x p e r i m e n t s Hosting Specialists The SFT [CERN] does not have experts on many scientific or technical domains that can contribute to ‘content’ of the new CERN Library Probably out of question to hire people like Fred James We can overcome this by inviting/offering short visits or PJAS/PDAS contracts E.g. Geant4 model development mainly carried out by PJAS E.g. Torbjorn Sjostrand hosted by SFT during development of Pythia8 We just need good contacts and some additional exploitation budget September, 30th 200933
34
SFT S o F T w a r e D e v e l o p m e n t f o r E x p e r i m e n t s Manpower Plans In depth planning exercise prepared in 2005 Still valid for baseline program Does not take into account the R&D and new possible commitments Ideal level of resources needed by project (baseline) Simulation: 5 STAF, 1 FELL, 1 TECH, 2 PJAS. 1 PDAS ROOT: 6 STAF, 1 FELL, 1 TECH, 1 PJAS SPI: 2 STAF, 1 FELL September, 30th 200934
35
SFT S o F T w a r e D e v e l o p m e n t f o r E x p e r i m e n t s Manpower Summary We are at level that is just sufficient for what we need to do (as it was planned in 2005) Besides the R&D activities all the others are long-term activities People ending their contract or retiring should be replaced Applied Fellow level should be maintained In principle any new commitment should be accompanied with the corresponding additional resources September, 30th 200935
36
SFT S o F T w a r e D e v e l o p m e n t f o r E x p e r i m e n t s Concluding Remarks The “raison d'être” of the group as well as its mandate is still valid It is a very effective and efficient way of providing key scientific software components to the CERN experimental program Instrumental to the leadership of CERN in HEP software The main focus continues to be the LHC for many years but we should slightly modified the scope to include new activities We should continue in the direction of increasing the synergy and coherency between the different projects hosted in the group The group is ready to face the challenges imposed by the physics analysis of the LHC experiments We should start planning on how and what to incorporate from the R&D activities into the baseline The staffing level that is just sufficient for what we need to do September, 30th 200936
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.