Download presentation
Presentation is loading. Please wait.
Published byJanis Douglas Modified over 9 years ago
1
16.07.2015HSM Meeting - HPC - FS High Performance Computing (HPC) Support from IT 1 Historical Overview of IT Computing Support & Present State Present Situation for the LIU SC Studies Proposal for HPC Support from IT (Primary Goal) Main Secondary Goals Conclusion
2
16.07.2015HSM Meeting - HPC - FS Historical Overview of IT Computing Support (1/3) 2 The IT Support for the systematic dynamic aperture studies for the LHC can be considered ideal! Both Hardware & Software Support have been State of the Art at any moment in time. This support however is in large part due personal initiative between members of IT & ABP (and their predecessors). IT had tolerated this successful collaboration at best. They did provide help with installation & maintenance of the hardware.
3
16.07.2015HSM Meeting - HPC - FS Historical Overview of IT Computing Support (2/3) 3 Hardware CRAY Supercomputer DEC Workstation PC Cluster Screen Saver Distributed Computing (BOINC, LHC@Home) (Play console chips, GPU) Financing LHC Project leader (Lyn Evans) paid a total of about 800’000 CHF for several options. BE had for the last PC farm upgrade ~150’000 CHF.
4
16.07.2015HSM Meeting - HPC - FS Historical Overview of IT Computing Support (3/3) 4 Software: Decade long collaboration with Eric McIntosh and to lesser but still significant level by Harry Renshall Making SixTrack a reliable high speed tracker now with world wide reputation. Speed and other Optimization of the code and continuous maintenance, E.G. Code quality, Vectorization, Speed optimization of critical loops, etc For BB studies: approximate but very fast complex error function in collaboration with George Erskine. Guaranteed hardware independent bit-by-bit precision of the results. ➔ BOINC Check-point/restart ➔ BOINC
5
16.07.2015HSM Meeting - HPC - FS Present Support State 1/2 5 IT no longer supports general package libraries like CERNLIB (done by PH until 2006) ➔ no more mathematicians like Erskine. IT also no longer provide any program support ➔ experts like McIntosh or Renshall can no longer be found in IT. ➔ In principle they simply cannot help us with our Code Development! They now have a mandate to support CERN critical computation efforts, E.G. since quite some time they maintain & update our ~400 PC cluster for LHC simulations without charging us. As I understand they will agree to “reasonable” enlargement of our cluster following a justified request (may take several months). They offer a fair-share procedure to avoid an idling system and one may boost the number of available boxes by a factor of 2 when urgent simulation campaigns require more computing facilities.
6
16.07.2015HSM Meeting - HPC - FS Present Support State 2/2 6 Due to severe lack of IT support and discussion with Oliver and Paul I have been asked to present our BE/ABP situation in the IT Service Review Meeting (ITSRM) end of 2009 resulting in: Keep AFS Tools No Software Support possible! Keep NAG tools Get back BOINC BE-IT Forum ➔ Accelerator Sector treated like a LHC experiment
7
16.07.2015HSM Meeting - HPC - FS Present Situation for the LIU SC Studies 1/3 7 During the last couple of years we have been worked hard to prepare for the systematic studies on the LHC pre- accelerators PSB, PS & SPS: Preparing several SC PIC and frozen SC codes Benchmarking various Codes Benchmarking the Codes with Experiments Optimization of the non-linear Models of the Machines Even advancements in our theoretical Understanding In fall we will summarize our results and prepare for systematic SC studies for LIU.
8
16.07.2015HSM Meeting - HPC - FS Present Situation for the LIU SC Studies 2/3 8 For our studies IT has provided 40 boxes with 48 cores, i.e. 2’000 cores total. These boxes are reasonable powerful but no longer top notch! These boxes are old and no longer produced and therefore there cannot be any upgrades we might get more of the same! This system was just enough to satisfy the base need of the PSB, while too slow for long-term studies both for the PSB & PS. On the other hand long-term simulation with the frozen SC in MAD-X over 800’000 turns take 10 days of sequential simulation while for the PIC simulations a few 1’000 take weeks on the 48 core machines. Since the SPS hasn’t yet fully started SC simulations we have to wait for the requirements for that machine.
9
16.07.2015HSM Meeting - HPC - FS Present Situation for the LIU SC Studies 3/3 9 In essence to reach about 10’000 turns to cover the initial phase where the self-consistent effects are most crucial we would need a system with; Better scaler speed More cores per box to take advantage of the scaling with the number of cores. This can be improved making use use of clusters that are geared for HPC like CNAF or EPFL. We are in contact with them. However, we would be required to pay for their services. If we could convince IT to provide at least the hardware of a sufficiently large system this would be of course be advantageous!
10
CERN, red CNAF-Bologna, blue (Please ignore the green curve!)
11
16.07.2015HSM Meeting - HPC - FS Proposal for HPC Support from IT (Primary Goal) 1/2 11 Bernd Panzer from IT has been in charge of providing computing resources at CERN since quite some time, E.G. he has provided us with the 48 core systems including the maintenance without charging BE (again close to zero help with the code issue!). During the last few months I have been discussing with him about a potential system of 16 core machines linked with INFINIBAND networking to create roughly 200 cores per system which fits with our scaling tests.
12
16.07.2015HSM Meeting - HPC - FS Proposal for HPC Support from IT (Primary Goal) 2/2 12 The idea is to provide as an initial system 10 of those systems, i.e. another 2’000 cores but 4 times faster (conservative estimate) than our 48 core system. The hope was that IT would decide to go for HPC but due to the financial considerations they have put it on ice. But there is hope that a request from our BE department head will be sufficient to convince IT to look into this more seriously.
13
16.07.2015HSM Meeting - HPC - FS Main Secondary Goals 13 Once we have convinced IT to agree to provide an HPC system there will be a long list of secondary requests: A.How long will it take ➔ Might be 9 months but that is still okay for the LIU studies. B.Fairshare ➔ It depends if there will be other users that will need a multi-core systems. In fact, it appears that there is another request from theory. NO COSTS! C.Upgrade Hardware ➔ This remains to be seen since due to parallel structure all machines must be at equal speed! D.Progression of the system ➔ Dependent on our usage we might ask for 50% growth per year of active work.
14
16.07.2015HSM Meeting - HPC - FS Conclusions 14 1.For the LIU systematic Studies a substantial Speed-Up would be highly desirable ➔ This can only be achieved with HPC facilities. 2.IT is on the verge to adopt HPC in their mandate but presently it is on ice. 3.IT’s mandate includes support of CERN’s critical computing needs ➔ Our proposal would be covered. 4.The primary Goal is to get HPC facilities from IT ➔ This will require a inter-departmental request from the BE head. 5.There are several secondary goals that will have to be addressed once IT accepts to cover HPC.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.