Download presentation
Presentation is loading. Please wait.
Published byClaribel Cross Modified over 9 years ago
1
Why it might be interesting to look at ARM Ben Couturier, Vijay Kartik Niko Neufeld, PH-LBC SFT Technical Group Meeting 08/10/2012
2
The challenge for LHCb Major upgrade during LS2 Read out detector at bunch-xing rate 40 MHz No more hardware based trigger – need to filter 40 Million events / s (32 Tbit/s) in software Why look at ARM? N. Neufeld
3
GBT: custom radiation- hard link over MMF, 3.2 Gbit/s (about 10000) Input into DAQ network (10/40 Gigabit Ethernet or FDR IB) (1000 to 4000) Output from DAQ network into compute unit clusters (100 Gbit Ethernet / EDR IB) (200 to 400 links) Dataflow Why look at ARM? N. Neufeld Detector DAQ network 100 m rock Readout Units Compute Units
4
What will be the Compute Unit? Baseline could possibly be augmented with a co- processor card (like Intel MIC or a GPU) lots of interest from various groups Alternative 1: Use lower- power, cheaper x86 processors such as Intel Atom, AMD –Optimize HEPSpec/CHF/W Alternative 2: Or use non- Intel processors. Try to profit from the highly competitive and innovative market for processors for portable devices ARM Why look at ARM? N. Neufeld A compute unit is a destination for the event- data fragments from the readout units It assembles the fragments into a complete “event” and runs various selection algorithms on this event About 0.1 % of events is retained Baseline option: a high- density server platform (mainboard with standard CPUs) using Moore’s law and some estimates on the algorithms need 4000 to 5000 servers of the 2018 type!
5
ARM A “pure” RISC architecture (with some enhancements) A long tradition in the embedded market Billions of cores sold –in many variants –# cores / power vs performance Produced by various licensees Has a reputation of the best power-efficiency in the market Why look at ARM? N. Neufeld We are here 32-bit IEEE floats SIMD native Java offload Announced: 64-bit SIMD with DP floats
6
So what would a compute unit look like? Why look at ARM? N. Neufeld
7
Operational constraints The Online farms are very big –O(2000) servers, of different generations, vendors, Like a traditional data-centre with all the problems, and very few administrators and some simplifications: –A single client –In Online operation at least mostly a single work-load But want rack-mountable, remote-manageable, good mechanics, decent powering, vendor support etc… and of course low cost! –Don’t want to build this ourselves needs to fit in traditional data-centre structure Why look at ARM? N. Neufeld
8
Embedded in the data-centre Why look at ARM? N. Neufeld Boston Viridis (projects also from DELL and HP) Consists of 48 SoC 4 cores 4 GB RAM ARM A9 Cortex 1.4 GHz 80 Gb Ethernet switch Total 192 cores / 192 GB RAM / 300 Watt Exists also from DELL/HP
9
How fast is a core? Why look at ARM? N. Neufeld So we’ll need many
10
Is it worth it? ARM v7: 192 cores need 300 W and 2 U for about 520 HepSpecs X5650: 96 hyperthreads need about 1400 W and 2 U for 900 HEPSpecs If this ratio continues to hold into 2018 LHCb could do the upgrade with a 600 kW data-centre instead of a new (!) 2 MW one And maybe at some point we need to pay for the power Why look at ARM? N. Neufeld
11
The acid test HepSPEC is not necessarily a good test for Online usage –Online we (currently) run n instances of the same application in parallel, where n is the number of cores/hyperthreads –No “mixed” work-load – hyperthreading typically adds more in the Online “mono- culture” Need to benchmark using the High Level Trigger code Why look at ARM? N. Neufeld
12
Project: “Moore on ARM” Need to compile the LHCb software-stack (beginning from Root) Can compare with natively compiled code – everything works fine on the FC17 test- node, but compilation is slow – Root 5.34.02./configure linuxarm --enable- c++11; make –j 4 takes 30m43s Team (part-time only) Ben Couturier, Vijay Kartik, Niko Neufeld Why look at ARM? N. Neufeld
13
Future plans X-compiler chain ready Will now go on to compile stack Verification and bench-marking Then: full-scale test on fully loaded 192 core system (with a faster ARM – currently use A8 – will have A9 or A15), possibly including real network input (for fun) Why look at ARM? N. Neufeld
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.