Niko Neufeld, CERN. Trigger-free read-out – every bunch-crossing! 40 MHz of events to be acquired, built and processed in software 40 Tbit/s aggregated.

Niko Neufeld, CERN

Trigger-free read-out – every bunch-crossing! 40 MHz of events to be acquired, built and processed in software 40 Tbit/s aggregated throughput, about 500 data-sources with 100 Gigabit/s each More than 10000 optical fibres from the detector At least 2000 servers openlab technical WS 6/11/15 - Niko Neufeld 2

Detector front-end electronics Eventbuilder network Eventbuilder PCs/PCIe40 Eventfilter Farm ~ 80 subfarms Eventfilter Farm ~ 80 subfarms UX85B Point 8 surface subfarm switch TFC 500 6 x 100 Gbit/s subfarm switch Online storage Clock & fast commands ~ 9000 Versatile Links for DAQ ~ 9000 Versatile Links for DAQ throttle from PCIe40 Clock & fast commands 6 x 100 Gbit/s ECS openlab technical WS 6/11/15 - Niko Neufeld 3 300 m

Arria10 FPGA PCIe Gen3 x 16 == 100 Gbit/s Up to 48 optical input links Will have > 500 in experiment Used also by ALICE, and … maybe … who knows… 4 openlab technical WS 6/11/15 - Niko Neufeld

6.4 PB net storage / 12.8 raw openlab technical WS 6/11/15 - Niko Neufeld 5

480 - 960 optical fibres (on 40 – 80 MPO12) 10 2 U I/O servers with 2 x 100 Gbit/s interface 36 compute servers taking between 20 to 40 Gbit/s (each) 1 – 2 PB of storage ~ 40Tbit/s network I/O to network (full duplex) openlab technical WS 6/11/15 - Niko Neufeld 7

Vendor neutral Public tender every time Long lived facility > 10 years Has to grow “adiabatically” – unlike a super-computer we can’t throw away things after 4 years Upgradeable Cost, Cost, Cost Tight cost-efficient integration of compute, storage and network Should be flexible to also accommodate accelerators (Xeon/Phi, FPGA) if they prove efffective Power  electricity at CERN is cheap, but we want to be green and reduce running costs openlab technical WS 6/11/15 - Niko Neufeld 8

Need temporary storage to wait for calibration and alignment – and to profit from no-beam time Current model: completely local storage as a software RAID1 of 4 TB on each node File management by scripts and control software No common name-space 100% overhead Capacity oriented Streaming I/O only, single reader / single writer, typically max 4 streams / RAID set, aggregated I/O low 10 – 20 MB/s openlab technical WS 6/11/15 - Niko Neufeld 10

Operational: No common name-space Disk-failure during data-taking can cause several problems Controller or both disks failed  node needs to be excluded from data-taking Disk does not actually fail but becomes “slow” because of errors  node accumulates backlog of unprocessed data Rebuild can affect performance Inaccessible data (even temporary) block all data from further processing (because offline data-sets are treated as a “whole”) openlab technical WS 6/11/15 - Niko Neufeld 11

Basically disk and I/O requirements / node go up by 10x Need cost-efficient solution Still looks attractive to have disks in each node vs NAS / rack, disaggregated shelves  see challenge 1 Can we have better efficiency RAID5,6,7? Would love to have common name-space, posix or not? openlab technical WS 6/11/15 - Niko Neufeld 12

Niko Neufeld, CERN. Trigger-free read-out – every bunch-crossing! 40 MHz of events to be acquired, built and processed in software 40 Tbit/s aggregated.

Similar presentations

Presentation on theme: "Niko Neufeld, CERN. Trigger-free read-out – every bunch-crossing! 40 MHz of events to be acquired, built and processed in software 40 Tbit/s aggregated."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Niko Neufeld, CERN. Trigger-free read-out – every bunch-crossing! 40 MHz of events to be acquired, built and processed in software 40 Tbit/s aggregated.

Similar presentations

Presentation on theme: "Niko Neufeld, CERN. Trigger-free read-out – every bunch-crossing! 40 MHz of events to be acquired, built and processed in software 40 Tbit/s aggregated."— Presentation transcript:

Similar presentations

About project

Feedback