High Speed Detectors at Diamond Nick Rees
A few words about HDF5 PSI and Dectris held a workshop in May 2012 which identified issues with HDF5: –HDF5 was (is) single threaded Makes it difficult to benefit from parallel systems. –pHDF5 could not store compressed data Compression is essential to reduce the data sizes. –HDF5 files could not be open for read and write simultaneously. Analysis software could not run while data taking
What did we do? Direct chunk write –Funded by Dectris and PSI –Released in HDF5 version in May 2013 Dynamically loaded filters –Funded by DESY –Also released in HDF5 version Single Writer/Multiple Reader –Funded by Diamond, ESRF and Dectris –Available in HDF scheduled for March 2016 Virtual data set. –Funded by Diamond, DESY and the Percival project. –Also available in HDF
Diamond History Early 2007: –Diamond first user. –No detector faster than ~10 MB/sec. Early 2009: –first Lustre system (DDN S2A9900) –first Pilatus 6M 60 MB/s. Early 2011: –second Lustre system (DDN SFA10K) –first 25Hz Pilatus 6M MB/s. Early 2013: –first GPFS system (DDN SFA12K) –First 100 Hz Pilatus 6M 600 MB/sec –~10 beamlines with 10 GbE detectors (mainly Pilatus and PCO Edge). 2016: –Percival and parallel Eiger 16M (5000 MB/sec). Doubling time = 7.5 months
Diamond Parallel Detectors Excalibur –Medipix sensor array developed with STFC 2048x1532, 3M, 100 Hz, 5 Gb/s Uses 6x1 Gbps servers. –Commissioned in Percival –Low energy MAPS array with 3 proposed configurations: 1484x1408 2M, 300 Hz, 20 Gb/s 3710x M, 120 Hz, 50 Gb/s –First system in 2016 Eiger (proposal) –Talking to Dectris about combining 4x4M sensors ~200 Gb/s uncompressed data rate. Compression works better at high data rates because there are fewer photons/frame. Scientific benefit is to be able to monitor sample degradation. –Delivery 2016
Diamond Data Flow Model
General Parallel Detector Design Transfer between FPGA and processing nodes is UDP so need to be careful about handling packets Switch has deep buffers so links can have momentary (factor 10) overloads – tested HP 5920 Deep-Buffer Switch. Processing nodes all write in parallel to parallel file system. Use VDS to map parallel data streams into one dataset. Typically LVDS SensorFPGA Readout boards Front End 1 Front End 2 10Gbit/s links Switch 10Gbit/s links Processing nodesFile storage 10Gbit/s links PC servers Parallel writing of HDF5 files PC
Temporal and spatial processing distribution Computer 1 Frame 1 Frame 2 Frame 3 Frame 4 Computer 2Computer 3Computer 4
Data Receiver Readout Hardware (n copies) Data Processing: 2 bit gain handling DCS subtraction Pixel re-arrangement Rate correction(?) Flat field Dark subtraction Efficiency correction Data Processing: 2 bit gain handling DCS subtraction Pixel re-arrangement Rate correction(?) Flat field Dark subtraction Efficiency correction Data Compression HDF5 File Writer Detector API Control and Data Protocol Control Hardware Configuration Control Server Control Driver Cmd Status HDF5 file Python/ Matlab Tango/ Lima EPICS/ Area Detector Detector Control Software Detector Data Stream (n copies) Controlled Interfaces Beamline Control Software Detector Engineer Software Key Actual/potential network or CPU socket boundaries
Use of existing features - chunking areaDetector is well suited to high speed detectors –multi-threading to take advantage of modern CPU’s –support for high performance libraries such as hdf5. hdf5 file support provides –complex data structures –ability to store multiple frames in one file, reducing file system overhead. –parallel writing –chunking, to optimise read back speeds in, for example, tomography
Why the new features are important Direct chunk write –In library compression is limited to ~500 MB/sec/process of raw data. –Direct chunk write can write at 2 GB/sec/process of compressed data. Dynamically loaded filters –Dectris has been developing LZ4 based filters that are very well suited to zero noise diffraction detectors. –Compression ratios typically a factor of 10. Single Writer/Multiple Reader –Particularly useful for slow data taking Spectroscopic mapping experiments can take several hours and needs immediate feedback. –Now can use off-line data analysis software on-line Virtual data set –Main use is for high speed parallel data writing Datasets can be compressed as well as written in parallel. –Supports different write patterns for different detectors. –Can be used for combining sparse scans in larger datasets.
What next? Most Exascale developments are looking at object storage systems rather than traditional file systems –Intel is working with LLNS and The HDF Group on Extreme-Scale Computing Research and Development (Fast Forward) Storage and I/O. –Seagate is working on ProjectSAGE in a consortium of 10 companies. Both aim match processing and storage so processing can happen closer to storage in data dominated applications. Both are looking at special interfaces to HDF5
Summary Detectors are challenging our ways of taking and storing data. We continue to work with The HDF Group to enhance HDF5 for our applications. Open Source Software is not free. –We need to sort out the HDF5 support model