Download presentation
Presentation is loading. Please wait.
Published byEvan Greene Modified over 6 years ago
1
LCLS2 Data Reduction Pipeline Preparations for Serial Femtosecond Crystallography
Chuck Yoon HDRMX, Mar 16, 2017
2
Serial Femtosecond Crystallography
One Million Hertz LCLS management The Linac Coherent Light Source (LCLS) is world’s first hard X-ray free-electron laser operating at 120 Hz. European XFEL is coming online soon operating at 27kHz. LCLS-II and LCLS-II HE will utilize more of the linac to deliver unprecedented data rates.
3
Serial Femtosecond Crystallography (SFX) Setup
Detector Diffraction before destruction Number of pulses/sec: 120 Millions of diffraction patterns from crystals Classification Problem: Hit or Miss?
4
SFX processing pipeline
Orientation Determination Diffraction Pattern Hit / Miss? .XTC .CXI Bad pixel Mask .STREAM 3D merge Gain Correction Event-level Parallelism For Data Reduction Common mode Correction Phase Retrieval Pedestal Correction Protein Structure
5
Diffraction Pattern CSPAD 2.3M pixels Bragg spots from a crystal
SFX at the LCLS
6
SFX processing pipeline
Orientation Determination Diffraction Pattern Hit / Miss? .XTC .CXI Bad pixel Mask .STREAM 3D merge Gain Correction Common mode Correction Phase Retrieval Pedestal Correction Protein Structure
7
Peak finding Droplet Algorithm Hit: Peaks >= 15
SFX data analysis at LCLS
8
SFX processing pipeline
Orientation Determination Diffraction Pattern Hit / Miss? .XTC .CXI Bad pixel Mask .STREAM 3D merge Gain Correction Common mode Correction Phase Retrieval Pedestal Correction Protein Structure
9
Orientation Determination
Space group: P21 a = 51.20Å b = 98.27Å c = 53.39Å α = 90.51° β = ° γ = 90.14° Orientation Determination SFX data analysis at LCLS
10
SFX processing pipeline
Orientation Determination Diffraction Pattern Hit / Miss? .XTC .CXI Bad pixel Mask .STREAM 3D merge Gain Correction Common mode Correction Phase Retrieval Pedestal Correction Protein Structure
11
LCLS-II Crystallography Data Rate
By 2020, LCLS-II data rate: 120 Hz 20,000 Hz (167x) 2Mpix 4Mpix “One structure per second” experiments Data Rate LCLS LCLS-II
12
Why We Need a Handle on Throughput
Lossless compression will help but it is not the long term answer There are hard limit on what we can do with lossless compression intrinsically related to the nature of the data (see next slide) Without on-the-fly data reduction we could face hardware costs as high as $0.25B by 2026 Beside cost considerations, there are significant risks by not adopting on-the-fly data reduction: We may be unable to move the data to NERSC while keeping up with the throughput from the front end electronics We may need a bigger data center than we can handle The overall offline system may become too complex for us to handle (robustness issues, intermittent failures)
13
Why Lossless is Not The Solution (by itself)
Information theory limits the best compression factor that can be theoretically attained based on the entropy of the data Example - CSPAD calibrated image (by Mikhail) Array entropy H(16-bit) = (bits, out of 16) zlib level=1: data size (bytes) in/out = / = time(sec)= t(decomp)= zlib level=9: data size (bytes) in/out = / = time(sec)= t(decomp)= Generic compressor zlib (standard in HDF5) achieves factor 2 compression vs theoretical limit 2.7 (16/5.844)
14
What we are Proposing Develop a “toolbox” of different techniques:
Veto (software trigger) Lossless compression Feature extraction sparsification (peak-finders, ROI) calculation (ROI integration, angular integration, time-tool values, binning, “cube”) data transformation (into space where data are sparse, e.g. wavelet JPEG-style compression) Multi event reduction MPEG style correlation over events (perhaps, although it breaks easy-parallelization-over-shots. Filipe Maia found it yielded 3x reduction for crystallography) Save programmable “pre-scaled” unreduced data use to verify that reduction does not compromise physics
15
LCLS-II Data System DAQ (per hutch) Data Reduction Pipeline
(shared - 1 for NEH, partitionable) FFB (shared - 1 for NEH) Offline (shared by all) Timing (input) PM 100 PB 1 PB Data compression Event level veto R1 E0 PGP DTNs PGP NVME GPU R2 E1 PGP GPU R3 E2 Ethernet PGP IB EDR Ethernet 10 Gb/s GPU R4 IB EDR E3 PGP GPU PGP BLD GPU EN PGP RN GPU Fast Feedback (SSD) DTI PGP Event Builder Nodes DAQ Readout Nodes DRP Nodes Lustre Fast Feedback Analysis Farm Control Online Monitoring Nodes Factor of 10 reduction 1 MHz acquisition 250 GB/s 25 GB/s Offline Analysis Farm Data written in HDF5 format
16
Data Reduction Pipeline (DRP)
Data reduction nodes: Receive raw data and calibration constants from DAQ Reduce data volume with lossless compression or feature extraction Event builder nodes: Veto storage of uninteresting events (less common) Additional compression where cross-readout-node information is important Data Reduction Nodes LCLS-II Data Reduction Pipeline Event Builder Nodes R1 E0 PGP GPU R2 E1 PGP GPU R3 E2 PGP IB switch R4 GPU E3 PGP GPU BLD PGP GPU PGP RN DAQ Monitoring node PGP
17
Hardware Technology Choices
GPU Harder to program, but better performance for many image-processing algorithms Intel Phi Perhaps easier to program for the embarrassingly parallel LCLS case 68 slow-but-standard cores on a chip Potential advantage: used by FPGA Likely in front-end electronics for algorithms that change rarely (e.g. TimeTool, waveform digitizer)
18
Compression Benchmark
SZ compression achieves a factor of x3.8 (guaranteed less than 1% pixel error). Original SZ compression Indexed patterns: 38203 Overall CC = Indexed patterns: 38066 Overall CC =
19
Acknowledgements Veto missed events reliably? How well can we compress the data? Real-time feature extraction will be ideal. Only saving a subset of data? What metric can tell us to stop collecting data? LCLS: Christopher O’Grady Mikhail Dubrovin Silke Nelson Clemens Weninger Sioan Zohar ANL: Franck Cappello Dingwen Tao Di Sheng
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.