LCLS2 Data Reduction Pipeline Preparations for Serial Femtosecond Crystallography Chuck Yoon HDRMX, Mar 16, 2017.

Slides:



Advertisements
Similar presentations
LCLS Data Systems Amedeo Perazzo SLAC HSF Workshop, January 20 th 2015.
Advertisements

HLT - data compression vs event rejection. Assumptions Need for an online rudimentary event reconstruction for monitoring Detector readout rate (i.e.
Hadoop Team: Role of Hadoop in the IDEAL Project ●Jose Cadena ●Chengyuan Wen ●Mengsu Chen CS5604 Spring 2015 Instructor: Dr. Edward Fox.
Hunt for Molecules, Paris, 2005-Sep-20 Software Development for ALMA Robert LUCAS IRAM Grenoble France.
1 S. E. Tzamarias Hellenic Open University N eutrino E xtended S ubmarine T elescope with O ceanographic R esearch Readout Electronics DAQ & Calibration.
IceCube DAQ Mtg. 10,28-30 IceCube DAQ: “DOM MB to Event Builder”
PITZ – Introduction to the Video System Stefan Weiße DESY Zeuthen June 10, 2003.
Igor Gaponenko ( On behalf of LCLS / PCDS ).  An integral part of the LCLS Computing System  Provides:  Mid-term (1 year) storage for experimental.
Data Acquisition for the 12 GeV Upgrade CODA 3. The good news…  There is a group dedicated to development and support of data acquisition at Jefferson.
Detectors for Light Sources Contribution to the eXtreme Data Workshop of Nicola Tartoni Diamond Light Source.
Next Generation Operating Systems Zeljko Susnjar, Cisco CTG June 2015.
DAQ for 4-th DC S.Popescu. Introduction We have to define DAQ chapter of the DOD for the following detectors –Vertex detector –TPC –Calorimeter –Muon.
Dec.11, 2008 ECL parallel session, Super B1 Results of the run with the new electronics A.Kuzmin, Yu.Usov, V.Shebalin, B.Shwartz 1.New electronics configuration.
VLVnT09A. Belias1 The on-shore DAQ system for a deep-sea neutrino telescope A.Belias NOA-NESTOR.
High Speed Detectors at Diamond Nick Rees. A few words about HDF5 PSI and Dectris held a workshop in May 2012 which identified issues with HDF5: –HDF5.
Sep. 17, 2002BESIII Review Meeting BESIII DAQ System BESIII Review Meeting IHEP · Beijing · China Sep , 2002.
String-18 Development Workshop, Overview Azriel Goldschmidt AMANDA/ICECUBE Collaboration Meeting Berkeley March 2002.
1 Electronics Status Trigger and DAQ run successfully in RUN2006 for the first time Trigger communication to DRS boards via trigger bus Trigger firmware.
1 Tracker Software Status M. Ellis MICE Collaboration Meeting 27 th June 2005.
CODA Graham Heyes Computer Center Director Data Acquisition Support group leader.
IceCube DAQ Mtg. 10,28-30 IceCube DAQ: Implementation Plan.
FTK high level simulation & the physics case The FTK simulation problem G. Volpi Laboratori Nazionali Frascati, CERN Associate FP07 MC Fellow.
IRFU The ANTARES Data Acquisition System S. Anvar, F. Druillole, H. Le Provost, F. Louis, B. Vallage (CEA) ACTAR Workshop, 2008 June 10.
DAQ and Trigger for HPS run Sergey Boyarinov JLAB July 11, Requirements and available test results 2. DAQ status 3. Trigger system status and upgrades.
 13 Readout Electronics A First Look 28-Jan-2004.
IMAGE PROCESSING is the use of computer algorithms to perform image process on digital images   It is used for filtering the image and editing the digital.
Visual Information Processing. Human Perception V.S. Machine Perception  Human perception: pictorial information improvement for human interpretation.
THIS MORNING (Start an) informal discussion to -Clearly identify all open issues, categorize them and build an action plan -Possibly identify (new) contributing.
HTCC coffee march /03/2017 Sébastien VALAT – CERN.
Giovanna Lehmann Miotto CERN EP/DT-DI On behalf of the DAQ team
Mathematical Derivation of Probability
Slow Control and Run Initialization Byte-wise Environment
DAQ ACQUISITION FOR THE dE/dX DETECTOR
Slow Control and Run Initialization Byte-wise Environment
Use of FPGA for dataflow Filippo Costa ALICE O2 CERN
Serial Femtosecond Crystallography
Mitchell Cox University of the witwatersrand, johannesburg SAIP 2014
Complications and Troubleshooting
WP18, High-speed data recording Krzysztof Wrona, European XFEL
Compression & Huffman Codes
Crystallography images
SCADA for Remote Industrial Plant
Applying Control Theory to Stream Processing Systems
LHC experiments Requirements and Concepts ALICE
Enrico Gamberini, Giovanna Lehmann Miotto, Roland Sipos
for the Offline and Computing groups
ALICE – First paper.
HERD Prototype Beam Test
PADME L0 Trigger Processor
Commissioning of the ALICE HLT, TPC and PHOS systems
Trigger, DAQ, & Online: Perspectives on Electronics
So far we have covered … Basic visualization algorithms
DCH Electronics Upgrade: Overview and Status
Graeme Winter STFC Computational Science & Engineering
Vertex 2005 November 7-11, 2005 Chuzenji Lake, Nikko, Japan
ALICE Computing Upgrade Predrag Buncic
Computing Infrastructure for DAQ, DM and SC
A First Look J. Pilcher 12-Mar-2004
Overview What is Multimedia? Characteristics of multimedia
Yu Su, Yi Wang, Gagan Agrawal The Ohio State University
Barron Montgomery1,2, Christopher O’Grady2, Chuck Yoon2, Mark Hunter2+
Example of DAQ Trigger issues for the SoLID experiment
Neurochip3.
SFX and Laue Diffraction
John Harvey CERN EP/LBC July 24, 2001
Characteristics of Reconfigurable Hardware
Tracker Software Status
Wavelet “Block-Processing” for Reduced Memory Transfers
The CMS Tracking Readout and Front End Driver Testing
LUSI Controls and Data Systems W.B.S. 1.6
Presentation transcript:

LCLS2 Data Reduction Pipeline Preparations for Serial Femtosecond Crystallography Chuck Yoon HDRMX, Mar 16, 2017

Serial Femtosecond Crystallography One Million Hertz LCLS management The Linac Coherent Light Source (LCLS) is world’s first hard X-ray free-electron laser operating at 120 Hz. European XFEL is coming online soon operating at 27kHz. LCLS-II and LCLS-II HE will utilize more of the linac to deliver unprecedented data rates.

Serial Femtosecond Crystallography (SFX) Setup Detector Diffraction before destruction Number of pulses/sec: 120 Millions of diffraction patterns from crystals Classification Problem: Hit or Miss?

SFX processing pipeline Orientation Determination Diffraction Pattern Hit / Miss? .XTC .CXI Bad pixel Mask .STREAM 3D merge Gain Correction Event-level Parallelism For Data Reduction Common mode Correction Phase Retrieval Pedestal Correction Protein Structure

Diffraction Pattern CSPAD 2.3M pixels Bragg spots from a crystal SFX at the LCLS

SFX processing pipeline Orientation Determination Diffraction Pattern Hit / Miss? .XTC .CXI Bad pixel Mask .STREAM 3D merge Gain Correction Common mode Correction Phase Retrieval Pedestal Correction Protein Structure

Peak finding Droplet Algorithm Hit: Peaks >= 15 SFX data analysis at LCLS

SFX processing pipeline Orientation Determination Diffraction Pattern Hit / Miss? .XTC .CXI Bad pixel Mask .STREAM 3D merge Gain Correction Common mode Correction Phase Retrieval Pedestal Correction Protein Structure

Orientation Determination Space group: P21 a = 51.20Å b = 98.27Å c = 53.39Å α = 90.51° β = 112.79° γ = 90.14° Orientation Determination SFX data analysis at LCLS

SFX processing pipeline Orientation Determination Diffraction Pattern Hit / Miss? .XTC .CXI Bad pixel Mask .STREAM 3D merge Gain Correction Common mode Correction Phase Retrieval Pedestal Correction Protein Structure

LCLS-II Crystallography Data Rate By 2020, LCLS-II data rate: 120 Hz  20,000 Hz (167x) 2Mpix  4Mpix “One structure per second” experiments Data Rate LCLS LCLS-II

Why We Need a Handle on Throughput Lossless compression will help but it is not the long term answer There are hard limit on what we can do with lossless compression intrinsically related to the nature of the data (see next slide) Without on-the-fly data reduction we could face hardware costs as high as $0.25B by 2026 Beside cost considerations, there are significant risks by not adopting on-the-fly data reduction: We may be unable to move the data to NERSC while keeping up with the throughput from the front end electronics We may need a bigger data center than we can handle The overall offline system may become too complex for us to handle (robustness issues, intermittent failures)

Why Lossless is Not The Solution (by itself) Information theory limits the best compression factor that can be theoretically attained based on the entropy of the data Example - CSPAD calibrated image (by Mikhail) Array entropy H(16-bit) = 5.844 (bits, out of 16) zlib level=1: data size (bytes) in/out = 4593957/2261808 = 2.031 time(sec)=0.085962 t(decomp)=0.020319 zlib level=9: data size (bytes) in/out = 4593957/2193889 = 2.094 time(sec)=1.802204 t(decomp)=0.020965 Generic compressor zlib (standard in HDF5) achieves factor 2 compression vs theoretical limit 2.7 (16/5.844)

What we are Proposing Develop a “toolbox” of different techniques: Veto (software trigger) Lossless compression Feature extraction sparsification (peak-finders, ROI) calculation (ROI integration, angular integration, time-tool values, binning, “cube”) data transformation (into space where data are sparse, e.g. wavelet JPEG-style compression) Multi event reduction MPEG style correlation over events (perhaps, although it breaks easy-parallelization-over-shots. Filipe Maia found it yielded 3x reduction for crystallography) Save programmable “pre-scaled” unreduced data use to verify that reduction does not compromise physics

LCLS-II Data System DAQ (per hutch) Data Reduction Pipeline (shared - 1 for NEH, partitionable) FFB (shared - 1 for NEH) Offline (shared by all) Timing (input) PM 100 PB 1 PB Data compression Event level veto R1 E0 PGP DTNs PGP NVME GPU R2 E1 PGP GPU R3 E2 Ethernet PGP IB EDR Ethernet 10 Gb/s GPU R4 IB EDR E3 PGP GPU PGP BLD GPU EN PGP RN GPU Fast Feedback (SSD) DTI PGP Event Builder Nodes DAQ Readout Nodes DRP Nodes Lustre Fast Feedback Analysis Farm Control Online Monitoring Nodes Factor of 10 reduction 1 MHz acquisition 250 GB/s 25 GB/s Offline Analysis Farm Data written in HDF5 format

Data Reduction Pipeline (DRP) Data reduction nodes: Receive raw data and calibration constants from DAQ Reduce data volume with lossless compression or feature extraction Event builder nodes: Veto storage of uninteresting events (less common) Additional compression where cross-readout-node information is important Data Reduction Nodes LCLS-II Data Reduction Pipeline Event Builder Nodes R1 E0 PGP GPU R2 E1 PGP GPU R3 E2 PGP IB switch R4 GPU E3 PGP GPU BLD PGP GPU PGP RN DAQ Monitoring node PGP

Hardware Technology Choices GPU Harder to program, but better performance for many image-processing algorithms Intel Phi Perhaps easier to program for the embarrassingly parallel LCLS case 68 slow-but-standard cores on a chip Potential advantage: used by Cori @NERSC FPGA Likely in front-end electronics for algorithms that change rarely (e.g. TimeTool, waveform digitizer)

Compression Benchmark SZ compression achieves a factor of x3.8 (guaranteed less than 1% pixel error). Original SZ compression Indexed patterns: 38203 Overall CC = 0.9823403 Indexed patterns: 38066 Overall CC = 0.9814232

Acknowledgements Veto missed events reliably? How well can we compress the data? Real-time feature extraction will be ideal. Only saving a subset of data? What metric can tell us to stop collecting data? LCLS: Christopher O’Grady Mikhail Dubrovin Silke Nelson Clemens Weninger Sioan Zohar ANL: Franck Cappello Dingwen Tao Di Sheng