Presentation is loading. Please wait.

Presentation is loading. Please wait.

Graeme Winter STFC Computational Science & Engineering

Similar presentations


Presentation on theme: "Graeme Winter STFC Computational Science & Engineering"— Presentation transcript:

1 Graeme Winter STFC Computational Science & Engineering
Data Analysis Systems Graeme Winter STFC Computational Science & Engineering

2 Caveat Methods developer for macromolecular crystallography (MX)
Synchrotron as a user facility Broad view

3 Online Proposal System
Context Online Proposal System User Office System incl.: User Database Scheduling Health and Safety Proposal Management Single Sign On Account Creation and Management Diagnostics Metadata Catalogue Data Acquisition System Data Analysis & Feedback DataPortal Storage Management System Data Pre-Reduction

4 Overview Data rates Users & expectations
From the outside: automation in MX Illustrative experiments Analysis methods & hardware Conclusions

5 Data Rates: XFEL guess 4TB = ~ 7 minutes (proto 1k)
= ~ 30 seconds (full 4k) @ 5120 frames / second

6 Aside: Problem in MX DAQ rates ~ 20% sustained @ 70MB/s
Typical data set ~ GB Data reduction memory / disk bandwidth limited Computer architecture (e.g. clusters / NFS) for data reduction... is no good!

7 User Questions Who are the users? What do they expect?

8 Expectations XFEL = Tool = Measurements or
XFEL = Experiment = Discoveries Latter is fine (somebody else’s problem) former presents us problems

9 Why compare to MX? Area detectors Users awkward!
Real time analysis quite advanced Frequently non-technical users

10 Beamline End-station GUI not script / CLI

11 Data Handling Detector systems: Distortion correction
Background subtraction Compression Add metadata

12 Automation in MX Measurements not experiment, however:
Measure carefully Sample lifetime unknown Sample generally uncharacterised

13 Automated Data Collection
Experimental systems exist: Characterise sample Measure data Perform initial data reduction

14 Automated Data Reduction
Expert systems exist to: Thoroughly reduce data Provide information for downstream analysis – spacegroup, resolution limits

15 MX: Summary Easy cases – automatic sample to structure
Hard cases – still hard but automation really helps Hardware & software is mature – suitable for biologist

16 Assumptions for discussion
Area detector will be tiled array Computing budget will be limited One objective is to image biological structures

17 Illustrative XFEL Experiments
Consider two kinds of “experiment”: Single molecule imaging Single-shot crystallography* / nano-MX *One shot per crystal, many tiny crystals

18 Assumptions: Single Molecule
Flow of molecules past the beam – by “magic” (SEP) Some pulses will hit molecule, some will not Will need ~ 109 images for useful reconstruction (Shneerson, 2008)

19 Outcome: Single Molecule
Emphasis will be on images – no time-dependence Filtering will be critical Reforming the images at the earliest possible time will be important Analysis build on existing reconstruction techniques

20 Analysis Filter / correct tiles Reconstruct images
Compute orientations Accumulate Assume perhaps 1 sample / few pulses

21 Assumptions: Nano-MX 1 sample / pulse train so 10 Hz rate so we have time series Sample will survive > 1 pulse ~ 104 samples per data set

22 Outcome: nMX Emphasis pixel time series Extrapolate to dose = 0
Later reconstruct and analyse data Will build on existing MX methods – including expert systems & CCP4 Not what XFEL was designed for but…

23 Analysis Filter / correct Construct d = 0 image with σ estimate
Index, cluster (different crystal forms) Integrate, rescale, cluster again Accumulate h, k, l, i, σ (i) Estimate remaining measurement time

24 Analysis Methods Filtering & correction Compression Analysis Feedback
Simpler Filtering & correction Compression Analysis Feedback More valuable

25 Filtering / correction
Identify “bad pixels” Remove “blank” images Deconvolve time structure of pulses / measurement effects Include corrections for scattering angle Incorporate metadata

26 Compression If SMI, much of the image will have zero / low counts
If nMX, make use of the fact that image j and j + 1 will be similar MX detectors already do this – CBF

27 Analysis SMI: how close to objective are we – how much can we do with remaining time? nMX: how much data are accumulated? What is the quality like?

28 Feedback SMI: Change sample rate / pulse structure – improve data
nMX: Improve data collection process / stop when experiment complete

29 Challenges Synchronisation with beam parameters
Image reconstruction from tiles Deconvolution of time structure Getting dynamic feedback Robustness

30 Computing Hardware Hardware at least as important as software
Detector is tiled (parallel) so emphasise parallel computing as far as possible

31 Factors Memory bandwidth – filtering, image transpose
Floating point horsepower – deconvolution, FFT Thanks to games…

32 Benefits Custom hardware (e.g. Cell, GPGPU) allows much more FP horsepower than vanilla cpu Memory can be managed properly Libraries available for e.g. FFT Analysis can be timed to fit

33 Costs Memory must be managed properly
Novel architecture (need to recompile) Hardware costs (complicated) Interfacing (I have no idea)

34 Conclusions XFEL DAQ with area detectors presents a challenge (evidently) User facility would require solid, well thought out computing infrastructure Automation & real-time feedback can help to get maximum value from XFEL

35 Conclusions Science determines computational architecture (e.g. time series vs. image) MX as a technique is mature, so good role model

36 Acknowledgements EU FP7 for supporting pre-XFEL work
EU FP6 BioXHit and UK BBSRC for supporting xia2 and DNA development Scientists & engineers at DL for coffee-time discussions


Download ppt "Graeme Winter STFC Computational Science & Engineering"

Similar presentations


Ads by Google