Autoprocessing updates at the MX beamlines

Slides:



Advertisements
Similar presentations
Experiment Workflow Pipelines at APS: Message Queuing and HDF5 Claude Saunders, Nicholas Schwarz, John Hammonds Software Services Group Advanced Photon.
Advertisements

University of Chicago Department of Energy The Parallel and Grid I/O Perspective MPI, MPI-IO, NetCDF, and HDF5 are in common use Multi TB datasets also.
LIBRA: Lightweight Data Skew Mitigation in MapReduce
Vector-Scanned Microcrystallographic Data Collection Techniques Malcolm Capel NE-CAT Dept. Chemistry & Chemical Biology Cornell University.
 Contents 1.Introduction about operating system. 2. What is 32 bit and 64 bit operating system. 3. File systems. 4. Minimum requirement for Windows 7.
Data Analysis I19 Upgrade Workshop 11 Feb Overview Short history of automated processing for Diamond MX beamlines Effects of adding Pilatus detectors.
Pricing Changes MSDN subscriptions Stand- alone tool Team collaboration Release management Visual Studio Team Foundation Server Device CAL Visual Studio.
Thomas Finnern Evaluation of a new Grid Engine Monitoring and Reporting Setup.
The HDF Group July 8, 2014HDF 2014 ESIP Summer Meeting HDF Product Designer Aleksandar Jelenak, H. Joe Lee, Ted Habermann The.
1 port BOSS on Wenjing Wu (IHEP-CC)
Three steps to sell Office Always ask every customer the following questions to get them interested in buying Office: Did you know that Office.
Test Of Distributed Data Quality Monitoring Of CMS Tracker Dataset H->ZZ->2e2mu with PileUp - 10,000 events ( ~ 50,000 hits for events) The monitoring.
Computing on the Cloud Jason Detchevery March 4 th 2009.
Alexandre A. P. Suaide VI DOSAR workshop, São Paulo, 2005 STAR grid activities and São Paulo experience.
Software Engineering for Business Information Systems (sebis) Department of Informatics Technische Universität München, Germany wwwmatthes.in.tum.de Data-Parallel.
The TARDIS Framework A Federated Repository Solution For Raw Diffraction Datasets Steve Androulakis, Monash University, Melbourne Australia I2S2 Workshop.
The TARDIS Framework A Federated Repository Solution For Raw Diffraction Datasets Steve Androulakis, Monash University, Melbourne Australia International.
1 FlexTraining in a Nutshell Welcome to a brief introduction of the FlexTraining Total e- Learning Solution. This short sample course will outline the.
Fermilab User Facility US-CMS User Facility and Regional Center at Fermilab Matthias Kasemann FNAL.
High-Throughput Crystallography at Monash Noel Faux Dept of Biochemistry and Molecular Biology Monash University.
CCGrid 2014 Improving I/O Throughput of Scientific Applications using Transparent Parallel Compression Tekin Bicer, Jian Yin and Gagan Agrawal Ohio State.
Wenjing Wu Andrej Filipčič David Cameron Eric Lancon Claire Adam Bourdarios & others.
What’s Working in the Real World The Quick Data Excel Add In.
Grid MP at ISIS Tom Griffin, ISIS Facility. Introduction About ISIS Why Grid MP? About Grid MP Examples The future.
CCGrid 2014 Improving I/O Throughput of Scientific Applications using Transparent Parallel Compression Tekin Bicer, Jian Yin and Gagan Agrawal Ohio State.
“Live” Tomographic Reconstructions Alun Ashton Mark Basham.
CCGrid, 2012 Supporting User Defined Subsetting and Aggregation over Parallel NetCDF Datasets Yu Su and Gagan Agrawal Department of Computer Science and.
High Speed Detectors at Diamond Nick Rees. A few words about HDF5 PSI and Dectris held a workshop in May 2012 which identified issues with HDF5: –HDF5.
ERDDAP The Next Generation of Data Servers Bob Simons DOC / NOAA / NMFS / SWFSC / ERD Monterey, CA Disclaimer: The opinions expressed.
Diamond update and kappa activities village structure 19 operational beamlines 5 operational MX beamlines one MX beamline (I23) under construction mini.
Transform® for Advanced Document Process Automation Andy Barnett-Picking K3BS AX Development Manager.
How To Build a Production-Ready SP 2013 Farm Martin Cox SharePoint / O365 Architect SharePoint 2013 BI Farm Setup Best Practices.
XRD data analysis software development. Outline  Background  Reasons for change  Conversion challenges  Status 2.
AUTOMATION OF MACROMOLECULAR DATA COLLECTION - INTEGRATION OF DATA COLLECTION AND DATA PROCESSING Harold R. Powell 1, Graeme Winter 1, Andrew G.W. Leslie.
ISPyB for MX at Diamond Pierre Aller. -Before beamtime Shipping preparation Sample registration -During beamtime Beamline status (remote) Puck allocation.
TRECVID IES Lab. Intelligent E-commerce Systems Lab. 1 Presented by: Thay Setha 05-Jul-2012.
Dr Andrew Peter Hammersley ESRF ESRF MX COMPUTIONAL AND NETWORK CAPABILITIES (Note: This is not my field, so detailed questions will have to be relayed.
DECTRIS Ltd Baden-Daettwil Switzerland Continuous Integration and Automatic Testing for the FLUKA release using Jenkins (and Docker)
GLAST LAT ProjectNovember 18, 2004 I&T Two Tower IRR 1 GLAST Large Area Telescope: Integration and Test Two Tower Integration Readiness Review SVAC Elliott.
HedEx Lite Obtaining and Using Huawei Documentation Easily
Eiger at the Australian Synchrotron
Understanding and Improving Server Performance
Pierre Aller ISPyB for MX at Diamond.
Python for data analysis Prakhar Amlathe Utah State University
ISPyB December 4th, 2013 From sample to data analysis: how to track every step of an experiment in the ISPyB database. Marjolaine Bodin, ESRF/EXP/Structural.
TRANSACTION PROCESSING SYSTEM (TPS)
Tools and Services Workshop
Joslynn Lee – Data Science Educator
Vocabulary byte - The technical term for 8 bits of data.
Eiger-Related Development at
An Open Source Project Commonly Used for Processing Big Data Sets
3 Best Blu-ray Copy Software to Copy Blu-ray
Remote Operations and Online Processing
NGS Oracle Service.
VI-SEEM Data Repository
Excel-to-PowerPoint Document Automation
(Dectris Eiger) HDF5 Stream Writer
Graeme Winter STFC Computational Science & Engineering
Vocabulary byte - The technical term for 8 bits of data.
Scientific computing in x-ray microscopy
Automation from a user perspective
What's New in eCognition 9
Tomography at Advanced Photon Source
ftp://ftp.mrc-lmb.cam.ac.uk/mosflm
Diamond is all about data…
What's New in eCognition 9
NSLS II High Data Rate Workshop May 2016
WINDOW 7 INSTALLATION Prepared By:- Mr. Pawan Kumar
Presentation transcript:

Autoprocessing updates at the MX beamlines Jun Aishima Postdoc – Monash University Advanced Molecular Imaging Centre of Excellence and Australian Synchrotron

Feedback to users Processing software Analysis software Introduction to MX Feedback to users Processing software Analysis software

MX processing system outline Expt. type Indexing screening dataset Availability MX1 only MX2 only Both Images collected 1-2 images 100 images 1800+ images Info obtained Unit cell parameters Dataset quality Dataset Example output I23 78, 78, 78, 90, 90, 90

Software used to process data at the MX beamlines Custom pipeline software – Nathan Mudie - https://github.com/AustralianSynchrotron/mx-auto-dataset Custom web application software – Nathan Mudie - https://github.com/AustralianSynchrotron/mx-sample-manager xdsme – my upgrades to add handling inverted rotation axis, improve integration speed, etc.- (BSD license following from original) https://github.com/JunAishima/xdsme Pierre Legrand (Synchrotron Soleil) (BSD license) https://github.com/legrandp/xdsme XDS http://xds.mpimf-heidelberg.mpg.de/ Pointless, Aimless (CCP4) Sadabs and Xprep for CX data

What additional features beyond just processing the data? Merging datasets Dataset plots Retriggering (reprocessing) Results spreadsheet Whole goal is to help users evaluate their data, get the best out of it, and then keep track of the results

The Eiger 16M detector Rastering plots New feature s Put in rastering tab pic – show diffraction image, move to

Rastering plots

Eiger is a pixel array detector CCD PAD

How is the Eiger different from previous detectors? Data size: Eiger test crystal dataset, 180 degrees, 0.1 degrees/image – 9.5 GB ADSC test crystal dataset, 180 degrees, 1.0 degree/image – 800 MB Collection speed: Eiger test crystal dataset – 18 sec – continuous collection (3 microsecond readout) ADSC test crystal dataset – 7.2 minutes – ~1.4 sec readout per 1 second image Data rate: hundreds times faster! HDF5 files – master and multiple data files in NXMX format (full metadata necessary for processing) Typically 200 images/data file and experiments have 1800 images (9 files)

What is the real effect of the Eiger on data produced on the beamlines? 1 TB typical on a single visit

XDS also had support for the Eiger How much did we need to change the processing software to integrate the Eiger? The good news When we installed the Eiger 16M on MX2, xdsme already had been upgraded to support Eigers XDS also had support for the Eiger In terms of processing, not much code had to be rewritten Testing infrastructure also helped to rapidly develop and test new code while the beamlines were both still running CCD detectors => We get for free all of the features we have developed for the past few years with the new detector!! (merging, retriggering, web application, results spreadsheet) But the large amount of data -> slow!

Data Acquisition System Current Eiger processing system outline – MX2 Data Acquisition System Web app http://processing/retrigger/submit autodataset autodataset-cpu1,2,3 host:EPU autodataset host:EPU02 autodataset host:EPU03 autodataset host:ASCI05 autodataset host:ASCI06 Some processing (integration in XDS) is split between multiple nodes

Data Acquisition System In progress… Australian Synchrotron Computing Infrastructure (ASCI) processing cluster Data Acquisition System Web app http://processing/retrigger/submit autodataset autodataset host: node 1 autodataset host: node 2 autodataset host: node 3 autodataset host: node 4 autodataset host: node 5 autodataset host: node 6 autodataset host: node 7 autodataset host: node 8 autodataset host: node 9 autodataset host: node 10 autodataset host: node 11 autodataset host: node 12 autodataset host: node 13 autodataset host: node 14

What’s coming up in the next year? Collection strategies Compare with ADSC Sample services Collection Strategies Sample tracking - these mean better automatic processing, combining datasets that should be combined - scientists think about the science, not each individual dataset

Main dataset processing Retriggering Merging Results spreadsheet Summary Eiger Main dataset processing Retriggering Merging Results spreadsheet Rastering plots Plots Show diffraction image Move to image location In progress ASCI processing cluster Future Collection strategies Sample tracking LCP injector

Thanks to… The MX beamline – Tom Caradoc-Davies, David Aragao, Daniel Eriksson, Santosh Panjikar, Jason Price, Alan Riboldi-Tunnicliffe, Rachel Williamson Scientific Computing group – Andreas Moll, John Marcou, Robbie Clarken, Nathan Mudie, Ron Bosworth Australian Cancer Research Foundation for helping fund the Eiger 16M detector Users – for being patient, requesting features, providing feedback to our systems High Data Rate MX meeting – great advice and collaboration opportunities with other beamline scientists and Dectris to help deal with the Eiger detectors