Automatic Data Processing at the ESRF

Slides:



Advertisements
Similar presentations
Beyond the Help Desk Getting ahead of the game Mihaela Damian Dan Sexton 11 th July 2013 CSCS > School of Clinical Medicine > University of Cambridge.
Advertisements

Validata Release Coordinator Accelerated application delivery through automated end-to-end release management.
Biologically Inspired AI (mostly GAs). Some Examples of Biologically Inspired Computation Neural networks Evolutionary computation (e.g., genetic algorithms)
Technology Steering Group January 31, 2007 Academic Affairs Technology Steering Group February 13, 2008.
1998/5/21by Chang I-Ning1 ImageRover: A Content-Based Image Browser for the World Wide Web Introduction Approach Image Collection Subsystem Image Query.
Task 3.5 Tests and Integration ( Wp3 kick-off meeting, Poznan, 29 th -30 th January 2002 Santiago González de la.
Design Patterns for Efficient Graph Algorithms in MapReduce Jimmy Lin and Michael Schatz University of Maryland Tuesday, June 29, 2010 This work is licensed.
Automated Tests in NICOS Nightly Control System Alexander Undrus Brookhaven National Laboratory, Upton, NY Software testing is a difficult, time-consuming.
Performance Testing Design By Omri Lapidot Symantec Corporation Mobile: At SIGiST Israel Meeting November 2007.
Version 4 for Windows NEX T. Welcome to SphinxSurvey Version 4,4, the integrated solution for all your survey needs... Question list Questionnaire Design.
Data Analysis I19 Upgrade Workshop 11 Feb Overview Short history of automated processing for Diamond MX beamlines Effects of adding Pilatus detectors.
1 Management Pain points now Existing tools: Do not map to virtual environments Provisioning Backup Health monitoring Performance monitoring / management.
On-demand Mobile CRM. Key Features You can have your own CRM application instantly Subscription based – no investment on hardware, software or hosting.
CCP4 Study Weekend 3rd January 2003 CCP4i - “Tricks and Tools” Peter Briggs CCP4 Daresbury.
Nightly Releases and Testing Alexander Undrus Atlas SW week, May
T.Jadczyk, Bioinformatics Applications in the Virtual Laboratory Bioinformatics Applications in the Virtual Laboratory Tomasz Jadczyk AGH University of.
My.umich.edu Partial Integration of Dynamic Services with Visual Design.
CH 13 Server and Network Monitoring. Hands-On Microsoft Windows Server Objectives Understand the importance of server monitoring Monitor server.
David Adams ATLAS DIAL: Distributed Interactive Analysis of Large datasets David Adams BNL August 5, 2002 BNL OMEGA talk.
Atomic structure model
The Protein Identifier Cross-Reference (PICR) service.
Welcome to the PRECIS training workshop
17 th October 2005CCP4 Database Meeting (York) CCP4i Database Overview Peter Briggs.
Bayesian Evolutionary Analysis by Sampling Trees (BEAST) LEE KIM-SUNG Environmental Health Institute National Environment Agency.
AUTOMATION OF MACROMOLECULAR DATA COLLECTION - INTEGRATION OF DATA COLLECTION AND DATA PROCESSING Harold R. Powell 1, Graeme Winter 1, Andrew G.W. Leslie.
ISPyB for MX at Diamond Pierre Aller. -Before beamtime Shipping preparation Sample registration -During beamtime Beamline status (remote) Puck allocation.
VIEWS b.ppt-1 Managing Intelligent Decision Support Networks in Biosurveillance PHIN 2008, Session G1, August 27, 2008 Mohammad Hashemian, MS, Zaruhi.
Dr Andrew Peter Hammersley ESRF ESRF MX COMPUTIONAL AND NETWORK CAPABILITIES (Note: This is not my field, so detailed questions will have to be relayed.
07/07/ KDD Services at the Goddard DAAC Robert Mack NASA/Goddard Space Flight Center Distributed Active Archive Center.
Solange Delageniere ESRF/TID/MIS
J2EE Platform Overview (Application Architecture)
VisIt Project Overview
Pierre Aller ISPyB for MX at Diamond.
Solange Delageniere ESRF/TID/MIS
Computer Applications
ISPyB December 4th, 2013 From sample to data analysis: how to track every step of an experiment in the ISPyB database. Marjolaine Bodin, ESRF/EXP/Structural.
Leveraging R and Shiny for Point and Click ADaM Analysis
WP18, High-speed data recording Krzysztof Wrona, European XFEL
Shared Services with Spotfire
Platform Overview Provide your marketing and sales groups with a single, integrated, web based on-demand platform that allows them to easily automate and.
Customer Relationship Management (CRM) An introduction and workshop
ISPyB for BioSAXS.
Introduction to Visual Basic 2008 Programming
Chapter 5: Using System Software
Post Processing Plugins: PPP
CCP4 6.1 and beyond: Tools for Macromolecular Crystallography
Accelerate define.xml using defineReady - Saravanan June 17, 2015.
Database Requirements for CCP4 17th October 2005
Complete automation in CCP4 What do we need and how to achieve it?
Cloud based Open Source Backup/Restore Tool
The Scheduling Strategy and Experience of IHEP HTCondor Cluster
Experimental Definition in SynchWeb for XPDF
Based on work by DoIT Network Services, UW-Madison
Current CRIMS – ISPyB communication status
Alfanar: Building Dynamic Sales and Service Teams with SAP® Hybris® Cloud for Customer Company alfanar Group​ Headquarters ​Riyadh, Saudi Arabia Industry.
SharePoint 2019 Changes Point of View.
Introduction to Informer
UNECE CES Steering group on SdG statistics – Task force on NRPs
Virtual Reality.
UNECE CES Steering group on SdG statistics – Task force on NRPs
Provide quick feedback to data collection experiments.
Direct Manipulation.
Storing and Accessing G-OnRamp’s Assembly Hubs outside of Galaxy
Overview of Workflows: Why Use Them?
HARDWARE OBJECTS - PERPETUUM MOBILE
M. Kezunovic (P.I.) S. S. Luo D. Ristanovic Texas A&M University
The site to download BALBES:
What's New in eCognition 9
Neal Kurande, WinaGodwin Anyanwu Jr., Adam Chau
Presentation transcript:

Automatic Data Processing at the ESRF Max Nanao

Automatic Processing – why? +Rapid feedback to user on data quality +Enables “value added” services +MR phasing +Ligand fitting +Automatic SAD +QA for us

Automatic processing at ESRF, History 2009 GrenADES (XDS wrapper) Coding and MXCuBE integration 2010 First version available –”fastproc” and “parallelproc” 2010 Automatic SAD incorporated 2012-13 EDNA-ification (Thomas Boeglin) 2015 XDSApp fallback added to GrenADES (Max) 2016 autoPROC added (Olof) 2016 DIALS added (via XIA2 and GrenADES) (Olof, Max) 2016 XDSApp added (Olof)

Automatic processing organisation MXCuBE BES OAR EDNAProc autoPROC XDSAPP XIA2 DIALS GrenADES launcher GrenADES fastproc GrenADES Parallelproc GrenADES DIALS OAR=Queuing system BES=Beamline Expert System

Current configuration: Compute Resources Current configuration: OAR queuing/scheduling *with a separate scheduler* 29 hosts 2-6 Gb RAM 3-3.3 Ghz AMD 564 “cores” GPFS Mean run time : 322.4 s Median run time: 160.0 s ~15k processings/month across all packages l ESRF Automatic data processing l January 17, 2017 l Max Nanao

Post processing tasks Processed data Summary Display on beamline Automatic SAD Molecular replacement from user model Ligand fitting from user SMILES Molecular replacement from unit cell (Nearest Cell) DIMPLE

Post processing tasks Processed data Summary Display on beamline Automatic SAD Molecular replacement from user model Molecular replacement from unit cell (Nearest Cell) DIMPLE

Why so many pipelines? While in general they perform similarly, no packages appear to consistently do better than others N.B. “normalized data”: +Statistics calculated with phenix.merging_statistics +Same resolution limits +Same Bravais lattice +Compare to Grenades

Riso calculated for the same dataset, same XDS, different wrapper Why so many pipelines Riso calculated for the same dataset, same XDS, different wrapper Program Mean vs Grenades Parallel (%) Median vs Grenades Parallel (%) EDNAproc (n=24k) 5.0 2.5 GrenADES fastproc (n=30k) 4.3 1.9 l ESRF Automatic data processing l January 17, 2017 l Max Nanao

 Many results, ~40 solved⃰ per month May, June July 2016 Recent work Currently Multiple space groups Multiple solvent contents Both hands  Many results, ~40 solved⃰ per month May, June July 2016 Notification by Local Contact email notification Inspecting the data collection directory tree  Integrate SAD results into web interface (Olof, Alejandro, Max) ⃰ Partial CC >25, Avg fragment size >10 datasets, not projects

Notification that phasing was successful

Complete view of all phasing trials PNG snapshot of PDB Interactive density viewer (UglyMol)

PNG Cartoon of SHELXE model

Interactive electron density

To DO +Molecular Replacement results into database and in EXI +Coordination with Global Phasing for AutoSHARP, PIPEDREAM? +More information from user (MXCubE, EXI) on sample – sequence, anom scatterer +Automatic merging of SSX data +HCA +Genetic Algorithm +User chooses which processing pipelines via MXCuBE?

People Olof Svensson Alejandro de Maria Daniele de Sanctis Matias Guijarro Thomas Boeglin Stephanie Monaco Antonia Beteva Solange Delageniere Darren Spruce Marjolaine Bodin Matthew Bowler Sean McSweeney Elspeth Gordon Gordon Leonard Andrew McCarthy

Mean vs Grenades Parallel (%) Median vs Grenades Parallel (%) Program Mean vs Grenades Parallel (%) Median vs Grenades Parallel (%) EDNAproc (n=24k) 5.0 2.5 GrenADES fastproc (n=30k) 4.3 1.9 XIA2 DIALS (n=1319) 13.5 11.2 autoPROC (n=776) 9.5 6.8 l ESRF Automatic data processing l January 17, 2017 l Max Nanao