Slide: 1 NeXus and Synchrotrons: Challenges and Requirements V.A. Solé – ESRF Software Group NeXus Data Format Workshop, PSI, May. 2010.

Slides:



Advertisements
Similar presentations
Workshop « HDF5 as hyperspectral data analysis format » ESRF January 11-13, 2010 Synchrotron SOLEIL Alain BUTEAU : Software for Controls and Data Acquisition.
Advertisements

NEFIS (WP5) Evaluation Meeting, November 2004 Evaluation Data Rights Aljoscha Requardt, University of Hamburg Response Rate: 91% - 10 of 11 partners.
The Quantum Chromodynamics Grid James Perry, Andrew Jackson, Matthew Egbert, Stephen Booth, Lorna Smith EPCC, The University Of Edinburgh.
Experiment Workflow Pipelines at APS: Message Queuing and HDF5 Claude Saunders, Nicholas Schwarz, John Hammonds Software Services Group Advanced Photon.
1 Projection Indexes in HDF5 Rishi Rakesh Sinha The HDF Group.
areaDetector Developments
Two main requirements: 1. Implementation Inspection policies (scheduling algorithms) that will extand the current AutoSched software : Taking to account.
EventStore Managing Event Versioning and Data Partitioning using Legacy Data Formats Chris Jones Valentin Kuznetsov Dan Riley Greg Sharp CLEO Collaboration.
1 CSI 101 Elements of Computing Fall 2009 Lecture #4 Using Flowcharts Monday February 2nd, 2009.
1.1 CAS CS 460/660 Introduction to Database Systems File Organization Slides from UC Berkeley.
Chapter 10.  Basic Functions  Insert Graphics, Audio/Video  Add Text  Create Links  Capture Brainstormed Ideas  Generate Outline  Organize Graphics,
RISK MANAGEMENT IN SOFTWARE ENGINEERING RISK MANAGEMENT IN SOFTWARE ENGINEERING Prepared by Prepared by Sneha Mudumba Sneha Mudumba.
University of Illinois at Urbana-ChampaignHDF 1McGrath/Yang 2/27/02 Transitioning from HDF4 to HDF5 Robert E. McGrath Kent Yang.
Succeeding as a Systems Analyst, Roles Class 2. First, some definitions Systems Development Specifying in detail how the many components of the information.
1 CF Unleashed: Introduction to Cf/Radial Joe VanAndel National Center for Atmospheric Research 2013/1/8 The National Center for Atmospheric.
A. Homs, BLISS Day Out – 15 Jan 2007 CCD detectors: spying with the Espia D. Fernandez A. Homs M. Perez C. Guilloud M. Papillon V. Rey V. A. Sole.
©2014 Experian Information Solutions, Inc. All rights reserved. Experian Confidential.
The Sardana device pool for SPEC lovers - BLISS Seminar - January 15, 2007 The Sardana device pool for SPEC lovers BLISS Seminar January 15, 2007 Tiago.
Laboratoire Interdisciplinaire sur l’Organisation Nanométrique et Supramoléculaire DIRECTION DES SCIENCES DE LA MATIERE IRAMIS TANGO at LIONS Olivier Taché.
DM_PPT_NP_v01 SESIP_0715_AJ HDF Product Designer Aleksandar Jelenak, H. Joe Lee, Ted Habermann Gerd Heber, John Readey, Joel Plutchak The HDF Group HDF.
University Prep Math Patricia van Donkelaar Course Outcomes By the end of the course,
Configuration Management (CM)
Lesson 3 McManus COP  You have to tell them ◦ what to do ◦ what to use ◦ in what order to do itand ◦ what to do if your user does not do what.
The netCDF-4 data model and format Russ Rew, UCAR Unidata NetCDF Workshop 25 October 2012.
Methodology - Conceptual Database Design. 2 Design Methodology u Structured approach that uses procedures, techniques, tools, and documentation aids to.
25th & 26th August 2009ICAT developer workshop 1.
Usability Issues Facing 21st Century Data Archives Joey Mukherjee and David Winningham
Detectors for Light Sources Contribution to the eXtreme Data Workshop of Nicola Tartoni Diamond Light Source.
Alastair Duncan STFC Pre Coffee talk STFC July 2014 The Trials and Tribulations and ultimate success of parallelisation using Hadoop within the SCAPE project.
2007. Software Engineering Laboratory, School of Computer Science S E Web-Harvest Web-Harvest: Open Source Web Data Extraction tool 이재정 Software Engineering.
Managing the Impacts of Change on Archiving Research Data A Presentation for “International Workshop on Strategies for Preservation of and Open Access.
240-Current Research Easily Extensible Systems, Octave, Input Formats, SOA.
Computer Systems & Architecture Lesson 4 8. Reconstructing Software Architectures.
“Live” Tomographic Reconstructions Alun Ashton Mark Basham.
DATABASE MANAGEMENT SYSTEM ARCHITECTURE
CAC01 – April 2010B11 – Data Format and Data Reduction Synchrotron SOLEIL Alain BUTEAU : Head of Controls and Data Acquisition software group) The Data.
F. Douglas Swesty, DOE Office of Science Data Management Workshop, SLAC March Data Management Needs for Nuclear-Astrophysical Simulation at the Ultrascale.
5-Oct-051 Tango collaboration status ICALEPCS 2005 Geneva (October 2005)
Database Management Systems (DBMS)
High Speed Detectors at Diamond Nick Rees. A few words about HDF5 PSI and Dectris held a workshop in May 2012 which identified issues with HDF5: –HDF5.
July 20, Update on the HDF5 standardization effort Elena Pourmal, Mike Folk The HDF Group July 20, 2006 SPG meeting, Palisades, NY.
Mantid Stakeholder Review Nick Draper 01/11/2007.
Integration of the ATLAS Tag Database with Data Management and Analysis Components Caitriana Nicholson University of Glasgow 3 rd September 2007 CHEP,
Réunion Contrôle Expérience 28/03/ Experiments Controls Vision, ideas, tasks to begin with … Alain Buteau Andy Götz.
Jean-Roch Vlimant, CERN Physics Performance and Dataset Project Physics Data & MC Validation Group McM : The Evolution of PREP. The CMS tool for Monte-Carlo.
PQDIF PQDIF: A Technical Overview Prepared by: Erich Gunther, Bill Dabbs, and Rob Scott Electrotek Concepts, Inc. NEW! IMPROVED!
With TANGO S. Poirier – Data management group.
Hyperion :High Volume Stream Archival Divya Muthukumaran.
Distributed Physics Analysis Past, Present, and Future Kaushik De University of Texas at Arlington (ATLAS & D0 Collaborations) ICHEP’06, Moscow July 29,
Barron’s chapter 5 Software development life cycle.
Copyright 2009 CSEM | PhD Seminar, 4 th June 2009| V. Revol | Page 0 X-ray Phase Contrast Imaging Vincent Revol Zürich,
Maitrayee Mukerji. INPUT MEMORY PROCESS OUTPUT DATA INFO.
1 30 th Tango collaboration meeting, June 2016 SPYC Project : CLI and scripting solution on top of Tango SPYC project : Command Line Interface and.
LSI Business Intelligence Initiative
22/04/2018 SOLEIL PROGRESS IN PROVIDING REMOTE DATA ANALYSIS SERVICES Majid OUNSY: Data Analysis Software Project Leader.
Module 11: File Structure
Moving from HDF4 to HDF5/netCDF-4
HDF5 for Real-Time and/or Embedded Test Data
Status report SOLEIL May 2009
Mark Rivers University of Chicago
Fernando Aguilar, IFCA-CSIC
Technical Expectations from Organizers & Participants B. Fougnie, S
Efficiently serving HDF5 via OPeNDAP
Introduction to Database Systems
Database Systems Instructor Name: Lecture-3.
Tomography at Advanced Photon Source
Simulation Framework Subproject cern
Implementation of ICT-related solutions
Diamond is all about data…
NSLS II High Data Rate Workshop May 2016
Presentation transcript:

Slide: 1 NeXus and Synchrotrons: Challenges and Requirements V.A. Solé – ESRF Software Group NeXus Data Format Workshop, PSI, May. 2010

Slide: 2 This talk Synchrotron needs NeXus challenges

Slide: 3 High Data Rate Experiments: ID15 Experiment: Absorption and phase contrast tomography Detector: 2k x 2k pixel 16 bit CMOS PCO camera Camera supported data rate: > 1000 full frame images/s – > 8 Gbytes/second 10 6 binned images/s Potentially a full tomography in less than 1s Bottleneck: Data transfer and storage (current limitation) Data analysis (if previous issue solved)‏ Data access: Most users come with their own disks to take data with them Data analysis often performed on site

Slide: 4 ID17 & ID19: The Malapa hominid study

Slide: 5 The Malapa hominid study Multiscale analysis performed on ID17 and ID19 in February 2010 Around 10 TB of basic data (radiographs)‏ Around 15 additional terabytes of data after processing 1 full month of calculation for reconstruction and artifact corrections with 100 processors, using at least 7 steps of processing. Probably 1-2 years of analysis before having the first important results on dental development and internal anatomy Acknowledgements : Paul Tafforeau, ESRF

Slide: 6 ID22 – Fluorescence-Diffraction Tomography Fluorescence Detector Diffraction CCD Sample Beam 1cm z x y 

Slide: 7 N y  N  Diffraction Images 1cm y  Sum Sinogram Sum Pattern Azimuthal Integrations N y  N  Diffraction Patterns Fit2d software Phase Sinograms  y PyMca software Reconstruction Capillary Calcite Ferrite Cubic sp3 25  m y x Acknowledgements: Pierre Bleuet CEA - Grenoble

Slide: 8 Currently Diffraction images in EDF or MarCCD format Fluorescence data in EDF or SPEC file format Scan information in SPEC file format Result of azimutal integration on Fit2D.chi format Preparing to move to HDF5 Data format issues Lesson learned: Try to avoid inventing a new data format Lesson NOT learned (yet?): Forget about ASCII just because you want to look at the file

Slide: 9 Synchrotron data format requirements Efficient format to store different data types Keep together motors, counters, images, spectra, … Compression support Widespread support Efficient and easy access to the data for visualization and analysis Versatile enough to be a one-for-all format HDF5

Slide: 10 Why HDF5? It fulfills the previous requirements It is supported on all common platforms and programming languages Long term support seems assured Standard file format for storing data from NASA's Earth Observing System Petabytes of data stored in HDF5 (Global Climate Change Research Program)‏

Slide: 11 Synchrotron needs in two words Efficiency & Versatility

Slide: 12 HDF5, what about NeXus? What we like about NeXus Well defined classes Application definitions A lot of endless discussions avoided What we do not like about NeXus A lot of endless discussions pending Not many analysis applications supporting it - At the ESRF only PyMca, others should follow. No easy way to implement new needs - Many synchrotrons are using NXdata as “data container” without setting signal and/or axis attributes and therefore not specifying a plot - A new need should imply a new group

Slide: 13 A beamline scientist friendly approach Advantages Simple to implement Answers current scientists demands (keep measurement data together, compression, …)‏ Compatible with NeXus if desired (specific NeXus groups can be written at any time with links and the opposite is also true) Can be seen/used as an intermediate step for not- yet-defined instruments or uses NXroot Top level. One per file. NXentry One group per measurement Measurement One group per measurement Positioners One group per Measurement Ex. All motor positions when the command was issued. ScalarData One group per Measurement Ex. Scanned motors and counters. Spectrum Several datasets per Measurement Ex. 1 spectrum dataset per MCA device ImageData Several datasets per measurement Ex. 1 image dataset per CCD device

Slide: 14 Data Archival Challenge The file should be able to provide EVERYTHING to perform the data analysis If NeXus achieves that, then it is a must and the ultimate archival format If it needs a complementary database, SOLEIL’s approach (Database + plugins + file indexing) seems more appropriate (but it can be achieved with HDF5 instead of NeXus)

Slide: 15 Data Analysis Reduced and analyzed data sharing NXdata seems very well suited to the job Independent and dependent variables Handling of units Raw data analysis More NeXus application definitions needed Should the actual applications provide them? Expressions to accommodate measured and requested data needed Is NeXus going to do something in this way? Should we join SOLEIL’s plugin approach?

Slide: 16 NeXus biggest challenge HDF5 NeXus

Slide: 17 Conclusion HDF5 is a good data format choice We expect to share data and analysis tools Conventions are certainly needed If they are not there, the analysis tools will impose them Try to keep things as simple as possible NXdata provides the simplest form of data exchange NeXus application definitions are an excellent idea Consider integration of Measurement group into NeXus Validation tools are critical to validate non NeXus API produced files (MatLab, IDL, …)‏