ARCS Workshop on Software for Data Analysis of Inelastic Scattering Data 15-16 March, 2002 California Institute of Technology Ray Osborn Argonne National.

Slides:



Advertisements
Similar presentations
Introduction to the BinX Library eDIKT project team Ted Wen Robert Carroll
Advertisements

A Toolbox for Blackboard Tim Roberts
SYSTEM PROGRAMMING & SYSTEM ADMINISTRATION
ILDG File Format Chip Watson, for Middleware & MetaData Working Groups.
Information Retrieval in Practice
Chapter Day 5. © 2007 Pearson Addison-Wesley. All rights reserved2-2 Agenda Day 5 Questions from last Class?? Problem set 1 Posted  Introduction on developing.
DCS Architecture Bob Krzaczek. Key Design Requirement Distilled from the DCS Mission statement and the results of the Conceptual Design Review (June 1999):
Results of NIAC Meeting April P.F. Peterson Executive Secretary, NIAC.
Introduction to Databases Transparencies
What is NeXus good for? P.F. Peterson Executive Secretary, NIAC.
The RDF meta model: a closer look Basic ideas of the RDF Resource instance descriptions in the RDF format Application-specific RDF schemas Limitations.
ASP.NET Programming with C# and SQL Server First Edition
Collaboration Suite Business Process Management
1 An Introduction to Visual Basic Objectives Explain the history of programming languages Define the terminology used in object-oriented programming.
Developing a Basic Web Page with HTML
Overview of Search Engines
Installing Windows XP Professional Using Attended Installation Slide 1 of 41Session 2 Ver. 1.0 CompTIA A+ Certification: A Comprehensive Approach for all.
TIBCO Designer TIBCO BusinessWorks is a scalable, extensible, and easy to use integration platform that allows you to develop, deploy, and run integration.
This chapter is extracted from Sommerville’s slides. Text book chapter
CS102 Introduction to Computer Programming
ViciDocs for BPO Companies Creating Info repositories from documents.
The NeXus API and Utilities Freddie Akeroyd STFC ISIS Facility for NeXus Workshop, PSI 10 – 12 May 2010.
Introduction to Java Appendix A. Appendix A: Introduction to Java2 Chapter Objectives To understand the essentials of object-oriented programming in Java.
Microsoft Visual Basic 2005: Reloaded Second Edition
EARTH SCIENCE MARKUP LANGUAGE “Define Once Use Anywhere” INFORMATION TECHNOLOGY AND SYSTEMS CENTER UNIVERSITY OF ALABAMA IN HUNTSVILLE.
DCS Overview MCS/DCS Technical Interchange Meeting August, 2000.
DM_PPT_NP_v01 SESIP_0715_AJ HDF Product Designer Aleksandar Jelenak, H. Joe Lee, Ted Habermann Gerd Heber, John Readey, Joel Plutchak The HDF Group HDF.
Work performed at the Advanced Photon Source was supported by the U.S. Department of Energy, Office of Science, Basic Energy Sciences, under Contract No.
Database structure for the European Integrated Tokamak Modelling Task Force F. Imbeaux On behalf of the Data Coordination Project.
PPPL Spring/Summer CO-OP 2006 Stephen Krenzel. Focus of the CO-OP The co-op focused on improving ElVis, an application for the visualization and monitoring.
Computational Methods of Scientific Programming Lecturers Thomas A Herring, Room A, Chris Hill, Room ,
Content and Computer Platforms Week 3. Today’s goals Obtaining, describing, indexing content –XML –Metadata Preparing for the installation of Dspace –Computers.
XML A web enabled data description language 4/22/2001 By Mark Lawson & Edward Ryan L’Herault.
EARTH SCIENCE MARKUP LANGUAGE Why do you need it? How can it help you? INFORMATION TECHNOLOGY AND SYSTEMS CENTER UNIVERSITY OF ALABAMA IN HUNTSVILLE.
N P O E S S I N T E G R A T E D P R O G R A M O F F I C E NPP/ NPOESS Product Data Format Richard E. Ullman NOAA/NESDIS/IPO NASA/GSFC/NPP Algorithm Division.
Copyright © 2007 Addison-Wesley. All rights reserved.1-1 Reasons for Studying Concepts of Programming Languages Increased ability to express ideas Improved.
Discovering Computers Fundamentals Fifth Edition Chapter 9 Database Management.
The netCDF-4 data model and format Russ Rew, UCAR Unidata NetCDF Workshop 25 October 2012.
1 Schema Registries Steven Hughes, Lou Reich, Dan Crichton NASA 21 October 2015.
25th & 26th August 2009ICAT developer workshop 1.
The european ITM Task Force data structure F. Imbeaux.
Unit-1 Introduction Prepared by: Prof. Harish I Rathod
Slide: 1 NeXus and Synchrotrons: Challenges and Requirements V.A. Solé – ESRF Software Group NeXus Data Format Workshop, PSI, May
Working with the VB IDE. Running a Program u Clicking the”start” tool begins the program u The “break” tool pauses a program in mid-execution u The “end”
What it is and how it works
NPOESS Enhanced Description Tool - “ned” Richard E. Ullman NASA/GSFC/NPP NOAA/NESDIS/IPO Data / Information Architecture Algorithm / System Engineering.
Mantid Stakeholder Review Nick Draper 01/11/2007.
The HDF Group Introduction to netCDF-4 Elena Pourmal The HDF Group 110/17/2015.
Mantid Stakeholder Review Nick Draper 01/11/2007.
Data File Formats: netCDF by Tom Whittaker University of Wisconsin-Madison SSEC/CIMSS 2009 MUG Meeting June, 2009.
Interface for Glyco Vault Functionality and requirements. Initial proposal. Maciej Janik.
With TANGO S. Poirier – Data management group.
SDM Center Parallel I/O Storage Efficient Access Team.
Managing Digital Assets File Naming and Resizing.
Chapter – 8 Software Tools.
The Integrated Spectral Analysis Workbench (ISAW) DANSE Kickoff Meeting, Aug. 15, 2006, D. Mikkelson, T. Worlton, Julian Tao.
 XML derives its strength from a variety of supporting technologies.  Structure and data types: When using XML to exchange data among clients, partners,
1 CAA 2009 Cross Cal 9, Jesus College, Cambridge, UK, March 2009 Caveats, Versions, Quality and Documentation Specification Chris Perry.
NetCDF Data Model Details Russ Rew, UCAR Unidata NetCDF 2009 Workshop
A S P. Outline  The introduction of ASP  Why we choose ASP  How ASP works  Basic syntax rule of ASP  ASP’S object model  Limitations of ASP  Summary.
MIRC Overview Medical Imaging Resource Center. RSNA2006 MIRC Courses Overview of the RSNA MIRC Software Installing MIRC on Your Laptop Using MIRC for.
Other Projects Relevant (and Not So Relevant) to the SODA Ideal: NetCDF, HDF, OLE/COM/DCOM, OpenDoc, Zope Sheila Denn INLS April 16, 2001.
NCI CBIIT LIMS ISIG Meeting– July 2007 NCI CBIIT LIMS Consortium Interface SIG Mission: focus on an overall goal of providing a library of interfaces/adapters.
Introduction to Algorithm. What is Algorithm? an algorithm is any well-defined computational procedure that takes some value, or set of values, as input.
Moving from HDF4 to HDF5/netCDF-4
SRNWP Interoperability Workshop
reduction data treatment for ARCS
XML QUESTIONS AND ANSWERS
GO! with Microsoft Access 2016
What is FITS? FITS = Flexible Image Transport System
Presentation transcript:

ARCS Workshop on Software for Data Analysis of Inelastic Scattering Data March, 2002 California Institute of Technology Ray Osborn Argonne National Laboratory

Aims of a Common Format Remove need for local expertise Reduce number of conversion utilities Reduce redundant software development Increase cooperation in software development Increase sophistication of visualization software Increase utility of generic software Remove problems of data portability Help when all documentation is lost

Criteria of a Modern Data Format n It must be portable n It must be self-describing n It must be extensible n It must be flexible in data organization n It must be efficient in data storage n It must be available in the public domain

History of Format Three parallel developments have led to NeXus : 1.Jon Tischler (ORNL) proposed an HDF-based format as a standard for data storage at the Advanced Photon Source (Argonne National Laboratory) 2.Mark Koennecke (PSI) made a similar proposal using netCDF for the European neutron scattering community while working at the ISIS pulsed neutron facility 3.Przemek Klosowski (NIST) produced a first draft of the NeXus proposal drawing on ideas from both sources n This formed the basis for the current design of the NeXus standard which was developed at two workshops, SoftNeSS'95 (NIST Sept. 1995) and SoftNeSS'96 (Argonne Oct. 1996), attended by representatives of a range of neutron and x-ray facilities. n The NeXus API was released in late 1997.

Choice of HDF Hierarchical Data Format (HDF) –Developed at NCSA (UIUC)  Portable - Macs, VMS, U**x, Windows NT/98  Extensible - Add anything you like when you like  Self-describing - you don’t need to have a manual  Binary (with internal compression)  Hierarchical - i.e. comprehensible and flexible  Widely used - e.g. astronomers, geophysicists  Commercially accessible - IDL, IGOR, PVwave

What’s the Alternative? n imgCIF/CBF ä There is a proposal to extend the Crystallography Information File (CIF) format to include binary images. This has insufficient flexibility for all neutron/x-ray instrumentation. n FITS ä The Flexible Image Transport System, the astronomical data format, is not self-describing. n ISO STEP/Express ä This is a standard for describing database structures rather than a data format itself. n netCDF ä The Unidata netCDF standard is a flat-file format. This means that all data sets must have unique names and cannot be organized into hierarchical groups. Is no longer developed. n XML ä The eXtensible Markup Language is gaining widespread acceptance as a means organizing database information, e.g. for web display. May have a role in NeXus.

What is NeXus? a set of subroutines - the NeXus API  to make reading and writing NeXus data easy a set of design principles  to help people to understand what is in the files a set of instrument definitions  to allow development of more portable analysis software

Myths about NeXus HDF is too complicated  the NeXus API is conceptually extremely simple HDF does not have adequate performance  HDF5 is state-of-the-art in performance  Automatic data compression can speed i/o  NeXus does not appear to impact HDF performance NeXus is only for storing raw data  it can store any kind of data  in principle, both raw and analyzed data can be stored in the same file transparently  it would also be ideal for Monte Carlo results

Example NeXus Program in F90 program NXlrmecs use NXUmodule... !Open NeXus output file and write global attributes if (NXopen ("sys$scratch:lrcs"//run_no//".nxs", NXACC_CREATE, file_id) /= NX_OK) stop if (NXUwriteglobals (file_id, user_name, "Argonne National Laboratory", "Argonne, IL 60439, USA", & "(630) ", "(630) ", /= NX_OK) stop if (NXmakegroup (file_id, entry, "NXentry") /= NX_OK) stop !Open NXsource if (NXmakegroup (file_id, "source", "NXsource") /= NX_OK) stop if (NXUwritedata (file_id, "distance", -L1, "m") /= NX_OK) stop if (NXUwritedata (file_id, "moderator", "liquid methane") /= NX_OK) stop if (NXclosegroup (file_id) /= NX_OK) stop !Open NXdata if (NXmakegroup (file_id, "data", "NXdata") /= NX_OK) stop if (NXUwritedata (file_id, "title", char_value) /= NX_OK) stop if (NXUwritedata (file_id, "data", sgarray%counts, "counts") /= NX_OK) stop if (NXputattr (file_id, "signal", 1) /= NX_OK) stop if (NXmakelink (file_id, time_id) /= NX_OK) stop if (NXmakelink (file_id, phi_id) /= NX_OK) stop if (NXclosegroup (file_id) /= NX_OK) stop if (NXclose (file_id) /= NX_OK) stop end program NXlrmecs

Current Status NeXus is in regular use at a number of facilities The NeXus design is on the web  The core API is available for downloading  C, F77, F90, Java, IDL The NXdict and utility API simplifies certain tasks  C, F90 NeXus Data Server released in 2000  Allows pure java browsers Various browsers now exist Some visualization software recognizes NeXus files  LAMP, open Genie, ISAW

Facility Support APS recommended standard for APS CATs FRM-II under development as format for new reactor instruments ILL accepted as an interchange format; readable by LAMP IPNS will use as the run-file format in future DAS ISISbeing used in open Genie and under development for new instruments JKJ projectproposed as run-file format for new facility KEKunder development as run-file format for new instrument LANSCEunder development as format for new instruments e.g. HIPPO NIST to be used as format for new instruments e.g. Disk-Chopper Spectrometer LLBin use on several spectrometers PSIin use as run-file format for current instruments µSRunder development as a standard interchange format SNSproposed as standard run format as long as based on HDF5

NeXus API Team The main people involved in developing the NeXus API are:  Mark Koennecke, PSI, Switzerland  Przemek Klosowski, NIST, USA  Freddie Akeroyd, ISIS, UK  Ray Osborn, ANL, USA

NeXus Design Principles The design of the NeXus data files should follow two principles as much as possible : n Files must be completely self-explanatory ä no need for local information, e.g. zero angles, focusing formulae n Data must be automatically plottable ä automatic identification of axes, titles, units, etc.

NeXus Objects There are only three types of data object in NeXus : Data  scalar or multidimensional arrays  integer (1, 2, 4, bytes), real (single or double), or character  equivalent to HDF SDS’s Data attributes  meta-data attached to a data item e.g. units Groups  folders containing sets of data items and/or other groups  can have any name, but must have a predefined class e.g. NXsample  equivalent to HDF4 Vgroups  group attributes not explicitly used in NeXus (although they are used to define group classes in HDF5)

NeXus Hierarchies Run1101 (NXentry) Run1102 (NXentry) sample (NXsample) monitor (NXmonitor) data (NXdata) sample (NXsample) monitor (NXmonitor) start_time counts time_of_flight integral

NeXus Classes n Every NeXus group is assigned both a name and a class. n A NeXus class defines the expected contents of the group. n It is not necessary for every variable defined for a class to be present in every instance of that class. n NeXus class names begin with NX followed without a break by a lower case word with underscores used to separate words. n In general, there can be more than one group of the same class although it must not have the same name. n The NX class names are a defined part of the NeXus standard and may not be modified by the user, i.e. if the user wants to define their own classes, they must not use the NX prefix.

Structure of NeXus Files

NXentry Class Histogram1 (NXentry) Histogram2 (NXentry) sample (NXsample) LRMECS (NXinstrument) monitor1 (NXmonitor) monitor2 (NXmonitor) data (NXdata) sample (NXsample) LRMECS (NXinstrument) start_time

Simplest NeXus file Scan (NXentry) data (NXdata) counts two_theta N.B. The programmers who produce intermediate files for storing analyzed data should agree on simple interchange rules

NXentry Class n This is the only group class allowed at the top. n It should contain at least one plottable dataset. n It contains data that can sensibly be classified as a single data entry. n It should contain all the data necessary for the intended analysis.

NXinstrument NXsource NXchopper NXcollimator NXguide NXdetector NXinstrument The NXinstrument group contains all the beamline components – defined by their distance from the sample – positive (negative) distances are after (before) the sample – the sample is not considered a beamline component

Structure of NeXus Data Groups NXdata groups encapsulate plottable data Multi-dimensional data Dimension scales Axis labels Title

Defining NeXus data data (NXdata) errors[n,m] counts[n,m] (signal=1, units=“counts”) time_of_flight[m] (axis=1, units=“microseconds”) two_theta[n] (axis=2, units=“degrees”) C Version of the API

Defining NeXus data data (NXdata) errors(m,n) counts(m,n) (signal=1, units=“counts”) time_of_flight(m) (axis=1, units=“microseconds”) two_theta(n) (axis=2, units=“degrees”) Fortran Version of the API

Defining NeXus data (Version II) data (NXdata) errors[n,m] counts[n,m] (signal=1, units=“counts”, axes=“[two_theta,time_of_flight]”) time_of_flight[m] (units=“microseconds”) two_theta[n] (units=“degrees”) Alternative Axis Labelling Scheme

Data Linking NXinstrument NXdata NXsource NXdetector NXmonochromator counts[100] two_theta[100] gas_pressure[100] solid_angle[100]

Naming Conventions n Lower case letters are used throughout, except for common symbols and abbreviations such as FWHM. n Names are constructed from full words separated by the underscore character e.g. time_of_flight. n For sequentially indexed group names, the sequential number is simply appended to the name, e.g. filter1, filter2. This convention should be used only for data group names. n The hierarchical structure of NeXus files should be used to simplify data names. e.g. “temperature”, not “sample_temperature”.

Units It is very important that, whenever possible, data units are specified n Physical Units  We recommend the use of S.I. units while recognizing that other units are very common in the neutron and x-ray community e.g. meV and Angstroms. Whatever units are used, they must be specified as a character string in the format used by the Unidata UDunits utility. n Dates and Times ä NeXus dates and times should be stored using the ISO 8601 format e.g :15: This will avoid confusion, e.g. between U.S. and European conventions, and is appropriate for machine sorting.

Recent Developments Shift to HDF5  Mark Koennecke has produced an HDF5 version of the NeXus API  If both HDF4 and HDF5 libraries are available –it is possible to read either type transparently –it is possible to choose which type to write Use of XML  Chris Moreton-Smith has written a sample DTD file and XML file compatible with NeXus  Allows format validation and higher-level API’s

Wishlist n Organization ä Establish groups to develop and maintain instrument definitions. ä Make NeXus more self-sustaining. n NeXus tools ä Develop validation tools in XML ä Develop a NeXus editor ä Develop filters to standard tools (Origin, Excel, Python) ä Develop high-level tools (GUI browsers, visualization) ä Develop better installation kits