Data Acquisition at the NSLS II Leo Dalesio, (NSLS II control group) Oct 22, 2014 (not 2010)

Slides:



Advertisements
Similar presentations
Categories of I/O Devices
Advertisements

V4 Status and Workshop Report CSS, DISCS, an V4 team.
Paul Chu FRIB Controls Group Leader (Acting) Service-Oriented Architecture for High-level Applications.
CIMCO Integration Software Products
1 1999/Ph 514: Channel Access Concepts EPICS Channel Access Concepts Bob Dalesio LANL.
1 BROOKHAVEN SCIENCE ASSOCIATES EPICS V4 Support to Physics Application, Data Acquisition, and Data Analysis L. Dalesio, Gabriele Carcassi, Martin Richard.
EPICS Architecture Version 3 Channel Access Client (CAC) Connection Data Transfers WAN/LAN/Local Connection Data Transfers Channel Access Server (CAS)
AreaDetector Data Processing Pipeline In EPICS V4 Dave Hickin Diamond Light Source EPICS Collaboration Meeting Diamond Light Source 01/05/2013.
June 22-23, 2005 Technology Infusion Team Committee1 High Performance Parallel Lucene search (for an OAI federation) K. Maly, and M. Zubair Department.
DCS Architecture Bob Krzaczek. Key Design Requirement Distilled from the DCS Mission statement and the results of the Conceptual Design Review (June 1999):
1 I/O Management in Representative Operating Systems.
FTP. SMS based FTP Introduction Existing System Proposed Solution Block Diagram Hardware and Software Features Benefits Future Scope Conclusion.
Agenda Adaptation of existing open-source control systems from compact accelerators to large scale facilities.
Remote Visualization of Large Datasets with MIDAS & ParaViewWeb Web3D – Paris 2011 Julien Jomier, Kitware
ViciDocs for BPO Companies Creating Info repositories from documents.
Upcoming Enhancements to the HST Archive Mark Kyprianou Operations and Engineering Division Data System Branch.
6/1/2001 Supplementing Aleph Reports Using The Crystal Reports Web Component Server Presented by Bob Gerrity Head.
Discussion and conclusion The OGC SOS describes a global standard for storing and recalling sensor data and the associated metadata. The standard covers.
1 Chapter Client-Server Interaction. 2 Functionality  Transport layer and layers below  Basic communication  Reliability  Application layer.
Introduction and Overview Questions answered in this lecture: What is an operating system? How have operating systems evolved? Why study operating systems?
SCADA. 3-Oct-15 Contents.. Introduction Hardware Architecture Software Architecture Functionality Conclusion References.
Managed by UT-Battelle for the Department of Energy 1 Integrated Catalogue (ICAT) Auto Update System Presented by Jessica Feng Research Alliance in Math.
1 BROOKHAVEN SCIENCE ASSOCIATES EPICS Core (and other development efforts) L. Dalesio. EPICS April 25, 2013.
©2010 John Wiley and Sons Chapter 12 Research Methods in Human-Computer Interaction Chapter 12- Automated Data Collection.
TANGO on embedded devices: the Bimorph Mirror application case Fulvio Billè Roberto Borghes, Roberto Pugliese, Lawrence Iviani Instrumentation & Measurement.
1 BROOKHAVEN SCIENCE ASSOCIATES NSLSII Physics Applications – Applying V4 The Control Group – presented by Bob Dalesio Taiwan EPICS Meeting, June 2011.
RELATIONAL FAULT TOLERANT INTERFACE TO HETEROGENEOUS DISTRIBUTED DATABASES Prof. Osama Abulnaja Afraa Khalifah
1/15 G. Manduchi EPICS Collaboration Meeting, Aix-en-Provence, Spring 2010 INTEGRATION OF EPICS AND MDSplus G. Manduchi, A. Luchetta, C. Taliercio, R.
Stanford Linear Accelerator Center R. D. Hall1 EPICS Collaboration Mtg Oct , 2007 Oracle Archiver Past Experience Lessons Learned for Future EPICS.
MASAR Service Guobao Shen Photon Sciences Department Brookhaven National Laboratory EPICS Collaboration Workshop Oct 05, 2013.
Instrumentation of the SAM-Grid Gabriele Garzoglio CSC 426 Research Proposal.
NCRRNIH BioCARS Keith Brister Beamline Controls July 25, 2002 Data Sequencing Data Collection Using Database Sequencing Keith Brister CARS The University.
Introduction to dCache Zhenping (Jane) Liu ATLAS Computing Facility, Physics Department Brookhaven National Lab 09/12 – 09/13, 2005 USATLAS Tier-1 & Tier-2.
MASAR Server & Application Guobao Shen Photon Sciences Department Brookhaven National Laboratory Collaboration Working Group Oct 02, 2013.
Framework for MDO Studies Amitay Isaacs Center for Aerospace System Design and Engineering IIT Bombay.
Computing Division Requests The following is a list of tasks about to be officially submitted to the Computing Division for requested support. D0 personnel.
CLASS Information Management Presented at NOAATECH Conference 2006 Presented by Pat Schafer (CLASS-WV Development Lead)
Online Software 8-July-98 Commissioning Working Group DØ Workshop S. Fuess Objective: Define for you, the customers of the Online system, the products.
1 BROOKHAVEN SCIENCE ASSOCIATES Control System Overview Bob Dalesio, Control Group HLA Review for NSLS-II Project April
ESFRI & e-Infrastructure Collaborations, EGEE’09 Krzysztof Wrona September 21 st, 2009 European XFEL.
11 CLUSTERING AND AVAILABILITY Chapter 11. Chapter 11: CLUSTERING AND AVAILABILITY2 OVERVIEW  Describe the clustering capabilities of Microsoft Windows.
Project Database Handler The Project Database Handler is a brokering application that mediates interactions between the project database and the external.
EPICS Release 3.15 Bob Dalesio May 19, Features for 3.15 Support for large arrays - done for rsrv in 3.14 Channel access priorities - planned to.
1 BROOKHAVEN SCIENCE ASSOCIATES EPICS Version 4 – Development Plan V4 Team – presented by Bob Dalesio EPICS Meeting October 12, 2010.
GLOBAL EDGE SOFTWERE LTD1 R EMOTE F ILE S HARING - Ardhanareesh Aradhyamath.
High Speed Detectors at Diamond Nick Rees. A few words about HDF5 PSI and Dectris held a workshop in May 2012 which identified issues with HDF5: –HDF5.
August 2003 At A Glance The IRC is a platform independent, extensible, and adaptive framework that provides robust, interactive, and distributed control.
The ATLAS DAQ System Online Configurations Database Service Challenge J. Almeida, M. Dobson, A. Kazarov, G. Lehmann-Miotto, J.E. Sloper, I. Soloviev and.
File Transfer And Access (FTP, TFTP, NFS). Remote File Access, Transfer and Storage Networks For different goals variety of approaches to remote file.
1 Channel Access Concepts – IHEP EPICS Training – K.F – Aug EPICS Channel Access Concepts Kazuro Furukawa, KEK (Bob Dalesio, LANL)
EPICS and LabVIEW Tony Vento, National Instruments
Distributed Physics Analysis Past, Present, and Future Kaushik De University of Texas at Arlington (ATLAS & D0 Collaborations) ICHEP’06, Moscow July 29,
Exploring Networked Data and Data Stores Lesson 3.
Chapter 4: server services. The Complete Guide to Linux System Administration2 Objectives Configure network interfaces using command- line and graphical.
January 2010 – GEO-ISC KickOff meeting Christian Gräf, AEI 10 m Prototype Team State-of-the-art digital control: Introducing LIGO CDS.
European Organization For Nuclear Research CERN Accelerator Logging Service Overview Focus on Data Extraction for Offline Analysis Ronny Billen & Chris.
1 BROOKHAVEN SCIENCE ASSOCIATES EPICS Version 4 – Normative V4 Team – presented by Bob Dalesio EPICS Meeting October 7, 2011.
Cofax Scalability Document Version Scaling Cofax in General The scalability of Cofax is directly related to the system software, hardware and network.
Fermilab Scientific Computing Division Fermi National Accelerator Laboratory, Batavia, Illinois, USA. Off-the-Shelf Hardware and Software DAQ Performance.
CERN IT-Storage Strategy Outlook Alberto Pace, Luca Mascetti, Julien Leduc
Compute and Storage For the Farm at Jlab
Architecture Review 10/11/2004
WP18, High-speed data recording Krzysztof Wrona, European XFEL
Working With Azure Batch AI
Design and Manufacturing in a Distributed Computer Environment
Platform as a Service.
Grid Computing.
#01 Client/Server Computing
Overview of Workflows: Why Use Them?
#01 Client/Server Computing
Presentation transcript:

Data Acquisition at the NSLS II Leo Dalesio, (NSLS II control group) Oct 22, 2014 (not 2010)

Introduction When we scaled some data collection to the frame size and data sets that are expected at NSLS II we found that most of the delays in processing the data were: Time to find the data sets that were needed for the analysis. Movement of the data into memory. No Xray facility has demonstrated an architecture that supports the range of requirements posed by an entire facility.

Data Rates Each beam line can produce 8 MB frames up to 1 kfps, with data sets up to 1 TB per minute. There will be over 60 experimental end stations when the facility is completed. Detectors buffer all of this data on SDD storage and does not pose a challenge.

Data Management  1 data set per minute, with 60 beam lines operating 20 hours per day 72,000 data sets per day, over 2 M data sets in one year. 20,000 signals are controlled and monitored on each beam line and recorded as time series.  A MetaDataStore service is being provided to support the tracking of data  A ChannelArchiver is provided to store time series data from the beam lines.  A Science Data Store is being provided to store large frame rate data sets.

metadata Store The metadata store contains all configuration and environment parameters that are used to characterize a data set. Searches for specific data sets must be optimized. Experiment Log Book The Experiment Log Book contains information about the data collection It may be used to log steps in the data analysis such as export file formats used. Log entries can be made by manually or programmatically Scientific Data The Scientific Data is the repository for large data sets. These data sets include matrices and image sets with the metadata required to plot or view the data. These are stored by sample and time stamp. Data can be written from experiment control or submitted from analysis. Other data sets can also be sent here such as reference or model data. Machine Data Machine data is all time series data that is taken from the accelerator that is available for use in analysis. Beamline Data Beamline data is all time series data that is taken from the end station, beam line or accelerator that is available for use in analysis. Data Is Stored in Appropriate Repositories

Data Processing Scientists may be at the beam line and interact with the experiment control. In this case, we want to minimize the latency required to process the data and provide information. Scientists may have submitted samples and interact with a scheduler and analysis report. In this case, we want to batch process the data and provide remote analysis results and reanalysis requests.  There are many architectural elements being modified or developed to address fast sample feedback to the scientist. Zero copy transportation of large arrays in front end computers Modified EPICS locking to use multicore front ends for array processing Requests to vendors to provide parallel SFP ports to pipe data into our FPGAs Multicast network protocol to send data to multiple clients We are developing a sample management application.

Processing Block Queue – 1 to n deep Processing Block Queue – 1 to n deep PVAccess Server Queue – 1 to n deep PVAccess protocol supports multi-cast and optimizes structured and large data set transfers Image Viewer PVAccess Client Image Archive PVAccess Client Image Process PVAccess Client Image Viewer PVAccess Client Image Archive PVAccess Client Image Process PVAccess Client Data Processing Pipeline for fast results Develop optimized processing blocks on multi-core processors - zero copy, interlligent locking PVAccess protocol has C++, Java and Python Bindings Normative Types Implement a narrow interface. areaDetector Driver Queue – 1 to n deep FPGA Processing Blocks

Samples are either viewed by their order in the sample changer dewar (DewarView), or by the order the user would like to collect (PriorityView). Priority The color coding for now is cyan=done, green=running, checked means requested to run when the "Collect Queue" button is pressed below the tree. Sample Management

Data Analysis There are many techniques used to study various samples at NSLS II. Closer to the raw data, many of the algorithms used are common across beam lines. There are some very specialized analysis routines required as well Provenance of data must be preserved to validate the scientific process. “Scientists would rather use someone else’s toothbrush than their analysis routines” Users will want to use different, competing, evolved, higher performance, or just better codes.  To allow access to raw and processed data, a Data Broker API is provided  A standard library of python processing routines is produced and supported.  Processing routines use standard data types  Results from processing can be stored back into the data management system  A visualization library is supported for 1D and 2D data and data sets.  A data flow system is being developed to support the development of data analysis. VisTrails is being evaluated as a tool to provide this.

Design Status – Experiment Control (1 OF 4)

Design Status – Experiment Control (2 OF 4)

Design Status – Experiment Control (3 OF 4)

Design Status – Experiment Control (4 OF 4)

Data Storage  Data must be kept online for some time and may be stored to tape for some time In 2016 we will need 2 PB to store one year of data, by PB will be needed. There is a requirement to track data provenance and store intermediate analysis results.  Commercial solutions exist for this scale of data.  Tracking Provenance is being considered in workflow tools and the metaDataStore

Data Export Scientists need to put together frame rate, time series, and configuration data in specific formats for existing analysis codes. Scientists may want to export the data life cycle – from raw data through all analysis.  The DataBrokerAPI can be used in Python to access all data. Code must be developed to repackage data from different data stores into a desired export format.

Detector Ethernet CAC Control System Studio with Correlation Plots CAC Process Database. CAS Channel Archiver View PVManager PVAS Channel Finder Server SQL RDB PVAC PVAS PVAC PVAS Experiment Information. SQL OLOG PVAC PVAS Archive Retrieval XML/RPC Beamline Data N-lanes PVAS Archive Store/Retrieve Science Data XML/RPC PVAS Archive Retrieval XML/RPC Machine Data REST Experiment Information. SQL PASS Science Data Device Support Area Detector Driver Process Database. CAS PVAS Device Support Driver Web Clients HTTP Instrumentation NFS File Formatter CAC PVAC NFS Experimental Data Architecture DataBroker Experiment Control DataBroker

Remote data access problem There is 1-10 Gbps link into BNL Our users sit at universities, coffee shops, small offices (not on a high performance network port). Analysis computers are being planned at BNL  A user interface to the analysis codes developed and maintained at BNL is being developed.

Conclusions Two distinct areas of concern must be covered: Providing immediate feedback to allow scientists to effectively use their beam time. Provide an analysis and data management framework that supports deep analysis after the experiment is run. No facility has demonstrated an architecture that supports the range of requirements posed by an entire facility. A service based architecture is being developed to support the DAQ and data analysis needs of NSLS II. Infrastructure is being used from an existing collaboration group that provides some strong basic capabilities. There is a strong set of software architects, analysis, visualization, and fpga programmers working on these issues.