Download presentation
Presentation is loading. Please wait.
1
Computing Infrastructure for DAQ, DM and SC
Djelloul Boukhelef Data processing workshop Eu-XFEL, Feb. 2016
2
The Big Picture Computing infrastructure for DAQ, DM and SC
3
Elements of the DAQ-DM-SC chain
PC layer: Cluster of high-performance servers that collect and store experiment data Highly tuned software and hardware infrastructure Online cache: provides storage space and bandwidth for storing experiment data Access to file system is strictly controlled, i.e. optimized for writing by PC layer Online analysis cluster: provide computing and storage resources for online scientific analysis and calibration pipeline Users’ online analysis algorithms run here Cluster file system store temporary data Offline storage: for storing experiment data amounts for longer time Metadata catalogue: central point that stores and organizes all information needed to locate and access scientific data
4
PC layer as data hub Cluster of high-performance computers where all experiment data are collected and saved to files PC layer’s operations on data Correct receiving of data from fast and slow data sources, using TCP/UDP or via broker Tuning and performance tests Very low packets loss rate: per 106 trains Consolidation: data collection and integration from several data sources Full event building: slow and fast data Formatting and recording to data files HDF5 format, file naming schema, file system Validation, monitoring, fast-feedback analysis, rejection and reduction (selected experts’ algorithms) Dissemination to scientific computing pipeline Data are available for user analysis at near-real-time Devices = datasources
5
Online storage Provides enough storage space and network/disk bandwidth to handle data from one experiment Prototyped using standalone storage boxes Will be implemented using IBM ESS (GPFS) DAQ system will store data into the online storage in HDF5 files Direct access to online cache is restricted and tightly controlled to give maximum reliability and performance for experiment data recording from PC layer Data stored in online cache can only be accessed through dedicated data reading service i.e. filesystem will not be visible to the computing nodes User can store this data locally and use it with other “non-karabo” tools.
6
Online analysis cluster
Provides computing and storage resources for data calibration and online analysis Connected to PC layer and Online-storage via InfiniBand fabric (FDR 56 Gbps) Dedicated cluster file system connected to online computing farm via Infiniband Provides temporary storage space for processed data and user temporary files Implemented using GPFS cluster filesystem Access to data is restricted to experiment member only (e.g. using system groups) User can store this data locally and use it with other “non-karabo” tools.
7
Offline storage system and archive
Access is protected using NFS4.1 ACL according to the definition provided by experiment PI User analysis spaces Large scratch space for user analysis Store processed and temporary data files Data are kept for a limited time period up to defined size with no backup Storage space for additional experiment data e.g. user uploaded data, processing results,… Quota (e.g. 5TB per experiment) Standard backup system Implemented using IBM ESS DESY dCache system to store and archive (raw) data 10PB of disk space + tape system (Raw) data files are immutable
8
Infrastructure
9
Test results with IBM GSS
Setup Several writers per host (Karabo devices): 1,2,3,4 Write to separate folders: per device per host Variable file sizes: up to 1GB File rate: 10Hz, with periodic files delete and listing Results Format in memory and copy to IBM GSS: very slow Format HDF5 directly to IBM GSS: very fast 4 IBM servers IBM GSS (9TB, Petra III) Time FDR InfiniBand (56 Gb/s) Bandwidth Bandwidth
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.