ALFA - a common concurrency framework for ALICE and FAIR experiments Mohammad Al-Turany GSI-IT/CERN-PH.

Slides:



Advertisements
Similar presentations
March 24-28, 2003Computing for High-Energy Physics Configuration Database for BaBar On-line Rainer Bartoldus, Gregory Dubois-Felsmann, Yury Kolomensky,
Advertisements

1 Generic logging layer for the distributed computing by Gene Van Buren Valeri Fine Jerome Lauret.
CHEP 2012 – New York City 1.  LHC Delivers bunch crossing at 40MHz  LHCb reduces the rate with a two level trigger system: ◦ First Level (L0) – Hardware.
1 Databases in ALICE L.Betev LCG Database Deployment and Persistency Workshop Geneva, October 17, 2005.
23/04/2008VLVnT08, Toulon, FR, April 2008, M. Stavrianakou, NESTOR-NOA 1 First thoughts for KM3Net on-shore data storage and distribution Facilities VLV.
ALFA: The new ALICE-FAIR software framework
1 ITC242 – Introduction to Data Communications Week 12 Topic 18 Chapter 19 Network Management.
CHEP04 - Interlaken - Sep. 27th - Oct. 1st 2004T. M. Steinbeck for the Alice Collaboration1/20 New Experiences with the ALICE High Level Trigger Data Transport.
CHEP04 - Interlaken - Sep. 27th - Oct. 1st 2004T. M. Steinbeck for the Alice Collaboration1/27 A Control Software for the ALICE High Level Trigger Timm.
> IRTG – Heidelberg 2007 < Jochen Thäder – University of Heidelberg 1/18 ALICE HLT in the TPC Commissioning IRTG Seminar Heidelberg – January 2008 Jochen.
CHEP03 - UCSD - March 24th-28th 2003 T. M. Steinbeck, V. Lindenstruth, H. Tilsner, for the Alice Collaboration Timm Morten Steinbeck, Computer Science.
Status and roadmap of the AlFa Framework Mohammad Al-Turany GSI-IT/CERN-PH-AIP.
ALFA: Next generation concurrent framework for ALICE and FAIR experiments Mohammad Al-Turany GSI-ExpSys/CERN-PH.
ALFA - a common concurrency framework for ALICE and FAIR experiments
FairRoot framework Mohammad Al-Turany (GSI-Scientific Computing)
©Ian Sommerville 2004Software Engineering, 7th edition. Chapter 18 Slide 1 Software Reuse 2.
Quality Control B. von Haller 8th June 2015 CERN.
©Ian Sommerville 2004Software Engineering, 7th edition. Chapter 18 Slide 1 Software Reuse.
Flexible data transport for the online reconstruction of FAIR experiments Mohammad Al-Turany Dennis Klein Alexey Rybalchenko (GSI-IT) 5/17/13M. Al-Turany,
SensIT PI Meeting, January 15-17, Self-Organizing Sensor Networks: Efficient Distributed Mechanisms Alvin S. Lim Computer Science and Software Engineering.
THE SIMULATION- AND ANALYSIS-FRAMEWORK FAIRROOT CHEP 2012, New York.
Boosting Event Building Performance Using Infiniband FDR for CMS Upgrade Andrew Forrest – CERN (PH/CMD) Technology and Instrumentation in Particle Physics.
Magnetic Field Measurement System as Part of a Software Family Jerzy M. Nogiec Joe DiMarco Fermilab.
JCOP Workshop September 8th 1999 H.J.Burckhart 1 ATLAS DCS Organization of Detector and Controls Architecture Connection to DAQ Front-end System Practical.
ALICE Upgrade for Run3: Computing HL-LHC Trigger, Online and Offline Computing Working Group Topical Workshop Sep 5 th 2014.
LOGO PROOF system for parallel MPD event processing Gertsenberger K. V. Joint Institute for Nuclear Research, Dubna.
February 17, 2015 Software Framework Development P. Hristov for CWG13.
Tool Integration with Data and Computation Grid GWE - “Grid Wizard Enterprise”
CBM Software Workshop for Future Challenges in Tracking and Trigger Concepts, GSI, 9 June 2010 Volker Friese.
ALICE, ATLAS, CMS & LHCb joint workshop on
And Tier 3 monitoring Tier 3 Ivan Kadochnikov LIT JINR
7. CBM collaboration meetingXDAQ evaluation - J.Adamczewski1.
Development of the distributed monitoring system for the NICA cluster Ivan Slepov (LHEP, JINR) Mathematical Modeling and Computational Physics Dubna, Russia,
Management of the LHCb DAQ Network Guoming Liu * †, Niko Neufeld * * CERN, Switzerland † University of Ferrara, Italy.
Data Acquisition Backbone Core J. Adamczewski-Musch, N. Kurz, S. Linev GSI, Experiment Electronics, Data processing group.
INFORMATION SYSTEM-SOFTWARE Topic: OPERATING SYSTEM CONCEPTS.
Introduction CMS database workshop 23 rd to 25 th of February 2004 Frank Glege.
INTRODUCTION TO DBS Database: a collection of data describing the activities of one or more related organizations DBMS: software designed to assist in.
LOGO Development of the distributed computing system for the MPD at the NICA collider, analytical estimations Mathematical Modeling and Computational Physics.
Overview of DAQ at CERN experiments E.Radicioni, INFN MICE Daq and Controls Workshop.
11 CLUSTERING AND AVAILABILITY Chapter 11. Chapter 11: CLUSTERING AND AVAILABILITY2 OVERVIEW  Describe the clustering capabilities of Microsoft Windows.
AliRoot survey P.Hristov 11/06/2013. Offline framework  AliRoot in development since 1998  Directly based on ROOT  Used since the detector TDR’s for.
Simulations and Software CBM Collaboration Meeting, GSI, 17 October 2008 Volker Friese Simulations Software Computing.
Abstract A Structured Approach for Modular Design: A Plug and Play Middleware for Sensory Modules, Actuation Platforms, Task Descriptions and Implementations.
Predrag Buncic Future IT challenges for ALICE Technical Workshop November 6, 2015.
FairRoot Status and plans Mohammad Al-Turany 6/25/13M. Al-Turany, ALICE Offline Meeting1.
1. 2 Purpose of This Presentation ◆ To explain how spacecraft can be virtualized by using a standard modeling method; ◆ To introduce the basic concept.
Tool Integration with Data and Computation Grid “Grid Wizard 2”
ALFA - a common concurrency framework for ALICE and FAIR experiments Mohammad Al-Turany GSI-IT/CERN-PH.
O 2 Project Roadmap P. VANDE VYVRE 1. O2 Project: What’s Next ? 2 O2 Plenary | 11 March 2015 | P. Vande Vyvre TDR close to its final state and its submission.
Management of the LHCb DAQ Network Guoming Liu *†, Niko Neufeld * * CERN, Switzerland † University of Ferrara, Italy.
S.Linev: Go4 - J.Adamczewski, H.G.Essel, S.Linev ROOT 2005 New development in Go4.
CWG13: Ideas and discussion about the online part of the prototype P. Hristov, 11/04/2014.
The ALICE data quality monitoring Barthélémy von Haller CERN PH/AID For the ALICE Collaboration.
Meeting with University of Malta| CERN, May 18, 2015 | Predrag Buncic ALICE Computing in Run 2+ P. Buncic 1.
Mitglied der Helmholtz-Gemeinschaft FairMQ with FPGAs and GPUs Simone Esch –
Online Software November 10, 2009 Infrastructure Overview Luciano Orsini, Roland Moser Invited Talk at SuperB ETD-Online Status Review.
ICE-DIP Project: Research on data transport for manycore processors for next generation DAQs Aram Santogidis › 5/12/2014.
Flexible data transport for online reconstruction M. Al-Turany Dennis Klein A. Rybalchenko 12/05/12 M. Al-Turany, Panda Collaboration Meeting, Goa 1.
Flexible data transport for the online reconstruction in FairRoot Mohammad Al-Turany Dennis Klein Anar Manafov Alexey Rybalchenko 6/25/13M. Al-Turany,
Barthélémy von Haller CERN PH/AID For the ALICE Collaboration The ALICE data quality monitoring system.
Fermilab Scientific Computing Division Fermi National Accelerator Laboratory, Batavia, Illinois, USA. Off-the-Shelf Hardware and Software DAQ Performance.
DAQ thoughts about upgrade 11/07/2012
PROOF system for parallel NICA event processing
CMS High Level Trigger Configuration Management
Data Transport for Online & Offline Processing
Commissioning of the ALICE HLT, TPC and PHOS systems
Computing Infrastructure for DAQ, DM and SC
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 2 Database System Concepts and Architecture.
OO-Design in PHENIX PHENIX, a BIG Collaboration A Liberal Data Model
Presentation transcript:

ALFA - a common concurrency framework for ALICE and FAIR experiments Mohammad Al-Turany GSI-IT/CERN-PH

ALFA: New ALICE-FAIR framework Motivation for ALFA ALFA and FairRoot Transport layer Serializing message data Deployment system Examples Summary 2

FairRoot (2013) – Concurrency is an issue – Online and offline can be joined ALICE O 2 Project started (2013) – DAQ, Online and Offline : Together on one Framework How it comes? 3

Similar online system requirements by ALICE and FAIR experiments 4

Online System Requirements PANDA Interaction rates of 20 MHz (50 MHz peak) Event size of 4-10 kB data rates after front end preprocessing: 80GB/s ‐300 GB/s No hardware trigger: – Autonomous recording frontends – Event selection in compute nodes 5

Online System Requirements CBM At 10 MHz, online data reduction by 1000 is mandatory (1 TB/s) Trigger signatures are complex (open charm) and require partial event reconstruction No a-priori association of signals to physical events! Need extremely fast reconstruction algorithms! 6

Online System Requirements ALICE Run3: Sample full 50kHz Pb-Pb interaction rate (current limit at ~500Hz, factor 100 increase)  ~1.1 TByte/s detector readout but the ALICE data processing power capacity will not allow more than a Storage bandwidth of ~20 GByte/s 7

Strategy Massive data volume reduction – Data reduction by (partial) online reconstruction and compression Much tighter coupling between online and offline reconstruction software 8

Start testing the VMC concept for CBM First Release of CbmRoot MPD (NICA) start also using FairRoot ASYEOS joined (ASYEOSRoot) GEM-TPC seperated from PANDA branch (FOPIRoot) Panda decided to join-> FairRoot: same Base package for different experiments R3B joined EIC (Electron Ion Collider BNL) EICRoot FairRoot 2012 SOFIA (Studies On Fission with Aladin) ENSAR-ROOT Collection of modules used by structural nuclear phsyics exp SHIP - Search for HIdden Particles 9

Current status of FairRoot 5/18/13M. Al-Turany, ACAT 2013, Beijing Florian Uhlig ROOT Users Workshop, Saas Fee Root TEve ROOT IO TGeo TVirtualMC Cint TTree … Proof Geant3 Geant4 Genat4_VMC Librarie s … VGM FairRoot … Run Manager IO Manager Runtime DB DB Interface Runtime DB DB Interface Event Display MC Application Module Detector Module Detector Task Magnetic Field … Event Generator Event Generator CbmRoot PandaRoot AsyEosRoot R3BRoot SofiaRoot MPDRoot FopiRoot EICRoot 10

ALICE and FAIR: Why? A common framework will be beneficial for the FAIR experiments since it will be tested with real data and existing detectors before the start of the FAIR facility. – E.g.: Concepts for online calibrations and alignment can be tested in a real environment, similar to that of the planned FAIR experiments. ALICE will benefit from the work already performed by the FairRoot team concerning already implemented features (e.g. the continuous read-out, building and testing system, etc) 11

Who is going to collaborate on ALFA ALICE DAQ ALICE HLT ALICE Offline FairRoot Fair and Non-Fair experiments are welcome to join 12

ALFA Common layer for parallel processing. Common algorithms for data processing. Common treatment of conditions database. Common deployment and monitoring infrastructure. 13

ALFA Will rely on a data-flow based model (Message Queues). It will contain – Transport layer (based on: ZeroMQ, NanoMSG) – Configuration tools – Management and monitoring tools Provide unified access to configuration parameters and databases. It will include support for a heterogeneous and distributed computing system. Incorporate common data processing components 14

Correct balance between reliability and performance Multi-process concept with message queues for data exchange – Each "Task" is a separate process, which can be also multithreaded, and the data exchange between the different tasks is done via messages. – Different topologies of tasks that can be adapted to the problem itself, and the hardware capabilities. 15

FairRoot is undergoing a re-design process to make it more modular ROOT Geant3 Geant4 Genat4_VMC Libraries … VGM Runtime DB Module Detector Magnetic Field Event Generator CbmRoot PandaRoot AsyEosRoot R3BRoot SofiaRoot MPDRoot FopiRoot EICRoot MC Application Transport (FairMQ) Building configuraion Testing Fair DB CMake ZeroMQ 16

How is it with ALFA and FairRoot? ROOT Geant3 Geant4 Genat4_VMC Libraries and Tools … VGM Runtime DB Module Detector Magnetic Field Event Generator MC Application Fair MQ Building configuraion Testing Fair DB CMake ZeroMQ DDS BOOST Protocol Buffers FairRoot ALFA CbmRoot PandaRoot AsyEosRoot R3BRoot SofiaRoot MPDRoot FopiRoot EICRoot AliRoot6 (O 2 ) ???? 17

BSD sockets API Bindings for 30+ languages Lockless and Fast Automatic re-connection Multiplexed I/O BSD sockets API Bindings for 30+ languages Lockless and Fast Automatic re-connection Multiplexed I/O ALFA will use ZMQ to connect different pieces together 18

BUT: nanomsg is under development by the original author of ZeroMQ Pluggable Transports: – ZeroMQ has no formal API for adding new transports (Infiniband, WebSeockets, etc). nanomsg defines such API, which simplifies implementation of new transports. Zero-Copy: – Better zero-copy support with RDMA and shared memory, which will improve transfer rates for larger data for inter-process communication. Simpler interface: – simplifies some zeromq concepts and API, for example, it no longer needs Context class. Numerous other improvements, described here: FairRoot is independent from the transport library – Modular/Pluggable/Switchable transport libraries. 19

ALFA has an abstract transport layer 20

How the different processes are going to communicate with each other? Different languages and hardware are possible how should the data exchange be done? Schema evolution? 21

Protocol buffers Google Protocol Buffers support is now implemented – Example in Tutorial 3 in FairRoot. To use protobuf, run cmake as follows: – cmake -DUSE_PROTOBUF=1 22

Boost serialization Code portability - depend only on ANSI C++ facilities. Code economy - exploit features of C++ such as RTTI, templates, and multiple inheritance, etc. where appropriate to make code shorter and simpler to use. Independent versioning for each class definition. That is, when a class definition changed, older files can still be imported to the new version of the class. Deep pointer save and restore. That is, save and restore of pointers saves and restores the data pointed to. …. This is used already by the CBM Online group in Frankfurt and to we need it to exchange data with them! This is used already by the CBM Online group in Frankfurt and to we need it to exchange data with them! 23

Protobuf, Boost or Manual serialization? Boost: – we are generic in the tasks but intrusive in the data classes (digi, hit, timestamp) Manual and Protobuf – we are generic in the class but intrusive in the tasks (need to fill/access payloads from class with set/get x, y, z etc ). Manual method is still the fastest, protobuf is 20% slower and boost is 30% slower. preliminary 24

How to deploy ALFA on a laptop, few PCs or a cluster? We need to utilize any RMS (Resource Management system) We should also run with out RMS Support different topologies and process dependencies 25

How to deploy ALFA on a laptop, few PCs or a cluster? We have to develop a dynamic deployment system (DDS): – An independent set of utilities and interfaces, which provide a dynamic distribution of different user processes by any given topology on any RMS. 26

The Dynamic Deployment System (DDS) Should: Deploy task or set of tasks Use (utilize) any RMS (Slurm, Grid Engine, … ), Secure execution of nodes (watchdog), Support different topologies and task dependencies Support a central log engine …. See Talk by Anar Manafov on Alice Offline week (March 2014) 27

Elements of the topology #### Task * A task is a single entity of the system. * A task can be an executable or a script. * A task is defined by a user with a set of props and rules. * Each task will have a dedicated DDS watchdog process. #### Collection * A set of tasks that have to be executed on the same physical computing node. #### Group * A container for tasks and collections. * Only main group can contain other groups. * Only group define multiplication factor for all its daughter elements. 28

GROUP_1 N=10 GROUP_1 N=10 COLLECTION_1 MAIN TASK_3 TASK_1 TASK_4 TASK_5 TASK_6 COLLECTION_2 TASK_2 TASK_7 TASK_1 TASK_2 TASK_5 29

Current Status The Framework delivers some components which can be connected to each other in order to construct a processing pipeline(s). All components share a common base called Device Devices are grouped by three categories: – Source: Data Readers (Simulated, raw) – Message-based Processor: Sink, Splitter, Merger, Buffer, Proxy – Content-based Processor: Processor 30

Using multi-part messages It allows us to concatenate multiple messages into a single message without copying All parts of the message are treated as a single atomic unit of transfer, however the boundaries between message parts are strictly preserved A multi-part message is a message that has more than one frame. ZeroMQ sends this message as a single on-the-wire message and guarantees to deliver all the message parts (one or more), or none of them. 31

We need a proto type for ALICE O2 is under development 32

O2Merger for(int i = 0; i < fNumInputs; i++) { if (poller->CheckInput(i)){ received = fPayloadInputs->at(i)->Receive(msg); } if (received) { if(i<NoOfMsgParts){ fPayloadOutputs->at(0)->Send(msg, ZMQ_SNDMORE); }else{ fPayloadOutputs->at(0)->Send(msg); } received = false; } } FLP Merger "Hold on! There are more data going to be added to this message!" 33

FLP Merger EPN EPN can be added on the fly New example (flp2epn) oot/tree/NewMQEx/example/flp2epn 34

Merger FLP Merger EPN Multiple mergers are also possible 35

Summary The AlFA project is starting The Communication layer is ready to use Different data serialization methods are available DDS and Topology parser are under development 36

Backup and Discussion 37

STORM is a very attractive option but no native support for C++ ! 38

The built-in core ØMQ patterns are: Request-reply, which connects a set of clients to a set of services. (remote procedure call and task distribution pattern) Publish-subscribe, which connects a set of publishers to a set of subscribers. (data distribution pattern) Pipeline, which connects nodes in a fan-out / fan-in pattern that can have multiple steps, and loops. (Parallel task distribution and collection pattern) Exclusive pair, which connect two sockets exclusively 39