FairRoot Status and plans Mohammad Al-Turany 6/25/13M. Al-Turany, ALICE Offline Meeting1.

Slides:



Advertisements
Similar presentations
MicroKernel Pattern Presented by Sahibzada Sami ud din Kashif Khurshid.
Advertisements

Categories of I/O Devices
SDN and Openflow.
28.2 Functionality Application Software Provides Applications supply the high-level services that user access, and determine how users perceive the capabilities.
23/04/2008VLVnT08, Toulon, FR, April 2008, M. Stavrianakou, NESTOR-NOA 1 First thoughts for KM3Net on-shore data storage and distribution Facilities VLV.
ALFA: The new ALICE-FAIR software framework
CHEP04 - Interlaken - Sep. 27th - Oct. 1st 2004T. M. Steinbeck for the Alice Collaboration1/27 A Control Software for the ALICE High Level Trigger Timm.
Software Frameworks for Acquisition and Control European PhD – 2009 Horácio Fernandes.
CHEP03 - UCSD - March 24th-28th 2003 T. M. Steinbeck, V. Lindenstruth, H. Tilsner, for the Alice Collaboration Timm Morten Steinbeck, Computer Science.
The new The new MONARC Simulation Framework Iosif Legrand  California Institute of Technology.
Status and roadmap of the AlFa Framework Mohammad Al-Turany GSI-IT/CERN-PH-AIP.
Chapter 4.1 Interprocess Communication And Coordination By Shruti Poundarik.
Fundamentals of Python: From First Programs Through Data Structures
ALFA: Next generation concurrent framework for ALICE and FAIR experiments Mohammad Al-Turany GSI-ExpSys/CERN-PH.
ALFA - a common concurrency framework for ALICE and FAIR experiments
FairRoot framework Mohammad Al-Turany (GSI-Scientific Computing)
Computer System Architectures Computer System Software
Flexible data transport for the online reconstruction of FAIR experiments Mohammad Al-Turany Dennis Klein Alexey Rybalchenko (GSI-IT) 5/17/13M. Al-Turany,
THE SIMULATION- AND ANALYSIS-FRAMEWORK FAIRROOT CHEP 2012, New York.
1 Chapter Client-Server Interaction. 2 Functionality  Transport layer and layers below  Basic communication  Reliability  Application layer.
Institute of Computer and Communication Network Engineering OFC/NFOEC, 6-10 March 2011, Los Angeles, CA Lessons Learned From Implementing a Path Computation.
1 Lecture 4: Threads Operating System Fall Contents Overview: Processes & Threads Benefits of Threads Thread State and Operations User Thread.
High performance I/O with the ZeroMQ (ØMQ) messaging library thematic CERN School of Computing Aram Santogidis › May 2015.
Boosting Event Building Performance Using Infiniband FDR for CMS Upgrade Andrew Forrest – CERN (PH/CMD) Technology and Instrumentation in Particle Physics.
Lecture 3 Process Concepts. What is a Process? A process is the dynamic execution context of an executing program. Several processes may run concurrently,
Advanced Computer Networks Topic 2: Characterization of Distributed Systems.
Introduction to dCache Zhenping (Jane) Liu ATLAS Computing Facility, Physics Department Brookhaven National Lab 09/12 – 09/13, 2005 USATLAS Tier-1 & Tier-2.
The european ITM Task Force data structure F. Imbeaux.
Games Development 2 Concurrent Programming CO3301 Week 9.
CE Operating Systems Lecture 3 Overview of OS functions and structure.
Computers Operating System Essentials. Operating Systems PROGRAM HARDWARE OPERATING SYSTEM.
CBM Software Workshop for Future Challenges in Tracking and Trigger Concepts, GSI, 9 June 2010 Volker Friese.
Development of the distributed monitoring system for the NICA cluster Ivan Slepov (LHEP, JINR) Mathematical Modeling and Computational Physics Dubna, Russia,
Management of the LHCb DAQ Network Guoming Liu * †, Niko Neufeld * * CERN, Switzerland † University of Ferrara, Italy.
Fast Crash Recovery in RAMCloud. Motivation The role of DRAM has been increasing – Facebook used 150TB of DRAM For 200TB of disk storage However, there.
Processes Introduction to Operating Systems: Module 3.
Data Acquisition Backbone Core J. Adamczewski-Musch, N. Kurz, S. Linev GSI, Experiment Electronics, Data processing group.
INFORMATION SYSTEM-SOFTWARE Topic: OPERATING SYSTEM CONCEPTS.
CS533 - Concepts of Operating Systems 1 The Mach System Presented by Catherine Vilhauer.
Processes CSCI 4534 Chapter 4. Introduction Early computer systems allowed one program to be executed at a time –The program had complete control of the.
11 CLUSTERING AND AVAILABILITY Chapter 11. Chapter 11: CLUSTERING AND AVAILABILITY2 OVERVIEW  Describe the clustering capabilities of Microsoft Windows.
H.G.Essel: Go4 - J. Adamczewski, M. Al-Turany, D. Bertini, H.G.Essel, S.Linev CHEP 2003 GSI Online Offline Object Oriented Go4.
1 Client-Server Interaction. 2 Functionality Transport layer and layers below –Basic communication –Reliability Application layer –Abstractions Files.
Simulations and Software CBM Collaboration Meeting, GSI, 17 October 2008 Volker Friese Simulations Software Computing.
H.G.Essel: Go4 - J. Adamczewski, M. Al-Turany, D. Bertini, H.G.Essel, S.Linev ROOT 2002 GSI Online Offline Object Oriented Go4.
CSI 3125, Preliminaries, page 1 SERVLET. CSI 3125, Preliminaries, page 2 SERVLET A servlet is a server-side software program, written in Java code, that.
AMQP, Message Broker Babu Ram Dawadi. overview Why MOM architecture? Messaging broker like RabbitMQ in brief RabbitMQ AMQP – What is it ?
Status & development of the software for CALICE-DAQ Tao Wu On behalf of UK Collaboration.
ALFA - a common concurrency framework for ALICE and FAIR experiments Mohammad Al-Turany GSI-IT/CERN-PH.
Management of the LHCb DAQ Network Guoming Liu *†, Niko Neufeld * * CERN, Switzerland † University of Ferrara, Italy.
S.Linev: Go4 - J.Adamczewski, H.G.Essel, S.Linev ROOT 2005 New development in Go4.
Introduction Contain two or more CPU share common memory and peripherals. Provide greater system throughput. Multiple processor executing simultaneous.
1 Device Controller I/O units typically consist of A mechanical component: the device itself An electronic component: the device controller or adapter.
Markus Frank (CERN) & Albert Puig (UB).  An opportunity (Motivation)  Adopted approach  Implementation specifics  Status  Conclusions 2.
Mitglied der Helmholtz-Gemeinschaft FairMQ with FPGAs and GPUs Simone Esch –
ICE-DIP Project: Research on data transport for manycore processors for next generation DAQs Aram Santogidis › 5/12/2014.
Flexible data transport for online reconstruction M. Al-Turany Dennis Klein A. Rybalchenko 12/05/12 M. Al-Turany, Panda Collaboration Meeting, Goa 1.
Flexible data transport for the online reconstruction in FairRoot Mohammad Al-Turany Dennis Klein Anar Manafov Alexey Rybalchenko 6/25/13M. Al-Turany,
ALFA - a common concurrency framework for ALICE and FAIR experiments Mohammad Al-Turany GSI-IT/CERN-PH.
Barthélémy von Haller CERN PH/AID For the ALICE Collaboration The ALICE data quality monitoring system.
Fermilab Scientific Computing Division Fermi National Accelerator Laboratory, Batavia, Illinois, USA. Off-the-Shelf Hardware and Software DAQ Performance.
Introduction to Operating Systems Concepts
CMS High Level Trigger Configuration Management
Data Transport for Online & Offline Processing
Controlling a large CPU farm using industrial tools
The Client/Server Database Environment
Oracle Architecture Overview
CS703 - Advanced Operating Systems
Multiple Processor Systems
Software models - Software Architecture Design Patterns
Presentation transcript:

FairRoot Status and plans Mohammad Al-Turany 6/25/13M. Al-Turany, ALICE Offline Meeting1

What is FairRoot Framework? And why it is needed? 6/25/13M. Al-Turany, ALICE Offline Meeting2 Simulation-, Reconstruction-, and Analysis-Framework 2003 started as 2 person project for the CBM experiment at FAIR Long list of base and/or ready to use modules and base classes of needed by the particle experiments

Current hot topics in FairRoot Database interface o Re-design the database interface based on TSQLServer ZeroMQ integration o Use of ZeroMQ as a communication layer Building, testing and quality assurance systems o Coverage tests, quality tests and unit tests Online monitoring o For test beams and detector proto-types GPU support and integration Time based simulation 6/25/13M. Al-Turany, ALICE Offline Meeting3

long list of people who have contributed pieces of code to FairRoot since the project started end of /25/13M. Al-Turany, ALICE Offline Meeting4 Core Team: Mohammad Al-Turany IT Denis Bertini IT Florian Uhlig CBM / IT Radek Karabowicz PANDA / IT Dmytro Kresan R3B/ IT Tobias Stockmanns PANDA (FZJ) People participated to major features: Ilse König HADES Volker Friese CBM Olaf Hartman PANDA Student: Dennis Klein (finished ) Alexey Rybalchenko (EE)

FairRoot Group at the GSI Mohammad Al-Turany (IT) Denis Bertini (IT) Radoslaw Karabowicz (IT/PANDA) Dymtro Kresan (IT/R3B) Anar Manafov (IT) Alexey Rybalchenko (Master Student) Yago Gonzalez Rozas (Guest scientist) Florian Uhlig (IT/CBM) N.N. (Sep.2013) (IT) 6/25/13M. Al-Turany, ALICE Offline Meeting5

6/25/13M. Al-Turany, ALICE Offline Meeting Florian Uhlig ROOT Users Workshop, Saas Fee Root TEve ROOT IO TGeo TVirtualMC Cint TTree … Proof Geant3 Geant4 Genat4_VMC Libraries … VGM FairRoot … Run Manager IO Manager Runtime DB DB Interface Runtime DB DB Interface Event Display MC Application Module Detector Module Detector Task Magnetic Field … Event Generator Event Generator CbmRoot PandaRoot AsyEosRoot R3BRoot SofiaRoot MPDRoot FopiRoot EICRoot

Start testing the VMC concept for CBM First Release of CbmRoot MPD (NICA) start also using FairRoot ASYEOS joined (ASYEOSRoot) GEM-TPC seperated from PANDA branch (FOPIRoot) Panda decided to join-> FairRoot: same Base package for different experiments R3B joined EIC (Electron Ion Collider BNL) EICRoot FairRoot : Timeline 2012 SOFIA (Studies On Fission with Aladin) 6/25/13M. Al-Turany, ALICE Offline Meeting7 ENSAR-ROOT Collection of modules used by structural nuclear phsyics exp. 2013

Database Re-Design 6/25/13M. Al-Turany, ALICE Offline Meeting8

Database in FairRoot: The real database in FairRoot is completely hidden from the user and/or software developer The runtime database is not a database in the classical sense, but a parameter manager. It knows the “I/O”s defined by the user and all parameter containers needed for the actual analysis and/or Simulation. It manages the automatic initialization and saving of the parameter containers After all initialization the complete list of runs and related parameter versions are saved either to Database (Oracle, MySql, …) or to ROOT files. 6/25/13M. Al-Turany, ALICE Offline Meeting9

FairRoot DB Design (Old) 10 FairRoot Run Manager RunTime Database ASCII File Configuration parameters. ASCII File Configuration parameters. IO Manager Root File MC-points Digits, etc Root File MC-points Digits, etc Root File Configuration parameters. Root File Configuration parameters. Oracle 6/25/13M. Al-Turany, ALICE Offline Meeting

FairRoot DB extended 11 FairRoot Run Manager RunTime Database ASCII File Configuration parameters. ASCII File Configuration parameters. IO Manager Root File MC-points Digits, etc Root File MC-points Digits, etc Root File Configuration parameters. Root File Configuration parameters. TSQLServer Oracle Postgresql MySQL DB Interface 6/25/13M. Al-Turany, ALICE Offline Meeting

Re-design Database interface based on ROOT Database Connectivity (RDBC) API which provides uniform interface to Oracle, MySQL, PgSQL Database Interface in FairRoot using TSQLServer –(MySQL, Oracle, PostGre,... ) Allows multiple connections to Dbs at runtime Adds Version Management Data type: Real and/or MC Detector type Date and Time Range Reduces SQL coding Simple Predefined Table Only Simple SQL used Ultimately Generic Container Handles Write/Read access 6/25/13M. Al-Turany, ALICE Offline Meeting12

Detector TimeVersion Validity time range (UTC) STS CAL MVD CAL MVD TEMP Version Mangment It must be possible to get a consistent set of information for any date (e.g. The start time of a certain run). It must be possible to get an answer to the question: 'Which parameters were used when analyzing this run X years ago?' (The calibration might have been optimized several times since this date. Maybe some bugs have been detected and corrected in the mean time.) 6/25/13M. Al-Turany, ALICE Offline Meeting13 RunID  t Time

Version Management The Query process 1.Context ( Timestamp,Detector,Version) is the primary key 2.Context converted to unique SeqNo 3.SeqNo used as keys to access all rows in main table 4.System gives user access of all such rows Context matched Validity Frame Bigtable a Distributed Storage System for Structured Data, Google inc. OSDI 2006 Auxiliary validity table D. Bertini 146/25/13M. Al-Turany, ALICE Offline Meeting

New Data transfer layer for FairRoot 6/25/13M. Al-Turany, ALICE Offline Meeting15

The Online Reconstruction and analysis 6/25/ GB/s 20M Evt/s 300 GB/s 20M Evt/s < 1 GB/s 25K Evt/s < 1 GB/s 25K Evt/s We have the fastest algorithms but: How to distribute the processes? How to manage the data flow? How to recover processes when they crash? How to monitor the whole system? …… We have the fastest algorithms but: How to distribute the processes? How to manage the data flow? How to recover processes when they crash? How to monitor the whole system? …… 1 TB/s 1 GB/s > CPU-core or Equivalent GPU, FPGA, … > CPU-core or Equivalent GPU, FPGA, … M. Al-Turany, ALICE Offline Meeting16

Design constrains Highly flexible: o different data paths should be modeled. Adaptive: o Sub-systems are continuously under development and improvement Should works for simulated and real data: o developing and debugging the algorithms It should support all possible hardware where the algorithms could run (CPU, GPU, FPGA) It has to scale to any size! With minimum or ideally no effort. 6/25/13M. Al-Turany, ALICE Offline Meeting17

Data transport How to handle dynamic components, i.e. pieces that go away temporarily? How to handle messages that we can't deliver immediately? Particularly, if we're waiting for a component to come back on-line What if we need to use a different network transport. Say, multicast instead of TCP unicast? Or IPv6? Do we need to rewrite the applications, or is the transport abstracted in some layer? 6/25/13M. Al-Turany, ALICE Offline Meeting18

Before Re-inventing the Wheel What is available on the market and in the community? o A very promising package: ZeroMQ is available since 2 years Do we intend to separate online and offline? NO Multi-Threaded concept or Multi-Processes based on message queues? o Message based systems allow us to decouple producers from consumers. o We can spread the work to be done over several processes and machines. o We can manage/upgrade/move around programs (processes) independently of each other. 6/25/13M. Al-Turany, ALICE Offline Meeting19

ØMQ (zeromq) A socket library that acts as a concurrency framework. Carries messages across inproc, IPC, TCP, and multicast. Connect N-to-N via fanout, pubsub, pipeline, request-reply. Asynch I/O for scalable multicore message-passing apps. 30+ languages including C, C++, Java,.NET, Python. Most OS’s including Linux, Windows, OS X, PPC405/PPC440. Large and active open source community. LGPL free software with full commercial support from iMatix. 6/25/1320M. Al-Turany, ALICE Offline Meeting

What does it deliver? It handles I/O asynchronously, in background threads. o These communicate with application threads using lock-free data structures, o Concurrent ØMQ applications need no locks, semaphores, or other wait states. Components can come and go dynamically and ØMQ will automatically reconnect. o You can start components in any order. o You can create "service-oriented architectures" (SOAs) where services can join and leave the network at any time. When a queue is full, ØMQ o Automatically blocks senders, or o Throws away messages, depending on the kind of messaging you are doing (the so-called "pattern"). 6/25/13M. Al-Turany, ALICE Offline Meeting21

What does it deliver? It does not impose any format on messages. o They are blobs of zero to gigabytes large. o You can use any other product (Protocol) on top to represent your data (Google's protocol buffers, etc). Applications talk to each other over arbitrary transports: TCP, multicast, in-process, inter-process. o You don't need to change your code to use a different transport. 6/25/13M. Al-Turany, ALICE Offline Meeting22

The built-in core ØMQ patterns are: Request-reply, which connects a set of clients to a set of services. (remote procedure call and task distribution pattern) Publish-subscribe, which connects a set of publishers to a set of subscribers. (data distribution pattern) Pipeline, which connects nodes in a fan-out / fan-in pattern that can have multiple steps, and loops. (Parallel task distribution and collection pattern) Exclusive pair, which connect two sockets exclusively 6/25/13M. Al-Turany, ALICE Offline Meeting23

Current Status The Framework deliver some components which can be connected to each other in order to to optimize data flow topology. All component share a common base called Device (ZeroMQ Class). Devices are grouped by three categories: o Source: Sampler o Message-based Processor: Sink, BalancedStandaloneSplitter, StandaloneMerger, Buffer o Content-based Processor: Processor 6/25/13M. Al-Turany, ALICE Offline Meeting24

Panda Example 6/25/13 Experiment/dete ctor specific code Framework classes that can be used directly M. Al-Turany, ALICE Offline Meeting25 FairMQ package

Computing Unit Detector Simulation Detector Simulation Example for Panda online reconstruction hierarchy (scenario) MVD Pixel data Mvd Strip data Clusterer REQ REP Tracker REQ SUB PUB SUB Parameter database Parameter database PUB SUB PUB SUB REP 6/25/13M. Al-Turany, ALICE Offline Meeting26 Log XPUB Log Aggregate Log Writer XSUB XPUB XSUB

Correct semantics for logging Pub/Sub sockets Never block Lossy! (if needed) Buffer sizes / locations configurable Arbitrary message size 6/25/13M. Al-Turany, ALICE Offline Meeting27

Results Throughput of 940 Mbit/s was measured which is very close to the theoretical limit of the TCP/IPv4/GigabitEthernet The throughput for the named pipe transport between two devices on one node has been measured around 1.7 GB/s 6/25/13M. Al-Turany, ALICE Offline Meeting28 Each message consists of digits in one panda event for one detector, with size of few kBytes

Payload in Mbyte/s as function of message size 6/25/13M. Al-Turany, ALICE Offline Meeting29 ZeroMQ works on InfiniBand but using IP over IB

ZeroMQ Root (Event loop) 6/25/13 FairRootManager FairRunAna FairTasks Init() Re-Init() Exec() Finish() FairTasks Init() Re-Init() Exec() Finish() FairMQProcessorTask Init() Re-Init() Exec() Finish() FairMQProcessorTask Init() Re-Init() Exec() Finish() ROOT Files, Lmd Files, Remote event server, … Integrating the existing software: M. Al-Turany, ALICE Offline Meeting30

FairBase/examples/Tutorial3 6/25/13M. Al-Turany, ALICE Offline Meeting31 Fairbase/example/Tutorial3

Next to implement Local and central Log processors Command channels and objects (messages) Automatic monitoring and configuration (hopefully till the end of this year!) 6/25/13M. Al-Turany, ALICE Offline Meeting32

Summary ZeroMQ communication layer is integrated into our offline framework (FairRoot) On the short term we will keep both options ROOT based event loop and concurrent processes communicating with each other via ZeroMQ. On long Term we are moving away from single event loop to distributed processes. Thanks you ! 6/25/13M. Al-Turany, ALICE Offline Meeting33

Native InfiniBand/RDMA is faster than IP over IB 6/25/13M. Al-Turany, ALICE Offline Meeting34 Implementing ZeroMQ over IB verbs will improve the performance.

Device Each processing stage of a pipeline is occupied by a process which executes an instance of the Device class 6/25/13M. Al-Turany, ALICE Offline Meeting35

Sampler Devices with no inputs are categorized as sources A sampler loops (optionally: infinitely) over the loaded events and send them through the output socket. A variable event rate limiter has been implemented to control the sending speed 6/25/13M. Al-Turany, ALICE Offline Meeting36

Message format (Protocol) Potentially any content-based processor or any source can change the application protocol. Therefore, the framework provides a generic Message class that works with any arbitrary and continuous junk of memory (FairMQMessage). One has to pass a pointer to the memory buffer, the size in bytes, and can optionally pass a function pointer to a destructor, which will be called once the message object is discarded. 6/25/13M. Al-Turany, ALICE Offline Meeting37

New simple classes without ROOT are used in the Sampler (This enable us to use non-ROOT clients) and reduce the messages size. 6/25/13M. Al-Turany, ALICE Offline Meeting38

Processor design 6/25/13M. Al-Turany, ALICE Offline Meeting39 Processor N-Data output sockets N-Data Input sockets Log server Command Client

Content-based Processor The Processor device has at least one input and one output socket. A task is meant for accessing and potentially changing the message content. 6/25/13M. Al-Turany, ALICE Offline Meeting40

Message-based Processor All message-based processors inherit from Device and operate on messages without interpreting their content. Four message-based processors have been implemented so far 6/25/13M. Al-Turany, ALICE Offline Meeting41

6/25/13 MVD data Clusterer MVD Tracker MVD data FairMQBalancedStandaloneSplitter Clustrer FairMQStandaloneMerger Tracker Example for Fan-out/Fan-in the data path for load balancing M. Al-Turany, ALICE Offline Meeting42

6/25/13 MVD data Clusterer MVD Tracker MVD data FairMQBalancedStandaloneSplitter Clustrer FairMQStandaloneMerger Example for Fan-out/Fan-in the data path for load balancing M. Al-Turany, ALICE Offline Meeting43 MVD Tracker