SuperB Computing R&D Workshop 9-12 March 2010, IUSS, Ferrara, Italy P. Mato /CERN.

Slides:



Advertisements
Similar presentations
Database System Concepts and Architecture
Advertisements

Ch:8 Design Concepts S.W Design should have following quality attribute: Functionality Usability Reliability Performance Supportability (extensibility,
1 Software & Grid Middleware for Tier 2 Centers Rob Gardner Indiana University DOE/NSF Review of U.S. ATLAS and CMS Computing Projects Brookhaven National.
Ideas on the LCG Application Architecture Application Architecture Blueprint RTAG 12 th June 2002 P. Mato / CERN.
Usage of the Python Programming Language in the CMS Experiment Rick Wilkinson (Caltech), Benedikt Hegner (CERN) On behalf of CMS Offline & Computing 1.
Course Instructor: Aisha Azeem
Architectural Design Establishing the overall structure of a software system Objectives To introduce architectural design and to discuss its importance.
Software Architecture in Practice (3rd Ed) Introduction
SEAL V1 Status 12 February 2003 P. Mato / CERN Shared Environment for Applications at LHC.
©Ian Sommerville 2004Software Engineering, 7th edition. Chapter 18 Slide 1 Software Reuse.
OpenAlea An OpenSource platform for plant modeling C. Pradal, S. Dufour-Kowalski, F. Boudon, C. Fournier, C. Godin.
TILC09, April 2009, Tsukuba P. Mato /CERN.  Former LHCb core software coordination ◦ Architect of the GAUDI framework  Applications Area manager.
Chapter 2 The process Process, Methods, and Tools
Gaudi Framework Tutorial, April Introduction.
An Introduction to Software Architecture
David Adams ATLAS ATLAS Distributed Analysis David Adams BNL March 18, 2004 ATLAS Software Workshop Grid session.
ALCPG Software Tools Jeremy McCormick, SLAC LCWS 2012, UT Arlington October 23, 2012.
Magnetic Field Measurement System as Part of a Software Family Jerzy M. Nogiec Joe DiMarco Fermilab.
LC Software Workshop, May 2009, CERN P. Mato /CERN.
The PH-SFT Group Mandate, Organization, Achievements Software Components Software Services Summary January 31st,
CSE 219 Computer Science III Program Design Principles.
N ATIONAL E NERGY R ESEARCH S CIENTIFIC C OMPUTING C ENTER Charles Leggett The Athena Control Framework in Production, New Developments and Lessons Learned.
Common Application Software for the LHC experiments NEC’2007 International Symposium, Varna, Bulgaria September 2007 Pere Mato, CERN.
SEAL: Core Libraries and Services Project CERN/IT After-C5 Meeting 6 June 2003 P. Mato / CERN.
Systems Analysis and Design in a Changing World, 3rd Edition
Chapter 10 Analysis and Design Discipline. 2 Purpose The purpose is to translate the requirements into a specification that describes how to implement.
1 Planning for Reuse (based on some ideas currently being discussed in LHCb ) m Obstacles to reuse m Process for reuse m Project organisation for reuse.
MINER A Software The Goals Software being developed have to be portable maintainable over the expected lifetime of the experiment extensible accessible.
W. Pokorski - CERN Simulation Project1 Python binding for Geant4 toolkit using Reflex/PyROOT tool Witek Pokorski EuroPython 2006, CERN, Geneva
Introduction to Gaudi LHCb software tutorial - September
Distribution and components. 2 What is the problem? Enterprise computing is Large scale & complex: It supports large scale and complex organisations Spanning.
SEAL: Common Core Libraries and Services for LHC Applications CHEP’03, March 24-28, 2003 La Jolla, California J. Generowicz/CERN, M. Marino/LBNL, P. Mato/CERN,
SEAL Core Libraries and Services CLHEP Workshop 28 January 2003 P. Mato / CERN Shared Environment for Applications at LHC.
SEAL Project Core Libraries and Services 18 December 2002 P. Mato / CERN Shared Environment for Applications at LHC.
LC Software Workshop, May 2009, CERN P. Mato /CERN.
GDB Meeting - 10 June 2003 ATLAS Offline Software David R. Quarrie Lawrence Berkeley National Laboratory
The CMS Simulation Software Julia Yarba, Fermilab on behalf of CMS Collaboration 22 m long, 15 m in diameter Over a million geometrical volumes Many complex.
- Early Adopters (09mar00) May 2000 Prototype Framework Early Adopters Craig E. Tull HCG/NERSC/LBNL ATLAS Arch CERN March 9, 2000.
ATLAS Meeting CERN, 17 October 2011 P. Mato, CERN.
CASE (Computer-Aided Software Engineering) Tools Software that is used to support software process activities. Provides software process support by:- –
Design Reuse Earlier we have covered the re-usable Architectural Styles as design patterns for High-Level Design. At mid-level and low-level, design patterns.
August 2003 At A Glance The IRC is a platform independent, extensible, and adaptive framework that provides robust, interactive, and distributed control.
Detector Description in LHCb Detector Description Workshop 13 June 2002 S. Ponce, P. Mato / CERN.
SEAL Project Overview LCG-AA Internal Review October 2003 P. Mato / CERN.
23/2/2000Status of GAUDI 1 P. Mato / CERN Computing meeting, LHCb Week 23 February 2000.
General requirements for BES III offline & EF selection software Weidong Li.
CERN Tutorial, February Introduction to Gaudi.
D. Duellmann - IT/DB LCG - POOL Project1 The LCG Dictionary and POOL Dirk Duellmann.
Preliminary Ideas for a New Project Proposal.  Motivation  Vision  More details  Impact for Geant4  Project and Timeline P. Mato/CERN 2.
The SEAL Component Model Radovan Chytracek CERN IT/DB, LCG AA On behalf of LCG/SEAL team This work received support from Particle Physics and Astronomy.
SEAL Project Status SC2 Meeting 16th April 2003 P. Mato / CERN.
INFSO-RI Enabling Grids for E-sciencE Using of GANGA interface for Athena applications A. Zalite / PNPI.
From Use Cases to Implementation 1. Structural and Behavioral Aspects of Collaborations  Two aspects of Collaborations Structural – specifies the static.
CPT Week, November , 2002 Lassi A. Tuura, Northeastern University Core Framework Infrastructure Lassi A. Tuura Northeastern.
David Adams ATLAS ATLAS Distributed Analysis and proposal for ATLAS-LHCb system David Adams BNL March 22, 2004 ATLAS-LHCb-GANGA Meeting.
Follow-up to SFT Review (2009/2010) Priorities and Organization for 2011 and 2012.
Project Work Plan SEAL: Core Libraries and Services 7 January 2003 P. Mato / CERN Shared Environment for Applications at LHC.
VI/ CERN Dec 4 CMS Software Architecture vs Hybrid Store Vincenzo Innocente CMS Week CERN, Dec
From Use Cases to Implementation 1. Mapping Requirements Directly to Design and Code  For many, if not most, of our requirements it is relatively easy.
Muon Collider Physics Workshop November 2009, Fermilab P. Mato /CERN.
16 th Geant4 Collaboration Meeting SLAC, September 2011 P. Mato, CERN.
SEAL: Common Core Libraries and Services for LHC Applications
The LHCb Software and Computing NSS/IEEE workshop Ph. Charpentier, CERN B00le.
Conference on High-Energy Physics Moscow, Russia,
SW Architecture SG meeting 22 July 1999 P. Mato, CERN
An Introduction to Software Architecture
Simulation Framework Subproject cern
Simulation and Physics
SEAL Project Core Libraries and Services
Planning next release of GAUDI
Presentation transcript:

SuperB Computing R&D Workshop 9-12 March 2010, IUSS, Ferrara, Italy P. Mato /CERN

 About myself  Introduction ◦ Software components ◦ Programming languages  Software interoperability levels ◦ Level-1 to Level-3  Data Processing Software Frameworks  Possible evolution of Frameworks  Summary SuperB Computing R&D Workshop, March 9-12, 2010, Ferrara -- P. Mato/CERN2

 Former LHCb core software coordination ◦ Architect of the GAUDI framework  Applications Area manager of the Worldwide LHC Computing Grid (WLCG) ◦ Develops and maintains that part of the physics applications software and associated infrastructure that is common among the LHC experiments  Leader of the Software Development for Experiments (SFT) support group in the Physics (PH) Department at CERN ◦ Projects: Geant4, ROOT, GENSER, SPI, CernVM, etc. ◦ Giving support to LHC experiments (+LC Detector studies) SuperB Computing R&D Workshop, March 9-12, 2010, Ferrara -- P. Mato/CERN3

4 non-HEP specific software packages Experiment Framework Event Det Desc. Calib. Applications Core Libraries Simulation Data Mngmt. Distrib. Analysis Every experiment has a framework for basic services and various specialized frameworks: event model, detector description, visualization, persistency, interactivity, simulation, calibrarion Many non-HEP libraries widely used (e.g. Xerces, GSL, Boost, etc.) Applications are built on top of frameworks and implementing the required algorithms (e.g. simulation, reconstruction, analysis, trigger) Core libraries and services that are widely used and provide basic functionality (e.g. ROOT, HepMC,…) Specialized domains that are common among the experiments (e.g. Geant4, COOL, Generators, etc.)

 Foundation Libraries ◦ Basic types ◦ Utility libraries ◦ System isolation libraries  Mathematical Libraries ◦ Special functions ◦ Minimization, Random Numbers  Data Organization ◦ Event Data ◦ Event Metadata (Event collections) ◦ Detector Description ◦ Detector Conditions Data  Data Management Tools ◦ Object Persistency ◦ Data Distribution and Replication SuperB Computing R&D Workshop, March 9-12, 2010, Ferrara -- P. Mato/CERN5  Simulation Toolkits Event generators Detector simulation  Statistical Analysis Tools Histograms, N-tuples Fitting  Interactivity and User Interfaces GUI Scripting Interactive analysis  Data Visualization and Graphics Event and Geometry displays  Distributed Applications Parallel processing Grid computing One or more implementations of each component exists for LHC

 C++ used almost exclusively by all LHC Experiments ◦ LHC experiments with an initial FORTRAN code base have completed the migration to C++ long time ago  Large common software projects in C++ are in production for many years ◦ ROOT, Geant4, …  FORTRAN still in use mainly by the MC generators ◦ Large developments efforts are being put for the migration to C++ (Pythia8, Herwig++, Sherpa,…)  Java is almost inexistent for LHC ◦ Exception is the ATLAS event display ATLANTIS SuperB Computing R&D Workshop, March 9-12, 2010, Ferrara -- P. Mato/CERN6

 Scripting has been an essential component in the HEP analysis software for the last decades ◦ PAW macros (kumac) in the FORTRAN era ◦ C++ interpreter (CINT) in the C++ era ◦ Python is widely used by 3 out of 4 LHC experiments  Most of the statistical data analysis and final presentation is done with scripts ◦ Interactive analysis ◦ Rapid prototyping to test new ideas  Scripts are also used to “configure” complex C++ programs developed and used by the experiments ◦ “Simulation” and “Reconstruction” programs with hundreds or thousands of options to configure SuperB Computing R&D Workshop, March 9-12, 2010, Ferrara -- P. Mato/CERN7

 Python language is really interesting for two main reasons: ◦ High level programming language  Simple, elegant, easy to learn language  Ideal for rapid prototyping  Used for scientific programming ( ◦ Framework to “glue” different functionalities  Any two pieces of software can be glued at runtime if they offer a “Python interface”  In principle with PyROOT any C++ class can be easy used from Python SuperB Computing R&D Workshop, March 9-12, 2010, Ferrara -- P. Mato/CERN8

 The software system of an experiment should allow for interoperability, reuse and enabling independent developments ◦ Level 1 - Application Interfaces/Formats ◦ Level 2 - Software Integrating Elements ◦ Level 3 – Common frameworks SuperB Computing R&D Workshop, March 9-12, 2010, Ferrara -- P. Mato/CERN9

Application Interfaces and Common Formats

 Agreement on ‘standard formats’ for input and output data is essential for independent development of different detector concepts ◦ Detector description, configuration data, MC generator data, raw data, reconstructed data, statistical data, etc.  This has been the direction for the ILC software so far (e.g. lcdd, lcio, stdhep, gear,…) ◦ Very successful ◦ Interoperability with different languages (C++, Java)  Unfortunately LHC has been standardizing on other formats/interfaces (e.g. HepMC, GDML, ROOT files, etc.) ◦ Effort has started to increase convergence in this area SuperB Computing R&D Workshop, March 9-12, 2010, Ferrara -- P. Mato/CERN11

 It is not sufficient to say that is ‘ROOT’ format or ‘ASCII’ format ◦ You need to agreed on what is the actual data contents ◦ It implies common or standard ‘Event Data Models’  Common Event Data models ◦ Obviously within an experiment the Event model is common  Algorithms can be easily re-used between reconstruction, high- level trigger, etc. ◦ Sharing of Event Data models in LHC has not happen ◦ On the contrary, LCIO is a very successful Event Model (and I/O system) for the ILC community  It allows to interoperate between detector concepts SuperB Computing R&D Workshop, March 9-12, 2010, Ferrara -- P. Mato/CERN12

 Event record written in C++ for HEP MC Generators. ◦ Many extensions from HEPEVT (the Fortran HEP standard common block)  Agreed interface between MC authors and clients (LHC experiments, Geant4, …)  I/O support ◦ Read and Write ASCII files. Handy but not very efficient. ◦ HepMC events can also be stored with ROOT I/O SuperB Computing R&D Workshop, March 9-12, 2010, Ferrara -- P. Mato/CERN13

 Geometry Description Markup Language (XML)  Low level (materials, shapes, volumes and placements) ◦ Quite verbose to edit directly  Directly understood by Geant4 and ROOT ◦ Long term commitment  Has been extended by lcdd with sensitive volumes, regions, visualization attributes, etc. SuperB Computing R&D Workshop, March 9-12, 2010, Ferrara -- P. Mato/CERN14

Dictionaries, Scripting Languages and Components Models

 To facilitate the integration of independently developed components to build a coherent [single] program  Dictionaries ◦ Dictionaries provide meta data information (reflection) to allow introspection and interaction of objects in a generic manner ◦ The LHC strategy has been a single reflection system (Reflex)  Scripting languages ◦ Interpreted languages are ideal for rapid prototyping ◦ They allow integration of independently developed software modules (software bus) ◦ Standardized on CINT and Python scripting languages  Component model and plug-in management ◦ Modeling the application as a set of components with well defined interfaces ◦ Loading the required functionality at runtime SuperB Computing R&D Workshop, March 9-12, 2010, Ferrara -- P. Mato/CERN16

SuperB Computing R&D Workshop, March 9-12, 2010, Ferrara -- P. Mato/CERN17 X.h Reflex data rootcint genreflex XDict.so ReflexROOT ROOT meta C++ CINT Python (PyROOT) Object I/O Scripting (CINT, Python) Plug-in management etc. dictionaries

 Highly optimized (speed & size) platform independent I/O system developed for more than 10 years ◦ Able to write/read any C++ object (event model independent) ◦ Almost no restrictions (default constructor needed)  Make use of ‘dictionaries’  Self-describing files ◦ Support for automatic and complex ‘schema evolution’ ◦ Usable without ‘user libraries’  All the LHC experiments rely on ROOT I/O for the next many years SuperB Computing R&D Workshop, March 9-12, 2010, Ferrara -- P. Mato/CERN18

Software Frameworks

 A software framework is an abstraction in which common code providing generic functionality can be selectively overridden or specialized by user code providing specific functionality.  A software framework is similar to software libraries in that they are reusable abstractions of code wrapped in a well-defined API ◦ Typically the framework “calls” the user provided adaptations for specific functionality  Is the realization of a software architecture and facilitates software re-use SuperB Computing R&D Workshop, March 9-12, 2010, Ferrara -- P. Mato/CERN20

 Capture major interfaces between subsystems and packages early  Be able to visualize and reason about the design in a common notation ◦ Common vocabulary, running scenarios  Be able to break the work into smaller pieces that can be developed concurrently by different teams  Acquire an understanding of non-functional constrains ◦ Programming languages, concurrency, database, GUI, component re-use SuperB Computing R&D Workshop, March 9-12, 2010, Ferrara -- P. Mato/CERN21

 A well designed architecture has certain qualities: ◦ layered subsystems ◦ low inter-subsystem coupling ◦ robust, resilient and scalable ◦ high degree of reusable components ◦ clear interfaces ◦ driven by most important and risky use cases ◦ easy to understand SuperB Computing R&D Workshop, March 9-12, 2010, Ferrara -- P. Mato/CERN22

 Experiments have developed Software Frameworks ◦ General architecture of any event processing applications (simulation, trigger, reconstruction, analysis, etc.) ◦ To achieve coherency and to facilitate software re-use ◦ Hide technical details to the end-user Physicists ◦ Help the Physicists to focus on their physics algorithms  Applications are developed by customizing the Framework ◦ By the “composition” of elemental Algorithms to form complete applications ◦ Using third-party components wherever possible and configuring them  ALICE: AliROOT; ATLAS+LHCb: Athena/Gaudi; CMS: CMSSW SuperB Computing R&D Workshop, March 9-12, 2010, Ferrara -- P. Mato/CERN23

 Predefined component ‘vocabulary’ ◦ E.g. ‘Algorithm’, ‘Tool’, ‘Service’, ‘Auditor’, ‘DataObject’, ‘Property’, ‘DetectorCondition’, etc  Separation from interfaces & implementation ◦ Allowing for evolution of implementations  Plug-in based (dynamic loading)  Homogenous configuration, logging and error reporting  Built-in profiler, monitoring, utilities, etc.  Interoperable with other languages (e.g. Java, Python, etc.) SuperB Computing R&D Workshop, March 9-12, 2010, Ferrara -- P. Mato/CERN24

 GAUDI is a mature software framework for event data processing used by several HEP experiments ◦ ATLAS, LHCb, HARP, GLAST, Daya Bay, Minerva, BES III,…  The same framework is used for all applications ◦ All applications behave the same way (configuration, logging, control, etc.) ◦ Re-use of ‘Services’ (e.g. Det. description) ◦ Re-use of ‘Algorithms’ (e.g. Recons -> HTL) SuperB Computing R&D Workshop, March 9-12, 2010, Ferrara -- P. Mato/CERN25

 Separation between “data” and “algorithms”  Three basic categories of “data” ◦ event data, detector data, statistical data  Separation between “transient” and “persistent” representations of data  Data store-centered (“black-board”) architectural style  “User code” encapsulated in few specific places  Well defined component “interfaces” with plug-in capabilities SuperB Computing R&D Workshop, March 9-12, 2010, Ferrara -- P. Mato/CERN26

SuperB Computing R&D Workshop, March 9-12, 2010, Ferrara -- P. Mato/CERN27 Algorithm A Algorithm B Algorithm C Transient Event Data Store Data T1 Data T2, T3 Data T2 Data T3, T4 Data T4 Data T5 Data T1 Data T5 Real dataflow Apparent dataflow

 Concept of sequences of Algorithms to allow processing based on physics signature ◦ Avoid re-calling same algorithm on same event ◦ Different instances of the same algorithm possible  Event filtering ◦ Avoid passing all the events through all the processing chain SuperB Computing R&D Workshop, March 9-12, 2010, Ferrara -- P. Mato/CERN 28 Event Input/Output Algorithm Filter Decision Single Instances

 Typically the execution of Algorithms are explicitly specified by the initial sequence and and sub-sequences ◦ Avoid too-late loading of components (HTL) ◦ Easier to debug  For some use-cases it is necessary to trigger the execution of a given Algorithm by accessing an Object in the Transient Store ◦ The DataOnDemand Service is can be configured to provide this functionality SuperB Computing R&D Workshop, March 9-12, 2010, Ferrara -- P. Mato/CERN29

◦ JobOptions Service ◦ Message Service ◦ Particle Properties Service ◦ Event Data Service ◦ Histogram Service ◦ N-tuple Service ◦ Detector Data Service ◦ Magnetic Field Service ◦ Tracking Material Service ◦ Random Number Generator ◦ Chrono Service ◦ (Persistency Services) ◦ (User Interface & Visualization Services) ◦ (Geant4 Services) 30SuperB Computing R&D Workshop, March 9-12, 2010, Ferrara -- P. Mato/CERN

 Each Framework component can be configured by a set of ‘properties’ (name/ value pairs)  In total thousands of parameters need to be specified to fully configure a complex HEP application  Using Python to facilitate the task ◦ Python ”configurables” generated from C++ ◦ Build-in type checking SuperB Computing R&D Workshop, March 9-12, 2010, Ferrara -- P. Mato/CERN31

SuperB Computing R&D Workshop, March 9-12, 2010, Ferrara -- P. Mato/CERN32 Worker Transient Event Store Reader Transient Event Store Algorithm Event Input Queue Input Event Data Writer Transient Event Store output OutputStream Event Output Queue gaudirun –-parallel=N optionfile.py Configuratio n

 The high-level steering of applications could be done in Python ◦ Configuration, plugin (module) loading, event loop, algorithm scheduling, etc. ◦ The low Python execution performance shouldn’t be a problem at this level  The real number-crunching could be done in a proper compiled language (e.g. C or C++)  Essential to define ‘interfaces’ and ‘concepts’ for the framework components ◦ As it is the case for a C++ framework SuperB Computing R&D Workshop, March 9-12, 2010, Ferrara -- P. Mato/CERN33

 Could simply a number of functionalities ◦ Plug-in declaration, definition and loading ◦ Interactivity/scripting will be more natural and better integrated  Already a positive experience with Gaudi Parallel ◦ The setup and steering is done in from Python using standard module (e.g. Processing) SuperB Computing R&D Workshop, March 9-12, 2010, Ferrara -- P. Mato/CERN34

 I have been focusing on the reasons why a Framework is needed rather than recommending a concrete one  When deciding on the framework (and languages) not only the technical advantages should be considered ◦ Interoperate with existing codes from the community (e.g. ROOT, Geant4, etc.) ◦ People training should not be underestimated  Common interfaces/formats is good but adopting a common framework is even better ◦ It would enable one level up in re-use ◦ Single framework to cover all applications is desirable  Super-B could profit from existing structures and support for the common LHC software SuperB Computing R&D Workshop, March 9-12, 2010, Ferrara -- P. Mato/CERN35