Gran Sasso Lab, Jul- 2002 Andreas Pfeiffer, CERN/IT-API, Anaphe - OO Libraries for Data Analysis using C++ and Python AIDA –

Slides:



Advertisements
Similar presentations
Physicist Interfaces Project an overview Physicist Interfaces Project an overview Jakub T. Moscicki CERN June 2003.
Advertisements

Blueprint RTAGs1 Coherent Software Framework a Proposal LCG meeting CERN- 11 June Ren é Brun ftp://root.cern.ch/root/blueprint.ppt.
Ideas on the LCG Application Architecture Application Architecture Blueprint RTAG 12 th June 2002 P. Mato / CERN.
Vincenzo Innocente, BluePrint RTAGNuts & Bolts1 Architecture Nuts & Bolts Vincenzo Innocente CMS.
O. Stézowski IPN Lyon AGATA Week September 2003 Legnaro Data Analysis – Team #3 ROOT as a framework for AGATA.
Core Application Software Activities Ian Fisk US-CMS Physics Meeting April 20, 2001.
Usage of the Python Programming Language in the CMS Experiment Rick Wilkinson (Caltech), Benedikt Hegner (CERN) On behalf of CMS Offline & Computing 1.
ACAT Lassi A. Tuura, Northeastern University Ignominy Tool for Analysing Software Dependencies and For Reducing Complexity.
Software Installation, release 4.0 Geant4 Users’ Workshop Tutorial SLAC February 18-22, 2002 Takashi Sasaki, Gabriele Cosmo,
Victor Serbo, SLAC30 September 2004, Interlaken, Switzerland JASSimApp plugin for JAS3: Interactive Geant4 GUI Serbo, Victor (SLAC) - presenter Donszelmann,
By Steven Taylor.  Basically a video game engine is a software system designed for the creation and development of video games.  There are many game.
Software Installation The full set of lecture notes of this Geant4 Course is available at
SEAL V1 Status 12 February 2003 P. Mato / CERN Shared Environment for Applications at LHC.
FALL 2005CSI 4118 – UNIVERSITY OF OTTAWA1 Part 4 Web technologies: HTTP, CGI, PHP,Java applets)
Ianna Gaponenko, Northeastern University, Boston The CMS IGUANA Project1 George Alverson, Ianna Gaponenko, and Lucas Taylor Northeastern University, Boston.
Java Analysis Studio Status Update 12 May 2000 Altas Software Week Tony Johnson
Advanced Analysis Environments What is the role of Java in physics analysis? Will programming languages at all be relevant? Can commercial products help.
Introduzione al Software di CMS N. Amapane. Nicola AmapaneTorino, Aprile Outline CMS Software projects The framework: overview Finding more.
JAS3 + AIDA LC Simulations Workshop SLAC 19 th May 2003.
IX International Workshop on Advanced Computing and Analysis Techniques in Physics Research KEK, Tsukuba, December 2003
Conditions DB in LHCb LCG Conditions DB Workshop 8-9 December 2003 P. Mato / CERN.
Java Root IO Part of the FreeHEP Java Library Tony Johnson Mark Dönszelmann
G.Barrand, LAL-Orsay OpenScientist Status (v11) Relationship with AIDA
Ch 1. A Python Q&A Session Spring Why do people use Python? Software quality Developer productivity Program portability Support libraries Component.
CPT Week, Apr Lassi A. Tuura, Northeastern University Software Quality with Ignominy Lassi A. Tuura Northeastern.
V. Serbo, SLAC ACAT03, 1-5 December 2003 Interactive GUI for Geant4 by Victor Serbo, SLAC.
NOVA Networked Object-based EnVironment for Analysis P. Nevski, A. Vaniachine, T. Wenaus NOVA is a project to develop distributed object oriented physics.
MINER A Software The Goals Software being developed have to be portable maintainable over the expected lifetime of the experiment extensible accessible.
JAS3 - A general purpose data analysis framework for HENP and beyond Tony Johnson, Victor Serbo, Max Turri, Mark Dönszelmann, Joseph Perl SLAC.
Acat OctoberRene Brun1 Future of Analysis Environments Personal views Rene Brun CERN.
CHEP Lassi A. Tuura, Northeastern University Analysing Software Dependencies With Ignominy Lucas Taylor Lassi.
CMS pixel data quality monitoring Petra Merkel, Purdue University For the CMS Pixel DQM Group Vertex 2008, Sweden.
WIRED 4 An extensible generic Event Display Mark Donszelmann SLAC, Stanford, U.S.A. CHEP2004, 27 september – 1 october Interlaken, Switzerland.
ROOT Future1 Some views on the ROOT future ROOT Workshop 2001 June 13 FNAL Ren é Brun CERN.
GranSasso, Jul-2002 Andreas Pfeiffer, CERN/IT-API, AIDA Abstract Interfaces for Data Analysis Andreas Pfeiffer CERN IT/API
SEAL Core Libraries and Services CLHEP Workshop 28 January 2003 P. Mato / CERN Shared Environment for Applications at LHC.
SEAL Project Core Libraries and Services 18 December 2002 P. Mato / CERN Shared Environment for Applications at LHC.
OnX & ROOT1 OnX & ROOT on behalf of Guy Barrand ROOT Workshop 2001 June 13 FNAL Ren é Brun CERN.
GDB Meeting - 10 June 2003 ATLAS Offline Software David R. Quarrie Lawrence Berkeley National Laboratory
Not Invented Here: The Re-use of Commercial Components in HEP Computing Jeremy Walton The Numerical Algorithms Group Ltd, UK.
Visualization of Geant4 Data: Exploiting Component Architecture through AIDA, HepRep, JAS and WIRED Geant4 Workshop, CERN - 2 October 2002 Joseph Perl.
Introduction What is detector simulation? A detector simulation program must provide the possibility of describing accurately an experimental setup (both.
Feedback from LHC Experiments on using CLHEP Lorenzo Moneta CLHEP workshop 28 January 2003.
File Systems cs550 Operating Systems David Monismith.
Genova 10-Dec-2001 Andreas Pfeiffer, CERN/IT-API, Anaphe OO Libraries for Data Analysis using C++ and Python Andreas Pfeiffer.
Computing R&D and Milestones LHCb Plenary June 18th, 1998 These slides are on WWW at:
23/2/2000Status of GAUDI 1 P. Mato / CERN Computing meeting, LHCb Week 23 February 2000.
Geant4 Workshop, Sept/Oct 2002 Software Process and Quality Assurance Software Metrics And Ignominy “How to Win Friends And Influence People” Lassi A.
Analysis Software Strategy Jürgen Knobloch HTASC, DESY 9 October 2001 AIDA ANAPHE LIZARD.
JAS and JACO – Status Report Atlas Graphics Group August 2000 Tony Johnson.
D. Duellmann - IT/DB LCG - POOL Project1 The LCG Dictionary and POOL Dirk Duellmann.
Giulio Eulisse, Northeastern University CHEP’04, Interlaken, 27th Sep - 1st Oct, 2004 CHEP’04 IGUANA Interactive Graphics Project:
Summary of the AIDA workshop AIDA Workshop, July What is AIDA  AIDA defines today interfaces for some common analysis data objects  IHistogram,
Ianna Gaponenko, Northeastern University, Boston The CMS IGUANA Project1 George Alverson, Ianna Gaponenko and Lucas Taylor Northeastern University, Boston.
Geant4 User Workshop 15, 2002 Lassi A. Tuura, Northeastern University IGUANA Overview Lassi A. Tuura Northeastern University,
Lucas Taylor, Northeastern University User Analysis Environment October 1999, CERN 1st Internal Review of CMS Software and Computing User Analysis.
CPT Week, November , 2002 Lassi A. Tuura, Northeastern University Core Framework Infrastructure Lassi A. Tuura Northeastern.
D. Duellmann, IT-DB POOL Status1 POOL Persistency Framework - Status after a first year of development Dirk Düllmann, IT-DB.
VI/ CERN Dec 4 CMS Software Architecture vs Hybrid Store Vincenzo Innocente CMS Week CERN, Dec
Discussion with Blueprint RTAG August 2002 Tony Johnson SLAC.
POOL Based CMS Framework Bill Tanenbaum US-CMS/Fermilab 04/June/2003.
Definition CASE tools are software systems that are intended to provide automated support for routine activities in the software process such as editing.
(on behalf of the POOL team)
Anaphe OO Libraries for Data Analysis using C++ and Python
Dirk Düllmann CERN Openlab storage workshop 17th March 2003
Project Status and Plan
Software Installation
Simulation and Physics
Software Installation, release 4.0
SEAL Project Core Libraries and Services
Presentation transcript:

Gran Sasso Lab, Jul Andreas Pfeiffer, CERN/IT-API, Anaphe - OO Libraries for Data Analysis using C++ and Python AIDA – Abstract Interfaces for Data Analysis

Gran Sasso Lab, Jul Andreas Pfeiffer, CERN/IT-API, Anaphe OO Libraries for Data Analysis using C++ and Python Andreas Pfeiffer CERN IT/API

Gran Sasso Lab, Jul-2002 Andreas Pfeiffer, CERN/IT-API, 3 Outline Motivation Anaphe Components C++ Lizard: Interactive Data Analysis Python Software quality control Summary

Gran Sasso Lab, Jul Andreas Pfeiffer, CERN/IT-API, LHC Computing challenge

Gran Sasso Lab, Jul-2002 Andreas Pfeiffer, CERN/IT-API, 5 LHC & The Alps 27km circumference ~100m deep Interaction Points

Gran Sasso Lab, Jul-2002 Andreas Pfeiffer, CERN/IT-API, 6 LHC Computing Challenge 4 experiments will create huge amount of data >1 PetaByte/year for each experiment ! Bytes 1,000 TeraBytes 20,000 Redwood tapes 100,000 dual-sided DVD-RAM disks 1,500,000 sets of the Encyclopaedia Britannica (w/o photos) Need lots of CPU power to reconstruct/analyse about 1000 PC boxes per experiment (2005 ones !) of today’s boxes (dual P-III 800 MHz) complex data models reconstruction s/w is also used for online filtering needs high quality s/w in order not to waste beam time

Gran Sasso Lab, Jul-2002 Andreas Pfeiffer, CERN/IT-API, 7 Lifetime of LHC software = 25 yrs WWW SPS 1969 LEP 1989 W and Z 1983 LEP ends 2000 XML Linux V C Ethernet standar d 1983 IBM PC 1981 K&R C 1978 Unix V6 first public version 1975 Java 1995 Intel Pentium 1992

Gran Sasso Lab, Jul-2002 Andreas Pfeiffer, CERN/IT-API, 8 Technology (R)Evolution 10 yrs major cycle length (HW,SW,OS) ~12 evolutionary changes in the market 1 revolutionary change towards greater diversity don’t forget changes of requirements Consequences s/w written today most probably will be rewritten tomorrow we must anticipate changes

Gran Sasso Lab, Jul-2002 Andreas Pfeiffer, CERN/IT-API, 9 Anaphe: what it is Analysis for physics experiments Modular (OO/C++) replacement of CERNLIB functionality for use in HEP experiments memory management I/O foundation classes histogramming minimizing/fitting visualization interactive data analysis Trying to use standards wherever possible Trying to re-use existing class libraries

Gran Sasso Lab, Jul-2002 Andreas Pfeiffer, CERN/IT-API, 10 Anaphe Components

Gran Sasso Lab, Jul Andreas Pfeiffer, CERN/IT-API, AIDA Abstract Interfaces for Data Analysis  next talk

Gran Sasso Lab, Jul Andreas Pfeiffer, CERN/IT-API, Anaphe components

Gran Sasso Lab, Jul-2002 Andreas Pfeiffer, CERN/IT-API, 13 ‘Layered’ Approach Basic functionalities (histograms, fitting, etc.) are available as individual C++ class libraries. Easy replacing one part without throwing away everything Objectivity/DB to provide persistence HepODBMS library (“insulating layer”, “tags”) Histogram library (HTL) Fitting libraries (Gemini, HepFitting) Graphics libraries (Qt, Qplotter) Insulate components through Abstract Interfaces “wrapper” layer to implement Interfaces in terms of existing libs Apply s/w quality control tools code checking, testing

Gran Sasso Lab, Jul-2002 Andreas Pfeiffer, CERN/IT-API, 14 ANAPHE Components User Interface - using Abstract Types Lizard Interactive Commands Histograms NTuples Fitting Plotting VectorOfPoints Functions Analyzer Abstract types HTL Tags (HepODBMS Gemini/HepFitting Qplotter VectorOfPoints CLHEP Class Libraries for HEP Implementations (HEP-specific) non-HEP components AIDA (Abstract Interfaces for Data Analysis) Python / SWIG Objectivity/DB | HBook NAG-C | Minuit Qt (free edition)

Gran Sasso Lab, Jul-2002 Andreas Pfeiffer, CERN/IT-API, 15 Basic 3D Graphic Libraries OpenGL (basic graphics) De-facto industry standard for basic 3D graphics Used in CAD/CAE, games, VR, medical imaging OpenInventor (scene mgmt.) OO 3D toolkit for graphics Cubes, polygons, text, materials Cameras, lights, picking 3D viewers/editors,animation Based on OpenGL/MesaGL

Gran Sasso Lab, Jul-2002 Andreas Pfeiffer, CERN/IT-API, 16 2D Graphics libraries Qt multi-platform C++ GUI toolkit C++ class library, not wrapper around C libs superset of Motif and MFC available on Unix and MS Windows no change for developer commercial but with public domain version Qplotter “add-on” functionality for HEP “HIGZ/HPLOT”

Gran Sasso Lab, Jul-2002 Andreas Pfeiffer, CERN/IT-API, 17 Mathematical Libraries NAG (Numerical Algorithms Group) C Library Covers a broad range of functionality Linear algebra differential equations quadrature, etc. Special functions of CERNLIB added to Mark-6 release mostly for theory and accelerator Quality assurance extensive testing done by NAG

Gran Sasso Lab, Jul-2002 Andreas Pfeiffer, CERN/IT-API, 18 CLHEP - foundation classes HEP foundation class library Random number generators Physics vectors 3- and 4- vectors Geometry Linear algebra System of units more packages recently added will continue to evolve wwwinfo.cern.ch/asd/lhc++/clhep/

Gran Sasso Lab, Jul-2002 Andreas Pfeiffer, CERN/IT-API, 19 Histograms: the HTL package Histograms are the basic tool for physics analysis Statistical information of density distributions Histogram Template Library (HTL) design based on C++ templates Modular : separation between sampling and display Extensible : open for user defined binning systems Flexible: support transient/persistent at the same time Open: large use of abstract interfaces recent addition: 3D histograms

Gran Sasso Lab, Jul-2002 Andreas Pfeiffer, CERN/IT-API, 20 Fitting and Minimization Fitting and Minimization Library (FML) common OO interface NAG-C, MINUIT based on Abstract Interfaces IVector, IModelFunction, … fitting as a special case of minimization minimize “distance” between data and model replacement for HepFitting (and Gemini) Gemini common interface to minimizer engine very thin layer

Gran Sasso Lab, Jul Andreas Pfeiffer, CERN/IT-API, Opening bracket: Persistency

Gran Sasso Lab, Jul-2002 Andreas Pfeiffer, CERN/IT-API, 22 Object persistency Two concepts: serial and page I/O “Sequential access to objects” (streaming) good in networking context or serial writes to file(s) much like “good old Fortran” often perceived to be “simpler” to implement (“ >”) “Navigational access to objects” (buffered) I/O on demand for complex data models location transparent (for user) access to object typically by de-referencing of a smart pointer optimized for (random) disk access (disks deliver pages) sequential write to file(s) still ok Both concepts need to take care about changes of the internal structure of the objects (schema evolution)

Gran Sasso Lab, Jul-2002 Andreas Pfeiffer, CERN/IT-API, 23 Architectural Issue: Persistency (“Object-I/O”) Brings a completely new quality into the design Objects have now lifetime don’t “delete” until you really are sure you want to persistency is kind of “intended memory leak” would like to see no difference between memory and disk “Layout” of objects may change during (extended) life “schema evolution” additions/deletions of attributes changes of inheritance relations

Gran Sasso Lab, Jul-2002 Andreas Pfeiffer, CERN/IT-API, 24 Architectural Issue: Persistency (“Object-I/O”) (II) Objects can be placed (“clustering”) de-coupling of logical and physical view of data Special care needed to ensure consistency in data set avoid reading group of objects (tracks, events,...) for which writing/updating is not (yet) complete clean up if only part of the objects are written typically taken care of by using transactions Complications possible in distributed computing need to protect disk access now like memory access in past (“Segmentation violation”)

Gran Sasso Lab, Jul-2002 Andreas Pfeiffer, CERN/IT-API, 25 Physical Model and Logical Model Physical model may be changed to optimise performance Physical model may be changed to optimise performance Existing applications continue to work transparently ! Existing applications continue to work transparently !

Gran Sasso Lab, Jul-2002 Andreas Pfeiffer, CERN/IT-API, 26 Object Model Thanks to Vincenzo Innocente (CMS)

Gran Sasso Lab, Jul-2002 Andreas Pfeiffer, CERN/IT-API, 27 Physical clustering Thanks to Vincenzo Innocente (CMS)

Gran Sasso Lab, Jul Andreas Pfeiffer, CERN/IT-API, Closing bracket: Persistency

Gran Sasso Lab, Jul-2002 Andreas Pfeiffer, CERN/IT-API, 29 “Tags”, Ntuples and Events Tags - a special kind of Ntuple Always associated with an underlying persistent store Tags may be used to store “ntuple-like” data extracted from all over the event minPt, maxEmiss, nJets, nMuon, trigger, … Main use: speedup data selection for analysis … Tag simplifies selection without loosing complexity Events more complex than a tree structure (“CWN”) lots of cross-references between classes, containers Association from the Tag to the Event may be used to navigate to any other part of the Event even from an interactive visualization program

Gran Sasso Lab, Jul-2002 Andreas Pfeiffer, CERN/IT-API, 30 Anaphe components

Gran Sasso Lab, Jul-2002 Andreas Pfeiffer, CERN/IT-API, 31 Anaphe Internals: (Abstract) Interfaces

Gran Sasso Lab, Jul-2002 Andreas Pfeiffer, CERN/IT-API, 32 AIDA compliance of Anaphe Presently (Anaphe 3.x) only AIDA 1.0 compliant Plan to implement AIDA 2.2 Interfaces by end 2001 (Anaphe 4.x) initially as wrappers to existing interfaces/packages Will maintain 3.x for some time ensures stability for users Development will concentrate on 4.x while AIDA will evolve further Similar timeschedule as JAS (Tony Johnson) OpenScientist (Guy Barrand) already there

Gran Sasso Lab, Jul Andreas Pfeiffer, CERN/IT-API, Lizard: a tool for Interactive Data Analysis

Gran Sasso Lab, Jul-2002 Andreas Pfeiffer, CERN/IT-API, 34 Interactive Data Analysis Aim: “OO replacement for PAW” (at least) analysis of “ntuple-like data” (“Tags”, “Ntuples”, …) visualisation of data (Histograms, scatter-plot, “Vectors”) fitting of histograms (and other data) access to experiment specific data/code Maximize flexibility and re-use Foresee customization/integration allow use from within experiment’s s/w Plan for extensions “code for now, design for the future” Ensure maintainability use of s/w quality control tools

Gran Sasso Lab, Jul-2002 Andreas Pfeiffer, CERN/IT-API, 35 Scripting - why Typical use of scripting is quite different from programming (reconstruction, analysis,...) history “go back to where I was before” repetition/looping - with “modifiable parameters” avoid “one size fits all” or “using power-tool as hammer” rapid prototyping in “scripting language” quick turn-around times performance critical code in “core language” exploit richer set of features/functionality (e.g. templates in C++) scripting languages usually less susceptible to changes than “mainstream languages” potentially longer lifes

Gran Sasso Lab, Jul-2002 Andreas Pfeiffer, CERN/IT-API, 36 Python - why Python - OO (scripting) language öno “strange $!%-variables” ôsensitive to indentation More easy for users as Java Lots of user supplied modules available and ready for use scientific, numerics, graphics, GUI, network, OS, games, DBs, … example: Parnassus Totals: 1173 items in 49 categories. Also usable in Java (Jython) used in JAS for scripting minimize changes needed within AIDA compliant environments

Gran Sasso Lab, Jul-2002 Andreas Pfeiffer, CERN/IT-API, 37 Python - how SWIG to (semi-) automatically create connection to chosen scripting language allows flexibility to choose amongst several scripting languages Python, Perl, Tcl, Guile, Ruby, (Java) … Very easy to use swig -c++ -python -shadow -c myClass.h create shared lib from myClass.cpp and myClass_wrap.c start python and import myClass.h to use it Very easy to extend simply inherit from “swiggified” class in python modifications can later be fed back into C++ performance, type safety, special language features (templates), …

Gran Sasso Lab, Jul-2002 Andreas Pfeiffer, CERN/IT-API, 38 PAW -> Lizard translation Ntuple projection Lizard lizard --useHBook :-) nt = ntm.findNtuple(“higgscand.hbk::cands”) :-) nplot1D(nt, “mass”, “quality=5 && cut > 198”) Ntuple projection PAW pawX11 paw> h/file 1 higgscand.hbk paw> nt/pl 10.mass quality=5.and.cut>198 Assuming file higgscand.hbk contains ntuple with number 10 and title cands Any valid C++ expression

Gran Sasso Lab, Jul-2002 Andreas Pfeiffer, CERN/IT-API, 39 Tutorials and Examples available

Gran Sasso Lab, Jul-2002 Andreas Pfeiffer, CERN/IT-API, 40 Users and Collaborations AIDA spoken here! IGUANA (CMS visualization) GAUDI (LHCb/HARP) framework ATHENA (Atlas) framework Analyzer modules in Geant 4 JAS Open Scientist …you?

Gran Sasso Lab, Jul Andreas Pfeiffer, CERN/IT-API, Software quality control

Gran Sasso Lab, Jul-2002 Andreas Pfeiffer, CERN/IT-API, 42 Software quality control Using tools for testing/checking has started Insure++, CodeWizard Package dependencies: Ignominy Set of perl and shell scripts by Lassi Tuura (CMS) Ignominy scans… Make dependency data produced by the compilers (*.d files) Source code for #includes (resolved against the ones actually seen) Shared library dependencies (“ldd” output) Defined and required symbols (“nm” output) And maps… Source code and binaries into packages #include dependencies into package dependencies Unresolved/defined symbols into package dependencies ignominy: dishonour, disgrace, shame; infamy; the condition of being in disgrace, etc. (Oxford English Dictionary)

Gran Sasso Lab, Jul-2002 Andreas Pfeiffer, CERN/IT-API, 43 Ignominy Analysis of Anaphe Distribution of tools and utilities for LHC era physics Combination of commercial, free and HEP software Claims to be a toolkit Seems to live up to its toolkit claims Good work on modularity Clean design is evident in many places Dependency diagrams often split naturally into functional units Thanks to Lassi Tuura (CMS)

Gran Sasso Lab, Jul-2002 Andreas Pfeiffer, CERN/IT-API, 44 Package Metrics Size = total amount of source code (not normalised across projects!) ACD = average component dependency (~ libraries linked in) CCD = sum of single-package component dependencies over whole release Indicates testing/integration cost NCCD = Measure of CCD compared to a balanced binary tree A good toolkit’s NCCD will be close to 1.0 < 1.0: structure is flatter than a binary tree (= independent packages) > 1.0: structure is more strongly coupled (vertical or cyclic) Aim: NCCD ~ 1 for given software/functionality Thanks to Lassi Tuura (CMS)

Gran Sasso Lab, Jul-2002 Andreas Pfeiffer, CERN/IT-API, 45 Metrics: NCCD vs Cycles Toolkits & Frameworks ATLAS ORCA IGUANA COBRA G4 ROOT Thanks to Lassi Tuura (CMS) Includes Fortran NCCD (“spaghetti index”)  1.0: good toolkit < 1.0: indep. packages > 1.0: strongly-coupled NCCD (“spaghetti index”)  1.0: good toolkit < 1.0: indep. packages > 1.0: strongly-coupled Anaphe

Gran Sasso Lab, Jul-2002 Andreas Pfeiffer, CERN/IT-API, 46 History Started after CHEP-2000 Full version out since June 2001 Established functionality exceeding PAW Analyzer component giving direct access to data and libraries of the experiment framework Based on Abstract Interfaces Flexible and extensible Established parallel development of “license free” version while re-using existing libraries Direct reading/writing of HBook files as an alternative to Objectivity/DB based persistency Use of Minuit as a replacement for the minimizer of NAG-C

Gran Sasso Lab, Jul-2002 Andreas Pfeiffer, CERN/IT-API, 47 Ongoing activities Persistency De-emphasize Objectivity/DB (in coordination with experiments, IT/DB and LCG) Use of HBook ntuples Text files (using AIDA defined XML format) Planning to use LCG persistency (POOL) Investigating direct reading of ROOT files Fitting Implementing minimizer from GSL Discussing with the IGUANA team (CMS) to integrate their GUI components Looking forward for confirmation and/or re-direction of our efforts following the SC2 (RTAGs)

Gran Sasso Lab, Jul-2002 Andreas Pfeiffer, CERN/IT-API, 48 Future enhancements Access to other implementations of components HBOOK CWNtuples Communication with Java tools/packages (JAS, Wired) via AIDA Reading of ROOT (> V3.0) files similar to Tony Johnson’s (Java) RootIO package depends on “stability” of Root file format  AIDA Ntuple/Histo store optimized for Ntuples, Histograms as (compressed) XML Adding other “scripting” languages Perl, Tcl, cint ?

Gran Sasso Lab, Jul-2002 Andreas Pfeiffer, CERN/IT-API, 49 Challenge: Distributed Computing Motivation move code to data parallel analysis Techniques services via AI late binding plug-in architecture End-user (Lizard) look-and-feel of local analysis R&D started and first prototype available soon CORBA based

Gran Sasso Lab, Jul-2002 Andreas Pfeiffer, CERN/IT-API, 50 Summary The architecture of Anaphe shows some important items for flexible and modular data analysis: weak coupling between components through use of Abstract Interface basic functionality is covered by individual C++ class libraries emphasis on usability and maintainability Major criteria are flexibility, extensibility and interoperability Recent example: GEANT-4 examples (based on AIDA) Lizard is an Interactive Data Analysis Tool based on Anaphe components and the Python scripting language (through SWIG) Lizard is young but has very solid base in mature Anaphe libraries real plug-in structure Software quality control is important tools help to optimize dependencies / minimize maintenance effort

Gran Sasso Lab, Jul-2002 Andreas Pfeiffer, CERN/IT-API, 51 More information cern.ch/Anaphe cern.ch/Anaphe/Lizard aida.freehep.org/ cern.ch/DB wwwinfo.cern.ch/asd/lhc++/clhep/

Gran Sasso Lab, Jul Andreas Pfeiffer, CERN/IT-API, Additional slides

Gran Sasso Lab, Jul-2002 Andreas Pfeiffer, CERN/IT-API, 53 Analysis of Geant4 Fairly large C++ project Very fine-grained (and multi-level) package structuring Seems quite clean from the preliminary analysis Fine package subdivision helps in many ways but makes analysis and code understanding more complicated One subsystem seems strongly coupled and needs attention Need to study the use of the internal command system Thanks to Lassi Tuura (CMS)

Gran Sasso Lab, Jul-2002 Andreas Pfeiffer, CERN/IT-API, 54 Analysis of ROOT ROOT developers have done a formidable job of breaking binary (shared library) dependencies, but… For example: By static analysis, nothing seems to use the postscript package directly (no incoming dependencies), but there is this code: void TPad::Print (const char *filename, Option_t *option) { […] TVirtualPS *psave = gVirtualPS; if (gROOT->LoadClass("TPostScript","Postscript")) return; gROOT->ProcessLineFast("new TPostScript()"); gVirtualPS->Open(psname,pstype); gVirtualPS->SetBit(kPrintingPS); […] } Taking these and global objects into account makes the dependency diagrams very different Sign of fast growth? Need a “next evolutionary step”? So “coherent” that replacing parts could get painful… Thanks to Lassi Tuura (CMS)

Gran Sasso Lab, Jul-2002 Andreas Pfeiffer, CERN/IT-API, 55 Analysis of ROOT… Binary only Binary + Source + Logical = Real Thanks to Lassi Tuura (CMS)

Gran Sasso Lab, Jul-2002 Andreas Pfeiffer, CERN/IT-API, 56 Metrics: NCCD vs ACD Toolkits & Frameworks ATLAS ORCA Anaphe IGUANA COBRA G4 ROOT Thanks to Lassi Tuura (CMS)

Gran Sasso Lab, Jul-2002 Andreas Pfeiffer, CERN/IT-API, 57 Metrics: NCCD vs Size Toolkits & Frameworks ATLAS ORCA Anaphe IGUANA COBRA G4 ROOT Thanks to Lassi Tuura (CMS)

Gran Sasso Lab, Jul-2002 Andreas Pfeiffer, CERN/IT-API, 58 Metrics: NCCD vs AID Toolkits & Frameworks ATLAS ORCA AnapheIGUANA COBRA G4 ROOT Thanks to Lassi Tuura (CMS)

Gran Sasso Lab, Jul-2002 Andreas Pfeiffer, CERN/IT-API, 59 Metrics: Packages vs Size Toolkits & Frameworks ATLAS ORCA Anaphe IGUANA COBRA G4 ROOT Thanks to Lassi Tuura (CMS)

Gran Sasso Lab, Jul-2002 Andreas Pfeiffer, CERN/IT-API, 60 Metrics: Packages vs Size Toolkits & Frameworks ATLAS ORCA Anaphe IGUANA COBRA G4 ROOT Thanks to Lassi Tuura (CMS)

Gran Sasso Lab, Jul-2002 Andreas Pfeiffer, CERN/IT-API, 61 Example script (ntuple) # get list of names of all tuples from tuplemanager ntm.listTuples() nt1=ntm.findNtuple(“Charm1”) # retrieve tuple by name # create 1D histos to project into h1=hm.create1D(10, “mass”,100, 0., 5000.) h2=hm.create1D(20, “mass for pt1>10”,100, 0., 5000.) # project the attribute ”MASS" into histo h1 without cut ("") nt1.project1D( h1, “”, “MASS”) # project the attribute ”MASS" into histo h2 with cut (”PT1>10") nt1.project1D( h2, “PT1>10”, “MASS”)