Aug 2000 Andreas Pfeiffer, CERN/IT, 1 Lizard A Flexible and Modular Data Analysis Tool using Abstract Types Andreas Pfeiffer CERN.

Slides:



Advertisements
Similar presentations
Physicist Interfaces Project an overview Physicist Interfaces Project an overview Jakub T. Moscicki CERN June 2003.
Advertisements

©2007 · Georges Merx and Ronald J. NormanSlide 1 Chapter 5 Architecture-Driven Component Development.
Ideas on the LCG Application Architecture Application Architecture Blueprint RTAG 12 th June 2002 P. Mato / CERN.
David Adams ATLAS DIAL Distributed Interactive Analysis of Large datasets David Adams BNL March 25, 2003 CHEP 2003 Data Analysis Environment and Visualization.
JAS – Distributed Data Analysis Grid Enabled Analysis Workshop Caltech - June 23-25, 2003.
Analysis with Geant4 and AIDA Tony Johnson SLAC-Geant4 Workshop February 2002 Tony Johnson.
Victor Serbo, SLAC30 September 2004, Interlaken, Switzerland JASSimApp plugin for JAS3: Interactive Geant4 GUI Serbo, Victor (SLAC) - presenter Donszelmann,
Gran Sasso Lab, Jul Andreas Pfeiffer, CERN/IT-API, Anaphe - OO Libraries for Data Analysis using C++ and Python AIDA –
Susanna GuatelliGeant4 Workshop 2004 Use of Analysis Tools Geant4 Workshop 2004, Catania Susanna Guatelli, INFN Genova.
SEAL V1 Status 12 February 2003 P. Mato / CERN Shared Environment for Applications at LHC.
Starting Chapter 4 Starting. 1 Course Outline* Covered in first half until Dr. Li takes over. JAVA and OO: Review what is Object Oriented Programming.
ROOT An object oriented HEP analysis framework.. Computing in Physics Physics = experimental science =>Experiments (e.g. at CERN) Planning phase Physics.
Object Oriented Programming Development
Metadata Creation with the Earth System Modeling Framework Ryan O’Kuinghttons – NESII/CIRES/NOAA Kathy Saint – NESII/CSG July 22, 2014.
Ianna Gaponenko, Northeastern University, Boston The CMS IGUANA Project1 George Alverson, Ianna Gaponenko, and Lucas Taylor Northeastern University, Boston.
Java Analysis Studio Status Update 12 May 2000 Altas Software Week Tony Johnson
Advanced Analysis Environments What is the role of Java in physics analysis? Will programming languages at all be relevant? Can commercial products help.
JAS3 + AIDA LC Simulations Workshop SLAC 19 th May 2003.
JAIDA, AIDA-JNI, JAS3 Status and Plans Mark Dönszelmann, Tony Johnson, Joseph Perl, Victor Serbo, Max Turri AIDA Workshop CERN July 2003.
Dessy, 17 september 2007 Tango Meeting Development of Tango Client Applications in Python Tiago Coutinho and Josep Ribas.
IX International Workshop on Advanced Computing and Analysis Techniques in Physics Research KEK, Tsukuba, December 2003
An Introduction to Design Patterns. Introduction Promote reuse. Use the experiences of software developers. A shared library/lingo used by developers.
Java Root IO Part of the FreeHEP Java Library Tony Johnson Mark Dönszelmann
N ATIONAL E NERGY R ESEARCH S CIENTIFIC C OMPUTING C ENTER Charles Leggett A Lightweight Histogram Interface Layer CHEP 2000 Session F (F320) Thursday.
R R R 1 Frameworks III Practical Issues. R R R 2 How to use Application Frameworks Application developed with Framework has 3 parts: –framework –concrete.
David Adams ATLAS DIAL status David Adams BNL July 16, 2003 ATLAS GRID meeting CERN.
OpenPAW A reimplementation of PAW with OpenScientist tools. Commands : Today the C part of KUIP extracted from old.
Writing Extension Modules (Plugins) for JAS 3 Mark Donszelmann Tony Johnson Victor Serbo Max Turri CHEP2004, 27 september-1 october 2004, Interlaken, Switzerland.
V. Serbo, SLAC ACAT03, 1-5 December 2003 Interactive GUI for Geant4 by Victor Serbo, SLAC.
CHEP Feb 7-11, 2000 Andreas Pfeiffer, CERN/IT, 1 AIDA - Abstract Interfaces for Data Analysis Andreas Pfeiffer CERN IT
07 Apr, 2000 GAUDI Histograms Pavel Binko, LHCb / CERN 1 LHCb Software Week GAUDI Histograms Pavel Binko LHCb / CERN.
Introduction Advantages/ disadvantages Code examples Speed Summary Running on the AOD Analysis Platforms 1/11/2007 Andrew Mehta.
JAS3 - A general purpose data analysis framework for HENP and beyond Tony Johnson, Victor Serbo, Max Turri, Mark Dönszelmann, Joseph Perl SLAC.
ESTEC 14-Jun-2001 Andreas Pfeiffer, CERN/IT-API, Architecture of Collaborating Frameworks Andreas Pfeiffer CERN IT/API
WIRED 4 An extensible generic Event Display Mark Donszelmann SLAC, Stanford, U.S.A. CHEP2004, 27 september – 1 october Interlaken, Switzerland.
GranSasso, Jul-2002 Andreas Pfeiffer, CERN/IT-API, AIDA Abstract Interfaces for Data Analysis Andreas Pfeiffer CERN IT/API
GranSasso, Jul-2002 Andreas Pfeiffer, CERN/IT-API, AIDA Abstract Interfaces for Data Analysis Andreas Pfeiffer CERN IT/API
Java Analysis Studio - Status CHEP 98 - September 1998 Tony Johnson - SLAC Jonas Gifford + Kevin Garwood - University of Victoria.
G.Barrand, LAL-Orsay G4/analysis. G.Barrand, LAL-Orsay What is “analysis” ? Histogram, Tuple, Fitter, Function, Plotter.
Not Invented Here: The Re-use of Commercial Components in HEP Computing Jeremy Walton The Numerical Algorithms Group Ltd, UK.
Visualization of Geant4 Data: Exploiting Component Architecture through AIDA, HepRep, JAS and WIRED Geant4 Workshop, CERN - 2 October 2002 Joseph Perl.
Introduction What is detector simulation? A detector simulation program must provide the possibility of describing accurately an experimental setup (both.
INFSO-RI Enabling Grids for E-sciencE Ganga 4 – The Ganga Evolution Andrew Maier.
Mantid Stakeholder Review Nick Draper 01/11/2007.
1 Technical & Business Writing (ENG-715) Muhammad Bilal Bashir UIIT, Rawalpindi.
LCIO A persistency framework and data model for the linear collider CHEP 04, Interlaken Core Software, Wednesday Frank Gaede, DESY -IT-
Integration of the ATLAS Tag Database with Data Management and Analysis Components Caitriana Nicholson University of Glasgow 3 rd September 2007 CHEP,
Analysis Software Strategy Jürgen Knobloch HTASC, DESY 9 October 2001 AIDA ANAPHE LIZARD.
Postgraduate Computing Lectures PAW 1 PAW: Physicist Analysis Workstation What is PAW? –A tool to display and manipulate data. Learning PAW –See ref. in.
JAS and JACO – Status Report Atlas Graphics Group August 2000 Tony Johnson.
Giulio Eulisse, Northeastern University CHEP’04, Interlaken, 27th Sep - 1st Oct, 2004 CHEP’04 IGUANA Interactive Graphics Project:
Computing in HEP A Introduction to Data Analysis in High Energy Physics Max Sang Applications for Physics Infrastructure Group IT Division, CERN, Geneva.
Summary of the AIDA workshop AIDA Workshop, July What is AIDA  AIDA defines today interfaces for some common analysis data objects  IHistogram,
AIDA Abstract Interfaces for Data Analysis Massimiliano Turri, SLACCHEP, La Jolla, March “The goal of the AIDA project is to define abstract.
Ianna Gaponenko, Northeastern University, Boston The CMS IGUANA Project1 George Alverson, Ianna Gaponenko and Lucas Taylor Northeastern University, Boston.
Geant4 User Workshop 15, 2002 Lassi A. Tuura, Northeastern University IGUANA Overview Lassi A. Tuura Northeastern University,
Lucas Taylor, Northeastern University User Analysis Environment October 1999, CERN 1st Internal Review of CMS Software and Computing User Analysis.
David Adams ATLAS ATLAS Distributed Analysis (ADA) David Adams BNL December 5, 2003 ATLAS software workshop CERN.
HEPVis May, 2001 Lassi A. Tuura, Northeastern University Coherent and Non-Invasive Open Analysis Architecture Lassi A. Tuura.
Genova 10-Dec-2001 Andreas Pfeiffer, CERN/IT-API, AIDA Abstract Interfaces for Data Analysis Andreas Pfeiffer CERN IT/API
CHEP 2001 Data Analysis & Visualization Philippe Canal (and Lucas Taylor)
Discussion with Blueprint RTAG August 2002 Tony Johnson SLAC.
Elements of LCG Architecture Application Architecture Blueprint RTAG 8 th June 2002 P. Mato / CERN.
David Adams ATLAS DIAL Distributed Interactive Analysis of Large datasets David Adams BNL May 19, 2003 BNL Technology Meeting.
Anaphe OO Libraries for Data Analysis using C++ and Python
Introduction to Design Patterns
Potential use of JAS/JAIDA etc. SAS J2EE Review
Project Status and Plan
Andreas Pfeiffer, CERN/IT,
Java Analysis Studio - Status
Presentation transcript:

Aug 2000 Andreas Pfeiffer, CERN/IT, 1 Lizard A Flexible and Modular Data Analysis Tool using Abstract Types Andreas Pfeiffer CERN IT/API

Aug 2000 Andreas Pfeiffer, CERN/IT, 2 Outline zIntroduction and Motivation zArchitecture overview and design criteria zBasic Types and their Functionality zPresent status and planning zSummary

Aug 2000 Andreas Pfeiffer, CERN/IT, 3 Introduction and Motivation zPart of Anaphe/LHC++ project yaim: full, modular replacement of CERNLIB (lib and tools) yconcentrated on using (industrial and de facto) standards (STL, OpenGL,...) and commercial components (e.g., NAG_C, OpenInventor, IrisExplorer) ybasic libraries (for Tags, Histos, Fitting, …) available since few years z1998 First iteration on physics data analysis tool in LHC++ context ydata driven approach (based on IRIS Explorer) yGUI based, not command line driven

Aug 2000 Andreas Pfeiffer, CERN/IT, 4 Introduction and Motivation (II) zRequest to create new physics analysis tool (September 99) ynew requirements defined together with experiments yidentified categories/components and Abstract Types zPresentation at HepVis’99 workshop ytriggered creation of working group (AIDA) ytogether with developers of other tools (Iguana, JAS, OpenScientist) and other interested people yaiming at interoperability

Aug 2000 Andreas Pfeiffer, CERN/IT, 5 Interactive Data Analysis zAim: “OO replacement for PAW” yanalysis of “ntuple-like data” (“Tags”, “Ntuples”, …) yvisualisation of data (Histograms, scatter-plot, “Vectors”) yfitting of histograms (and other data) zMaximize flexibility/interoperability zForesee customization/integration zPlan for extensions y“code for now, design for the future”

Aug 2000 Andreas Pfeiffer, CERN/IT, 6 Abstract Types zDefine Abstract Types for each component yAbstract Type : only pure virtual methods, inheritance from other Abstract Types only ycomponents use other components only through their Abstract Types zUse “AbstractFactory” pattern for creation of Abstract Types yconcrete implementation of Factory and Types are in dynamically loaded shared library (“plugin”) y“Manager” class to load (and control) a specific implementation of a component

Aug 2000 Andreas Pfeiffer, CERN/IT, 7 Architectural issues zAbstract Types yMaximize flexibility and re-use yallow each component to develop independently yre-use of existing packages to implement components reduces start-up time significantly u use “Adapter” pattern to wrap non-compliant implementations of a component zIdentify and use patterns - avoid anti-patterns ylearn from other people’s experiences/failures

Aug 2000 Andreas Pfeiffer, CERN/IT, 8 Architectural issue: Components zIdentify components by functionality ynot by “historic use” zEmphasize separation of different aspects for each component yexample: Histogram u statistical entity (density distribution of a physics quantity) u view of a “collection of data points” (which can be a density distribution but also a detector efficiency curve) u command to manipulate/store/plot/fit/... y“User’s view” is different from “implementor’s view” u separate Abstract Types for each aspect

Aug 2000 Andreas Pfeiffer, CERN/IT, 9 Categories and dependencies

Aug 2000 Andreas Pfeiffer, CERN/IT, 10 Architectural issue: Scripting zTypical use of scripting is quite different from programming (reconstruction, analysis,...) yhistory “go back to where I was before” yrepetition - with “modifiable parameters” zScripting language is an interface to the UserInterface component ySWIG ( Simplified Wrapper Interface Generator ) allows flexibility to choose amongst several scripting languages u Python, Perl, (Java) … zPython selected for start ygood “OO compliance” and “look-and-feel”

Aug 2000 Andreas Pfeiffer, CERN/IT, 11 Basic Types and their Functionality in Lizard (I) zVectorOfPoints - collection of DataPoints y“measured value” at n-dim space-point (with errors) u (x,eX-,eX+(,y,eY-,eY+,...), value, eVal-, eVal+) ybehaves like the content of a PAW-like histogram yshifting and scaling u full set of arithmetic operations (+-*/) in preparation yused by Fitter and Plotter ycan be created from Histogram ycan be read/written from/to ASCII file u XML format in preparation

Aug 2000 Andreas Pfeiffer, CERN/IT, 12 Basic Types and their Functionality in Lizard (II) zHistogram - purely statistical entity yrepresentation of density distribution + summary y“operations” taken over by VectorOfPoints ywith “Annotation” to keep non-statistical information u units of axes, ID, cuts used in creation,... zNTuple - access to disk resident data ysimilar in functionality to PAW RWN ybased on GenericTags

Aug 2000 Andreas Pfeiffer, CERN/IT, 13 Basic Types and their Functionality in Lizard (III) zFitter - uses VectorOfPoints ypresently in re-design phase zAnalyzer - access to experiment specific data and libraries yon-the-fly compilation and dynamic loading ycustomizable makefile to access experiment s/w ysimple interface yalso useable for complex fitting u see exercises

Aug 2000 Andreas Pfeiffer, CERN/IT, 14 Basic Types and their Functionality in Lizard (IV) zPlotter - 2-D visualisation of VectorOfPoints ybased on Qt libraries (Troll Tech, Norway, ) yadded Qplotter package (HIGZ/HPLOT functionality) y3-D graphics possible by using OpenInventor/OpenGL through Qt extensions zController (Commander) - interface to the User Interface yits methods define (most of) the commands yuses the other categories to implement them ymethods/commands can be called (and extended) by scripting language and/or GUI

Aug 2000 Andreas Pfeiffer, CERN/IT, 15 Scripting in Lizard zUsing public domain tools for scripting zSWIG to (semi-) automatically create connection to chosen scripting language zPython - OO scripting, no “strange $!%-variables” yother scripting languages possible zCan be enhanced and/or replaced by a GUI yPython-Qt package (from public domain) yscripting window within GUI application zSimple way to hide complex functionality by defining “functions” (“plot”, “fit”)

Aug 2000 Andreas Pfeiffer, CERN/IT, 16 Scripting in Lizard (II) User Python Controller Shadow classes C++ interfaces C++ implementations Automatically generated by SWIG

Aug 2000 Andreas Pfeiffer, CERN/IT, 17 Example script (simple fitting) # book and fill a histogram h1=hm.create1D(20, ”gauss-fit”,50, 0., 50.) for i in range(1.,50.): h1.fill(i,100.*exp(-(i-25.)**2/100.)+random.gauss(0,10)) # now fit the histogram with a Gaussian (and plot) # (this uses a pre-defined “python function” to do the work) fit(h1, ”G” )

Aug 2000 Andreas Pfeiffer, CERN/IT, 18 Example script (fitting) # ok, ok … if you think that’s too simple, here’s the “real” fit :-) # prepare to fit the histogram h1 v=vm.from1D(h1) # create VoP from histo fit=Fitter()# create a new fitter: fit.setModel(“G”)# set the model fit.addParameter(“amp”,100)# define (and init) parameters fit.addParameter(“mean”,h1.mean())#... for these the order fit.addParameter(“rms”,h1.rms())#... is relevant fit.chiSquareFit(v)# perform fit on VoP vFit = fit.fittedVector()# retrieve fitted curve as VoP pl.plot(v,vFit) # plot the “histo” and the fit

Aug 2000 Andreas Pfeiffer, CERN/IT, 19 Example script (ntuple) # get list of names of all ntuples from ntuplemanager ntm.listNtuples() nt1=ntm.findNtuple(“Charm1”)# retrieve tuple by name nt1.listAttributes() # print names and types of attributes # create 1D histos to project into h1=hm.create1D(10, “mass”,100,0.,5000.) h2=hm.create1D(20, “mass for pt1>20”,100,0.,5000.) # project the attribute ”MASS” into histo h1 without cut ("") nt1.project1D(h1, “MASS”, “”) # project”MASS” with cut (”sqrt(PX1*PX1+PY1*PY1)>20") # this will compile the cut (and the “selection function”) nt1.cproject1D(h2, “MASS”, “sqrt(PX1*PX1+PY1*PY1)>20”)

Aug 2000 Andreas Pfeiffer, CERN/IT, 20 Example Analyzer void doIt(IHistoManager *hm, INtupleManager *ntm, IVectorManager *vm) { // Create a new histogram using the HistoManager instance Histogram1D *histo = hm->create1D(200,"from analyzer",100,0.,100.); //... and fill it with some double gaussian double w; for (int i=0; i nBins(); i++) { w = exp(-(i-50.)*(i-50.)/10.) + exp(-(i-20.)*(i-20.)/100.); histo->fill(i,w); }

Aug 2000 Andreas Pfeiffer, CERN/IT, 21 Status and Plans zFirst prototype (limited functionality) available since CHEP-2000 yfeedback from users on Python scripting IF ynot based on Abstract Types ywrapping existing LHC++ libraries zRe-design started in April 2000 yAbstract interfaces for components zPresently in beta phase, release foreseen for Oct 2000 yrestricted functionality zFully functional version in April 2001 y“PAW-like” functionality (+ “Analyzer”) yreading of HBOOK ntuples

Aug 2000 Andreas Pfeiffer, CERN/IT, 22 Possible Future Enhancements zAdding other scripting languages ye.g., Perl zCommunication with Java tools/packages yJAS yWIRED zDistributed analysis in the Grid y“farm-out” ntuple analysis and “harvest” the histogram yprepare environment for user-analysis (“Analyzer”)

Aug 2000 Andreas Pfeiffer, CERN/IT, 23 Summary zThe architecture of Lizard shows some important items for a flexible and modular data analysis tool: yuse of Abstract Interfaces for the components yweak coupling between components zThe present work is based on Anaphe/LHC++ components and Python scripting language (through SWIG) ymaximizes re-use zMajor criteria are flexibility, extensibility and interoperability with other tools/packages

Aug 2000 Andreas Pfeiffer, CERN/IT, 24 Further information zLizard yhttp://wwwinfo.cern.ch/asd/lhc++/Lizard/index.html zPython yhttp:// z“Scripting … “ by J. Ousterhout (Tcl author) yhttp:// zSWIG yhttp://

Aug 2000 Andreas Pfeiffer, CERN/IT, 25 Patterns identified so far zAbstractFactory yfor object creation zStrategy yFactory is strategy for manager zFacade (controller) ypromotes weak coupling of classes zVisitor yextend functionality of base classes

Aug 2000 Andreas Pfeiffer, CERN/IT, 26 How does it look like? z#Find the ntuple from database nt1=ntm.findNtuple("Charm1") z# Create an histogram h=hm.create1D("pt1",40,10,50) z# Project pT of the first particle on the histogram nt1.cproject1D(h,"sqrt(PX1*PX1+PY1*PY1)",”pz1 >0") z# Fit the projection with a exponential and plot it fit(h,”E")