Presentation is loading. Please wait.

Presentation is loading. Please wait.

Aug 2000 Andreas Pfeiffer, CERN/IT, 1 Lizard A Flexible and Modular Data Analysis Tool using Abstract Types Andreas Pfeiffer CERN.

Similar presentations


Presentation on theme: "Aug 2000 Andreas Pfeiffer, CERN/IT, 1 Lizard A Flexible and Modular Data Analysis Tool using Abstract Types Andreas Pfeiffer CERN."— Presentation transcript:

1 Aug 2000 Andreas Pfeiffer, CERN/IT, andreas.pfeiffer@cern.ch 1 Lizard A Flexible and Modular Data Analysis Tool using Abstract Types Andreas Pfeiffer CERN IT/API andreas.pfeiffer@cern.ch

2 Aug 2000 Andreas Pfeiffer, CERN/IT, andreas.pfeiffer@cern.ch 2 Outline zIntroduction and Motivation zArchitecture overview and design criteria zBasic Types and their Functionality zPresent status and planning zSummary

3 Aug 2000 Andreas Pfeiffer, CERN/IT, andreas.pfeiffer@cern.ch 3 Introduction and Motivation zPart of Anaphe/LHC++ project yaim: full, modular replacement of CERNLIB (lib and tools) yconcentrated on using (industrial and de facto) standards (STL, OpenGL,...) and commercial components (e.g., NAG_C, OpenInventor, IrisExplorer) ybasic libraries (for Tags, Histos, Fitting, …) available since few years z1998 First iteration on physics data analysis tool in LHC++ context ydata driven approach (based on IRIS Explorer) yGUI based, not command line driven

4 Aug 2000 Andreas Pfeiffer, CERN/IT, andreas.pfeiffer@cern.ch 4 Introduction and Motivation (II) zRequest to create new physics analysis tool (September 99) ynew requirements defined together with experiments yidentified categories/components and Abstract Types zPresentation at HepVis’99 workshop ytriggered creation of working group (AIDA) ytogether with developers of other tools (Iguana, JAS, OpenScientist) and other interested people yaiming at interoperability

5 Aug 2000 Andreas Pfeiffer, CERN/IT, andreas.pfeiffer@cern.ch 5 Interactive Data Analysis zAim: “OO replacement for PAW” yanalysis of “ntuple-like data” (“Tags”, “Ntuples”, …) yvisualisation of data (Histograms, scatter-plot, “Vectors”) yfitting of histograms (and other data) zMaximize flexibility/interoperability zForesee customization/integration zPlan for extensions y“code for now, design for the future”

6 Aug 2000 Andreas Pfeiffer, CERN/IT, andreas.pfeiffer@cern.ch 6 Abstract Types zDefine Abstract Types for each component yAbstract Type : only pure virtual methods, inheritance from other Abstract Types only ycomponents use other components only through their Abstract Types zUse “AbstractFactory” pattern for creation of Abstract Types yconcrete implementation of Factory and Types are in dynamically loaded shared library (“plugin”) y“Manager” class to load (and control) a specific implementation of a component

7 Aug 2000 Andreas Pfeiffer, CERN/IT, andreas.pfeiffer@cern.ch 7 Architectural issues zAbstract Types yMaximize flexibility and re-use yallow each component to develop independently yre-use of existing packages to implement components reduces start-up time significantly u use “Adapter” pattern to wrap non-compliant implementations of a component zIdentify and use patterns - avoid anti-patterns ylearn from other people’s experiences/failures

8 Aug 2000 Andreas Pfeiffer, CERN/IT, andreas.pfeiffer@cern.ch 8 Architectural issue: Components zIdentify components by functionality ynot by “historic use” zEmphasize separation of different aspects for each component yexample: Histogram u statistical entity (density distribution of a physics quantity) u view of a “collection of data points” (which can be a density distribution but also a detector efficiency curve) u command to manipulate/store/plot/fit/... y“User’s view” is different from “implementor’s view” u separate Abstract Types for each aspect

9 Aug 2000 Andreas Pfeiffer, CERN/IT, andreas.pfeiffer@cern.ch 9 Categories and dependencies

10 Aug 2000 Andreas Pfeiffer, CERN/IT, andreas.pfeiffer@cern.ch 10 Architectural issue: Scripting zTypical use of scripting is quite different from programming (reconstruction, analysis,...) yhistory “go back to where I was before” yrepetition - with “modifiable parameters” zScripting language is an interface to the UserInterface component ySWIG ( Simplified Wrapper Interface Generator ) allows flexibility to choose amongst several scripting languages u Python, Perl, (Java) … zPython selected for start ygood “OO compliance” and “look-and-feel”

11 Aug 2000 Andreas Pfeiffer, CERN/IT, andreas.pfeiffer@cern.ch 11 Basic Types and their Functionality in Lizard (I) zVectorOfPoints - collection of DataPoints y“measured value” at n-dim space-point (with errors) u (x,eX-,eX+(,y,eY-,eY+,...), value, eVal-, eVal+) ybehaves like the content of a PAW-like histogram yshifting and scaling u full set of arithmetic operations (+-*/) in preparation yused by Fitter and Plotter ycan be created from Histogram ycan be read/written from/to ASCII file u XML format in preparation

12 Aug 2000 Andreas Pfeiffer, CERN/IT, andreas.pfeiffer@cern.ch 12 Basic Types and their Functionality in Lizard (II) zHistogram - purely statistical entity yrepresentation of density distribution + summary y“operations” taken over by VectorOfPoints ywith “Annotation” to keep non-statistical information u units of axes, ID, cuts used in creation,... zNTuple - access to disk resident data ysimilar in functionality to PAW RWN ybased on GenericTags

13 Aug 2000 Andreas Pfeiffer, CERN/IT, andreas.pfeiffer@cern.ch 13 Basic Types and their Functionality in Lizard (III) zFitter - uses VectorOfPoints ypresently in re-design phase zAnalyzer - access to experiment specific data and libraries yon-the-fly compilation and dynamic loading ycustomizable makefile to access experiment s/w ysimple interface yalso useable for complex fitting u see exercises

14 Aug 2000 Andreas Pfeiffer, CERN/IT, andreas.pfeiffer@cern.ch 14 Basic Types and their Functionality in Lizard (IV) zPlotter - 2-D visualisation of VectorOfPoints ybased on Qt libraries (Troll Tech, Norway, www.troll.no ) yadded Qplotter package (HIGZ/HPLOT functionality) y3-D graphics possible by using OpenInventor/OpenGL through Qt extensions zController (Commander) - interface to the User Interface yits methods define (most of) the commands yuses the other categories to implement them ymethods/commands can be called (and extended) by scripting language and/or GUI

15 Aug 2000 Andreas Pfeiffer, CERN/IT, andreas.pfeiffer@cern.ch 15 Scripting in Lizard zUsing public domain tools for scripting zSWIG to (semi-) automatically create connection to chosen scripting language zPython - OO scripting, no “strange $!%-variables” yother scripting languages possible zCan be enhanced and/or replaced by a GUI yPython-Qt package (from public domain) yscripting window within GUI application zSimple way to hide complex functionality by defining “functions” (“plot”, “fit”)

16 Aug 2000 Andreas Pfeiffer, CERN/IT, andreas.pfeiffer@cern.ch 16 Scripting in Lizard (II) User Python Controller Shadow classes C++ interfaces C++ implementations Automatically generated by SWIG

17 Aug 2000 Andreas Pfeiffer, CERN/IT, andreas.pfeiffer@cern.ch 17 Example script (simple fitting) # book and fill a histogram h1=hm.create1D(20, ”gauss-fit”,50, 0., 50.) for i in range(1.,50.): h1.fill(i,100.*exp(-(i-25.)**2/100.)+random.gauss(0,10)) # now fit the histogram with a Gaussian (and plot) # (this uses a pre-defined “python function” to do the work) fit(h1, ”G” )

18 Aug 2000 Andreas Pfeiffer, CERN/IT, andreas.pfeiffer@cern.ch 18 Example script (fitting) # ok, ok … if you think that’s too simple, here’s the “real” fit :-) # prepare to fit the histogram h1 v=vm.from1D(h1) # create VoP from histo fit=Fitter()# create a new fitter: fit.setModel(“G”)# set the model fit.addParameter(“amp”,100)# define (and init) parameters fit.addParameter(“mean”,h1.mean())#... for these the order fit.addParameter(“rms”,h1.rms())#... is relevant fit.chiSquareFit(v)# perform fit on VoP vFit = fit.fittedVector()# retrieve fitted curve as VoP pl.plot(v,vFit) # plot the “histo” and the fit

19 Aug 2000 Andreas Pfeiffer, CERN/IT, andreas.pfeiffer@cern.ch 19 Example script (ntuple) # get list of names of all ntuples from ntuplemanager ntm.listNtuples() nt1=ntm.findNtuple(“Charm1”)# retrieve tuple by name nt1.listAttributes() # print names and types of attributes # create 1D histos to project into h1=hm.create1D(10, “mass”,100,0.,5000.) h2=hm.create1D(20, “mass for pt1>20”,100,0.,5000.) # project the attribute ”MASS” into histo h1 without cut ("") nt1.project1D(h1, “MASS”, “”) # project”MASS” with cut (”sqrt(PX1*PX1+PY1*PY1)>20") # this will compile the cut (and the “selection function”) nt1.cproject1D(h2, “MASS”, “sqrt(PX1*PX1+PY1*PY1)>20”)

20 Aug 2000 Andreas Pfeiffer, CERN/IT, andreas.pfeiffer@cern.ch 20 Example Analyzer void doIt(IHistoManager *hm, INtupleManager *ntm, IVectorManager *vm) { // Create a new histogram using the HistoManager instance Histogram1D *histo = hm->create1D(200,"from analyzer",100,0.,100.); //... and fill it with some double gaussian double w; for (int i=0; i nBins(); i++) { w = exp(-(i-50.)*(i-50.)/10.) + exp(-(i-20.)*(i-20.)/100.); histo->fill(i,w); }

21 Aug 2000 Andreas Pfeiffer, CERN/IT, andreas.pfeiffer@cern.ch 21 Status and Plans zFirst prototype (limited functionality) available since CHEP-2000 yfeedback from users on Python scripting IF ynot based on Abstract Types ywrapping existing LHC++ libraries zRe-design started in April 2000 yAbstract interfaces for components zPresently in beta phase, release foreseen for Oct 2000 yrestricted functionality zFully functional version in April 2001 y“PAW-like” functionality (+ “Analyzer”) yreading of HBOOK ntuples

22 Aug 2000 Andreas Pfeiffer, CERN/IT, andreas.pfeiffer@cern.ch 22 Possible Future Enhancements zAdding other scripting languages ye.g., Perl zCommunication with Java tools/packages yJAS yWIRED zDistributed analysis in the Grid y“farm-out” ntuple analysis and “harvest” the histogram yprepare environment for user-analysis (“Analyzer”)

23 Aug 2000 Andreas Pfeiffer, CERN/IT, andreas.pfeiffer@cern.ch 23 Summary zThe architecture of Lizard shows some important items for a flexible and modular data analysis tool: yuse of Abstract Interfaces for the components yweak coupling between components zThe present work is based on Anaphe/LHC++ components and Python scripting language (through SWIG) ymaximizes re-use zMajor criteria are flexibility, extensibility and interoperability with other tools/packages

24 Aug 2000 Andreas Pfeiffer, CERN/IT, andreas.pfeiffer@cern.ch 24 Further information zLizard yhttp://wwwinfo.cern.ch/asd/lhc++/Lizard/index.html zPython yhttp://www.python.org z“Scripting … “ by J. Ousterhout (Tcl author) yhttp://www.scriptics.com/people/john.ousterhout/scripting.html zSWIG yhttp://www.swig.org/

25 Aug 2000 Andreas Pfeiffer, CERN/IT, andreas.pfeiffer@cern.ch 25 Patterns identified so far zAbstractFactory yfor object creation zStrategy yFactory is strategy for manager zFacade (controller) ypromotes weak coupling of classes zVisitor yextend functionality of base classes

26 Aug 2000 Andreas Pfeiffer, CERN/IT, andreas.pfeiffer@cern.ch 26 How does it look like? z#Find the ntuple from database nt1=ntm.findNtuple("Charm1") z# Create an histogram h=hm.create1D("pt1",40,10,50) z# Project pT of the first particle on the histogram nt1.cproject1D(h,"sqrt(PX1*PX1+PY1*PY1)",”pz1 >0") z# Fit the projection with a exponential and plot it fit(h,”E")


Download ppt "Aug 2000 Andreas Pfeiffer, CERN/IT, 1 Lizard A Flexible and Modular Data Analysis Tool using Abstract Types Andreas Pfeiffer CERN."

Similar presentations


Ads by Google