Acat OctoberRene Brun1 Future of Analysis Environments Personal views Rene Brun CERN
Acat OctobreRene Brun2 Dat a Analysis Packages Type of data ? Any type ? PAW-like ntuple? Restricted to histogramming & visualisation ? Structure ? What is modularity? Abstract interfaces? Languages? Parallelism? No restrictions Coherent Framework of Cooperating systems I/O + UI Object Bus
Acat OctobreRene Brun3 Type of Data in the past Event data managed by data structure (bank) managers (zebra, bos..) a bank is like an object Final physics data in ntuple format (paw) ntuple is like a table in a RDBMS Run/File catalog with adhoc tools (fatmen) calibrations, geometry, etc, adhoc tools (hepdb)
Acat OctobreRene Brun4 Type of data: trends-1 Put everything in an Object Data base like Objectivity Choice of RD45 project Many experiments initially following this line Abandonned by most experiments recently Interesting experience with Babar Solution not suited for PAW-like analysis
Acat OctobreRene Brun5 Type of data: trends-2 Put write-once data in an object store like ROOT in Streamer mode Use a RDBMS for : Run/Event catalogs Geometry, calibrations eg with ROOT Oracle interface or with ROOT Objectivity interface Use ROOT split/no-split mode for phys analysis
Acat OctobreRene Brun6 Framework basic requirements Dynamic Linking AND Unlinking of user shared libs User can define new classes interactively Interpreted code can call compiled code Compiled code can call interpreted code Scripts can be dynamically compiled/linked This is the normal operation mode Interesting feature for GUIs & event displays Script Compiler Root >.x file.C++
Acat OctobreRene Brun7 Fundamental features of an Object-Oriented Framework Functions Data DDL KUIP CDF Data Functions RTTI Persistency services User Interface Procedural World OO World ROOT C++ Java
Acat OctobreRene Brun8
Acat OctobreRene Brun9
Acat OctobreRene Brun10 Automatic Code generation Hand- written code Automatically generated code 40 per cent in ROOT AlgorithmsMeta information Used by I/O, GUI, Inspectors, browsers interpreter, html, etc
Acat OctobreRene Brun11 Java - ROOT interface(s) Read ROOT files from a java program see Tony Johnson will be simpler with new ROOT 2.26 supporting automatic schema evolution Call ROOT classes from a java program work by Subir Sarkar (hand-coded JNI interface) could use JACO (see Tony Johnson) or better use a variant of rootcint (rootjava) Generate ROOT-Java data classes TTree::MakeJava like TTree::MakeClass
Acat OctobreRene Brun12 Java - ROOT interface (s) import root.*; TROOT troot = new TROOT("simple", "Simple Java to root interface"); TApplication app = new TApplication("ROOT Apllication"); System.out.println("TApplication....."); TBenchmark bench = new TBenchmark(); bench.Start("Hsum"); TRandom random = new TRandom(); TH1F total = new TH1F("total","total distribution",100,-4.0F,4.0F); TH1F main = new TH1F("main","Main contributor",100,-4.0F,4.0F); TH1F s1 = new TH1F("s1","first signal",100,-4.0F,4.0F); TH1F s2 = new TH1F("s2","second signal",100,-4.0F,4.0F); total.Sumw2(); // this makes sure that the sum of squares of weights will be stored total.SetMarkerStyle(21); total.SetMarkerSize(0.7F); main.SetFillColor(16); s1.SetFillColor(42); s2.SetFillColor(46); TCanvas canvas = new TCanvas("c1","The HSUM example",200,10,600,400); canvas.SetGrid(); and so on.
Acat OctobreRene Brun13 Java - ROOT interface (s) It is important to cooperate to: facilitate the Java/C++ integration Could be interesting for applications where performance is not an issue (event display) However, I do not believe in a solution where the bulk of data is stored as C++ objects and analyzed with a Java-based system. It must fun but very inefficient what do you gain?
Acat OctobreRene Brun14 Languages for data analysis Data analysis requires an efficient access to objects (both data and functions). It requires a powerful programming language: in interpreted mode in compiled mode Transition from interpreted mode to compiled mode must be smooth and transparent. A scripting language is not the solution Python is not a solution
Acat OctobreRene Brun15 GUI Commands Interpreted scripts Compiled scripts
Acat OctobreRene Brun16 A role for commercial components ? Data bases Oracle very likely, others NO Graphics/UI NO but YES for interfaces to commercial systems Special algorithms like fitting strong doubts I strongly believe in the advantages of Open Source systems Large news/discussions groups
Acat OctobreRene Brun17 Our current work Continuous consolidation of the system Automatic schema evolution Common GUI between Unix and Windows Upgrade UI to new style GUI Tree query processor reimplemented using the new TSelector facility. PROOF (Parallel ROOT Facility) (see next) Interface with other systems, eg G3, G4 Support thousands of users Support thousands of users
Acat OctobreRene Brun18 The OODBMS dreams Selection Parameters Federation DB1 DB3 DB4 DB5 DB6 CPU Local Remote OODB DB2
Acat OctobreRene Brun19 ROOT/PROOF and GRIDs Selection Parameters DB1 DB4 DB5 DB6 CPU Local Remote Procedure Proc.C PROOF CPU TagDB RDB DB3 DB2
Acat OctobreRene Brun20 What is a modular system ? Modularity is a nice word. Everybody claims to be modular. a system with many small and independent modules? where is the object bus? what is the cost of assembling all the pieces in a real application? a hierarchical system with easily replaceable components? but with many internal dependencies
Acat OctobreRene Brun21 What is a modular system ? a system with well defined interfaces? where is the object bus? passing data by reference or value? Collections/Folders? a system easy to understand (user view) ? end users like monolithic systems doing everything a system easy to maintain (developer view) ? a system that can easily be integrated into other systems? a theoretical system and no implementation? Modularity is difficult to achieve in a growing system.
Acat OctobreRene Brun22 Modularity and Dependencies in ROOT By dependency, we mean binary dependency, when one module (shared library) forces the loading of another library. In the past this was a weak point of the system. For example, if you wanted to produce in a batch program some histograms you were required to link your app with all ROOT graphics libs up to X11. Like with PAW This problem was rightly pointed out by many users as something to be fixed. We did this. In the current system only a small set of base libraries are needed when creating e.g. histograms, in batch mode. Besides the decoupling of the graphics system many more abstract layers were introduced to decouple other parts of the system: histogram from its painter, the tree storage system from its query mechanism (treeplayer), fitting from minuit, etc. Following this reorganization none of the lower level libraries depend anymore on higher level libraries. These changes improved besides modularity also overal system performance.
Acat OctobreRene Brun23
Acat OctobreRene Brun24
Acat OctobreRene Brun25
Acat OctobreRene Brun26 ROOT Quality assurance
Acat OctobreRene Brun27 A growing users base
Acat OctobreRene Brun28 Summary We are implementing a powerful system designed for large scale data analysis with parallel architectures in a GRID context. The ROOT system is a framework providing a coherent object bus in DAQs, simulation, reconstruction and analysis phases. We have learnt a lot in the past 5 years, also following our 10 years of experience with PAW. Developing the system and at the same time supporting a rapidly growing users base is a demanding but also rewarding job.