Overview, Major Developments, Directions1 ROOT Project Status Major developments Directions DESY 5 December 2005 Ren é Brun CERN
DESY Rene BrunOverview & Major Developments2 ROOT: a 10 years old project Started in January 1995 in NA49 First public presentation in November 95 Cooperation with Masa Goto/CINT in April 96 First implementation of ROOT with gAlice in December 97 (TGeant3)->VMC FNAL chooses ROOT for I/O and data analysis Announced at CHEP98 in Sept 98 in Chicago RHIC experiments follow immediately Hoffmann computing Review in 2000/2001 LCG project in 2002 (Blueprint RTAG) ROOT: LCG project in 2005 :
DESY Rene BrunOverview & Major Developments3 Applications Area Organization in LCG Phase I Applications manager Architects forum Applications area meeting Simulation project PI project SEAL project POOL project SPI project decisions strategy consultation ROOT User - provider
DESY Rene BrunOverview & Major Developments4 Applications Area Organization in LCG Phase II Applications area meeting Simulation project ROOT +SEAL project POOL project SPI project consultation Same organization as in Phase I. ROOT and SEAL projects merge ROOT project structured in work-packages
DESY Rene BrunOverview & Major Developments5 New ROOT team structure GUI Ilka Antcheva SEAL Lorenzo Moneta PROOF Fons Rademakers I/O & Trees Philippe Canal DICT Philippe Canal MATH Lorenzo Moneta 2-D/3D graphics Olivier Couet GEOM/VMC Andrei Gheata BASE Fons Rademakers The work-packages
DESY Rene BrunOverview & Major Developments6 ROOT version 5 pro release 4.04/02 3 May (new Users Guide) 1st dev release June 2nd dev release September 3rd dev release Nov Pro release December (New Guide) See detailed Release Notes
DESY Rene BrunOverview & Major Developments7 ROOT 2005 workshop September at CERN 115 registered participants 41 talks 9 posters See talks at ROOT web page
DESY Rene BrunOverview & Major Developments8 ROOT: large user community ROOT is LGPL
Overview, Major Developments, Directions9 Base Work Package New infrastructure features Miscellaneous
DESY Rene BrunOverview & Major Developments10 TArchiveFile and TZIPFile TArchiveFile is an abstract class that describes an archive file containing multiple sub-files, like a ZIP or TAR archive. The TZIPFile class describes a ZIP archive file containing multiple ROOT sub-files. Notice that the ROOT files should not be compressed when being added to the ZIP file, since ROOT files are normally already compressed. To create the file multi.zip do: The ROOT files in an archive can be simply accessed like this: A TBrowser and TChain interface will follow shortly. zip –n root multi file1.root file2.root TFile *f = TFile::Open("multi.zip#file2.root") or TFile *f = TFile::Open("root://mymachine/multi.zip#2")
DESY Rene BrunOverview & Major Developments11 Replica of a DB subset local TZipFile remote TZipFile T0 T1 http, xrootd, castor, dcache..
DESY Rene BrunOverview & Major Developments12 SLAC’s New File Server - xrootd The file server xrootd (eXtended ROOT daemon) has been developed by Andy Hanushevsky of SLAC. The server exploits a multithreaded architecture to provide high-performance file based access, focusing on scalability and fault tolerance. The server is being extensively used by BaBar, Star and Alice. The xrootd file server will in the near future replace the current daemon rootd. xrootd and its plugin facility is heavily used by PROOF
DESY Rene BrunOverview & Major Developments13 The New xrootd Client - TXNetFile The new client class TXNetFile implements the xrootd protocol and is provided to open a file via the xrootd daemon. TXNetFile can detect when it talks to on old rootd daemon and return a TNetFile. To open a file via xrootd, just use the standard static method TFile::Open() as for opening via rootd.
DESY Rene BrunOverview & Major Developments14 Class TGrid (abstract interface) //--- General GRID const char *GridUrl() const const char *GetGrid() const const char *GetHost() const const char *GetUser() const const char *GetPw() const const char *GetOptions() const Int_t GetPort() const //--- Catalogue Interface virtual TGridResult *Command(const char *command, Bool_t interactive = kFALSE, UInt_t stream = kFALSE) virtual TGridResult *Query(const char *path, const char *pattern, const char *conditions, const char *options) virtual TGridResult *LocateSites() virtual TGridResult *ls(const char*ldn ="", Option_t*options ="") virtual Bool_t cd(const char*ldn ="",Bool_t verbose =kFALSE) virtual Bool_t mkdir(const char*ldn ="", Option_t*options ="") virtual Bool_t rmdir(const char*ldn ="", Option_t*options ="") virtual Bool_t register(const char *lfn, const char *turl, Long_t size, const char *se, const char *guid) virtual Bool_t rm(const char*lfn, Option_t*option ="") //--- Job Submission Interface virtual TGridJob *Submit(const char *jdl) virtual TGridJDL *GetJDLGenerator() //--- Load desired plugin and setup conection to GRID static TGrid *Connect(const char *grid, const char *uid, const char *pw, const char *options)
DESY Rene BrunOverview & Major Developments15 Access to File Catalogues eg Alien FC Same style interface could be implemented for Other GRID File Catalogues
DESY Rene BrunOverview & Major Developments16 // Connect TGrid alien = TGrid::Connect(“alien://”); // Query TGridResult *res =alien.Query (“/alice/cern.ch/user/p/peters/analysis/miniesd/”, ”*.root“); // List of files TList *listf = res->GetFileInfoList(); // Create chain TChain chain(“Events", “session"); Chain.AddFileInfoList(listf); // Start PROOF TProof proof(“remote”); // Process your query Chain.Process(“selector.C”); TGrid example with Alien
DESY Rene BrunOverview & Major Developments17 Auto Loading of Plugins Support for auto-loading libraries when an unknown class is being referenced. The auto-loading mechanism reads the files $ROOTSYS/etc/system.rootmap, ~/.rootmap and./.rootmap (via TEnv) to try to map the unknown class to a library. If the library is found it, and the libraries on which it depends, are loaded. The rootmap files are created with the rlibmap tool when executing "make map". Example: in an interactive session, one can do directly without having to do TLorentzVector v; gSystem->Load(“libPhysics”);
DESY Rene BrunOverview & Major Developments18 TMacro This class allows for storing a C++ macro in a ROOT file. In addition to being stored in a ROOT file a TMacro can be executed, edited, etc. TMacro m("Peaks.C"); //macro m with name "Peaks" is created //from file Peaks.C m.Exec(); //macro executed with default arguments m.Exec("4"); //macro executed with argument m.SaveSource("newPeaks.C"); TFile f("mymacros.root","recreate"); m.Write(); //macro saved to file with name "Peaks"
DESY Rene BrunOverview & Major Developments19 Using PCRE for Reg Exp’s new class TPRegexp which uses the Perl Compatible Regular Expressions library. Well know, rich, regular expression syntax. It is interfaced to TString and other class and methods now using TRegexp. TRegexp will of course stay for backward compatibility.
Overview, Major Developments, Directions20 Dict Work Package New version of Reflex New version of rootcint rootcint CINT rootcint -> Reflex ->Cintex ->CINT rootcint ->gccxml -> Reflex -> CINT Adapt PyRoot to Reflex Adapt CINT to Reflex
DESY Rene BrunOverview & Major Developments21 Dictionaries : root only X.h CINT DS rootcint XDictcint.cxx CINT API ROOT Root meta C++ CINT
DESY Rene BrunOverview & Major Developments22 Dictionaries : situation today X.h X.xml XDictlcg.cxx REFLEX DS CINT DS rootcint lcgdict gccxml XDictcint.cxx CINT API REFLEX API ROOT Root meta C++ CINT cintex
DESY Rene BrunOverview & Major Developments23 Dictionaries : situation in the future X.h Reflex/Cint DS rootcint -cint rootcint -reflex XDictcint.cxx CINT/Reflex API ROOT Root meta C++ CINT Python rootcint -gccxml
Overview, Major Developments, Directions24 IO work-package Consolidation, Consolidation, Consolidation Support for STL collections More cases in auto schema evolution Better support for references Bitmap index TreeSQL
DESY Rene BrunOverview & Major Developments25 ROOT I/O: STL Collections ROOT now supports I/O of all STL containers std::vector std::list std::set std::deque std::map std::multimap And implicitly (through std::deque) std::queue std::stack STL collections are saved in split mode Objects are split (but: NOT if pointers) Quick pre-selections on trees Interactivity: Trees can be browsed Save space (see $ROOTSYS/test/bench): std::vector : compression 5.38 std::vector : compression 3.37
DESY Rene BrunOverview & Major Developments26 Float, double and space… (1) Math operations very often require double precision, but on saving single precision is sufficient… New data type: Double32_t In memory: double On disk: float or integer
DESY Rene BrunOverview & Major Developments27 Float, double and space… (2) Usage (see tutorials/double32.C): Double32_t m_data; // [min,max ] No nbits,min,max: saved as float min, max: saved as int 32 bits precision explicit values or expressions of values known to Cint (e.g. “pi”) nbits present: saved as int with nbit precision higher precision than float for same persistent space
DESY Rene BrunOverview & Major Developments28 Float, double and space… (3) Save space Increase precision
DESY Rene BrunOverview & Major Developments29 File types & Access in 5.06 Local File X.xml RFIOChirp CastorDcache Local File X.root http rootd/xrootd Oracle SapDb PgSQL MySQL TFile TKey/TTree TStreamerInfo user TSQLServer TSQLRow TSQLResult TTreeSQL
DESY Rene BrunOverview & Major Developments30 Bitmap Indices Bitmap indices are efficient data structures for accelerating multi-dimensional queries: E.g. pT > 195 AND nTracks 12.4 Supported by most commercial database management systems and data warehouses Optimized for read-only data However, because an efficient index may be as big as the data, we think that it is only appropriate for things like event meta data catalogues However, because an efficient index may be as big as the data, we think that it is only appropriate for things like event meta data catalogues
DESY Rene BrunOverview & Major Developments31 Query Performance - TTreeFormula vs. Bitmap Indices Bitmap indices 10X faster than TTreeFormula
DESY Rene BrunOverview & Major Developments32 Data analysis with bitmap indices Event catalogue Bitmap index query Event list Direct use by PROOF slaves to select events
Overview, Major Developments, Directions33 Math work-package MathCore MathMore Minuit2 Smatrix Linear & Robust Fitter Splot
DESY Rene BrunOverview & Major Developments34 MATH work-package : News MathCore library with basic Math functionality Basic Special and statistical functions Physics and geometry vectors MathMore library C++ interface to function and algorithm from GSL Extra math functions, Adaptive integration, derivation, root finders Minuit2 New OO implementation of Minuit Interface to ROOT TVirtualFitter Linear and Robust Fitter sPlot
DESY Rene BrunOverview & Major Developments35 New Math Libraries organization
DESY Rene BrunOverview & Major Developments36 MATH work-package : Plan Complete MathCore with Random numbers Adapt ROOT classes to MathCore TF1,2,3, Fitting Virtual Fitter extensions corresponding changes in ROOT fitting and roofit Fully integrate and extend new Minuit Fitting GUI Box plots, qqplots Many new tools required for LHC Physics analysis (PHYSTAT05 Oxford)
DESY Rene BrunOverview & Major Developments37 Linear Fitter (0) To fit functions linear in parameters Polynomials, hyperplanes, linear combinations of arbitrary functions TLinearFitter can be used directly or through TH1, TGraph, TGraph2D::Fit interfaces When used directly, can fit multidimensional functions
DESY Rene BrunOverview & Major Developments38 Linear Fitter (1) Special formula syntax: Linear parts separated by “++” signs: “1 ++ sin(x) ++ sin(2*x) ++ cos(3*x)” “[0] + [1]*sin(x) + [2]*sin(2*x) + [3]*cos(3*x)” Simple to use in multidimensional case “x0 ++ x1 ++ exp(x2) ++ log(x3) ++ x4” Polynomials (pol0, pol1…) and hyperplanes (hyp1, hyp2, …) are the fastest to compute By default, polynomials in TH1, TGraph::Fit functions now go through Linear Fitter Data to be used for fitting is not copied into the fitter
DESY Rene BrunOverview & Major Developments39 Linear Fitter (2) Advantages in separating linear and non-linear fitting: Doesn’t require setting initial parameter values The gain in speed FunctionLinear fitterMinuit Pol3 in TGraphErrors 1000 fits of 1000 points Average CPU time 1.95 Average CPU time TMath::Sin(x) + TMath::Sin(2*x) Average CPU time 2.39 Average CPU time 21.34
DESY Rene BrunOverview & Major Developments40 Robust fitting (0) Least Trimmed Squares regression – extension of the TLinearFitter class Motivation: least-squares fitting is very sensitive to bad observations Robust fitter is used to fit datasets with outliers The algorithm tries to fit h points (out of N) that have the smallest sum of squared residuals
DESY Rene BrunOverview & Major Developments41 Robust fitting (1) High breakdown point - smallest proportion of outliers that can cause the estimator to produce values arbitrarily far from the true parameters Graph.Fit(“pol3”, “rob=0.75”, -2, 2); 2 nd parameter – fraction h of the good points
DESY Rene BrunOverview & Major Developments42 Multivariate covariance Minimum Covariance Determinant Estimator – a highly robust estimator of multivariate location and scatter Motivation: arithmetic mean and regular covariance estimator are very sensitive to bad observations Class TRobustEstimator The algorithm tries to find a subset of h observations (out of N) with the minimal covariance matrix determinant
DESY Rene BrunOverview & Major Developments43 Multivariate covariance Left – covariance ellipses of a 1000-point dataset with 250 outliers Right – distances of points from the robust mean, calculated using robust covariance matrix High breakdown point Indices of outlying points can be returned
DESY Rene BrunOverview & Major Developments44 sPlot – A statistical tool to unfold data distributions Left – Projection plot cut on the likelihood ratio Excess of events – Signal? Background? Right – sPlot – no cut getting rid of background by statistical methods Signal!!!
DESY Rene BrunOverview & Major Developments45 News in TH1 and TF1 TH1: Chi2 test Mean & RMS error, skewness and kurtosis TF1: Derivatives (1 st, 2 nd and 3 rd ) Improved minimization – a combination of grid search and Brent’s method (golden section search and parabolic interpolation)
Overview, Major Developments, Directions46 Graphics work-package zillions of micro/mini features reimplement (TGaxis) GL with new GUI GL for dynamic tracks GL in Pad
DESY Rene BrunOverview & Major Developments47
DESY Rene BrunOverview & Major Developments48 TImageDump Many extensions to the libAfterImage library to support line, marker and (filled) polygon drawing. Accessible via TASImage. The new class TImageDump uses TASImage and derives from TVirtualPS to allow the saving of canvases in gif, jpg, png, tiff, etc., image formats in batch mode: Or to display any gif, jpg, png, tiff in a canvas, do: $ root –b root [0].x hsimple.C root [1] c1->Print("c1.gif"); TCanvas *c1; TImageDump *imgdump = new TImageDump("test.png"); c1->Paint(); imgdump->Close();
DESY Rene BrunOverview & Major Developments49 GL in Pad
DESY Rene BrunOverview & Major Developments50
DESY Rene BrunOverview & Major Developments51 GL Features : Clipping
Overview, Major Developments, Directions52 GUI work-package zillions of micro/mini features GUI Builder completion New Editor Widgets Fit Panel widget
DESY Rene BrunOverview & Major Developments53 GUI work-package : Plan TVirtualX TGX11TGWin32GdkTGQt Building Blocks Widgets Combos, scroll bars, dialogs, sliders, MDI, etc Code generators High Level Widgets Editors, Browsers, Very stable Changing New features Bug fixes
DESY Rene BrunOverview & Major Developments54
DESY Rene BrunOverview & Major Developments55 Graphics Editor Object orientation of editor design Manage GUI complexity by object editors Presents the right GUI at the right time according to the selected object in the canvas Easy-to-use Capacity for growth
DESY Rene BrunOverview & Major Developments56 Top level interface Manage a collection of TStyle objects Create a new style Delete a selected style Import from a canvas / a C++ macro Export to a C++ macro Apply on all canvases or a selected object Activate the style editor Preview window Show the predicted results On line update or by request Placed in front of the selected canvas Style Editor Style Manager
DESY Rene BrunOverview & Major Developments57 GUI Builder GUI Builder simplifies the process of designing GUIs based on the ROOT widget classes. Using Ctrl+S or SaveAs dialog, users can generate C++ code in a macro that can be edited and executed via CINT interpreter: root [0].x example.C // transient frame TGTransientFrame *frame2 = new TGTransientFrame(gClient- >GetRoot(),760,590); // group frame TGGroupFrame *frame3 = new TGGroupFrame(frame2,"curve"); TGRadioButton *frame4 = new TGRadioButton(frame3,"gaus",10); frame3->AddFrame(frame4); frame2->SetWindowName(“Fit Panel"); frame2->MapSubwindows(); frame2->Resize(frame2->GetDefaultSize()); frame2->MapWindow(); }
Overview, Major Developments, Directions58 GEOM work-package Support for parameterized shapes. This will reduce the geometry size in memory for certain geometries defined in G3 style. CAD geometry import Geometry builder GUI
DESY Rene BrunOverview & Major Developments59 LHC detectors in ROOT TGeo
DESY Rene BrunOverview & Major Developments60 The Virtual MC User Code VMC Geometrical Modeller G3 G3 transport G4 transport G4 FLUKA transport FLUKA Reconstruction Visualisation Generators
DESY Rene BrunOverview & Major Developments61 Virtual Monte Carlo and ROOT Geometry TGeant3 Used in production – native GEANT3 New: TGeant3TGeo – interface to G3 using TGeo geometry No modification required in the user code See presentation from Ivana Hrivnacova TGeant4 Used for Geant4 physics validation – G4 native geometry built after g3tog4 conversion No interface yet between G4 and ROOT geometry Few possible strategies for this implementation discussed in details at last VMC workshop Expect an interface with G4 June06 TFluka Old geometry interface using G4 geometry vis FLUGG Currently a fully validated geometry interface based on TGeo
Overview, Major Developments, Directions62 PROOF work-package Connect/Disconnect modes Multiple queries in parallel Feedback histograms Multi-core CPUs
DESY Rene BrunOverview & Major Developments63 PROOF – Parallel ROOT Facility G. Ganis, ROOT05, 29 Sept 2005 System to export the ROOT analysis model on clusters of computers for interactive analysis of large data sets Flexible multi-tier architecture in GRID contexts adapts to cluster of clusters or wide area virtual clusters Exploit inter-independence of entries in a tree or directory to achieve basic parallelism data set split into packets assigned to worker nodes on demand
DESY Rene BrunOverview & Major Developments64 PROOF – Multi-tier Architecture G. Ganis, ROOT05, 29 Sept 2005 good connection ? VERY importantless important Optimize for data locality If not possible, remote data via (x)rootd, rfiod, dCache …
DESY Rene BrunOverview & Major Developments65 TChain a("h42"); {// Define the data set a.Add(“root://oplapro62.cern.ch//tmp/dstarmb.root"); a.Add(“root://oplapro62.cern.ch//tmp/dstarp1a.root"); a.Add(“root://oplapro62.cern.ch//tmp/dstarp1b.root"); a.Add(“root://oplapro62.cern.ch//tmp/dstarp2.root"); // Process the selector a.Process("h1analysis.C"); } PROOF – data analysis G. Ganis, ROOT05, 29 Sept 2005 Normal ROOT TChain : collection of TTree TSelector : : Begin(), Process(), Terminate() Local processing PROOF Same chain, same selector {// Open PROOF TProof proof(“master”); // Process the selector a.Process("h1analysis.C"); } Remote processing
DESY Rene BrunOverview & Major Developments66 PROOF and Selectors No user’s control on the order Many Trees are being processed Initialize each slave The same code works also without PROOF (of course!) Client Slaves
DESY Rene BrunOverview & Major Developments67 PROOF – User Sandbox G. Ganis, ROOT05, 29 Sept 2005 User’s have their own sandbox on each worker node File transfers minimized cache packages, selector File integrity: MD5 checksums, timestamps Package manager to upload files or packages binary or source PAR (PROOF Archive, like Java jar) provides ROOT-INF directory, BUILD.sh, SETUP.C to control setup in each worker TProof API to handle all this
DESY Rene BrunOverview & Major Developments68 Typical query-time distribution G. Ganis, ROOT05, 29 Sept 2005 blocking Blocking / Non-blocking non-blocking
DESY Rene BrunOverview & Major Developments69 AQ1: 1s query produces a local histogram AQ2: a 10mn query submitted to PROOF1 AQ3->AQ7: short queries AQ8: a 10h query submitted to PROOF2 BQ1: browse results of AQ2 BQ2: browse temporary results of AQ8 BQ3->BQ6: submit 4 10mn queries to PROOF1 CQ1: Browse results of AQ8, BQ3->BQ6 Monday at 10h15 ROOT session on my laptop Monday at 16h25 ROOT session on my laptop Wednesday at 8h40 Carrot session on any web browser Analysis Session Example
DESY Rene BrunOverview & Major Developments70 Allows full on-click control on everything define a new session submit a query, execute a command query editor execute macro to define or pick up a TChain browse directories with selectors online monitoring of feedback histograms browse folders with results of query retrieve, delete, archive functionality GUI manager
DESY Rene BrunOverview & Major Developments71 Feedback histograms Processing information Query processing
DESY Rene BrunOverview & Major Developments72 PROOF and XROOTD XROOTD is already playing a very important role in PROOF. It will continue to play a growing role. The PROOF and XROOTD teams are cooperating to get even more from XROOTD: caching, read ahead, new XROOTD services. Still a lot to do to have a good integration of XROOTD with other services like CASTOR.
DESY Rene BrunOverview & Major Developments73 PROOF test facility Since a few weeks we have access to a dedicated farm with 32 dual processor nodes. These machines are intended for testing, not production. The nodes are slow machines (800 MHz), but are extremely useful (vital) to test our PROOF prototypes (Alice, CMS, Phobos). We expect to have many more (100) and faster machines next year. CAF for Alice and CMS.
DESY Rene BrunOverview & Major Developments74 Multi Core CPUs This is going to affect the evolution of ROOT in many areas
DESY Rene BrunOverview & Major Developments75 Moore’s law revisited
DESY Rene BrunOverview & Major Developments76 CPU/Node hierarchy latency100 nanos100 micros100 millis Disk Giga100 Tera10 Peta Disk 2012 > 1 Peta > 100 Peta> 1 Tera Laptop node 1->32->??N cpus Local cluster 1000xN cpus GRID(s) 100x1000 nodes
DESY Rene BrunOverview & Major Developments77 Many implications for ROOT More and more multi-threaded applications Will have to make many classes thread aware ACLIC compilation in parallel GL viewer could take advantage of multi cpus Fitting too I/O with threads for Read ahead Unzipping And obviously TTree queries and PROOF
DESY Rene BrunOverview & Major Developments78 Summary After 10 years of development, ROOT is widely used in HEP and elsewhere. The team has been extended with LCG2 and cooperates with many external developers. Consolidation phase for I/O and Trees Intensive developments in most packages Pushing PROOF data analysis model Pro release 5.12 (15 December) Next Pro release 6.06 (June 2006)
DESY Rene BrunOverview & Major Developments79 SEAL ROOT transition Adiabatic changes towards the experiments SEAL functionality will be maintained as long as the experiments require time/versions Expts S/W SEAL ROOT functionality