Presentation is loading. Please wait.

Presentation is loading. Please wait.

A couple of slides on containers…

Similar presentations


Presentation on theme: "A couple of slides on containers…"— Presentation transcript:

1 A couple of slides on containers…
Federico Carminati Offline Week 10-05

2 Generalities Purpose of a container is to hold several instances of similar information Elements in a container are accessed via an index, an iterator or both Three kind of containers will be considered “C-style” arrays ROOT containers STL containers 06/10/2005

3 C-style containers #include <stdio.h> void cont1 () {
struct point { Float_t x; Float_t y; }; point P[100]; printf("Sizeof P = %ld\n",sizeof(P)); } root [5] .x cont1.C++ Sizeof P = 800 06/10/2005

4 C-style containers Advantages Drawbacks Where to use Where to avoid
Minimum size overhead Fast access (direct and sequential) Very clear semantics Drawbacks Lack of “encapsulation” Minimal functionality (I/O, browsing…) Fixed dimension No safety against out-of-bounds addressing Where to use Data structures within algorithms Where to avoid For dynamic data structures When I/O and inspection are required For publicly accessible data (i.e. outside a single method) 06/10/2005

5 Array of classes #include <stdio.h> #include <TObject.h>
void cont1 () class Cpoint : public TObject { public: Cpoint() {} ~Cpoint() {} Float_t X() const {return fX;} Float_t Y() const {return fY;} private: Float_t fX; Float_t fY; }; Cpoint CP[100]; printf("Sizeof CP = %ld\n",sizeof(CP)); } root [7] .x cont1.C++ Sizeof CP = 2000 06/10/2005

6 Array of classes Advantages Drawbacks Where to use Where to avoid
Fast access (direct and sequential) Very clear semantics Full “ROOT” functionality (I/O, browsing) C++ “encapsulation” Drawbacks 12 bytes overhead per object Fixed dimension No safety against out-of-bounds addressing Where to use Data structures within algorithms Class members with fixed dimensions Where to avoid For dynamic data structures When objects are small 06/10/2005

7 Classes of arrays root [20] .x cont1.C++ Sizeof CBP = 812
#include <stdio.h> #include <TObject.h> void cont1 () { class CApoint : public TObject { public: CApoint() {} ~CApoint() {} Float_t X(Int_t i) const {return fX[i];} Float_t Y(Int_t i) const {return fY[i];} private: Float_t fX[100]; Float_t fY[100]; }; CApoint CAP; printf("Sizeof CAP = %ld\n",sizeof(CAP)); } root [20] .x cont1.C++ Sizeof CBP = 812 06/10/2005

8 Classes of arrays #include <stdio.h> #include <TObject.h>
void cont1 () { class Cpoint : public TObject { public: Cpoint() {} ~Cpoint() {} Float_t X() const {return fX;} Float_t Y() const {return fY;} void Set(Float_t x, Float_t y) {fX=x; fY=y;} private: Float_t fX; Float_t fY; }; class CBpoint : public TObject { CBpoint() {} ~CBpoint() {} void GetPoint(Cpoint &p, Int_t i) const {p.Set(fX[i], fY[i]);} Float_t fX[100]; Float_t fY[100]; CBpoint CBP; printf("Sizeof CBP = %ld\n",sizeof(CBP)); } 06/10/2005

9 Classes of arrays Advantages Drawbacks Where to use Where to avoid
Fast access (direct and sequential) Very clear semantics Full “ROOT” functionality (I/O, browsing) C++ “encapsulation” Possibility to add your own memory management Low overhead (12 bytes for the whole array!) Drawbacks “Roll-your-own” management of dynamic dimensions No safety against out-of-bounds addressing Where to use Class members with fixed dimensions Where to avoid For highly dynamic data structures 06/10/2005

10 ROOT containers 06/10/2005

11 ROOT containers I - TObjArray
#include <stdio.h> #include <TObject.h> #include <TObjArray.h> void cont2 () { class Cpoint : public TObject { public: Cpoint() {} ~Cpoint() {} Float_t X() const {return fX;} Float_t Y() const {return fY;} void Set(Float_t x, Float_t y) {fX=x; fY=y;} private: Float_t fX; Float_t fY; }; TObjArray CP(100); for (Int_t i=0; i<100; ++i) CP[i]=new Cpoint(); } 06/10/2005

12 ROOT containers I - TObjArray
Advantages Fast direct access, sequential may be slower Polymorphic container Full “ROOT” functionality (I/O, browsing) C++ “encapsulation” Fully automated dynamic management Overhead is 40+<n>*4 bytes Drawbacks Have to use TObjects, with their own overhead Object creation is expensive Object ownership has to be handled carefully to avoid leaks Where to use Dynamic data structures with direct access Need for polymorphism Where to avoid Where the above conditions are not verified When you need to recreate objects frequently 06/10/2005

13 ROOT containers II - TClonesArray
#include <stdio.h> #include <TObjArray.h> #include <TClonesArray.h> #include <TStopwatch.h> void cont3 (Int_t nrep) { class Cpoint : public TObject { public: Cpoint() {} ~Cpoint() {} Float_t X() const {return fX;} Float_t Y() const {return fY;} void Set(Float_t x, Float_t y) {fX=x; fY=y;} private: Float_t fX; Float_t fY; }; const Int_t size=20000; TStopwatch t; t.Start(); TObjArray a(size); for(Int_t i=0; i<nrep; ++i) { for(Int_t j=0; j<size; ++j) a[j]=new Cpoint(); a.Delete(); } t.Print(); t.Reset(); t.Start(); TClonesArray b("Cpoint", size); for(Int_t j=0; j<size; ++j) new(b[j]) Cpoint(); b.Clear(); t.Print(); root [23] .x cont3.C++(1000) Real time 0:01:11, CP time Real time 0:00:22, CP time 06/10/2005

14 ROOT containers II - TClonesArray
Advantages Fast direct and sequential access Polymorphic container Full “ROOT” functionality (I/O, browsing) C++ “encapsulation” Fully automated dynamic management Overhead is 48+<n>*8 bytes Very cheap object creation Drawbacks Have to use TObjects, with their overhead Array owns the objects Where to use Dynamic data structures with direct access which are recreated several times Need for polymorphism Where to avoid Where the above conditions are not verified When you do not need to recreate objects frequently 06/10/2005

15 Trees Trees are not containers
Trees simulate containers for collections of similar objects written on a file When the collection is small, it is convenient to read it all in memory When it is large, Trees give you the “look and feel” of a container in memory with a sophisticated “behind your back” management of I/O Trees have a very nice “player” interface that you do not have for normal containers Unless you implement it! 06/10/2005

16 Maps june -> 30 Previous (in alphabetical order) is july
#include <map> #include <iostream> struct ltstr { bool operator()(const char* s1, const char* s2) const { return strcmp(s1, s2) < 0;} }; void cont4() map<const char*, int, ltstr> months; char *mname[12]={"january", "february", "march", "april", "may", "june", "july", "august", "september", "october", "november", "december"}; int days[12]={31,28,31,30,31,30,31,31,30,31,30,31}; for (int i=0; i<12; i++) months[mname[i]]=days[i]; cout << "june -> " << months["june"] << endl; map<const char*, int, ltstr>::iterator cur = months.find("june"); map<const char*, int, ltstr>::iterator prev = cur; map<const char*, int, ltstr>::iterator next = cur; ++next; --prev; cout << "Previous (in alphabetical order) is " << (*prev).first << endl; cout << "Next (in alphabetical order) is " << (*next).first << endl; } june -> 30 Previous (in alphabetical order) is july Next (in alphabetical order) is march 06/10/2005

17 Maps Map is a Sorted Associative Container that associates objects of type Key with objects of type Data Map is a Pair Associative Container, meaning that its value type is pair<const Key, Data> It is also a Unique Associative Container, meaning that no two elements have the same key Map has the important property that inserting a new element into a map does not invalidate iterators that point to existing elements Erasing an element from a map also does not invalidate any iterators, except, of course, for iterators that actually point to the element that is being erased 06/10/2005

18 Maps Advantages Drawbacks Where to use Where to avoid
Fast direct direct access and sequential access (but no indexing) Supported by ROOT Fully automated dynamic management (see before) Drawbacks Large overhead (I could not calculate it EXACTLY, but it includes a hash table) Using “AliRoot-forbidden STL’s” Where to use Need to access quickly data with non-integer keys Where to avoid Where you do NOT desperately need the above Where you can use TMap For integer keys the overhead of producing a hash table is massive and unjustified -- you are using a bazooka to kill a fly! 06/10/2005

19 … and if I had more time … I would have told you about all the rest
… but 06/10/2005

20 Conclusion It might be tempting to use the “most functional” container to do the job Functionality comes at a cost AliRoot is already too slow and too big to afford this So please use a judicious blend of brain and the simplest collection that does the job Don’t delude yourself with 10-lines benchmarks they can be tuned to provide any result with a bit of skill 06/10/2005


Download ppt "A couple of slides on containers…"

Similar presentations


Ads by Google