An Introduction to STL - Standard Template Library Jiang Hu Department of Electrical Engineering Texas A&M University
What Is STL? C++ based implementations of general purpose data structures and algorithms
When Do You Need STL? Reduce your C/C++ programming time Use data structures and algorithms from STL directly, instead of implementing everything from scratch Easier maintenance Bugs in bottom structures are the most difficult to find Use STL as building blocks, they are robust Program runtime is not very critical General purpose implementations may run a little slower than customized implementations
Assumption You know basics of C++ Data structures such as link list, hash table, binary search tree … Algorithms such as sorting …
Bottom Line: Understand class class is An essential concept for C++ What STL is built on An abstract form Has member variables to record data Has member functions to do something Members can be open to public or encapsulated to be private
Example of class class node { private: long x, y, ID; double rat, cap; node* parent; public: node(long id); void setCoordinates( long, long ); long getID(); }; node::node(long id) : x(0), y(0), ID(id), parent(NULL) {} long node::getID() { return ID; } main() { node myNode(1); myNode.setCoordinates(5,8); long p = myNode.getID(); }
STL Containers Sequence containers Sorted associative containers vector deque list Sorted associative containers set multiset map multimap
STL list list is a list Functionally, similar to double-linked list Supports Insertion Deletion Sorting … …
Example of list #include <list.h> void doSomething() { list<node> nodeList; … … nodeList.push_back( node(1) ); // node1 is added at end nodeList.push_front( node(0) ); // node0 is added at front nodeList.pop_back(); // remove the element at end nodeList.pop_front(); // remove the element at front }
STL iterator begin() end() list #include <list.h> void doSomething() { list<node> nodeList; list<node>::iterator j; // Generalized pointer … … for ( j = nodeList.begin(); j != nodeList.end(); j++ ) { if ( (*j).getID() > 3 ) break; }
insert(), erase() and remove() #include <list.h> void doSomething() { list<node> nodeList; list<node>::iterator j; // Generalized pointer … … for ( j = nodeList.begin(); j != nodeList.end(); j++ ) { if ( (*j).getID() > 3 ) nodeList.insert( j, node(2) ); if ( (*j).getID() < 0 ) nodeList.erase( j ); // problem?? } nodeList.remove( nodeA ); // operator== defined for node
Careful erase in a loop #include <list.h> void doSomething() { list<node> nodeList; list<node>::iterator j, k; … … j = nodeList.begin(); while ( j != nodeList.end() ) { k = j++; if ( (*k).getID() < 0 ) { nodeList.erase( k ); }
front() and back() #include <list.h> void doSomething() { list<node> nodeList; … … node A = nodeList.front(); // copy the first node in the list node B( nodeList.back() ); // construct a node B same as // the last node in the list }
size(), sort() and merge() #include <list.h> void doSomething() { list<node> listA, listB; … … int numNodes = listA.size(); // number of elements in list listA.sort(); // operator< is defined for node listB.sort(); listA.merge( listB ); // both lists should be sorted // listB becomes empty after merge }
STL vector Functionality-wise, roughly a subset of list No push_front(), pop_front(), remove(), sort(), merge() Supports operator [], such as vectorA[3] Then why need vector ? Faster access if the data are mostly static or not much insertion/deletion
STL deque Very similar to vector, except Supports push_front() and pop_front() No operator [] Also friendly to accessing static data
vector vs. array in C/C++ Dynamic memory management for vector Certain size of memory is allocated when a vector is constructed When add a new element and the memory space is insufficient, multiple-sized new memory is allocated automatically
Use reserve() if possible If you have a rough idea of the size of the vector, use reserve() to allocate sufficient memory space at the beginning Dynamic memory allocation is nice, but it costs you runtime, the related copying also takes time Ex., you will work on a routing tree for a net with k sinks, then: vector<node> nodeVec; const int a = 2; nodeVec.reserve( a*k );
Sequence Container Summary Contiguous-memory based: vector, constant time addition and deletion at end deque, constant time addition and deletion at beginning and end Constant time access Linear time insertion/deletion in middle Node based – list Linear time access Constant time insertion/deletion any position
Sorted Associative Containers set<key> : a set of sorted key which can be any data type with operator< defined multiset<key> : allow duplicative keys map<key, T> : T is a certain data type, the data are always sorted according to their keys multimap<key, T> : allow duplicative keys Internally, implemented via red-black tree
STL map #include <map.h> void doSomething() { map<string, double, less<string> > tempMap; tempMap[ “Austin” ] = 93.2; // [] can be employed to insert tempMap.insert( pair<string,double>( “Chicago”, 84.6 ) ); tempMap[ “Chicago” ] = 86.8; // [] is better for update map<string, double, less<string> >::iterator k; for ( k = tempMap.begin(); k != tempMap.end(); k++ ) { if ( (*k).first == “Austin” ) (*k).second = 97.4; } k = tempMap.find( “Austin” ); if ( k != tempMap.end() ) (*k).second = 95.3;
Data Insertion to STL Containers == Copy list::push_back(objA) map::insert(objB) What actually saved in containers are copies, not original object Thus, each insertion incurs an object copying procedure If a data type is too complex, sometimes it is better to save the pointers to the objects
empty() vs. size() == 0 Two methods to check if a container is empty empty(), takes constant time Check if size() == 0, may take linear time!
Container Adaptors stack queue priority_queue stack< vector<T> > stack< list<T> > queue queue< deque<T> > priority_queue priority_queue< vector<int>, less<int> >
STL Algorithms Many generic algorithms for_each() find() count(), count_if() copy(), swap(), replace(), remove(), merge() sort(), nth_element() min(), max() partition() … …
Algorithm Example remove() #include<vector.h> #include<algorithm.h> … … vector<int> v; v.reserve(10); for ( int j = 1; j <= 10; j++ ) { v.push_back(j); } cout << v.size(); // print 10 v[3] = v[5] = v[9] = 99; remove( v.begin(), v.end(), 99 ); // remove all element == 99 cout << v.size(); // still print 10!
remove() returns an iterator The Myth of remove() At beginning 1 2 3 4 5 6 7 8 9 10 After change data 1 2 3 99 5 99 7 8 9 99 v.end() After remove() 1 2 3 5 7 8 9 8 9 99 remove() returns an iterator
Make remove() More Meaningful #include<vector.h> #include<algorithm.h> … … vector<int> v; v.reserve(10); for ( int j = 1; j <= 10; j++ ) { v.push_back(j); } cout << v.size(); // print 10 v[3] = v[5] = v[9] = 99; v.erase( remove( v.begin(), v.end(), 99 ), v.end() ); cout << v.size(); // now print 7!
What Else for STL? Many other features Internal implementation Suggestion: Buy an STL book Use it as a dictionary
Something About C++ A nice book Effective C++, Scott Meyers, Addison-Wesley, 1998
Data Organization in Declaring Class class node { private: double rat; long x, y; double cap; long ID; node* parent; … … }; class node { private: long x, y, ID; double rat, cap; node* parent; … … }; // Put member variables of // same type together, this // improves memory efficiency
Prefer Initialization to Assignment in Constructors class node { private: const int ID; long x, y; node* parent; }; node::node( int j, long a, long b) : ID(j), x(a), y(b) { } node::node( int j, long a, long b) : ID(j) { x = a; y = b; } const or reference member variables have to be initialized Initialization is always performed (to certain default value), even you do the assignment later List members in an initialization list in the same order in which they are declared
Inline Function Inline function usually runs faster class node { … … long getID(); }; long node::getID() { return ID; } class node { … … inline long getID() { return ID; } }; Inline function usually runs faster But, compiling time may be longer
Minimize Compiling Dependencies // This is file A.h, class typeB is // declared in file B.h #include <B.h> class typeA { … … typeB* memberB; inline void do() { memberB->doSome(); } }; // Change of A.h involves B.h // thus, compiling time is long // This is file A.h, class typeB is // declared in file B.h class typeB; class typeA { … … typeB* memberB; void do(); };
Prefer Pass-by-Reference to Pass-by-Value class tree { … … node root; node getRoot(); }; // For a complex data type // return value takes time // on copying class node { … … node root; node& getRoot(); }; // Pass reference is faster // but, sometimes you have to // pass value
Never Return Reference to Local Variable node& someClass::doSomething() { … … node localNode; return localNode; // This causes trouble! // Since the local variable is discarded // at the exit of the function }