Basic Data Structures
Basic Data Structures A key part of most problems is being able to identify a basic structure that can be used for the problem Sometimes, the key to solving a problem is just knowing the right structure to use! These are some basic data structures – ones you should already be familiar with. We will, later, talk about other structures that you might be less familiar with. We will skip more general graph structures for now
Basic Data TYPES bool char short (short int) int long (long int) long long (long long int) float double long double string (provided via library)
Data structures (and the C++ implementation) Static arrays Queue int a[10]; queue<int> Dynamic arrays Priority Queue vector<int> priority_queue<int> Linked list Set list<int> set<int> Stack Map stack<int> map<int, int>
Array vs. Vector vs. List Many (most?) problems require storing data in such a format Array is generally faster, quicker to write, if you know size List does not support sort like array/vector does List is able to insert and erase more efficiently than vector and especially array Matters if doing a lot of insertion/deletion
Arrays: sort and search Sort: don’t bother writing your own sort in most cases Unless the sort compariso is tricky or the sort needs to be optimized somehow Sort (in C++): for a vector: sort(a.begin(), a.end()) for a static array of size n: sort(a, a+n) Search (in C++) for an UNSORTED array: gives pointer to the array element, not the index for a vector: find(a.begin(), a.end(), x) for a static array: find(a, a+n, x) Search (in C++) for a SORTED array (implements binary search): for a vector: lower_bound(a.begin(), a.end(), x) for a static array: lower_bound(a, a+n, x) Note: lower_bound gives iterator for first element not less than x, so can also check it to see if an element exists. Note: binary_search just returns true/false, not an index
Stacks / Queues LIFO / FIFO Stacks: Queues: Simulate recursion/function calls Good for matching: match parentheses, for instance Good for processing in reverse order when you don’t have a simple loop Queues: Good for processing things in order when you don’t have a simple loop When we get to graphs: Stacks needed for DFS, Queues for BFS
Priority Queue Implemented using a heap, underneath When want priority without having to enter/sort Good when new stuff will come in over time, rather than all read at beginning Many greedy algorithms can use this To prioritize the next thing to do Graphs: use for shortest path, MST Basic operations: empty, push, pop (or top)
Sets Keep track of distinct elements Set type in C++ will support, but can implement other ways Default way is as a binary search tree C++ algorithms: set_union set_intersection set_difference set_symmetric_difference Any sorted ranges can use these algorithms, not just sets!
A simple set representation For small sets of items (n<=30), labeled with index 0-n-1. Represent as an integer, where bit indicates in set or not To set bit: (1 << n) where n is how many spaces to shift over (i.e. the element number) Combine (or union) with bitwise or: | Set of {5, 17, 8} : int x = (1 << 5) | (1 << 17) | (1 << 8) Intersection with bitwise and (&) All elements: (1<<n)-1 Complement: ~x & ((1<<n)-1) Element i in set? : x & (1<<i)
Maps Key-value pairs Useful in some dynamic programming Implemented as binary search tree (sorted on key) Can use bracket operator with key side effect: if the key does not exist, it is created; use find if you want to avoid
Defining a comparison operator class mycomp() { public: bool operator() (const TYPE& lhs, const TYPE& rhs) { //return true if lhs is “less than” rhs, false otherwise } }; Can be especially useful for priority queues, or places where different comparisons are needed Can specify this when defining priority queue, sort, lower_bound, set, map, etc.
Augmenting Data Structures Can often get more functionality by augmenting a data structure and writing your own code. Example: augment a binary search tree so each element includes the size of the subtree increase size on path when adding, decrease size on path when deleting. Can quickly count how many items are less than a given element Add up opposite subtrees on path to element (plus sometimes the node itself) Can quickly find the i-th element in the tree Follow path by comparing sizes of child trees