Basic Data Structures
Basic Data Structures A key part of most problems is being able to identify a basic structure that can be used for the problem Sometimes, the key to solving a problem is just knowing the right structure to use! These are some basic data structures – ones you should already be familiar with. We will, later, talk about other structures that you might be less familiar with. We will skip more general graph structures for now
Basic Data TYPES bool char short (short int) int long (long int) long long (long long int) Unsigned versions of ints float double long double string (provided via library)
Data structures (and the C++ implementation) Static arrays Stack int a[10]; stack<int> Dynamic arrays Queue vector<int> queue<int> Linked list Double-Ended Queue list<int> deque<int> Pair Priority Queue pair<int, int> priority_queue<int> Tuple Set, Unordered Set tuple<int, int, int, int> set<int> Unordered_set<int> Map, Unordered Map map<int, int> unordered_map<int, int>
Array vs. Vector vs. List Many (most?) problems require storing data in such a format Array is generally faster, quicker to write, if you know size List does not support sort like array/vector does List is able to insert and erase more efficiently than vector and especially array Matters if doing a lot of insertion/deletion
Arrays: sort and search Sort: don’t bother writing your own sort in most cases Unless the sort comparison is tricky or the sort needs to be optimized somehow Sort (in C++): for a vector: sort(a.begin(), a.end()) for a static array of size n: sort(a, a+n) Search (in C++) for an UNSORTED array: gives pointer to the array element, not the index for a vector: find(a.begin(), a.end(), x) for a static array: find(a, a+n, x) Search (in C++) for a SORTED array (implements binary search): for a vector: lower_bound(a.begin(), a.end(), x) for a static array: lower_bound(a, a+n, x) Note: lower_bound gives iterator for first element not less than x, so can also check it to see if an element exists. Note: can also use upper_bound (gives pointer to first element greater than x), but not for checking existance. Note: binary_search just returns true/false, not an index
Pairs and Tuples Pair: simple way of pairing 2 elements pair<type1, type2> p; p = make_pair(a,b); p.first, p.second used to access elements Tuple: simple way of grouping 3+ elements (C++11) tuple<type1, type2, …, typen> t; t = make_tuple(a,b,…,n); get<0>(t), get<1>(t), etc. used to access elements
Stacks / Queues / Deque LIFO / FIFO / both Stacks: Queues: Deque: Simulate recursion/function calls Good for matching: match parentheses, for instance Good for processing in reverse order when you don’t have a simple loop push(), top(), pop(), size(), empty() Queues: Good for processing things in order when you don’t have a simple loop push(), front(), pop(), size(), empty() Deque: Good when you need both LIFO/FIFO push_back(), push_front(), back(), front(), pop_back(), pop_front(), size(), empty(), insert(), [] When we get to graphs: Stacks needed for DFS, Queues for BFS
Priority Queue Implemented using a heap, underneath Top is always the GREATEST element (unless different compare used) When want priority without having to enter/sort Good when new stuff will come in over time, rather than all read at beginning Many greedy algorithms can use this To prioritize the next thing to do Graphs: use for shortest path, MST Basic operations: empty(), push(), top(), pop()
Sets / Unordered Sets Keep track of distinct elements Set is still ordered, unordered_set is unordered Set generally implemented as a binary search tree underneath Unordered Set generally implemented as a hash table underneath Other set representations are possible, sometimes better. C++ algorithms: set_union set_intersection set_difference set_symmetric_difference Any sorted ranges can use these algorithms, not just sets!
A simple set representation For small sets of items (n<=32,64,128), labeled with index 0-(n-1). Represent as an integer/long long, where bit indicates in set or not To set bit: (1 << n) where n is how many spaces to shift over (i.e. the element number) Combine (or union) with bitwise or: | Set of {5, 17, 8} : int x = (1 << 5) | (1 << 17) | (1 << 8) Intersection with bitwise and (&) All elements: (1<<n)-1 Complement: ~x & ((1<<n)-1) Element i in set? : x & (1<<i)
Maps/Unordered Maps Key-value pairs Useful in some dynamic programming Implemented as binary search tree (sorted on key) or hash table Generally Unordered is better unless large keys (hashing can take longer than compare) or need an ordering Can use bracket operator with key side effect: if the key does not exist, it is created; use find if you want to avoid map<string, int> m; m[“this”] = 5;
Defining a comparison operator class mycomp() { public: bool operator() (const TYPE& lhs, const TYPE& rhs) { //return true if lhs is “less than” rhs, false otherwise } }; bool gtcmp(int a, int b) return a>b; // Compares ints with > Can be especially useful for priority queues, or places where different comparisons are needed Can specify this when defining priority queue, sort, lower_bound, set, map, etc.
Augmenting Data Structures Can often get more functionality by augmenting a data structure and writing your own code. Example: augment a binary search tree so each element includes the size of the subtree increase size on path when adding, decrease size on path when deleting. Can quickly count how many items are less than a given element Add up opposite subtrees on path to element (plus sometimes the node itself) Can quickly find the i-th element in the tree Follow path by comparing sizes of child trees