2 - Arrays Introducing Arrays Declaring Array Variables, Creating Arrays, and Initializing Arrays Ordered and Unordered Arrays Common Operations: Insertion, searching, deletion Search Methods: Linear search, Binary search Big O Notation
Array The Array is the most commonly used Data Storage Structure. It’s built into most Programming languages.
Introducing Arrays An array is a group of related data items that all have the same name and the same data type. Arrays can be of any data type we choose. Arrays are static in that they remain the same size throughout program execution. Each of the data items is known as an element of the array. Each element can be accessed individually.
Introducing Arrays An array’s data items are stored contiguously in memory. An Array of 10 Elements of type double
Array Declaration and Initialization int numbers[ 5 ] ; The name of this array is “numbers”. This declaration sets aside a chunk of memory that is big enough to hold 5 integers. It does not initialize those memory locations to 0 or any other value. They contain garbage. An array may be initialized, as in : int numbers[ 5 ] = { 5, 2, 6, 9, 3 } ;
Accessing Array Elements Each element in an array has a subscript (index) associated with it. index Subscripts are integers and always begin at zero. Values of individual elements can be accessed by indexing into the array. For example, result = numbers[ 2 ]; would assign the value of the third element, 6, to result numbers
Accessing Array Elements (con’t) A subscript can also be an expression that evaluates to an integer. numbers[ (a + b) * 2 ] ; Caution! It is a logical error when a subscript evaluates to a value that is out of range for the particular array. Some systems will handle an out-of- range error gracefully and some will not
Filling Large Arrays Since many arrays are quite large, using an array initialization can be impractical. Large arrays are often filled using a for loop. for ( i = 0; i < 100; i++ ) { values [ i ] = 0 ; } would set every element of the 100 element array “values” to 0.
Common Operations The common operations on arrays as structures are searching, insertion, deletion, retrieval and traversal. Although searching, retrieval and traversal of an array is an easy job, insertion and deletion is time consuming. The elements need to be shifted before insertion and after deletion.
Insertion The insertion process is very fast if list is unordered, requiring only a single step. This is true because a new item is always inserted in the first vacant cell in the array, and the algorithm knows this location because it knows how many items are already in the array. The new item is simply inserted in the next available space.
Deletion To delete an item, you must first find it. Implicit in the deletion algorithm is the assumption that holes (empty cells) are not allowed in the array Therefore, after locating the specified item and deleting it, each subsequent cell must be shifted one space to fill in the hole. How long does it take for the worst case? A deletion requires searching through an average of N/2 elements and then moving the remaining elements (an average of N/2 moves) to fill up the resulting hole. This is N steps in all.
SEARCHING TECHNIQUES 1. LINEAR (SEQUENTIAL) SEARCH 2.BINARY SEARCH 3. COMPLEXITY OF ALGORITHMS
SEARCHING TECHNIQUES To find out whether a particular element is present in the list. 2 methods: linear search, binary search The method we use depends on how the elements of the list are organized – unordered list: linear search: simple, slow – an ordered list binary search or linear search: complex, faster
1. LINEAR (SEQUENTIAL) SEARCH How? – Proceeds by sequentially comparing the key with elements in the list – Continues until either we find a match or the end of the list is encountered. – If we find a match, the search terminates successfully by returning the index of the element – If the end of the list is encountered without a match, the search terminates unsuccessfully.
1. LINEAR (SEQUENTIAL) SEARCH void lsearch(int list[],int n,int element) { int i, flag = 0; for(i=0;i<n;i++) if( list[i] == element) {cout<<“found at position”<<i; flag =1; break; } if( flag == 0) cout<<“ not found”; } average time: O(n)
2.BINARY SEARCH List must be a sorted one We compare the element with the element placed approximately in the middle of the list If a match is found, the search terminates successfully. Otherwise, we continue the search for the key in a similar manner either in the upper half or the lower half.
Baba? Eat?
Binary Search, cont.
void bsearch(int list[],int n,int element) { int l,u,m, flag = 0; l = 0; u = n-1; while(l <= u) { m = (l+u)/2; if( list[m] == element) {cout<<"found:"<<m; flag =1; break;} else if(list[m] < element) l = m+1; else u = m-1; } if( flag == 0) cout<<"not found"; } average time: O(log 2 n)
BINARY SEARCH: Recursion int Search (int list[], int key, int left, int right) { if (left <= right) { int middle = (left + right)/2; if (key == list[middle]) return middle; else if (key < list[middle]) return Search(list,key,left,middle-1); else return Search(list,key,middle+1,right); } return -1; }
Ordered Arrays major advantage – search times are much faster than in an unordered array. Disadvantage - insertion takes longer because all data items with higher key must be moved to make room. Deletions are slow in both ordered and unordered arrays because items must be moved Another disadvantages of arrays: – redundant memory space it is difficult to estimate the size of array
An array is a suitable structure when a small number of insertions and deletions are required, but a lot of searching and retrieval is needed. Ordered array - for a database of company employees, for example, where hiring new employees and laying off existing ones would probably be infrequent compared with accessing an existing employee’s record or updating it A retail store inventory, on the other hand, would not be a good candidate for an ordered array because frequent insertions and deletions, as items arrive and are sold, would run slowly Ordered Arrays
COMPLEXITY OF ALGORITHMS In Computer Science, it is important to measure the quality of algorithms, especially the specific amount of a certain resource an algorithm needs Resources: time or memory storage Different algorithms do same task with a different set of instructions in less or more time, space or effort than other. it’s useful to have a shorthand way to say how efficient a computer algorithm is. In computer science, this rough measure is called “Big O” notation.
Big O Notation Big O notation is the most common way of qualifying an algorithm Properly known as the Asymptotic Notation It provides a comparison that tells how an algorithm’s speed is related to the number of items. Time complexity: in big O notation. How much time it takes to process N data elements? Ignore constant details
Big O Notation It is generally written as Polynomial time algorithms, – O(1) --- Constant time --- the time does not change in response to the size of the problem. – O(n) --- Linear time --- the time grows linearly with the size (n) of the problem. – O(n 2 ) --- Quadratic time --- the time grows quadratically with the size (n) of the problem. In big O notation, all polynomials with the same degree are equivalent, so O(3n 2 + 3n + 7) = O(n 2 ) Sub-linear time algorithms – O(logn) -- Logarithmic time Super-polynomial time algorithms – O(n!) – O(2 n )
3. COMPLEXITY OF ALGORITHMS Example1: complexity of an algorithm void f ( int a[], int n ) { int i; cout<< "N = “<< n; for ( i = 0; i < n; i++ ) cout<<a[i]; printf ( "n" ); } ? ? 2 * O(1) + O(N) O(N)
3. COMPLEXITY OF ALGORITHMS Example2: complexity of an algorithm void f ( int a[], int n ) { int i; cout<< "N = “<< n; for ( i = 0; i < n; i++ ) for (int j=0;j<n;j++) cout<<a[i]<<a[j]; for ( i = 0; i < n; i++ ) cout<<a[i]; printf ( "n" ); } 2 * O(1) + O(N)+O(N 2 ) O(N 2 )
Insertion in an Unordered Array: Constant Insertion into an unordered array doesn’t depend on how many items are in the array. The new item is always placed in the next available position, Insertion requires the same amount of time no matter how big N is. Hence time, T, to insert an item into an unsorted array is a constant K
Linear Search: Proportional to N number of comparisons to find a specified item is, on the average, half of the total number of items. Thus, if N is number of items, search time T is proportional to half of N: T = K * N / 2
Comparisons in B-search Number of ItemsComparisons in B-search, , , ,000, ,000, ,000, ,000,000,00030
Comparisons in B-search Notice the differences between binary search times and linear search times (N/2): For 1,000 items, the numbers are 500 (for linear search) versus 10 (for binary), and for 1,000,000 items, they’re 500,000 versus 20. We can conclude that for all but very small arrays, the binary search is greatly superior. If s represents steps (number of comparisons) and n represents the number of items, then the equation is n = 2 s That is,s = log 2 (n)
Binary Search: Proportional to log(N) Therefore, a formula relating T and N for a binary search: T = K * log 2 (N) time is proportional to base 2 logarithm of N. because any logarithm is related to any other logarithm by a constant (3.322 to go from base 2 to base 10), we can lump this constant into K as well. Then we don’t need to specify the base: T = K * log(N)
Don’t Need the Constant Big O notation looks like the formulas just described, but it dispenses with the constant K. When comparing algorithms, you don’t really care about the particular microprocessor chip or compiler; all you want to compare is how T changes for different values of N, not what the actual numbers are. Therefore, the constant isn’t needed.
Big O notation Big O notation uses the uppercase letter O, which you can think of as meaning “order of.” In Big O notation, a linear search takes O(N) time, and a binary search takes O(log N) time. Insertion into an unordered array takes O(1), or constant time. (That’s the numeral 1 in the parentheses.)
Running Times in Big O Notation Example of algorithm Running Time in Big O Notation Linear search O(N) Binary Search O(logN) Insertion for unordered array O(1) Insertion for ordered array O(N) Deletion for unordered array O(N) Deletion for ordered array O(N)
BIG O : O( )
Figure on previous slide graphs some of the Big O relationships between time and number of items. Based on this graph, we might rate the various Big O values (very subjectively) like this: O(1) is excellent, O(log N) is good, O(N) is fair, and O(N 2 ) is poor. O(N 2 ) occurs in the bubble sort and also in certain graph algorithms that we’ll see later.
39 Array Implementation of List ADT Disadvantages : – insertion and deletion is very slow need to move elements of the list – redundant memory space it is difficult to estimate the size of array