CSE 326: Data Structures Lecture #4 Alon Halevy Spring Quarter 2001
Agenda Today: –Finish complexity issues. –Linked links (Read Ch 3; skip “radix sort”)
Direct Proof of Recursive Fibonacci Recursive Fibonacci: int Fib(n) if (n == 0 or n == 1) return 1 else return Fib(n - 1) + Fib(n - 2) Lower bound analysis T(0), T(1) >= b T(n) >= T(n - 1) + T(n - 2) + c if n > 1 Analysis let be (1 + 5)/2 which satisfies 2 = + 1 show by induction on n that T(n) >= b n - 1
Direct Proof Continued Basis: T(0) b > b -1 and T(1) b = b 0 Inductive step: Assume T(m) b m - 1 for all m < n T(n) T(n - 1) + T(n - 2) + c b n-2 + b n-3 + c b n-3 ( + 1) + c = b n-3 2 + c b n-1
Fibonacci Call Tree
Learning from Analysis To avoid recursive calls –store all basis values in a table –each time you calculate an answer, store it in the table –before performing any calculation for a value n check if a valid answer for n is in the table if so, return it Memoization –a form of dynamic programming How much time does memoized version take?
Kinds of Analysis So far we have considered worst case analysis We may want to know how an algorithm performs “on average” Several distinct senses of “on average” –amortized average time per operation over a sequence of operations –average case average time over a random distribution of inputs –expected case average time for a randomized algorithm over different random seeds for any input
Amortized Analysis Consider any sequence of operations applied to a data structure –your worst enemy could choose the sequence! Some operations may be fast, others slow Goal: show that the average time per operation is still good
Stack ADT Stack operations –push –pop –is_empty Stack property: if x is on the stack before y is pushed, then x will be popped after y is popped What is biggest problem with an array implementation? A BCDEFBCDEF E D C B A F
Stretchy Stack Implementation int data[]; int maxsize; int top; Push(e){ if (top == maxsize){ temp = new int[2*maxsize]; copy data into temp; deallocate data; data = temp; } else { data[++top] = e; } Best case Push = O( ) Worst case Push = O( )
Stretchy Stack Amortized Analysis Consider sequence of n operations push(3); push(19); push(2); … What is the max number of stretches? What is the total time? –let’s say a regular push takes time a, and stretching an array containing k elements takes time kb, for some constants a and b. Amortized = (an+b(2n-1))/n = a+2b-(1/n)= O(1) log n
Average Case Analysis Attempt to capture the notion of “typical” performance Imagine inputs are drawn from some random distribution –Ideally this distribution is a mathematical model of the real world –In practice usually is much more simple – e.g., a uniform random distribution
Example: Find a Red Card Input: a deck of n cards, half red and half black Algorithm: turn over cards (from top of deck) one at a time until a red card is found. How many cards will be turned over? –Best case = –Worst case = –Average case: over all possible inputs (ways of shuffling deck)
Summary Asymptotic Analysis – scaling with size of input Upper bound O, Lower bound O(1) or O(log n) great O(2 n ) almost never okay Worst case most important – strong guarantee Other kinds of analysis sometimes useful: –amortized –average case
List ADT List properties –A i precedes A i+1 for 1 i < n –A i succeeds A i-1 for 1 < i n –Size 0 list is defined to be the empty list Key operations –Find(item) = position –Find_Kth(integer) = item –Insert(item, position) –Delete(position) –Next(position) = position What are some possible data structures? ( A 1 A 2 … A n-1 A n ) length = n
Implementations of Linked Lists abc (a b c) (optional header) L W1ISES Array: Linked list: HAY Can we apply binary search to an array representation?
Linked List vs. Array Find(item) = position Find_Kth(integer)=item Find_Kth(1)=item Insert(item, position) Insert(item) Delete(position) Next(position) = position linked listarraysorted array
Tradeoffs For what kinds of applications is a linked list best? Examples for an unsorted array? Examples for a sorted array?
Implementing in C++ Create separate classes for –Node –List (contains a pointer to the first node) –List Iterator (specifies a position in a list; basically, just a pointer to a node) Pro: syntactically distinguishes uses of node pointers Con: a lot of verbage! Also, is a position in a list really distinct from a list? abc (a b c) (optional header) L