nalysis of lgorithms A A Universidad Nacional de Colombia Facultad de Ingeniería Departamento de Sistemas
OUTLINE Getting started Insertion sort Merge sort
Insertion Sort INPUT: A sequence of numbers <a1,a2,a3,...,an> INCREMENTAL PARADIGM Solve for 1 Solve for 2: using the solution for 1 Solve for the next: using the previous solution Solve for n: using the solution for n-1
procedure insertion_sort( A[1..n] ) for j 2 to n do x A[j] i j-1 while x<A[i] and i>0 do A[i+1] A[i] i i-1 end-while A[i+1] x end-for end
j= 2 A x 4 3 5 2 1 3 i = j-1 = 1 i j 3 4 5 2 1 3 i = i-1 = 0 i j Exit while because i<0
j=3 A x 3 4 5 2 1 5 i = j-1 = 2 i j Exit while because x >A[i]
j=4 3 4 5 2 1 2 i = j-1 = 3 i j 3 4 5 5 1 2 i = i-1 = 2 i j
3 4 4 5 1 2 i = i-1 =1 j i 2 3 4 5 1 2 i = i-1 = 0 j i Exit while because i<0
j=5 2 3 4 5 1 1 i = j-1 = 4 i j 2 3 4 5 5 1 i = i-1 =3 i j
2 3 4 4 5 1 i i = i-1 =2 j 2 3 3 4 5 1 i j i = i-1 =2 Exit while because i<0 1 2 3 4 5 1 i j
Using a loop invariant to prove correctness of an algorithms A loop invariant is a statement about variables of the algorithms that help us to prove why an algorithms is correct. We must show three things about a loop invariant:
INITILIZATION The loop invariant is true prior to the first iteration of the loop. MAINTENANCE If it is true before an iteration, it remains true before the next iteration. TERMINATION When the loop terminates, the invariant gives an useful property that help us to show that the algorithms is correct ( we usualy use loop invatients that at termination imply the correctness of the algorithm).
Loop Invariant for Insertion-sort At the start of each iteration of the for loop the subarray A[1,..,i-1] consist of the elements originally in A[1,..,i-1] but in sorted order.
INITILIZATION Before the beginning of the loop j=2, then the subarray A[1,..,j-1] consist only of the A[1], which is in fact the original element A[1]. This subarray is sorted (trivially, of course).
Before the beginning of iteration j=2 MAINTENANCE Before the beginning of iteration j=2 ... x1 x2 x3 ... xn i j Before the beginning of iteration j=3 min{x1 , x2 } max{x1 , x2 } ... xn j
Before the beginning of iteration j=3 min{x1 , x2 } max{x1 , x2 } x3 ... xn i j Before the beginning of iteration j=3 xi1 xi2 xi3 ... xn {xi1 ,xi2 ,xi3 }= {x1 ,x2 ,x3 } j xi1< xi2 < xi3
Before the beginning of iteration j=k xi1 xi2 xi3 ... xik-1 xk ... xn i j Before the beginning of iteration j=k+1 xi1 xi2 xi3 ... xik-1 xik ... xn j {xi1 ,xi2 ,...,xik }= {x1 ,x2 ,...,x3 } xi1< xi2 <...< xi3
TERMINATION At termination j=n+1 the subarray A[1,..,j-1] consist of the elements in original array A[1,..,n] but in a sorted way. Therefore the algorithm is correct.
Time complexity cost times procedure insertion_sort( A[1..n] ) for j 2 to n do c1 n x A[j] c2 n-1 insert A[j] into A[1,..,j-1] c3=0 n-1 i j-1 c4 n-1 while x<A[i] and i>0 do c5 i=2,..,n tj A[i+1] A[i] c6 i=2,..,n (ti -1) i i-1 c7 i=2,..,n (ti -1) end-while A[i+1] x c8 n-1 end-for end
tj is the number questions ask in the while(x<A[j] …. time complexity t(n) t (n) = c1 n +(c2+c4+c8) (n-1) + c5 j=2,..,n tj + (c6+c7) j=2,..,n (tj -1) If ci= 1 for i = 1,…8 and i ≠ 3, c3 =0 t (n) = 2n-1 + 3 j=2,..,n tj tj is the number questions ask in the while(x<A[j] …. 1 ≤ tj ≤ i
Best case time complexity tb(n) In this case tj = 1 then tb(n) = 2n-1 + 3 j=2,..,n 1 = (n). tb(n) = 5n-4= (n). Worst case time complexity tw(n) In this case tj = j then tw(n) = 2n-1 + 3 j=2,..,n j = (n2). tw(n) = 3/2 n2 + 7/2 n – 4 = (n2).
Average case time complexity ta(n) The only random variable in t(n) is ti , lets study the case with {1,2,3} t2 t3 123 1 1 132 1 2 213 2 1 231 1 3 312 2 2 321 2 3 Then t2 =1,2 with prob. ½ E(t2) = 3/2 t3 =1,2,3 with prob. 1/3 E(t3) = 2 : tj =1,2,3,..,j with prob. 1/j E(ti) = tj=1,..,j tj (1/j) = (j+1)/2
Average case time complexity ta(n) ta(n) = 2n-1 + 3 j=2,..,n (j+1)/2 = (n2). ta(n) = 3/4 n2 + 17/4 n – 4 = (n2).
DIVIDE AND CONQUER PARADIGM Merge-Sort INPUT: A sequence of numbers <a1,a2,a3,...,an> DIVIDE AND CONQUER PARADIGM Divide: the problem into a number of sub-problems Conquer: the sub-problems by solving them recursively Combine: the solutions to the sub-problems into the solution for the original problem
MERGE-SORT DIVIDE: Divide the n-element sequence to be sorted into two subsequences of n/2 elements each CONQUER: sort the two subsequences recursively using merge sort COMBINE: merge the two sorted subsequences to produce the sorted answer
MERGE-SORT STEPS: on a problem of size 1 do nothing on a problem of size at least 2 Split the sequence into two halves Sort one half of the numbers Sort the second half of the numbers Merge the two sorted lists
MERGE n m n+m Given two lists to merge size n and m (First we study merge ) Given two lists to merge size n and m Maintain pointer to head of each list Move smaller element to output and advance pointer n m n+m
MERGE MERGE ( A, p,q, r ) n1 q-p+1 n2 r-q create arrays L[1..n1+1] and R[1.. n2+1] for i 1 to n1 do L[i] A[p+i-1] for j 1 to n2 do R[j] A[q+j] L[n1+1] R[n2+1] i 1 j 1 for k p to r do if L[i] R[j] then A[k] L[i] i i+1 else A[k] R[j] j j+1 end_if end_for end.
MERGE ( A, 5,8,11 ) 4 5 6 7 8 9 10 11 12 ... ... 1 2 7 9 3 5 8 k A 1 2 7 9 3 5 8 i j L R
4 5 6 7 8 9 10 11 12 ... ... 1 2 7 9 3 5 8 k A 1 2 7 9 3 5 8 i j L R
4 5 6 7 8 9 10 11 12 ... ... 1 2 7 9 3 5 8 k A 1 2 7 9 3 5 8 i j L R
4 5 6 7 8 9 10 11 12 ... ... 1 2 3 9 3 5 8 k A 1 2 7 9 3 5 8 i j L R
4 5 6 7 8 9 10 11 12 ... ... 1 2 3 5 3 5 8 k A 1 2 7 9 3 5 8 i j L R
4 5 6 7 8 9 10 11 12 ... ... 1 2 3 5 7 5 8 k A 1 2 7 9 3 5 8 i j L R
4 5 6 7 8 9 10 11 12 ... ... 1 2 3 5 7 8 8 k A 1 2 7 9 3 5 8 i j L R
4 5 6 7 8 9 10 11 12 ... ... 1 2 3 5 7 8 9 k A 1 2 7 9 3 5 8 i j L R
MERGE-Correcteness Loop Invariant At the start of each iteration of the for loop for k, the subarray A[p,..,k-1] consist of the k-p smallest elements of L[1,.., n1+1] and R[1,.., n2+1] in sorted order. Moreover L[i] and R[j] are the smallest of their arrays that have not been copied back into A.
INITILIZATION Before the beginning of the loop k=p, then the subarray A[p,..,k-1] is empty and consist of the k-p = 0 smallest elements of L[1,.., n1+1] and R[1,.., n2+1]. Since i=j=1 then L[1] and R[1] are the smallest of their arrays that have not been copied back into A.
MAINTENANCE Before the beginning of the l-th iteration of the loop k=p+l, then the subarray A[p,..,k-1] consist of the (k-p = l) smallest elements of L[1,.., n1+1] and R[1,.., n2+1] in sorted order and L[i] and R[j] are the smallest elements of their arrays that have not been copied back into A.
Let us first assume that L[i] R[j] then following the loop L[i] is copied to A[k =p+l] therefore before the beginning of the (l+1)th k=p+l+1 and A[p,..,k-1] = A[p,..,p+l] consist of the (l+1) smallest elements of L[1,.., n1+1] and R[1,.., n2+1] and L[i] and R[j] are the smallest elements of their arrays that have not been copied back into A.
Now suppose that R[j] < L[i] then following the loop R[i] is copied to A[k =p+l] therefore before the beginning of the (l+1)th k=p+l+1 and A[p,..,k-1] = A[p,..,p+l] consist of the (l+1) smallest elements of L[1,.., n1+1] and R[1,.., n2+1] and L[i] and R[j] are the smallest elements of their arrays that have not been copied back into A.
TERMINATION At termination k=r+1, then the subarray A[p,..,k-1]= A[p,..,r] is consist of the r smallest elements of L[1,.., n1+1] and R[1,.., n2+1] in sorted order. Since i= n1 +1 and j= n2 +1 then L[i] and R[j] are .
MERGE-SORT if p<r then q (p+r)/2 MERGE-SORT( A, p, q ) procedure MERGE-SORT( A, p, r ) if p<r then q (p+r)/2 MERGE-SORT( A, p, q ) MERGE-SORT( A, q+1, r ) MERGE ( A, p, q, r ) To sort call MERGE(A, 1, length[A]) with A = [a1,a2,a3,...,an ] MERGE(A, p, r): procedure that takes time (n), where n=r-p+1
Initial Sequence divide divide divide merge merge merge 5 2 4 6 1 3 2 6 divide 5 2 4 6 1 3 2 6 divide 5 2 4 6 1 3 2 6 divide 5 2 4 6 1 3 2 6 merge 2 5 4 6 1 3 2 6 merge 2 4 5 6 1 2 3 6 merge 1 2 2 3 4 5 6 6 Sorted Sequence
Initial Sequence divide divide divide merge merge merge 5 2 4 6 1 3 2 6 divide 5 2 4 6 1 3 2 6 divide 5 2 4 6 1 3 2 6 divide 5 2 4 6 1 3 2 6 merge 2 5 4 6 1 3 2 6 merge 2 4 5 6 1 2 3 6 merge 1 2 2 3 4 5 6 6 Sorted Sequence
Analyzing divide and conquer algorithms Time complexity Analyzing divide and conquer algorithms The time can often be described by recurrence equation fo the form (1), if n c, T(n) = aT(n/b)+D(n)+C(n) if n>c With a,b and c be nonnegative constants. If the problem is the small enough, say n c, then the solution takes constant time (1). If not the problem is divided in a subproblems with (1/b) size of the original. The division takes time D(n) and the combinations of sub-solutions takes time C(n).
a=2, b=2, c=1, D(n)=(1) and C(n)= (n) Analyzing MERGE-SORT In this case a=2, b=2, c=1, D(n)=(1) and C(n)= (n) then (1), if n 1, T(n) = 2T(n/2)+ (1)+ (n) if n>1 c if n 1, 2T(n/2)+ cn if n>1
Sorted Sequence merge merge merge Initial Sequence 1 2 2 3 4 5 6 6 1 2 2 3 4 5 6 6 2 4 5 6 1 2 3 6 2 5 2 6 1 3 4 6 5 2 4 6 1 3 merge merge merge Initial Sequence
The construction of the recursion tree Lets suppose n power of two n=2k c, if n 1, T(n) = 2T(n/2)+ cn if n>1 T(n) cn T(n/2) T(n/2)
cn cn/2 cn/2 T(n/4) T(n/4) T(n/4) T(n/4)
cn cn/2 cn/2 cn/4 cn/4 cn/4 cn/4 T(n/8) T(n/8) T(n/8) T(n/8) T(n/8) T(n/8) T(n/8) T(n/8)
cn cn cn/2 cn/2 cn cn/4 cn/4 cn/4 cn/4 cn cn/8 cn/8 cn/8 cn/8 cn/8 lg n+1 cn/8 cn/8 cn/8 cn/8 cn/8 cn/8 cn/8 cn/8 cn . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . cn c c c c c c c c c c c c c c c c n Total: cn (lg n + 1) cn lg n + cn
Recursive substitution Lets assume that n is power of two, i.e., n=2k, k = lg n T(n) = 2T(n/2)+ cn = 2[2T(n/4)+ cn/2]+ cn = 4T(n/4)+ cn+ cn = 4[2T(n/8)+cn/4]+ cn+ cn= 8T(n/8)+cn+ cn+ cn . = 2kT(n/2k)+cn+ . . . +cn = = 2kT(1)+k(cn) = cn+cn lg n k times
Exact recursive recurrence In total including other operations let’s say each merge costs 3 per element output T(n)=T(n/2)+T(n/2)+3n for n2 T(1)=1 Can use this to figure out T for any value of n T(5)=T(3)+T(2)+3x5 =(T(2)+T(1)+3x3)+(T(1)+T(1)+3x2)+15 = ((T(1)+T(1)+6)+1+9)+(1+1+6)+15 = 8+10+8+15 = 41 “ceiling” round up “floor” round down