CS Main Questions Given that the computer is the Great Symbol Manipulator, there are three main questions in the field of computer science: What kinds.

CS Main Questions Given that the computer is the Great Symbol Manipulator, there are three main questions in the field of computer science: What kinds of symbol manipulations can be performed by a computer? [The Computability Question] How efficiently can a computer do these manipulations? [The Efficiency Question] How do we know that the computer is actually doing the manipulations intended? [The Correctness Question]

Algorithms Basic Goals Basic goals for an algorithm: always correct always terminates Performance  Performance often draws the line between what is possible and what is impossible.

Analysis: predict the cost of an algorithm in terms of resources and performance Design: design algorithms which minimize the cost Design and Analysis of Algorithms

Generic Random Access Machine (RAM) Executes operations sequentially Set of primitive operations:  Arithmetic. Logical, Comparisons, Function calls Simplifying assumption: all ops cost 1 unit  Eliminates dependence on the speed of our computer, otherwise impossible to verify and to compare A Machine Model

Sorting

Input: A sequence of n numbers a 1, …, a n Output: A reordering a 1 ’, …, a n ’, such that a 1 ’ < … < a n ’ 14326816 23141668

Insertion Sort INSERTION-SORT(A) // sort the array A for j = 2 to length(A) key = A[ j ] i = j – 1 while i > 0 and A[ i ] > key A[i+1] = A[ i ] i-- A[i+1] = key 14326816

Insertion Sort INSERTION-SORT(A) for j = 2 to length(A) key = A[j] i = j – 1 while i > 0 and A[i] > key A[i+1] = A[i] i-- A[i+1] = key 14326816 j

Insertion Sort INSERTION-SORT(A) for j = 2 to length(A) key = A[j] i = j – 1 while i > 0 and A[i] > key A[i+1] = A[i] i-- A[i+1] = key 23141668

Pseudo-Code Not really a program, just an outline Enough details to establish the correctness and running time Even pseudo-code is too complicated Note that for a simple algorithm it obscures what is going on A simpler version of Insertion Sort, using an auxiliary array:  Go over the numbers one-by-one, starting from the first, copy to new array  Each time copy to the correct place in the new array  In order to create empty space, shift the numbers that are larger than the current number one cell to the right Credits: Serj Plotkyn

The running time depends on the input: an already sorted sequence is easier to sort. Major Simplifying Convention: Parameterize the running time by the size of the input, since short sequences are easier to sort than long ones.  TA(n) = time of A on length n inputs Generally, we seek upper bounds on the running time, to have a guarantee of performance. Running time

Analysis of an algorithm Correctness: given a legal input, the algorithm terminates and produces the desired output Running Time: depends on input size, input properties  Worst case: max T(n), on any input of size n  Expected: E(T(n)), where inputs are taken from a distribution  Best case: min T(n) – not really interesting, except to show that an algorithm works well on input with certain properties

Correctness of Insertion Sort INSERTION-SORT(A) for j = 2 to length(A) key = A[j] i = j – 1 while i > 0 and A[i] > key A[i+1] = A[i] i-- A[i+1] = key Loop Invariant: At the start of each iteration of the for loop, A[1…j-1] consists of the original first j-1 elements, but sorted

Loop Invariants Help prove that an algorithm is correct To prove loop invariant, need to show three properties: Initialization: Invariant is true before 1 st iteration Maintenance: If invariant true before iteration k, it remains true right before iteration k+1 Termination: When the loop terminates, the invariant implies some useful property

Initialization INSERTION-SORT(A) for j = 2 to length(A) key = A[j] i = j – 1 while i > 0 and A[i] > key A[i+1] = A[i] i-- A[i+1] = key Show that when j = 2, invariant holds “ A[1] consists of the original first 1 elements, but sorted” Clear: A[1…j-1] is just A[1]

Maintenance INSERTION-SORT(A) for j = 2 to length(A) key = A[j] i = j – 1 while i > 0 and A[i] > key A[i+1] = A[i] i-- A[i+1] = key Show that each iteration maintains the invariant Informally, the relative order of A[1…j-1] is not affected by the body of the loop; A[j] is inserted in its proper place.

Termination INSERTION-SORT(A) for j = 2 to length(A) key = A[j] i = j – 1 while i > 0 and A[i] > key A[i+1] = A[i] i-- A[i+1] = key What does the invariant imply at loop termination? A[1…length(A)] is sorted!

Running Time INSERTION-SORT(A) # times executed for j = 2 to length(A) n = length(A) key = A[j] n – 1 i = j – 1 n – 1 while i > 0 and A[i] > key  j=2…n t j A[i+1] = A[i]  j=2…n (t j – 1) i--  j=2…n (t j – 1) A[i+1] = key n – 1  Assume each operation costs 1  Let t j = # times while loop is executed T(n) = n + 3(n – 1) +  j=2…n t j + 2  j=2…n (t j – 1)

Running Time T(n) = n + 3(n – 1) +  j=2…n t j + 2  j=2…n (t j – 1) Best case: Input A is already sorted Then, t j = 1, for all j T(n) = 4n – 3 + (n-1) + 0 = 5n - 4 Worst case: Input A is in reverse order: A[1] > … > A[n] Then, t j = j, for all j T(n) = 4n – 3 + n(n+1)/2 + n(n-1) = 3/2 n 2 + 7/2 n – 3 23141668 143286

B IG I DEAS : Ignore machine dependent constants, otherwise impossible to verify and to compare algorithms Look at growth of T(n) as n → ∞. “ Asymptotic Analysis” Machine-independent time

 -notation  (g(n)) = { f (n) :there exist positive constants c 1, c 2, and n 0 such that 0  c 1 g(n)  f (n)  c 2 g(n) for all n  n 0 } Drop low-order terms; ignore leading constants. Example: 3n 3 + 90n 2 – 5n + 6046 =  (n 3 ) Basic manipulations: DEF:

Asymptotic performance n T(n)T(n) n0n0. Asymptotic analysis is a useful tool to help to structure our thinking toward better algorithms We shouldn’t ignore asymptotically slower algorithms, however. Real-world design situations often call for a careful balancing When n gets large enough, a  (n 2 ) algorithm always beats a  (n 3 ) algorithm.

Merge Sort Divide A into two sub-arrays of half the size Recursively, sort each sub-array Merge the two sub-arrays into a sorted array  Merge by successively picking & erasing the smallest element from the beginning of the two sub-arrays

The divide-and-conquer approach Divide the problem into a number of subproblems Conquer the subproblems by solving recursively Combine the solutions to the sub-problems into the solution for the original problem

Example :Merge sort M ERGE -S ORT A[1.. n] 1.If n = 1, done. 2.Recursively sort A[ 1..  n/2  ] and A[  n/2  +1.. n ]. 3.“Merge” the 2 sorted lists. Key subroutine: M ERGE

Merging two sorted arrays 20 13 7 2 12 11 9 1

Merging two sorted arrays 20 13 7 2 12 11 9 1 1

Merging two sorted arrays 20 13 7 2 12 11 9 1 1 20 13 7 2 12 11 9

Merging two sorted arrays 20 13 7 2 12 11 9 1 1 20 13 7 2 12 11 9 2

Merging two sorted arrays 20 13 7 2 12 11 9 1 1 20 13 7 2 12 11 9 2 20 13 7 12 11 9

Merging two sorted arrays 20 13 7 2 12 11 9 1 1 20 13 7 2 12 11 9 2 20 13 7 12 11 9 7

Merging two sorted arrays 20 13 7 2 12 11 9 1 1 20 13 7 2 12 11 9 2 20 13 7 12 11 9 7 20 13 12 11 9

Merging two sorted arrays 20 13 7 2 12 11 9 1 1 20 13 7 2 12 11 9 2 20 13 7 12 11 9 7 20 13 12 11 9 9

Merging two sorted arrays 20 13 7 2 12 11 9 1 1 20 13 7 2 12 11 9 2 20 13 7 12 11 9 7 20 13 12 11 9 9 20 13 12 11

Merging two sorted arrays 20 13 7 2 12 11 9 1 1 20 13 7 2 12 11 9 2 20 13 7 12 11 9 7 20 13 12 11 9 9 20 13 12 11 20 13 12

Merging two sorted arrays 20 13 7 2 12 11 9 1 1 20 13 7 2 12 11 9 2 20 13 7 12 11 9 7 20 13 12 11 9 9 20 13 12 11 20 13 12 Time =  (n) to merge a total of n elements (linear time).

Analyzing merge sort M ERGE -S ORT A[1.. n] 1.If n = 1, done. 2.Recursively sort A[ 1..  n/2  ] and A[  n/2  +1.. n ]. 3.“Merge” the 2 sorted lists T(n)  (1) 2T(n/2)  (n) Sloppiness: Should be T(  n/2  ) + T(  n/2  ), but it turns out not to matter asymptotically.

Recurrence for merge sort T(n) =  (1) if n = 1; 2T(n/2) +  (n) if n > 1. We shall usually omit stating the base case when T(n) =  (1) for sufficiently small n, but only when it has no effect on the asymptotic solution to the recurrence. Lecture 2 provides several ways to find a good upper bound on T(n).

Recursion tree Solve T(n) = 2T(n/2) + cn, where c > 0 is constant.

Recursion tree Solve T(n) = 2T(n/2) + cn, where c > 0 is constant. T(n)T(n)

Recursion tree Solve T(n) = 2T(n/2) + cn, where c > 0 is constant. T(n/2) cn

Recursion tree Solve T(n) = 2T(n/2) + cn, where c > 0 is constant. cn T(n/4) cn/2

Recursion tree Solve T(n) = 2T(n/2) + cn, where c > 0 is constant. cn cn/4 cn/2  (1) …

Recursion tree Solve T(n) = 2T(n/2) + cn, where c > 0 is constant. cn cn/4 cn/2  (1) … h = lg n

Recursion tree Solve T(n) = 2T(n/2) + cn, where c > 0 is constant. cn cn/4 cn/2  (1) … h = lg n cn

Recursion tree Solve T(n) = 2T(n/2) + cn, where c > 0 is constant. cn cn/4 cn/2  (1) … h = lg n cn …

Recursion tree Solve T(n) = 2T(n/2) + cn, where c > 0 is constant. cn cn/4 cn/2  (1) … h = lg n cn #leaves = n (n)(n) …

Recursion tree Solve T(n) = 2T(n/2) + cn, where c > 0 is constant. cn cn/4 cn/2  (1) … h = lg n cn #leaves = n (n)(n) Total  (n lg n) …

Conclusions  (n lg n) grows more slowly than  (n 2 ). Therefore, merge sort asymptotically beats insertion sort in the worst case. In practice, merge sort beats insertion sort for n > 30 or so.

Merge ideas MERGE(A, p, q, r) create arrays L[1…q-p+1] and R[1…r-q] L[1…q-p] = A[p…q] R[1…r-q-1] = A[q+1…r] L[n 1 +1] = R[n 1 +1] =  i = j = 1 For k = p to r if L[i] < R[i] then A[k] = L[i]; i++ else A[k] = R[j]; j++ 38121416610 38141612610 q pr 128 361416

Merge Sort – Example 1432681611014381626110261 143816 26110143816 1102681631438 1612610128 361416

CS Main Questions Given that the computer is the Great Symbol Manipulator, there are three main questions in the field of computer science: What kinds.

Similar presentations

Presentation on theme: "CS Main Questions Given that the computer is the Great Symbol Manipulator, there are three main questions in the field of computer science: What kinds."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

CS Main Questions Given that the computer is the Great Symbol Manipulator, there are three main questions in the field of computer science: What kinds.

Similar presentations

Presentation on theme: "CS Main Questions Given that the computer is the Great Symbol Manipulator, there are three main questions in the field of computer science: What kinds."— Presentation transcript:

Similar presentations

About project

Feedback