Introduction to Algorithm design and analysis

Slides:



Advertisements
Similar presentations
2. Getting Started Heejin Park College of Information and Communications Hanyang University.
Advertisements

Razdan with contribution from others 1 Algorithm Analysis What is the Big ‘O Bout? Anshuman Razdan Div of Computing.
Lecture 4 Divide and Conquer for Nearest Neighbor Problem
Comp 122, Spring 2004 Divide and Conquer (Merge Sort)
1 Divide & Conquer Algorithms. 2 Recursion Review A function that calls itself either directly or indirectly through another function Recursive solutions.
Divide-and-Conquer Recursive in structure –Divide the problem into several smaller sub-problems that are similar to the original but smaller in size –Conquer.
Chapter 2. Getting Started. Outline Familiarize you with the to think about the design and analysis of algorithms Familiarize you with the framework to.
A Basic Study on the Algorithm Analysis Chapter 2. Getting Started 한양대학교 정보보호 및 알고리즘 연구실 이재준 담당교수님 : 박희진 교수님 1.
2. Getting started Hsu, Lih-Hsing. Computer Theory Lab. Chapter 2P Insertion sort Example: Sorting problem Input: A sequence of n numbers Output:
ALGORITHMS Introduction. Definition Algorithm: Any well-defined computational procedure that takes some value or set of values as input and produces some.
Analysis of Algorithms CS 477/677 Instructor: Monica Nicolescu Lecture 5.
CS Section 600 CS Section 002 Dr. Angela Guercio Spring 2010.
CS Section 600 CS Section 002 Dr. Angela Guercio Spring 2010.
Sorting. Input: A sequence of n numbers a 1, …, a n Output: A reordering a 1 ’, …, a n ’, such that a 1 ’ < … < a n ’
Lecture 3 Nearest Neighbor Algorithms Shang-Hua Teng.
CS421 - Course Information Website Syllabus Schedule The Book:
Lecture 2: Divide and Conquer I: Merge-Sort and Master Theorem Shang-Hua Teng.
Analysis of Algorithms CS 477/677 Midterm Exam Review Instructor: George Bebis.
CS3381 Des & Anal of Alg ( SemA) City Univ of HK / Dept of CS / Helena Wong 4. Recurrences - 1 Recurrences.
CS3381 Des & Anal of Alg ( SemA) City Univ of HK / Dept of CS / Helena Wong 2. Analysis of Algorithms - 1 Analysis.
CS Main Questions Given that the computer is the Great Symbol Manipulator, there are three main questions in the field of computer science: What kinds.
Introduction CIS 606 Spring The sorting problem Input: A sequence of n numbers 〈 a 1, a 2, …, a n 〉. Output: A permutation (reordering) 〈 a’ 1,
Data Structures Types of Data Structures Algorithms
David Luebke 1 8/17/2015 CS 332: Algorithms Asymptotic Performance.
HOW TO SOLVE IT? Algorithms. An Algorithm An algorithm is any well-defined (computational) procedure that takes some value, or set of values, as input.
Algorithm Correctness A correct algorithm is one in which every valid input instance produces the correct output. The correctness must be proved mathematically.
CS301 - Algorithms Fall Hüsnü Yenigün.
Introduction to Algorithms Lecture 1. Introduction The methods of algorithm design form one of the core practical technologies of computer science. The.
10/14/ Algorithms1 Algorithms - Ch2 - Sorting.
ECOE 456/556: Algorithms and Computational Complexity Lecture 1 Serdar Taşıran.
Lecture 2 Algorithm Analysis Arne Kutzner Hanyang University / Seoul Korea.
Analyzing algorithms & Asymptotic Notation BIO/CS 471 – Algorithms for Bioinformatics.
CMPT 438 Algorithms. Why Study Algorithms? Necessary in any computer programming problem ▫Improve algorithm efficiency: run faster, process more data,
ALGORITHMS THIRD YEAR BANHA UNIVERSITY FACULTY OF COMPUTERS AND INFORMATIC Lecture three Dr. Hamdy M. Mousa.
Getting Started Introduction to Algorithms Jeff Chastine.
1Computer Sciences Department. Book: Introduction to Algorithms, by: Thomas H. Cormen Charles E. Leiserson Ronald L. Rivest Clifford Stein Electronic:
September 17, 2001 Algorithms and Data Structures Lecture II Simonas Šaltenis Nykredit Center for Database Research Aalborg University
Algorithm Analysis CS 400/600 – Data Structures. Algorithm Analysis2 Abstract Data Types Abstract Data Type (ADT): a definition for a data type solely.
September 9, Algorithms and Data Structures Lecture II Simonas Šaltenis Nykredit Center for Database Research Aalborg University
COSC 3101A - Design and Analysis of Algorithms 2 Asymptotic Notations Continued Proof of Correctness: Loop Invariant Designing Algorithms: Divide and Conquer.
Divide-and-Conquer UNC Chapel HillZ. Guo. Divide-and-Conquer It’s a technique instead of an algorithm Recursive in structure – Divide the problem into.
David Luebke 1 1/6/2016 CS 332: Algorithms Asymptotic Performance.
1/6/20161 CS 3343: Analysis of Algorithms Lecture 2: Asymptotic Notations.
Asymptotic Performance. Review: Asymptotic Performance Asymptotic performance: How does algorithm behave as the problem size gets very large? Running.
Midterm Review 1. Midterm Exam Thursday, October 15 in classroom 75 minutes Exam structure: –TRUE/FALSE questions –short questions on the topics discussed.
Analysis of Algorithms Asymptotic Performance. Review: Asymptotic Performance Asymptotic performance: How does algorithm behave as the problem size gets.
ADVANCED ALGORITHMS REVIEW OF ANALYSIS TECHNIQUES (UNIT-1)
Introduction to Algorithms (2 nd edition) by Cormen, Leiserson, Rivest & Stein Chapter 2: Getting Started.
Algorithm Analysis. What is an algorithm ? A clearly specifiable set of instructions –to solve a problem Given a problem –decide that the algorithm is.
Introduction to Complexity Analysis. Computer Science, Silpakorn University 2 Complexity of Algorithm algorithm คือ ขั้นตอนการคำนวณ ที่ถูกนิยามไว้อย่างชัดเจนโดยจะ.
Algorithms A well-defined computational procedure that takes some value as input and produces some value as output. (Also, a sequence of computational.
2IS80 Fundamentals of Informatics Fall 2015 Lecture 6: Sorting and Searching.
1 Ch. 2: Getting Started. 2 About this lecture Study a few simple algorithms for sorting – Insertion Sort – Selection Sort (Exercise) – Merge Sort Show.
Lecture 2 Algorithm Analysis Arne Kutzner Hanyang University / Seoul Korea.
CSC317 1 So far so good, but can we do better? Yes, cheaper by halves... orkbook/cheaperbyhalf.html.
 Design and Analysis of Algorithms تصميم وتحليل الخوارزميات (311 عال) Chapter 2 Sorting (insertion Sort, Merge Sort)
September 18, Algorithms and Data Structures Lecture II Simonas Šaltenis Aalborg University
Design and Analysis of Algorithms Faculty Name : Ruhi Fatima Course Description This course provides techniques to prove.
CS6045: Advanced Algorithms Sorting Algorithms. Sorting Input: sequence of numbers Output: a sorted sequence.
Lecture 2 Algorithm Analysis
CMPT 438 Algorithms.
CS 3343: Analysis of Algorithms
CS 3343: Analysis of Algorithms
CS 583 Analysis of Algorithms
Ch 2: Getting Started Ming-Te Chi
Ack: Several slides from Prof. Jim Anderson’s COMP 202 notes.
Divide & Conquer Algorithms
Introduction To Algorithms
Quicksort Quick sort Correctness of partition - loop invariant
Algorithms and Data Structures Lecture II
Presentation transcript:

Introduction to Algorithm design and analysis Example: sorting problem. Input: a sequence of n number a1, a2, …,an Output: a permutation (reordering) a1', a2', …,an' such that a1' a2'  …  an '. Different sorting algorithms: Insertion sort and Mergesort.

Efficiency comparison of two algorithms Suppose n=106 numbers: Insertion sort: c1n2 Merge sort: c2n (lg n) Best programmer (c1=2), machine language, one billion/second computer A. Bad programmer (c2=50), high-language, ten million/second computer B. 2 (106)2 instructions/109 instructions per second = 2000 seconds. 50 (106 lg 106) instructions/107 instructions per second  100 seconds. Thus, merge sort on B is 20 times faster than insertion sort on A! If sorting ten million numbers, 2.3 days VS. 20 minutes. Conclusions: Algorithms for solving the same problem can differ dramatically in their efficiency. much more significant than the differences due to hardware and software.

Algorithm Design and Analysis Design an algorithm Prove the algorithm is correct. Loop invariant. Recursive function. Formal (mathematical) proof. Analyze the algorithm Time Worse case, best case, average case. For some algorithms, worst case occurs often, average case is often roughly as bad as the worst case. So generally, worse case running time. Space Sequential and parallel algorithms Random-Access-Model (RAM) Parallel multi-processor access model: PRAM

Insertion Sort Algorithm (cont.) for j = 2 to length[A] do key  A[j] //insert A[j] to sorted sequence A[1..j-1] i  j-1 while i >0 and A[i]>key do A[i+1]  A[i] //move A[i] one position right i  i-1 A[i+1]  key

Correctness of Insertion Sort Algorithm Loop invariant At the start of each iteration of the for loop, the subarray A[1..j-1] contains original A[1..j-1] but in sorted order. Proof: Initialization : j=2, A[1..j-1]=A[1..1]=A[1], sorted. Maintenance: each iteration maintains loop invariant. Termination: j=n+1, so A[1..j-1]=A[1..n] in sorted order.

Analysis of Insertion Sort INSERTION-SORT(A) cost times for j = 2 to length[A] c1 n do key  A[j] c2 n-1 //insert A[j] to sorted sequence A[1..j-1] 0 n-1 i  j-1 c4 n-1 while i >0 and A[i]>key c5 j=2n tj do A[i+1]  A[i] c6 j=2n(tj –1) i  i-1 c7 j=2n(tj –1) A[i+1]  key c8 n –1 (tj is the number of times the while loop test in line 5 is executed for that value of j) The total time cost T(n) = sum of cost  times in each line =c1n + c2(n-1) + c4(n-1) + c5j=2n tj+ c6j=2n (tj-1)+ c7j=2n (tj-1)+ c8(n-1)

Analysis of Insertion Sort (cont.) Best case cost: already ordered numbers tj=1, and line 6 and 7 will be executed 0 times T(n) = c1n + c2(n-1) + c4(n-1) + c5(n-1) + c8(n-1) =(c1 + c2 + c4 + c5 + c8)n – (c2 + c4 + c5 + c8) = cn + c‘ Worst case cost: reverse ordered numbers tj=j, so j=2n tj = j=2n j =n(n+1)/2-1, and j=2n(tj –1) = j=2n(j –1) = n(n-1)/2, and T(n) = c1n + c2(n-1) + c4(n-1) + c5(n(n+1)/2 -1) + + c6(n(n-1)/2 -1) + c7(n(n-1)/2)+ c8(n-1) =((c5 + c6 + c7)/2)n2 +(c1 + c2 + c4 +c5/2-c6/2-c7/2+c8)n-(c2 + c4 + c5 + c8) =an2+bn+c Average case cost: random numbers in average, tj = j/2. T(n) will still be in the order of n2, same as the worst case.

Merge Sort—divide-and-conquer Divide: divide the n-element sequence into two subproblems of n/2 elements each. Conquer: sort the two subsequences recursively using merge sort. If the length of a sequence is 1, do nothing since it is already in order. Combine: merge the two sorted subsequences to produce the sorted answer.

Merge Sort –merge function Merge is the key operation in merge sort. Suppose the (sub)sequence(s) are stored in the array A. moreover, A[p..q] and A[q+1..r] are two sorted subsequences. MERGE(A,p,q,r) will merge the two subsequences into sorted sequence A[p..r] MERGE(A,p,q,r) takes (r-p+1).

MERGE-SORT(A,p,r) if p < r then q  (p+r)/2 MERGE-SORT(A,p,q) MERGE-SORT(A,q+1,r) MERGE(A,p,q,r) Call to MERGE-SORT(A,1,n) (suppose n=length(A))

Analysis of Divide-and-Conquer Described by recursive equation Suppose T(n) is the running time on a problem of size n. T(n) = (1) if nnc aT(n/b)+D(n)+C(n) if n> nc Where a: number of subproblems n/b: size of each subproblem D(n): cost of divide operation C(n): cost of combination operation

Analysis of MERGE-SORT Divide: D(n) = (1) Conquer: a=2,b=2, so 2T(n/2) Combine: C(n) = (n) T(n) = (1) if n=1 2T(n/2)+ (n) if n>1 T(n) = c if n=1 2T(n/2)+ cn if n>1

Compute T(n) by Recursive Tree The recursive equation can be solved by recursive tree. T(n) = 2T(n/2)+ cn, (See its Recursive Tree). lg n+1 levels, cn at each level, thus Total cost for merge sort is T(n) =cnlg n +cn = (nlg n). Question: best, worst, average? In contrast, insertion sort is T(n) = (n2).

Recursion tree of T(n)=2T(n/2)+cn

Order of growth Lower order item(s) are ignored, just keep the highest order item. The constant coefficient(s) are ignored. The rate of growth, or the order of growth, possesses the highest significance. Use (n2) to represent the worst case running time for insertion sort. Typical order of growth: (1), (lg n), (n),(n), (nlg n), (n2), (n3), (2n), (n!) Asymptotic notations: , O, , o, .

c2g(n) f(n) f(n) c1g(n) f(n) cg(n) f(n) = ( g(n)) f(n) = O( g(n)) There exist positive constants c such that there is a positive constant n0 such that … There exist positive constants c1 and c2 such that there is a positive constant n0 such that … cg(n) f(n) = ( g(n)) n n0 c1g(n) c2g(n) f(n) f(n) n n0 f(n) = O( g(n)) There exist positive constants c such that there is a positive constant n0 such that … f(n) = ( g(n)) n n0 cg(n) f(n)

Prove f(n)=an2+bn+c=(n2) a, b, c are constants and a>0. Find c1, and c2 (and n0) such that c1n2  f(n) c2n2 for all n n0. It turns out: c1 =a/4, c2 =7a/4 and n0 = 2max(|b|/a, sqrt(|c|/a)) Here we also can see that lower terms and constant coefficient can be ignored. How about f(n)=an3+bn2+cn+d?

o-notation 2g(n) g(n) 1/2g(n) f(n) For a given function g(n), o(g(n))={f(n): for any positive constant c,there exists a positive n0 such that 0 f(n)  cg(n) for all n n0} Write f(n)  o( g(n)), or simply f(n) = o( g(n)). 2g(n) f(n) = o( g(n)) n n0 g(n) f(n) 1/2g(n)

Notes on o-notation O-notation may or may not be asymptotically tight for upper bound. 2n2 = O(n2) is tight, but 2n = O(n2) is not tight. o-notition is used to denote an upper bound that is not tight. 2n = o(n2), but 2n2  o(n2). Difference: for some positive constant c in O-notation, but all positive constants c in o-notation. In o-notation, f(n) becomes insignificant relative to g(n) as n approaches infinitely: i.e., lim = 0. f(n) g(n) n

-notation For a given function g(n), (g(n))={f(n): for any positive constant c, there exists a positive n0 such that 0  cg(n)  f(n) for all n n0} Write f(n)  ( g(n)), or simply f(n) = ( g(n)). -notation, similar to o-notation, denotes lower bound that is not asymptotically tight. n2/2 = (n), but n2/2  (n2) f(n) = ( g(n)) if and only if g(n)=o(f(n)). lim =  f(n) g(n) n

Techniques for Algorithm Design and Analysis Data structure: the way to store and organize data. Disjoint sets Balanced search trees (red-black tree, AVL tree, 2-3 tree). Design techniques: divide-and-conquer, dynamic programming, prune-and-search, laze evaluation, linear programming, … Analysis techniques: Analysis: recurrence, decision tree, adversary argument, amortized analysis,…

NP-complete problem Hard problem: Why interesting: Why important: Most problems discussed are efficient (poly time) An interesting set of hard problems: NP-complete. Why interesting: Not known whether efficient algorithms exist for them. If exist for one, then exist for all. A small change may cause big change. Why important: Arise surprisingly often in real world. Not waste time on trying to find an efficient algorithm to get best solution, instead find approximate or near-optimal solution. Example: traveling-salesman problem.