Computation, Complexity, and Algorithm IST 512 Spring 2012 Dongwon Lee, Ph.D.

Slides:



Advertisements
Similar presentations
Algorithm Analysis Input size Time I1 T1 I2 T2 …
Advertisements

Problems and Their Classes
HST 952 Computing for Biomedical Scientists Lecture 10.
Lecture 12: Revision Lecture Dr John Levine Algorithms and Complexity March 27th 2006.
Fall 2006CENG 7071 Algorithm Analysis. Fall 2006CENG 7072 Algorithmic Performance There are two aspects of algorithmic performance: Time Instructions.
CSE332: Data Abstractions Lecture 27: A Few Words on NP Dan Grossman Spring 2010.
Chapter 3 The Efficiency of Algorithms
Eleg667/2001-f/Topic-1a 1 A Brief Review of Algorithm Design and Analysis.
The Theory of NP-Completeness
Cmpt-225 Algorithm Efficiency.
Analysis of Algorithms CS 477/677
Algorithm Analysis CS 201 Fundamental Structures of Computer Science.
Chapter 11: Limitations of Algorithmic Power
Liang, Introduction to Java Programming, Eighth Edition, (c) 2011 Pearson Education, Inc. All rights reserved Chapter 23 Algorithm Efficiency.
Algorithm Analysis (Big O)
Algorithm Cost Algorithm Complexity. Algorithm Cost.
COMP s1 Computing 2 Complexity
BIT Presentation 4.  An algorithm is a method for solving a class of problems.  While computer scientists think a lot about algorithms, the term.
1 Complexity Lecture Ref. Handout p
Liang, Introduction to Java Programming, Seventh Edition, (c) 2009 Pearson Education, Inc. All rights reserved Chapter 23 Algorithm Efficiency.
Program Performance & Asymptotic Notations CSE, POSTECH.
IST 511 Information Management: Information and Technology
Week 2 CS 361: Advanced Data Structures and Algorithms
1 Growth of Functions CS 202 Epp, section ??? Aaron Bloomfield.
1 Chapter 24 Developing Efficient Algorithms. 2 Executing Time Suppose two algorithms perform the same task such as search (linear search vs. binary search)
{ CS203 Lecture 7 John Hurley Cal State LA. 2 Execution Time Suppose two algorithms perform the same task such as search (linear search vs. binary search)
Algorithm Efficiency CS 110: Data Structures and Algorithms First Semester,
Mathematics Review and Asymptotic Notation
Analysis of Algorithms
Analysis of Algorithms These slides are a modified version of the slides used by Prof. Eltabakh in his offering of CS2223 in D term 2013.
Problems you shouldn’t tackle. Problem Complexity.
Analysis of Algorithms CSCI Previous Evaluations of Programs Correctness – does the algorithm do what it is supposed to do? Generality – does it.
Computation, Complexity, and Algorithm IST 501 Fall 2014 Dongwon Lee, Ph.D.
Algorithm Analysis (Algorithm Complexity). Correctness is Not Enough It isn’t sufficient that our algorithms perform the required tasks. We want them.
Program Efficiency & Complexity Analysis. Algorithm Review An algorithm is a definite procedure for solving a problem in finite number of steps Algorithm.
Data Structure Introduction.
NP-COMPLETE PROBLEMS. Admin  Two more assignments…  No office hours on tomorrow.
3.3 Complexity of Algorithms
SNU OOPSLA Lab. 1 Great Ideas of CS with Java Part 1 WWW & Computer programming in the language Java Ch 1: The World Wide Web Ch 2: Watch out: Here comes.
CS 206 Introduction to Computer Science II 09 / 18 / 2009 Instructor: Michael Eckmann.
CS 206 Introduction to Computer Science II 01 / 30 / 2009 Instructor: Michael Eckmann.
Fundamentals of Informatics Lecture 14 Intractability and NP-completeness Bas Luttik.
Algorithm Analysis (Big O)
Computer Science Background for Biologists CSC 487/687 Computing for Bioinformatics Fall 2005.
Scalability for Search Scaling means how a system must grow if resources or work grows –Scalability is the ability of a system, network, or process, to.
CS 461 – Nov. 18 Section 7.1 Overview of complexity issues –“Can quickly decide” vs. “Can quickly verify” Measuring complexity Dividing decidable languages.
Searching Topics Sequential Search Binary Search.
CS216: Program and Data Representation University of Virginia Computer Science Spring 2006 David Evans Lecture 8: Crash Course in Computational Complexity.
Onlinedeeneislam.blogspot.com1 Design and Analysis of Algorithms Slide # 1 Download From
Big O David Kauchak cs302 Spring Administrative Assignment 1: how’d it go? Assignment 2: out soon… Lab code.
BITS Pilani Pilani Campus Data Structure and Algorithms Design Dr. Maheswari Karthikeyan Lecture1.
Copyright © 2014 Curt Hill Algorithm Analysis How Do We Determine the Complexity of Algorithms.
1 Computability Tractable, Intractable and Non-computable functions.
(Complexity) Analysis of Algorithms Algorithm Input Output 1Analysis of Algorithms.
Algorithm Analysis 1.
Analysis of Algorithms
Last time More and more information born digital
Scalability for Search
Introduction to Algorithms
Analysis of algorithms
Chapter 2 Fundamentals of the Analysis of Algorithm Efficiency
The Pennsylvania State University, University Park, PA, USA
Scalability for Search and Big O
Objective of This Course
CS 201 Fundamental Structures of Computer Science
Chapter 11 Limitations of Algorithm Power
Scalability for Search and Big O
Analysis of algorithms
Invitation to Computer Science 5th Edition
Algorithm Analysis How can we demonstrate that one algorithm is superior to another without being misled by any of the following problems: Special cases.
Presentation transcript:

Computation, Complexity, and Algorithm IST 512 Spring 2012 Dongwon Lee, Ph.D.

2 Learning Objectives Understand what computation and algorithm are Be able to compare alternative algorithms for a problem w. r. t their complexity Given an algorithm, be able to describe its big O complexity Be able to explain NP-complete

3 Computational Complexity Why complexity? Primarily for evaluating difficulty in scaling up a problem How will the problem grow as resources increase? Knowing if a claimed solution to a problem is optimal (best) Optimal (best) in what sense?

4 Why do we have to deal with this? Moore’s law Hwang’s law Growth of information and information resources Storing stuff Finding stuff Moore’s law: Empirical observation that the transistor density of integrated circuits (IC) doubles every 24 (or 18) months Hwang’s law: doubles every year

5 Impact The efficiency of algorithms/methods The inherent "difficulty" of problems of practical and/or theoretical importance A major discovery in the science was that computational problems can vary tremendously in the effort required to solve them precisely. The technical term for a hard problem is “____________________" which essentially means: "abandon all hope of finding an efficient algorithm for the exact (and sometimes approximate) solution of this problem".

6 Optimality A solution to a problem is sometimes stated as “optimal” Optimal in what sense? Empirically? Theoretically? (the only real definition)

7 You’re given a problem Data informatics: How will our system scale if we store everybody’s every day and we ask everyone to send a picture of themselves and to everyone else. We will also hire a new employ each week for a year Social informatics: We want to know how the number of social connections in a growing community increase over time.

8 We will use algorithms Algorithm: a sequence of finite instructions for doing some task unambiguous and simple to follow Eg, Euclid’s algorithm to determine the maximum common divisor of two integers greater than 1 Subtract the smaller number from the larger one Repeat until you get 0 or 1

Eg. Sorting Algorithms Problem: Sort a set of numbers in ascending order [ ] Many variations of sorting algorithms Bubble Sort Insertion Sort Merge Sort Quick Sort …

10 Scenarios I’ve got two algorithms that accomplish the same task Which is better? I want to store some data How do my storage needs scale as more data is stored Given an algorithm, can I determine how long it will take to run? Input is unknown Don’t want to trace all possible paths of execution For different input, can I determine how system cost will change?

11 Measuring the Growth of Work While it is possible to measure the work done by an algorithm for a given set of input, we need a way to: Measure the rate of growth of an algorithm or problem based upon the size of the input Compare algorithms or methods to determine which is better for the situation

12 Basic unit of the growth of work Input N Describing the relationship as a function f(n) f(N): the number of steps required by an algorithm

13 Why Input Size N? What are some of the factors that affect computation time? Speed of the CPU Choice of programming languages Size of the input A test case shows a specific mapping from between these factors and the computation time Since an algorithm can be implemented in different machines, and different programming languages, we are interested in comparing algorithms using factors that are not affected by implementations. Hence, we are interested in how “size of input” affects computation time.

14 Comparing the growth of work N log N N2N2 1 Size of input N Work done

15 Time vs. Space Very often, we can trade space for time: For example: maintain a collection of students’ with SSN information. Use an array of a billion elements and have immediate access (better time) Use an array of number of students and have to search (better space)

16 Introducing Big O Notation Used in a sense to put algorithms and problems into families Will allow us to evaluate algorithms and growth of problems. Has precise mathematical definition Q: How long does it take from S.C. to Chicago if you drive? Less than 11 years More than 10 seconds Driveway vs. Parkway vs. Thruway

17 Size of Input (measure of work) In analyzing rate of growth based upon size of input, we’ll use a variable Why? For each factor in the size, use a new variable N is most common… Examples: A linked list of N elements A 2D array of N x M elements A Binary Search Tree of P elements

O(g(n)) = {f(n) : there exist positive constants c and n 0 such that 0  f(n)  cg(n) for all n  n 0 } 18 Formal Definition of Big-O For a given function g(n), O(g(n)) is defined to be the set of functions f(n) = O(g(n)) means that f(n) is no worse than the function g(n) when n is large. That is, asymptotic upper bound

19 Visual O( ) Meaning f(n) cg(n) n0n0 f(n) = O(g(n)) Size of input Work done Our Algorithm Upper Bound

20 Simplifying O( ) Answers We say Big O complexity of 3n = O(n 2 )  drop constants! because we can show that there is a n 0 and a c such that: 0  3n  cn 2 for n  n 0 i.e. c = 4 and n 0 = 2 yields: 0  3n  4n 2 for n  2

21 Correct but Meaningless You could say 3n = O(n 6 ) or 3n = O(n 7 ) But this is like answering: How long does it take to drive to Chicago? Less than 11 years.

22 Comparing Algorithms Now that we know the formal definition of O( ) notation (and what it means)… If we can determine the O( ) of algorithms and problems… This establishes the worst they perform. Thus now we can compare them and see which has the “better” performance.

Classes of Big O Functions O(1): O(log log n): O(log n): O((log n) c ), c>1: O(n): O(n logn) = O(log n!): O(n 2 ): O(n 3 ): O(n c ), c>1: O(c n ): O(n!) = O(n n ): c O(c^n) : constant double logarithmic logarithmic polylogarithmic linear quasilinear quadratic cubic polynomial exponential (or geometric) combinatorial double exponential

24 Correctly Interpreting O( ) Does not mean that it takes only one operation Does mean that the work doesn’t change as N changes Is notation for “constant work” Does not mean that it takes N operations Does mean that the work changes in a way that is proportional to N Is a notation for “work grows at a linear rate” O(1) or “Order One” O(N) or “Order N”

25 Complex/Combined Factors Algorithms and problems typically consist of a sequence of logical steps/sections We need a way to analyze these more complex algorithms… It’s easy – analyze the sections and then combine them!

26 Eg: Insert into Linked List Task: Insert an element into an “ordered” list… Find the right location Do the steps to create the node and add it to the list head // Inserting 75 Step 1: find the location = O(N)

27 Eg: Insert into Linked List Insert an element into an ordered list… Find the right location Do the steps to create the node and add it to the list head // Step 2: Do the node insertion = O(1) 75

28 Combine the Analysis Find the right location = O(N) Insert Node = O(1) Sequential, so add: O(N) + O(1) = O(N + 1) = O(N)

29 Eg: Search a 2D Array Task: Search an “unsorted” 2D array (row, then column) Traverse all rows For each row, examine all the cells (changing columns) Row Column O(N)

30 Search an unsorted 2D array (row, then column) Traverse all rows For each row, examine all the cells (changing columns) Row Column O(M) Eg: Search a 2D Array

31 Combine the Analysis Traverse rows = O(N) Examine all cells in row = O(M) Embedded, so multiply: O(N) x O(M) = O(N*M)

32 Sequential Steps If steps appear sequentially (one after another), then add their respective O(). loop... endloop loop... endloop N M O(N + M)

33 Embedded Steps If steps appear embedded (one inside another), then multiply their respective O(). loop... endloop MN O(N*M)

34 Correctly Determining O( ) Can have multiple factors: O(N*M) O(logP + N 2 ) But keep only the dominant factors: O(N + NlogN)  O(N*M + P)  O(V 2 + VlogV)  Drop constants: O(2N + 3N 2 )  O(NlogN) O(N*M) O(V 2 )  O(N 2 ) O(N + N 2 )

35 Summary We use O() notation to discuss the rate at which the work of an algorithm grows with respect to the size of the input. O() is an upper bound, so only keep dominant terms and drop constants

36 Poly-time vs. expo-time Such algorithms with running times of orders O(log n), O(n ), O(n log n), O(n 2 ), O(n 3 ) etc. Are called polynomial-time algorithms. On the other hand, algorithms with complexities which cannot be bounded by polynomial functions are called exponential- time algorithms. These include "exploding- growth" orders which do not contain exponential factors, like n!.

37 The Towers of Hanoi A B C Goal: Move stack of rings to another peg Rule 1: May move only 1 ring at a time Rule 2: May never have larger ring on top of smaller ring

38 Original State Move 1 Move 2 Move 3 Move 4 Move 5 Move 6 Move 7 Towers of Hanoi: Solution

39 Towers of Hanoi - Complexity For 3 rings we have 7 operations. In general, the cost is 2 N – 1 = O(2 N ) Each time we increment N, we double the amount of work. This grows incredibly fast!

40 Towers of Hanoi (2 N ) Runtime For N = 64 2 N = 2 64 = 18,450,000,000,000,000,000 If we had a computer that could execute a billion instructions per second… It would take 584 years to complete But it could get worse… The case of 4 or more pegs: Open Problem

41 The Traveling Salesman Problem A traveling salesman wants to visit a number of cities and then return to the starting point. Of course he wants to save time and energy, so he wants to determine the shortest path for his trip. We can represent the cities and the distances between them by a weighted, complete, undirected graph. The problem then is to find the round-trip of minimum total weight that visits each vertex exactly once.

42 The Traveling Salesman Problem Example: What path would the traveling salesman take to visit the following cities? ChicagoToronto New York Boston Solution: The shortest path is Boston, New York, Chicago, Toronto, Boston (2,000 miles).

Traveling Salesman Problem (TSP): one of the best-known computational problems that is known to be “extremely hard to solve” NP-hard A brute force solution takes O(n!) A dynamic programming solution takes O(c n ), c>2 Harder to solve than Tower of Hanoi problem The Traveling Salesman Problem

44 Where Does this Leave Us? Clearly algorithms have varying runtimes or storage costs. We’d like a way to categorize them: Reasonable, so it may be useful Unreasonable, so why bother running

45 Performance Categories Sub-linear O(Log N) Linear O(N) Quasilinear O(N Log N) Quadratic O(N 2 ) Exponential O(2 N ) O(N!) O(N N ) Polynomial

46 Two Categories of Algorithms Size of Input (N) trillion billion million N N5N5 2N2N N Unreasonable Don’t Care! Reasonable Runtime

47 Reasonable vs. Unreasonable Reasonable algorithms feature polynomial factors in their O( ) and may be usable depending upon input size. O (Log N), O (N), O (N K ) where K is a constant Unreasonable algorithms feature exponential factors in their O( ) and have little practical utility O (2 N ), O (N!), O (N N )

48 Give the Big O complexity in terms of n of each expression below and order the following as to increasing complexity. (all unspecified terms are to be determined constants) O(n) Order (from most complex to least) a n b log n c. 3 n 2 log n + 21 n 2 d. n log n n 2 e. 8n! + 2 n f. 10 k n g. a log n +3 n 3 h. b 2 n n 2 i. A n n Eg: Computational Complexity n#7 log n #8 n 2 log n #5 n 2 #6 n! #1 k n #2 or #3 (by k) n 3 #4 2 n #2 or #3 (by k) n n #1

49 Decidable vs. Undecidable Any problem that can be solved by an algorithm is called Decidable. Problems that can be solved in polynomial time are called tractable (easy). Problems that can be solved, but for which no polynomial time solutions are known are called intractable (hard). Problems that can not be solved given any amount of time are called undeciable. Decidable, tractable, intractable, undecidable

50 Undecidable Problems No algorithmic solution exists Regardless of cost These problems aren’t computable No answer can be obtained in finite amount of time

51 Complexity Classes Problems have been grouped into classes based on the most efficient algorithms for solving the problems: Class P: those problems that are solvable in polynomial time. Class NP: problems that are “verifiable” in polynomial time (i.e., given the solution, we can verify in polynomial time if the solution is correct or not.)

52 NP-complete Problems If problem A is NP-complete, and problem B can be transformed to problem A in a polynomial time algorithm, then problem B is also NP-complete. Examples: Traveling Sales Problems

53 The Challenge Remains … If one finds a polynomial time algorithm for ONE NP-complete problem, then all NP- complete problems have polynomial time algorithms. A lot of smart computer scientists have tried, non have succeeded. So, people believe NP-complete problems are Very Very Very …. Hard problems w/o polynomial time solutions

54 What is a good algorithm? If the algorithm has a running time that is a polynomial function of the size of the input, N, otherwise it is a “bad” algorithm. A problem is considered tractable if it has a polynomial time solution and intractable if it does not. For many problems we still do not know if the are tractable or not.

55 What’s this good for anyway? Knowing hardness of problems lets us know when an optimal solution can exist. Salesman can’t sell you an optimal solution Keeps us from seeking optimal solutions when none exist, use heuristics instead. Some software/solutions used because they scale well. Helps us scale up problems as a function of resources. Many interesting problems are very hard (NP- hard)! Use heuristic solutions Only appropriate when problems have to scale.

Reference Many materials are modified from Prof. Yen and Prof. Lee’s lecture slides in IST 512/512 over the years aaaIST