Computer Science Background for Biologists CSC 487/687 Computing for Bioinformatics Fall 2005.

Slides:



Advertisements
Similar presentations
Intro to Analysis of Algorithms. Algorithm “A sequence of unambiguous instructions for solving a problem, i.e., for obtaining a required output for any.
Advertisements

Razdan with contribution from others 1 Algorithm Analysis What is the Big ‘O Bout? Anshuman Razdan Div of Computing.
I Advanced Algorithms Analysis. What is Algorithm?  A computer algorithm is a detailed step-by-step method for solving a problem by using a computer.
Lecture 3: Algorithm Complexity. Recursion A subroutine which calls itself, with different parameters. Need to evaluate factorial(n) = n  factorial(n-1)
Complexity 15-1 Complexity Andrei Bulatov Hierarchy Theorem.
1 ICS 353 Design and Analysis of Algorithms Spring Semester (062) King Fahd University of Petroleum & Minerals Information & Computer Science.
Introduction to Analysis of Algorithms
Complexity Analysis (Part I)
Cmpt-225 Algorithm Efficiency.
Asymptotic Analysis Motivation Definitions Common complexity functions
2 -1 Chapter 2 The Complexity of Algorithms and the Lower Bounds of Problems.
Algorithm Efficiency and Sorting
The Complexity of Algorithms and the Lower Bounds of Problems
Solution methods for Discrete Optimization Problems.
Lecture 2: Algorithm Complexity. Recursion A subroutine which calls itself, with different parameters. Need to evaluate factorial(n) factorial(n) = n.
Analysis of Algorithms COMP171 Fall Analysis of Algorithms / Slide 2 Introduction * What is Algorithm? n a clearly specified set of simple instructions.
CSE 421 Algorithms Richard Anderson Lecture 3. Classroom Presenter Project Understand how to use Pen Computing to support classroom instruction Writing.
© 2006 Pearson Addison-Wesley. All rights reserved10 A-1 Chapter 10 Algorithm Efficiency and Sorting.
CS-2852 Data Structures LECTURE 3A Andrew J. Wozniewicz Image copyright © 2010 andyjphoto.com.
Algorithm analysis and design Introduction to Algorithms week1
Instructor Neelima Gupta
Lecture 2 We have given O(n 3 ), O(n 2 ), O(nlogn) algorithms for the max sub-range problem. This time, a linear time algorithm! The idea is as follows:
Algorithm Analysis & Complexity We saw that a linear search used n comparisons in the worst case (for an array of size n) and binary search had logn comparisons.
Program Performance & Asymptotic Notations CSE, POSTECH.
2.3 Functions A function is an assignment of each element of one set to a specific element of some other set. Synonymous terms: function, assignment, map.
Lecture 2 Computational Complexity
Algorithm Efficiency CS 110: Data Structures and Algorithms First Semester,
CSC 201 Analysis and Design of Algorithms Lecture 04: CSC 201 Analysis and Design of Algorithms Lecture 04: Time complexity analysis in form of Big-Oh.
Algorithm Input Output An algorithm is a step-by-step procedure for solving a problem in a finite amount of time. Chapter 4. Algorithm Analysis (complexity)
1 Introduction to Approximation Algorithms. 2 NP-completeness Do your best then.
2.1 Computational Tractability. 2 Computational Tractability Charles Babbage (1864) As soon as an Analytic Engine exists, it will necessarily guide the.
Analysis of Algorithms
Analysis of Algorithms CSCI Previous Evaluations of Programs Correctness – does the algorithm do what it is supposed to do? Generality – does it.
Major objective of this course is: Design and analysis of modern algorithms Different variants Accuracy Efficiency Comparing efficiencies Motivation thinking.
Introduction to Analysis of Algorithms COMP171 Fall 2005.
MCA 202: Discrete Structures Instructor Neelima Gupta
Program Efficiency & Complexity Analysis. Algorithm Review An algorithm is a definite procedure for solving a problem in finite number of steps Algorithm.
Strings and Pattern Matching Algorithms Pattern P[0..m-1] Text T[0..n-1] Brute Force Pattern Matching Algorithm BruteForceMatch(T,P): Input: Strings T.
MS 101: Algorithms Instructor Neelima Gupta
Complexity, etc. Homework. Comparison to computability. Big Oh notation. Sorting. Classwork/Homework: prepare presentation on specific sorts. Presentation.
NP-COMPLETE PROBLEMS. Admin  Two more assignments…  No office hours on tomorrow.
Time Complexity of Algorithms (Asymptotic Notations)
Prof. Amr Goneid, AUC1 Analysis & Design of Algorithms (CSCE 321) Prof. Amr Goneid Department of Computer Science, AUC Part 1. Complexity Bounds.
2IS80 Fundamentals of Informatics Fall 2015 Lecture 5: Algorithms.
Algorithm Analysis (Big O)
E.G.M. PetrakisAlgorithm Analysis1  Algorithms that are equally correct can vary in their utilization of computational resources  time and memory  a.
CES 592 Theory of Software Systems B. Ravikumar (Ravi) Office: 124 Darwin Hall.
CSE 421 Algorithms Richard Anderson Winter 2009 Lecture 4.
Copyright © 2014 Curt Hill Algorithm Analysis How Do We Determine the Complexity of Algorithms.
Complexity of Algorithms Fundamental Data Structures and Algorithms Ananda Guna January 13, 2005.
Ch03-Algorithms 1. Algorithms What is an algorithm? An algorithm is a finite set of precise instructions for performing a computation or for solving a.
Mathematical Foundations (Growth Functions) Neelima Gupta Department of Computer Science University of Delhi people.du.ac.in/~ngupta.
Mathematical Foundation
Definition of Computer Science
Design and Analysis of Algorithms Chapter -2
Analysis of Algorithms
Introduction to Algorithms
Big-O notation.
Growth of functions CSC317.
CS 3343: Analysis of Algorithms
Objective of This Course
Advanced Analysis of Algorithms
Richard Anderson Lecture 3
GC 211:Data Structures Algorithm Analysis Tools
Introduction to Algorithm and its Complexity Lecture 1: 18 slides
At the end of this session, learner will be able to:
Discrete Mathematics 7th edition, 2009
Advanced Analysis of Algorithms
Estimating Algorithm Performance
Presentation transcript:

Computer Science Background for Biologists CSC 487/687 Computing for Bioinformatics Fall 2005

What is algorithm  Well-defined computational procedure that takes some values as input and produces some value as output.  We are interested in the correctness and efficiency of computer algorithms  We seek to extract clean, well-defined problems from the typically messy “real” problem to gain insight into it.

Example of an algorithm  Input: A sequence of n numbers (a 1, a 2, …a n ).  Output: A permutation (a’ 1, a’ 2, …a’ n ) of the input sequence such that a’ 1 ≤ a’ 2 ≤ …a’ n.

Exact String Matching  Input: A text string T, where |T| = n, and a pattern string P, where |P| = m.  Output: An index i such that T i+k-1 = P k for all 1 ≤ k ≤ m, i.e. showing that P is a substring of T. abcabaabcabac abaa Text T: Pattern P:

Exact String Matching  Brute force search algorithm for i =1 to n-m+1 do j=1; while ( T[i+j-1] == P[j] ) and (j <= m) j=j+1; if (j > m) then print “pattern at position ”, i;

Algorithm Efficiency  Time efficiency of algorithms  Space efficiency of algorithms

Machine Independent Analysis We assume that every basic operation takes constant time: Example Basic Operations: Addition, Subtraction, Multiplication, Memory Access Time efficiency of an algorithm is the number of basic operations it performs We do not distinguish between the basic operations.

Time efficiency  In fact, we will not worry about the exact values, but will look at ``broad classes’ of values.  Let there be n inputs. If an algorithm needs n basic operations and another needs 2n basic operations, we will consider them to be in the same efficiency category. However, we distinguish between exp(n), n, log(n)

Example: Time Complexity  This algorithm might use only n steps if we are lucky.  We might need about n*m steps if we are unlucky

Order of Increase We worry about the increase speed of our algorithms with increased input sizes. n log n exp (n)

Function Orders A function f(n) is O(g(n)) if ``increase’’ of f(n) is not faster than that of g(n). A function f(n) is O(g(n)) if there exists a number n0 and a nonnegative c such that for all n  n0, 0  f(n)  cg(n). If limn  f(n)/g(n) exists and is finite, then f(n) is O(g(n))

Implication of Big oh notation  Big oh notation ― an upper bound on the number of steps that an algorithm takes in the worst case. Suppose we know that our algorithm uses at most O(f(n)) basic steps for any n inputs, and n is sufficiently large, then we know that our algorithm will terminate after executing at most constant times f(n) basic steps. We know that a basic step takes a constant time in a machine. Hence, our algorithm will terminate in a constant times f(n) units of time, for all large n.

Algorithm Complexity  Thus the brute force string matching algorithm is O(mn), or takes quadratic time  An quadratic time algorithm is usually fast enough for small problems, but not big ones.  An exponential-time algorithm can only be fast enough for tiny problems

Any improvement based on brute force search?  Some of these comparisons are wasted work!  By being more clever, we can reduce the worst case running time to O(n+m)  Knuth-Morris-Pratt string matching

NP, NP hard, NP complete Problems  A problem is assigned to the NP class if it can be verified in polynomial time.  A problem is NP-hard if an algorithm for solving it can be translated into one for solving any other NP-problemalgorithmNP-problem  NP-hard therefore means "at least as hard as any NP-problem,“NP-problem  NP-complete: it is both NP problem and NP- hard problem

NP-Completeness  Unfortunately, for many problems, there is no known polynomial algorithm  Even worse, most of these problems can be proven NP-complete, meaning that no such algorithm can exist!  Heuristics, approximate

Shortest Common Superstring  Input: A set S = {s 1, s 2, … s m } of text strings on some alphabet £.  Output: the shortest possible string T such that each s i is a substring of T.  This application arises in DNA sequencing

Shortest common superstring

 NP-complete problems.  Can you suggest an algorithm to find the shortest common superstring?  Greedy heuristic ― approximate optimal solution

Greedy Heuristic  We always merge the two strings with the longest overlap  Put the combined string back  Repeat until only one string remains  GREEDY finds a superstring of length at most twice optimal

Time complexity of the greedy heuristic  We assume n strings, each string has a length of k. N rounds O(N 2 ) strings comparisons Each string comparison takes k 2 steps.