smallest number of inserts/deletes to turn arg#1 into arg#2

Slides:



Advertisements
Similar presentations
MATH 224 – Discrete Mathematics
Advertisements

Longest Common Subsequence
Chapter 11 Proof by Induction. Induction and Recursion Two sides of the same coin.  Induction usually starts with small things, and then generalizes.
Chapter 3: The Efficiency of Algorithms Invitation to Computer Science, Java Version, Third Edition.
9.7 Special Factors. a. 0 y y y-4y Ans:________________ (y+4)(y-4)
One of the most important problems is Computational Geometry is to find an efficient way to decide, given a subdivision of E n and a point P, in which.
Computability and Complexity 23-1 Computability and Complexity Andrei Bulatov Search and Optimization.
Dynamic Programming Solving Optimization Problems.
DAST, Spring © L. Joskowicz 1 Data Structures – LECTURE 1 Introduction Motivation: algorithms and abstract data types Easy problems, hard problems.
Chapter 3: The Efficiency of Algorithms Invitation to Computer Science, C++ Version, Third Edition Additions by Shannon Steinfadt SP’05.
This material in not in your text (except as exercises) Sequence Comparisons –Problems in molecular biology involve finding the minimum number of edit.
Chapter 3: The Efficiency of Algorithms Invitation to Computer Science, C++ Version, Fourth Edition.
Chapter 3: The Efficiency of Algorithms
2 -1 Chapter 2 The Complexity of Algorithms and the Lower Bounds of Problems.
UNIVERSITY OF SOUTH CAROLINA College of Engineering & Information Technology Bioinformatics Algorithms and Data Structures Chapter 11: Core String Edits.
CSC 3323 Notes – Introduction Algorithm Analysis.
The Complexity of Algorithms and the Lower Bounds of Problems
1 Theory I Algorithm Design and Analysis (11 - Edit distance and approximate string matching) Prof. Dr. Th. Ottmann.
SIGCSE Tradeoffs, intuition analysis, understanding big-Oh aka O-notation Owen Astrachan
The Complexity of Optimization Problems. Summary -Complexity of algorithms and problems -Complexity classes: P and NP -Reducibility -Karp reducibility.
Analysis of Algorithms
Chapter 19: Searching and Sorting Algorithms
Complexity of algorithms Algorithms can be classified by the amount of time they need to complete compared to their input size. There is a wide variety:
P, NP, and Exponential Problems Should have had all this in CS 252 – Quick review Many problems have an exponential number of possibilities and we can.
1 Functional Programming Lecture 6 - Algebraic Data Types.
Hoogλe Finding Functions from Types Neil Mitchell haskell.org/hoogle community.haskell.org/~ndm/ [α]  [α]
CISC 235: Topic 1 Complexity of Iterative Algorithms.
A * Search A* (pronounced "A star") is a best first, graph search algorithm that finds the least-cost path from a given initial node to one goal node out.
Search Engines WS 2009 / 2010 Prof. Dr. Hannah Bast Chair of Algorithms and Data Structures Department of Computer Science University of Freiburg Lecture.
Dynamic Programming & Memoization. When to use? Problem has a recursive formulation Solutions are “ordered” –Earlier vs. later recursions.
Functional Programming Lecture 3 - Lists Muffy Calder.
Dynamic Programming for the Edit Distance Problem.
Polymorphic Functions
What is a Parser? A parser is a program that analyses a piece of text to determine its syntactic structure  3 means 23+4.
Functional Programming
MA/CSSE 473 Day 05 Factors and Primes Recursive division algorithm.
Analysis of Algorithms
Functional Programming Lecture 12 - more higher order functions
PROGRAMMING IN HASKELL
Relationship to Other Classes
A lightening tour in 45 minutes
Haskell Chapter 4.
Approximate Algorithms (chap. 35)
PROGRAMMING IN HASKELL
Courtsey & Copyright: DESIGN AND ANALYSIS OF ALGORITHMS Courtsey & Copyright:
Sorting by Tammy Bailey
PROGRAMMING IN HASKELL
Introduction to the Analysis of Complexity
The Complexity of Algorithms and the Lower Bounds of Problems
Haskell Strings and Tuples
Computer Science 312 Haskell Lists 1.
Data Structures ADT List
Chapter 3: The Efficiency of Algorithms
Multiplying a Polynomial by a Monomial
PROGRAMMING IN HASKELL
Dynamic Programming Computation of Edit Distance
Haskell Types, Classes, and Functions, Currying, and Polymorphism
PROGRAMMING IN HASKELL
PROGRAMMING IN HASKELL
CS 457/557: Functional Languages Folds
PROGRAMMING IN HASKELL
Lecture 8. Paradigm #6 Dynamic Programming
CSCE 314: Programming Languages Dr. Dylan Shell
Bioinformatics Algorithms and Data Structures
CSCE 314: Programming Languages Dr. Dylan Shell
Objective Factor polynomials by using the greatest common factor.
Ch Part 1 Multiplying Polynomials
PROGRAMMING IN HASKELL
PROGRAMMING IN HASKELL
Applications of Arrays
Presentation transcript:

smallest number of inserts/deletes to turn arg#1 into arg#2 Edit distance smallest number of inserts/deletes to turn arg#1 into arg#2 dist :: Eq a => [a] -> [a] -> Int 6 4 Main> dist ”abcd” ”xaby” Main> dist ”” ”monkey” Main> dist ”Haskell” ”” Main> dist ”hello” ”hello” 7

Edit distance implementation challenge #0: implement a polynomial time version dist :: Eq a => [a] -> [a] -> Int dist [] ys = length ys dist xs [] = length xs dist (x:xs) (y:ys) | x == y = dist xs ys | otherwise = (1 + dist (x:xs) ys) `min` (1 + dist xs (y:ys)) either insert y or delete x two recursive calls: exponential time

How to test? -- ”Test Oracle” think QuickCheck Formal specification Executable Efficient (polynomial time) comparing against naive dist is no good... challenge #1: find an practical way to test your implementation!

(answer)

upper-bound: easy, lower-bound: hard An efficient dist dynamic programming dist :: Eq a => [a] -> [a] -> Int dist xs ys = head (dists xs ys) dists :: Eq a => [a] -> [a] -> [Int] dists [] ys = [n,n-1..0] where n = length ys dists (x:xs) ys = line x ys (dists xs ys) line :: Eq a => a -> [a] -> [Int] -> [Int] line x [] [d] = [d+1] line x (y:ys) (d:ds) | x == y = head ds : ds' | otherwise = (1+(d`min`head ds')) : ds' where ds' = line x ys ds testing upper-bound: easy, lower-bound: hard

Naive dist dist :: Eq a => [a] -> [a] -> Int dist [] ys = length ys dist xs [] = length xs dist (x:xs) (y:ys) | x == y = dist xs ys | otherwise = (1 + dist (x:xs) ys) `min` (1 + dist xs (y:ys)) base case #1 base case #2 step case #1 step case #2

”Inductive Testing” prop_BaseXs (ys :: String) = dist [] ys == length ys prop_BaseYs (xs :: String) = dist xs [] == length xs prop_StepSame x xs (ys :: String) = dist (x:xs) (x:ys) == dist xs ys prop_StepDiff x y xs (ys :: String) = x /= y ==> dist (x:xs) (y:ys) == (1 + dist (x:xs) ys) `min` (1 + dist xs (y:ys)) specialization

(Alternative) distFix :: Eq a => ([a] -> [a] -> Int) distFix f [] ys = length ys distFix f xs [] = length xs distFix f (x:xs) (y:ys) | x == y = f xs ys | otherwise = (1 + f (x:xs) ys) `min` (1 + f xs (y:ys)) prop_Dist xs (ys :: String) = dist xs ys == distFix dist xs ys no recursion

What is happening? bugs

Applications Search algorithms SAT-solvers other kinds of solvers Optimization algorithms LP-solvers (edit distance) Symbolic algorithms? substitution, unification, anti-unification, ...