Construction and Analysis of Efficient Algorithms Introduction Autumn 2017, Juris Vīksna
Timetable Regular lecture times: Wednesdays 12.30 - 14.00 413. aud. Fridays 12.30 - 14.00 413. aud. The lectures at the following dates will be rescheduled: 13.09., 15.09. Some other changes are possible (but hopefully, not too many). Replacement lectures most likely will be scheduled starting from the second half of October.
Outline Motivation and some examples Subjects covered in the course Requirements Textbooks Other practical information
Motivation Is there a sufficiently fast algorithm for a given problem? Problem 1a For a given pair of vertices in connected undirected graph find the length of shortest path between these vertices. Problem 1b For a given pair of vertices in connected undirected graph find the length of longest non-intersecting path between these vertices.
Problem 1a Problem 1a For a given pair of vertices in connected undirected graph find the length of shortest path between these vertices.
Problem 1a Problem 1a For a given pair of vertices in connected undirected graph find the length of shortest path between these vertices. There is an algorithm that solves this problem in time less than const · (|V| + |E|)
Problem 1a
Problem 1a d = 0
Problem 1a d = 1
Problem 1a d = 2
Problem 1a d = 3
Problem 1a d = 4
Problem 1a d = 5 OK
Problem 1b Problem 1b For a given pair of vertices in connected undirected graph find the length of longest non-intersecting path between these vertices.
Problem 1b Problem 1b For a given pair of vertices in connected undirected graph find the length of longest non-intersecting path between these vertices. There is not known an algorithm that solves this problem within polynomial time. (And it is widely believed that such algorithm does not exist.)
Problem 1c Problem 1c For a given pair of vertices in connected undirected graph find a non-intersecting path between these vertices consisting of even number of edges.
Problem 2a Problem 2a For a given undirected graph find an Euler cycle (if such exists). Euler cycle is defined as a closed path that contains every edge exactly once.
Eulerian path (cycle) problem For a given graph find a path (cycle) that visits every edge exactly once (or show that such path does not exist).
Problem 2a Problem 2a For a given undirected graph find an Euler cycle (if such exists). Euler cycle is defined as a closed path that contains every edge exactly once. There is an algorithm that solves this problem in time less than const · (|V| + |E|)
Eulerian path (cycle) problem Eulerian cycle exists if and only if each of graph vertices has even degree. Moreover, there is a simple linear time algorithm for finding Eulerian cycle. Eulerian path (cycle) problem For a given graph find a path (cycle) that visits every edge exactly once (or show that such path does not exist).
Problem 2b Problem 2b For a given undirected graph find a Hamiltonian cycle (if such exists). Hamiltonian cycle is defined as a closed path that contains every vertex exactly once.
Hamiltonian path (cycle problem) For a given graph find a path (cycle) that visits every vertex exactly once (or show that such path does not exist).
Hamiltonian path (cycle problem) For a given graph find a path (cycle) that visits every vertex exactly once (or show that such path does not exist).
Hamiltonian path (cycle problem)
Problem 2b Problem 2b For a given undirected graph find a Hamiltonian cycle (if such exists). Hamiltonian cycle is defined as a closed path that contains every vertex exactly once. There is not known an algorithm that solves this problem within polynomial time. (And it is widely believed that such algorithm does not exist.)
Genome sequence assembly
Genome sequence assembly
Sequence assembly problem Ok, let us assume that we have these hybridizations. How can we reconstruct the initial DNA sequence from them? Affymetrix GeneChip W.Bains, C.Smith (1988) A novel method for nucleic acid sequence determination. Journal of theoretical biology .Vol. 135:3, 303-307.
Sequence assembly problem Ok, let us assume that we have these hybridizations. How can we reconstruct the initial DNA sequence from them?
SBH – Hamiltonian path approach
SBH – Hamiltonian path approach
SBH – Eulerian path approach
An Eulerian path approach to SBH problem PNAS (Proceedings of the National Academy of Sciences), 2001
SBH and de Bruijn graphs (~ 1950 :)
Using de Bruijn graphs for NGS data assembly
Motivation How to compute the time of computation required for a given algorithm? Does the given algorithm work correctly? Problem 3 Given two strings P and T over the same alphabet , determine whether P occurs as a substring in T.
Problem 3 SimpleMatcher(string P, string T) n length[T] m length[P] for s 0 to n m do if P[1...m] = T[s+1 ... s+m] then print s Time = const ·n · m
Problem 3 KnuthMorrisPrattMatcher(string P, string T) n length[T] m length[P] PrefixFunction(P) q 0 for i 1 to n do while q > 0 & P[q+1] T[i] do q [q] if P[q+1] = T[i] then q q + 1 if q = m then print i m
Problem 3 PrefixFunction(string P) m length[P] [1] 0 k 0 for q 2 to m do while k > 0 & P[k+1] P[q] do k [q] if P[q+1] = P[q] then k k + 1 [q] k return
Problem 3 KnuthMorrisPrattMatcher(string P, string T) n length[T] m length[P] PrefixFunction(P) q 0 for i 1 to n do while q > 0 & P[q+1] T[i] do q [q] if P[q+1] = T[i] then q q + 1 if q = m then print i m T(n,m) = TP(m) + n TWhile(m) TP(m) = const · m T(n,m) const ·n · m It can be shown that T(n,m) = const · (n + m)
Knuth-Morris-Pratt Algorithm - Idea T = gadjama gramma berida P = gaga g a j m r b e i d g a j m r b e i d
Subjects covered in the course Models of computation and complexity of algorithms Complexity analysis Data structures Sorting algorithms Algorithms on graphs String searching algorithms Dynamical programming Computational geometry Algorithms for arithmetical problems NP completeness
Requirements Two short-term homeworks Most likely will be given in October-November-December Strict (or almost) one week deadline 20% each Programming assignment Problem to be announced in second half of October No deadline – must be submitted before the exam 20% Exam 40% To qualify for grade 10 you may be asked to cope with some additional question(s)/problem(s)
Academic honesty You are expected to submit only your own work! Sanctions: Receiving a zero on the assignment (in no circumstances a resubmission will be allowed) No admission to the exam and no grade for the course
Textbooks Thomas H.Cormen Charles E.Leiserson Ronald L.Rivest Clifford Stein Introduction to Algorithms The MIT Press, 2009 (3rd ed, 1st ed 1990)
Textbooks Harry R.Lewis Larry Denenberg Data Structures and their Algorithms Harper Collins Publishers, 1991
Textbooks Steven S.Skiena The Algorithm Design Manual Springer Verlag, 2008 (2nd ed, 1st ed 1997)
Textbooks Alfred V.Aho, John E.Hopcroft, Jeffrey D.Ullman The Design and Analysis of Computer Algorithms Addison-Wesley Publishing Company, 1976
Textbooks David Harel, Algorithmics: the spirit of computing Addison-Wesley Publishing Company, 2004 (3rd ed)
Textbooks Dan Gusfield Algorithms on Strings Trees and Sequences Cambridge University Press, 1997
Textbooks Michael R. Garey, David S. Johnson Computers and Intractability – A Guide to the Theory of NP-Completeness W. H. Freedman and Company, 1979
Web page http://susurs.mii.lu.lv/juris/courses/alg2017.html It is expected to contain: short summaries of lectures announcements power point presentations homework and programming assignment problems frequently asked questions (???) your grades other useful information Course information available also at https://estudijas.lu.lv
Contact information Juris Vīksna Room 421, Rainis boulevard 29 phone: 67213716