CSC 3130: Automata theory and formal languages Andrej Bogdanov The Chinese University of Hong Kong Polynomial.

Slides:



Advertisements
Similar presentations
Part VI NP-Hardness. Lecture 23 Whats NP? Hard Problems.
Advertisements

Complexity Classes: P and NP
Variants of Turing machines
INHERENT LIMITATIONS OF COMPUTER PROGRAMS CSci 4011.
Great Theoretical Ideas in Computer Science for Some.
CSE373: Data Structures & Algorithms Lecture 24: The P vs. NP question, NP-Completeness Nicki Dell Spring 2014 CSE 373 Algorithms and Data Structures 1.
Lecture 21 Nondeterministic Polynomial time, and the class NP FIT2014 Theory of Computation Monash University Faculty of Information Technology Slides.
CSC 3130: Automata theory and formal languages Andrej Bogdanov The Chinese University of Hong Kong Variants.
Discrete Structures & Algorithms The P vs. NP Question EECE 320.
Complexity 12-1 Complexity Andrei Bulatov Non-Deterministic Space.
Complexity 15-1 Complexity Andrei Bulatov Hierarchy Theorem.
Computability and Complexity 13-1 Computability and Complexity Andrei Bulatov The Class NP.
1 Introduction to Computability Theory Lecture11: Variants of Turing Machines Prof. Amos Israeli.
Computability and Complexity 19-1 Computability and Complexity Andrei Bulatov Non-Deterministic Space.
Complexity ©D.Moshkovitz 1 Turing Machines. Complexity ©D.Moshkovitz 2 Motivation Our main goal in this course is to analyze problems and categorize them.
CS5371 Theory of Computation Lecture 11: Computability Theory II (TM Variants, Church-Turing Thesis)
FORMAL LANGUAGES, AUTOMATA AND COMPUTABILITY Read sections 7.1 – 7.3 of the book for next time.
NP and NP-completeness
CSC 3130: Automata theory and formal languages Andrej Bogdanov The Chinese University of Hong Kong Nondeterminism.
CS Master – Introduction to the Theory of Computation Jan Maluszynski - HT Lecture NP-Completeness Jan Maluszynski, IDA, 2007
Fall 2004COMP 3351 Time Complexity We use a multitape Turing machine We count the number of steps until a string is accepted We use the O(k) notation.
Final Exam Review Cummulative Chapters 0, 1, 2, 3, 4, 5 and 7.
Complexity Theory: The P vs NP question Lecture 28 (Dec 4, 2007)
CSC 3130: Automata theory and formal languages Andrej Bogdanov The Chinese University of Hong Kong Polynomial.
INHERENT LIMITATIONS OF COMPUTER PROGRAMS CSci 4011.
Nattee Niparnan. Easy & Hard Problem What is “difficulty” of problem? Difficult for computer scientist to derive algorithm for the problem? Difficult.
חישוביות וסיבוכיות Computability and Complexity Lecture 7 TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: A AAAA.
Theory of Computing Lecture 15 MAS 714 Hartmut Klauck.
CSC 3130: Automata theory and formal languages Andrej Bogdanov The Chinese University of Hong Kong Efficient.
Cs3102: Theory of Computation Class 24: NP-Completeness Spring 2010 University of Virginia David Evans.
CSCI 2670 Introduction to Theory of Computing November 29, 2005.
CSCI 2670 Introduction to Theory of Computing December 1, 2004.
CSE 105 Theory of Computation Alexander Tsiatas Spring 2012 Theory of Computation Lecture Slides by Alexander Tsiatas is licensed under a Creative Commons.
COMPSCI 102 Introduction to Discrete Mathematics.
Theory of Computing Lecture 21 MAS 714 Hartmut Klauck.
COMPSCI 102 Introduction to Discrete Mathematics.
CSE373: Data Structures & Algorithms Lecture 22: The P vs. NP question, NP-Completeness Lauren Milne Summer 2015.
Automata & Formal Languages, Feodor F. Dragan, Kent State University 1 CHAPTER 3 The Church-Turing Thesis Contents Turing Machines definitions, examples,
CS 3343: Analysis of Algorithms Lecture 25: P and NP Some slides courtesy of Carola Wenk.
CSC 3130: Automata theory and formal languages Andrej Bogdanov The Chinese University of Hong Kong Turing Machines.
CSCI 3130: Formal languages and automata theory Andrej Bogdanov The Chinese University of Hong Kong The Cook-Levin.
Automata & Formal Languages, Feodor F. Dragan, Kent State University 1 CHAPTER 7 Time complexity Contents Measuring Complexity Big-O and small-o notation.
Strings Basic data type in computational biology A string is an ordered succession of characters or symbols from a finite set called an alphabet Sequence.
NPC.
1 Time Complexity We use a multitape Turing machine We count the number of steps until a string is accepted We use the O(k) notation.
CSCI 2670 Introduction to Theory of Computing December 2, 2004.
CSCI 2670 Introduction to Theory of Computing December 7, 2005.
CSC 3130: Automata theory and formal languages Andrej Bogdanov The Chinese University of Hong Kong NP-complete.
Given this 3-SAT problem: (x1 or x2 or x3) AND (¬x1 or ¬x2 or ¬x2) AND (¬x3 or ¬x1 or x2) 1. Draw the graph that you would use if you want to solve this.
Computability Examples. Reducibility. NP completeness. Homework: Find other examples of NP complete problems.
CS 154 Formal Languages and Computability May 10 Class Meeting Department of Computer Science San Jose State University Spring 2016 Instructor: Ron Mak.
The NP class. NP-completeness Lecture2. The NP-class The NP class is a class that contains all the problems that can be decided by a Non-Deterministic.
CSCI 3130: Formal languages and automata theory Andrej Bogdanov The Chinese University of Hong Kong Decidable.
CSC 3130: Automata theory and formal languages Andrej Bogdanov The Chinese University of Hong Kong Undecidable.
CSCI 3130: Formal languages and automata theory Andrej Bogdanov The Chinese University of Hong Kong Polynomial.
The NP class. NP-completeness
NP and NP-completeness
Great Theoretical Ideas in Computer Science
Polynomial time The Chinese University of Hong Kong Fall 2010
CSE 105 theory of computation
NP-complete problems The Chinese University of Hong Kong Fall 2008
NP-Completeness Yin Tat Lee
Efficient computation
CS154, Lecture 13: P vs NP.
NP-completeness The Chinese University of Hong Kong Fall 2008
Time Complexity Classes
Instructor: Aaron Roth
Instructor: Aaron Roth
Instructor: Aaron Roth
CSE 105 theory of computation
Presentation transcript:

CSC 3130: Automata theory and formal languages Andrej Bogdanov The Chinese University of Hong Kong Polynomial time Fall 2008

Efficient algorithms The running time of an algorithm depends on the input For longer inputs, we allow more time Efficiency is measured as a function of input size decidable A TM PCP efficient

Running time The running time of TM M is the function t M (n) : t M (n) = maximum number of steps that M takes on any input of length n On input x, until you reach $ Read and cross off first a or b before $ Read and cross off first a or b after $ If there is a mismatch, reject M:M: If all symbols but $ are crossed off, accept O(n) times O(n) steps L = {w$w: w ∈ {a, b}} running time: O(n 2 )

Running time L = {$0 n 1 n : n ≥ 0} On input x, Check that the input is of the form $0*1* Until everything is crossed off: Move head left to $ Cross off the leftmost 0 Cross off the following 1 If everything is crossed off, accept. M:M: O(n) times O(n) steps running time: O(n 2 )

A faster way L = {$0 n 1 n : n ≥ 0} On input x, Check that the input is of the form $0*1* Until everything is crossed off: Move head left to $ Find the parities of number of 0 s and 1 s If one is even and other off, reject Otherwise, cross off every other 0 and every other 1 If everything is crossed off, accept. M:M: O(log n) times O(n) steps running time: O(n log n) O(n) steps

Running time vs. model What if we have a two-tape Turing Machine? L = {$0 n 1 n : n ≥ 0} On input x, Check that the input is of the form $0*1* Copy 0* part of input on second tape Until ☐ is reached, Cross off next 1 from first tape and next 0 from second tape If both tapes reach ☐ at same time, accept M:M: O(n) steps running time: O(n)

Running time vs. model How about a java program? L = {$0 n 1 n : n ≥ 0} running time: O(n) M(string x) { n = x.len; if n % 2 == 0 reject; else for (i = 1; i <= n/2; i++) if x[i] != 0 reject; if x[n-i+1] != 1 reject; accept; } 1 -tape TM O(n log n) 2 -tape TM O(n) java O(n) The running time can change depending on the model of computation!

Measuring running time What does it mean when we say: One step in all mean different things! “This algorithm runs in 1000 steps” javaRAM machine 1-tape TM if (x > 0) y = 5*y + x; write r3;  (q 3, a) = (q 7, b, R)

Efficiency and the Church-Turing thesis The Church-Turing thesis says all these models are equivalent in power… … but not in running time! java RAM machine Turing Machine multitape TM UNIVAC

The Cobham-Edmonds thesis However, there is an extension to the Church-Turing thesis that says For any realistic models of computation M 1 and M 2 : So any task that takes time T on M 1 can be done in time (say) T 2 or T 3 on M 2 M 1 can be simulated on M 2 with at most polynomial slowdown

Efficient simulation The running time of a program depends on the model of computation… … but in the grand scheme, this is irrelevant java RAM machinemultitape TMordinary TM fastslow Every reasonable model of computation can be simulated efficiently on every other

Example of efficient simulation Recall simulating multiple tapes on a single tape M … 010 … 01 … 100  = {0, 1, ☐ } S … 01010##0#10   ’ = {0, 1, ☐, 0, 1, ☐, #}   #

Running time of simulation Each move of the multiple tape TM might require traversing the whole single tape after t steps O(s) steps of single tape TM s = rightmost cell ever visited s ≤ 3t step of 3-tape TM t steps of 3-tape O(ts) = O(t 2 ) single tape steps multi-tape TMsingle tape TM quadratic slowdown

Simulation slowdown Cobham-Edmonds Thesis: multi-tape TMjava single tape TMRAM machine O(t)O(t)O(t)O(t) O(t2)O(t2) O(t2)O(t2) O(t)O(t)O(t)O(t) M 1 can be simulated on M 2 with at most polynomial slowdown

Examples of running time (on a RAM) parsing problem running time 0n1n0n1n algorithmLR(1) O(n) O(n log n) short paths Dijkstra matching Edmonds O(n 3 ) CYK O(n 2 ) n = input size running time problemrouting 2 O(n) scheduling 2 O(n logn) 2 O(n) theorem proving

Input representation Since we measure efficiency in terms of input size, how the input is represented will make a difference For us, any “reasonable” representation will be okay The number ( 17 in base two) OK NO This graph 0000,0010,0001, OK {1,2,3,4}{2,3}{3,4} OK

Nondeterminism and the CE thesis Cobham-Edmonds Thesis says: But is nondetermistic computation realistic? Any two realistic models of computation can be simulated with polynomial slowdown

Running time of nondeterministic TM What about nondeterministic TMs? For ordinary TMs, the running time of M on input x is the number of transitions M makes before it halts But a nondeterministic TM can run for a different time on different “computation paths”

Example Definition of running time for nondeterministic TM 1/1R q acc q0q0 1/1R 0/0R q1q what is the running time? q rej running time = computation path:any possible sequence of transitions max length of any computation path 5 0/0R

Simulation of nondeterministic TM nondet TMmulti-tape TM … 100 … 100 … 221 input tape x 1 simulation tape z address tape a For all k > 0 For all possible strings a of length k Copy x to z. Simulate N on input z using a as choices If a specifies an invalid choice or simulation loops/rejects, abandon simulation. If N enters its accept state, accept and halt. If N rejected on all a s of length k, reject and halt. represents possible choices at each step each a describes a possible computation path NM

Simulation slowdown for nondeterminism For all k > 0 For all possible strings a of length k Copy x to z. Simulate N on input z using a as choices If a specifies an invalid choice or simulation loops/rejects, abandon simulation. If N enters its accept state, accept and halt. If N rejected on all a s of length k, reject and halt. simulation will halt when k = t running time of N is t running time of simulation = ( running time for specific a) × ( number of a s of length ≤ t) = O(t) × 2 O(t) = 2 O(t)

Simulation slowdown multi-tape TMjava single tape TMRAM machine O(t)O(t) O(t)O(t) O(t2)O(t2) O(t2)O(t2) O(t)O(t) O(t)O(t) nondeterministic TM 2 O(t) Do nondeterministic TM violate the Cobham-Edmonds thesis?

Example Recall the scheduling problem Scheduling with nondeterminism: CSC 3230 CSC 2110 CSC 3160 CSC 3130 Can you schedule final exams so that there are no conflicts? Exams → vertices Slots → colors Conflicts → edges YRB schedule(int n, Edges edges) { for i := 1 to n: choose { c[i] := Y; } or { c[i] := R; } or { c[i] := B; } for all e in edges: if c[e.left] == c[e.right] reject; accept; }

Example... but if we had it, we could schedule in linear time! schedule(int n, Edges edges) { for i := 1 to n: choose { c[i] := Y; } or { c[i] := R; } or { c[i] := B; } for all e in edges: if c[e.left] == c[e.right] reject; accept; } In reality, programming languages don’t allow us to choose We have to tell the computer how to make these choices Nondeterminism does not seem like a realistic feature of a programming language or computer

Nondeterministic simulation If we can do better, this would improve all known combinatorial optimization algorithms! nondeterministic TMmulti-tape TM 2 O(t) slowdown Is this the best we can do?

Millenium prize problems Recall how in 1900, Hilbert gave 23 problems that guided mathematics in the 20 th century In 2000, the Clay Mathematical Institute gave 7 problems for the 21 st century 1 P versus NP 2 The Hodge conjecture 3 The Poincaré conjecture 4 The Riemann hypothesis 5 Yang–Mills existence and mass gap 6 Navier–Stokes existence and smoothness 7 The Birch and Swinnerton-Dyer conjecture $1,000,000 Hilbert’s 8 th problem Perelman 2006 (refused money) computer science

The P versus NP question Among other things, this asks: –Is nondeterminism a realistic feature of computation? –Can the choose construct be efficiently implemented? –Can we efficiently optimize any “well-posed” problem? nondeterministic TMordinary TM Can nondeterministic TM be simulated on ordinary TM with polynomial slowdown? poly(t) Most people think not, but nobody knows for sure!

The class P decidable regular context-free efficient P is the class of all languages that can be decided on an ordinary TM whose running time is some polynomial in the length of the input By the CE thesis, we can replace “ordinary TM” by any realistic model of computation multi-tape TMjavaRAM

Examples of languages in P parsing problem running time 0n1n0n1n algorithmLR(1) O(n) O(n log n) short paths Dijkstra matching Edmonds O(n 3 ) CYK O(n 2 ) n = input size L 01 = {$0 n 1 n : n > 0} L G = {x: x is generated by G} PATH = {(G, a, b, L): G is a graph with a path of length L from a to b} G is some CFG MATCH = {G, a, b, L: G is a graph with a “perfect” matching } context-free P (efficient) decidable L 01 LGLG PATH MATCH

Languages believed to be outside P running time of best-known algorithm problemrouting 2 O(n) scheduling 2 O(n) thm-proving We do not know if these problems have faster algorithms, but we suspect not P (efficient) decidable LGLG PATH MATCH ROUTE SCHED PROVE ? To explain why, first we need to understand what these languages have in common

More problems Graph G A clique is a subset of vertices that are all interconnected {1, 4}, {2, 3, 4}, {1} are cliques An independent set is a subset of vertices so that no pair is connected {1, 2}, {1, 3}, {4} are independent sets there is no independent set of size 3 A vertex cover is a set of vertices that touches (covers) all edges {2, 4}, {3, 4}, {1, 2, 3} are vertex covers

Boolean formula satisfiability A boolean formula is an expression made up of variables, ands, ors, and negations, like The formula is satisfiable if one can assign values to the variables so the expression evaluates to true (x 1 ∨ x 2 ) ∧ (x 2 ∨ x 3 ∨ x 4 ) ∧ (x 1 ) x 1 = F x 2 = F x 3 = T x 4 = T Above formula is satisfiable because this assignment makes it true:

Status of these problems CLIQUE = {(G, k): G is a graph with a clique of k vertices } IS = {(G, k): G is a graph with an independent set of k vertices } VC = {(G, k): G is a graph with a vertex cover of k vertices } SAT = {f: f is a satisfiable Boolean formula } running time of best-known algorithm problem CLIQUE 2 O(n) IS 2 O(n) SAT 2 O(n) VC 2 O(n) What do these problems have in common?

Checking solutions efficiently We don’t know how to solve them efficiently But if someone told us the solution, we would be able to check it very quickly Is (G, 5) in CLIQUE ? 1,5,9,12,14 Example:

Cliques via nondeterminism Checking solutions efficiently is the same as designing efficient nondeterministic algorithms Is (G, k) in CLIQUE ? Example: clique(Graph G, int k) { choose C := list of k vertices for i in C: for j in C: if i != j and G.is_edge(i,j) == false reject; accept; }

Example: Formula satisfiability (x 1 ∨ x 2 ) ∧ (x 2 ∨ x 3 ∨ x 4 ) ∧ (x 1 ) f = Checking solution:Nondeterministic algorithm: FFTT substitute x 1 = F x 2 = F x 3 = T x 4 = T evaluate formula (F ∨ T ) ∧ (F ∨ T ∨ F) ∧ (T) f = can be done in linear time sat(Formula f) { x = new bool[f.n]; for i := 1 to n: choose { x[i] := true; } or { x[i] := false; } if f.eval(x) == true accept; else reject; }

The class NP The class NP : L can be solved on a nondeterministic TM in polynomial time iff its solutions can be checked in time polynomial in the input length NP is the class of all languages that can be decided on a nondeterministic TM whose running time is some polynomial in the length of the input

P versus NP because an ordinary TM is only weaker than a nondeterministic one Conceptually, finding solutions can only be harder than checking them P (efficient) decidable LGLG PATH MATCH CLIQUE SAT IS NP (efficiently checkable) VC P is contained in NP

P versus NP The answer to the question is not known. But one reason it is believed to be negative is because, intuitively, searching is harder than verifying For example, solving homework problems (searching for solutions) is harder than grading (verifying the solution is correct) Is P equal to NP ? $1,000,000

Searching versus verifying Mathematician: Given a mathematical claim, come up with a proof for it. Scientist: Given a collection of data on some phenomena, find a theory explaining it. Engineer: Given a set of constraints (on cost, physical laws, etc.) come up with a design (of an engine, bridge, etc.) which meets them. Detective: Given the crime scene, find “who’s done it”.