Approximating Optimal Binary Decision Trees Brent Heeringa (joint work with Micah Adler) 18 November 2005.

Slides:



Advertisements
Similar presentations
A threshold of ln(n) for approximating set cover By Uriel Feige Lecturer: Ariel Procaccia.
Advertisements

The Theory of NP-Completeness
1 NP-Complete Problems. 2 We discuss some hard problems:  how hard? (computational complexity)  what makes them hard?  any solutions? Definitions 
INHERENT LIMITATIONS OF COMPUTER PROGRAMS CSci 4011.
Approximation Algorithms Chapter 5: k-center. Overview n Main issue: Parametric pruning –Technique for approximation algorithms n 2-approx. algorithm.
© The McGraw-Hill Companies, Inc., Chapter 8 The Theory of NP-Completeness.
Combinatorial Algorithms
Introduction to PCP and Hardness of Approximation Dana Moshkovitz Princeton University and The Institute for Advanced Study 1.
PCPs and Inapproximability Introduction. My T. Thai 2 Why Approximation Algorithms  Problems that we cannot find an optimal solution.
CSC5160 Topics in Algorithms Tutorial 2 Introduction to NP-Complete Problems Feb Jerry Le
Computational problems, algorithms, runtime, hardness
CS21 Decidability and Tractability
February 23, 2015CS21 Lecture 201 CS21 Decidability and Tractability Lecture 20 February 23, 2015.
Computability and Complexity 15-1 Computability and Complexity Andrei Bulatov NP-Completeness.
Graphs 4/16/2017 8:41 PM NP-Completeness.
Complexity and Approximation of the Minimum Recombinant Haplotype Configuration Problem Authors: Lan Liu, Xi Chen, Jing Xiao & Tao Jiang.
Approximation Algorithms Lecture for CS 302. What is a NP problem? Given an instance of the problem, V, and a ‘certificate’, C, we can verify V is in.
1 Approximation Algorithms CSC401 – Analysis of Algorithms Lecture Notes 18 Approximation Algorithms Objectives: Typical NP-complete problems Approximation.
1 Vertex Cover Problem Given a graph G=(V, E), find V' ⊆ V such that for each edge (u, v) ∈ E at least one of u and v belongs to V’ and |V’| is minimized.
88- 1 Chapter 8 The Theory of NP-Completeness P: the class of problems which can be solved by a deterministic polynomial algorithm. NP : the class.
Analysis of Algorithms CS 477/677
Time Complexity.
An O(log(n))-Approximation for Decision Trees Brent Heeringa (joint work with Micah Adler) 11 March 2005.
1 CSE 417: Algorithms and Computational Complexity Winter 2001 Lecture 24 Instructor: Paul Beame.
Complexity 1 Hardness of Approximation. Complexity 2 Introduction Objectives: –To show several approximation problems are NP-hard Overview: –Reminder:
February 25, 2015CS21 Lecture 211 CS21 Decidability and Tractability Lecture 21 February 25, 2015.
1 Joint work with Shmuel Safra. 2 Motivation 3 Motivation.
1 The Theory of NP-Completeness 2 NP P NPC NP: Non-deterministic Polynomial P: Polynomial NPC: Non-deterministic Polynomial Complete P=NP? X = P.
Defining Polynomials p 1 (n) is the bound on the length of an input pair p 2 (n) is the bound on the running time of f p 3 (n) is a bound on the number.
The Theory of NP-Completeness 1. Nondeterministic algorithms A nondeterminstic algorithm consists of phase 1: guessing phase 2: checking If the checking.
1 Introduction to Approximation Algorithms. 2 NP-completeness Do your best then.
The Theory of NP-Completeness 1. What is NP-completeness? Consider the circuit satisfiability problem Difficult to answer the decision problem in polynomial.
Graph Coalition Structure Generation Maria Polukarov University of Southampton Joint work with Tom Voice and Nick Jennings HUJI, 25 th September 2011.
Approximation Algorithms Department of Mathematics and Computer Science Drexel University.
1 The Theory of NP-Completeness 2012/11/6 P: the class of problems which can be solved by a deterministic polynomial algorithm. NP : the class of decision.
APPROXIMATION ALGORITHMS VERTEX COVER – MAX CUT PROBLEMS
1 Introduction to Approximation Algorithms. 2 NP-completeness Do your best then.
Lecture 22 More NPC problems
The Complexity of Optimization Problems. Summary -Complexity of algorithms and problems -Complexity classes: P and NP -Reducibility -Karp reducibility.
INHERENT LIMITATIONS OF COMPUTER PROGRAMS CSci 4011.
Final Exam Review Final exam will have the similar format and requirements as Mid-term exam: Closed book, no computer, no smartphone Calculator is Ok Final.
TECH Computer Science NP-Complete Problems Problems  Abstract Problems  Decision Problem, Optimal value, Optimal solution  Encodings  //Data Structure.
Week 10Complexity of Algorithms1 Hard Computational Problems Some computational problems are hard Despite a numerous attempts we do not know any efficient.
CSE 024: Design & Analysis of Algorithms Chapter 9: NP Completeness Sedgewick Chp:40 David Luebke’s Course Notes / University of Virginia, Computer Science.
EMIS 8373: Integer Programming NP-Complete Problems updated 21 April 2009.
1 The Theory of NP-Completeness 2 Cook ’ s Theorem (1971) Prof. Cook Toronto U. Receiving Turing Award (1982) Discussing difficult problems: worst case.
NP-Complete Problems. Running Time v.s. Input Size Concern with problems whose complexity may be described by exponential functions. Tractable problems.
Non-Approximability Results. Summary -Gap technique -Examples: MINIMUM GRAPH COLORING, MINIMUM TSP, MINIMUM BIN PACKING -The PCP theorem -Application:
CS 3343: Analysis of Algorithms Lecture 25: P and NP Some slides courtesy of Carola Wenk.
Bahareh Sarrafzadeh 6111 Fall 2009
CPS Computational problems, algorithms, runtime, hardness (a ridiculously brief introduction to theoretical computer science) Vincent Conitzer.
The full Steiner tree problem Theoretical Computer Science 306 (2003) C. L. Lu, C. Y. Tang, R. C. T. Lee Reporter: Cheng-Chung Li 2004/06/28.
CS6045: Advanced Algorithms NP Completeness. NP-Completeness Some problems are intractable: as they grow large, we are unable to solve them in reasonable.
1 CPSC 320: Intermediate Algorithm Design and Analysis July 30, 2014.
NPC.
NP Completeness Piyush Kumar. Today Reductions Proving Lower Bounds revisited Decision and Optimization Problems SAT and 3-SAT P Vs NP Dealing with NP-Complete.
Approximation Algorithms for Combinatorial Auctions with Complement-Free Bidders Speaker: Shahar Dobzinski Joint work with Noam Nisan & Michael Schapira.
Young CS 331 D&A of Algo. NP-Completeness1 NP-Completeness Reference: Computers and Intractability: A Guide to the Theory of NP-Completeness by Garey and.
CSE 421 Algorithms Richard Anderson Lecture 27 NP-Completeness Proofs.
The Theory of NP-Completeness 1. Nondeterministic algorithms A nondeterminstic algorithm consists of phase 1: guessing phase 2: checking If the checking.
COSC 3101A - Design and Analysis of Algorithms 14 NP-Completeness.
1 The Theory of NP-Completeness 2 Review: Finding lower bound by problem transformation Problem X reduces to problem Y (X  Y ) iff X can be solved by.
The NP class. NP-completeness Lecture2. The NP-class The NP class is a class that contains all the problems that can be decided by a Non-Deterministic.
 2005 SDU Lecture15 P,NP,NP-complete.  2005 SDU 2 The PATH problem PATH = { | G is a directed graph that has a directed path from s to t} s t
TU/e Algorithms (2IL15) – Lecture 10 1 NP-Completeness, II.
TU/e Algorithms (2IL15) – Lecture 11 1 Approximation Algorithms.
The Theory of NP-Completeness
The Theory of NP-Completeness
CS21 Decidability and Tractability
Switching Lemmas and Proof Complexity
Presentation transcript:

Approximating Optimal Binary Decision Trees Brent Heeringa (joint work with Micah Adler) 18 November 2005

Question: àI am thinking of a computer scientist. Which one? àRule: Ask YES/NO questions from a finite set Q

Q1: MIT Professor?

Q1: MIT Professor? YES Q1 YES NO

Q1: MIT Professor? YES Q2: Author of a popular CS text? Q1 YES NO

Q1: MIT Professor? YES Q2: Author of a popular CS text? YES Q1 YES NO Q2 YES NO

Q1: MIT Professor? YES Q2: Author of a popular CS text? YES Q1 YES NO Q2 YES NO

Q1: MIT Professor? YES Q2: Author of a popular CS text? YES Q3: Inventor of RSA? Q1 YES NO Q2 YES NO

Q1: MIT Professor? YES Q2: Author of a popular CS text? YES Q3: Inventor of RSA? NO Q1 YES NO Q2 YES NO

Decision Tree Problem (DT) àInput: A set X=(x 1,…,x n ) of binary strings (called items) ßEach item has exactly m-bits ßE.g. if m=5 then x i might be àSolution: A binary tree with n leaves ßEach internal node indexes some bit k Þ partitions items into two groups ßEach item is a leaf (n leaves total) àCost: Total Sum of Leaf Depths àOptimal Solution: DT with minimum cost k 0 1

Example: Cost: = 9

Example:

Example: Cost: = 8 OPTIMAL!

Alternative Cost = 8

Decision Trees àDecision Trees (DT) model many natural tasks in ßMedical Diagnosis ßExperiment Design àDT is the the 20-questions problem àDT is NP-Complete ßReduction from Set Cover (Exact Cover by 3 Sets) Þ[Hyafill and Rivest ]

Outline àProblem Introduction àA Greedy Approximation Algorithm for DT àAn Analysis of the Greedy Algorithm ß ln n-approximation àOther Results and Open Problems

A Greedy DT Algorithm ? IDEA: Always choose bit which most evenly partitions items 01

A Greedy DT Algorithm IDEA: Always choose bit which most evenly partitions the items

A Greedy DT Algorithm 4 IDEA: Always choose bit which most evenly partitions items

A Greedy DT Algorithm 4 IDEA: Always choose bit which most evenly partitions items

A Greedy DT Algorithm IDEA: Always choose bit which most evenly partitions items GREEDY-DT(X) If X=Ø Return NIL Else k  index of the bit most evenly separating X T  new tree node T[left]  GREEDY-DT({X | X(k)=0}) T[right]  GREEDY-DT({X | X(k)=1}) Return T

Optimal vs. Greedy a bc h eb cdfg a Optimal Tree T*Greedy Tree T Cost(T)=26 Cost(T*)=25 deh fg

Outline àProblem Introduction àA Greedy Approximation Algorithm for DT àAn Analysis of the Greedy Algorithm ß (ln n+1)-approximation àOther Results and Open Problems

Approximation Algorithm Review àMinimization Problem àC = cost given by approximation algorithm àC opt = cost of optimal solution  -approximation:  may be a function of the input size – n

Analysis Outline àAccounting Scheme ßEach pair of items {x i, x j } is separated exactly once in any decision tree Þ True for Greedy and Optimal ßDistribute cost of the Greedy tree among item pairs àAnalyze cost of greedy tree w.r.t. structure of optimal tree Theorem: The greedy algorithm yields a tree with cost at most a factor of (ln n +1) more than the optimal tree

Definitions and Notation àConsider each pair of items {x i,x j } àS ij separates x i from x j àS ij : set of items that are children àS ij + and S ij - child sets respectively à|S ij + | ≥ |S ij - | à|S ij | = |S ij + | + |S ij - | xixi S ij S ij - S ij + xjxj Greedy Tree T

Accounting Method xixi S ij S ij - S ij + xjxj Greedy Tree T àAssign cost c ij to each pair of items {x i,x j }  Distribute |S ij | equally among the |S ij + ||S ij - | pairs of items split at S ij àc ij =

xixi 2 4 xjxj Greedy Tree T àAssign cost c ij to each pair of items {x i,x j }  Distribute |S ij | equally among the |S ij + ||S ij - | pairs of items split at S ij àc ij = àExample:  |S ij |= 6 |S ij + |= 4 |S ij - |= 2 ß {a,f} = {c,e} = c ij = 6/8 = 3/4 6 Accounting Method {a,b,c,d,e,f} {a,b,c,d}{e,f}

Greedy Tree Cost xixi S ij S ij - S ij + xjxj Greedy Tree T Cost of Greedy Tree T: Free to order pair costs in any way we like!

Reorder c ij according to T* xixi Z Z-Z- Z+Z+ xjxj Free to order pair costs in any way we like! Optimal Tree T *

A Lemma xixi Z Z-Z- Z+Z+ xjxj Optimal Tree T * Lemma: For any node Z in T*

Prove of the Theorem xixi Z Z-Z- Z+Z+ xjxj Optimal Tree T * (lemma) (|Z| ≤ n) (Def of tree cost) (CLRS) Lemma: For any node Z in T*

Proving the Lemma Lemma: For any node Z in T* xixi S ij S ij - S ij + xjxj Greedy Tree T Goal: Relate pair cost (defined w.r.t. greedy tree) to the optimal tree Claim 1:

Proving the Lemma Lemma: For any node Z in T* xixi S ij S ij Z- S ij Z+ xjxj Greedy Tree T Claim 1:

Proving the Lemma Lemma: For any node Z in T* Claim 1: xixi S ij S ij Z- S ij Z+ Greedy Tree T xjxj

Claim 2 (claim 1) Claim 2: For any Z in T *, for any x i in Z + :

Proof of Claim 2 Claim 2: For any Z in T *, for any x i in Z + : Z - = {a, b, c, d, e, f} Order Z from 1 to 6 according to when x j is split from x i When t th item is split from x i, |S ij  Z - | ≥ 6-t+1 xixi a,b c d,e f S i1 S i2 S i3 S i4 S i5 |S i2 | ≥ 6 |S i3 | ≥ 4 |S i4 | ≥ 3 |S i4 | ≥ 1 Greedy Tree T

Wrapping up the Proof: (claim 1) Claim 2: For any Z in T *, for any x i in Z + : Lemma: For any node Z in T*

Wrapping up the Proof: (claim 1) Claim 2: For any Z in T *, for any x j in Z - : Lemma: For any node Z in T*

Wrapping up the Proof: (claim 1) Lemma: For any node Z in T* QED (claim 2)

Outline àProblem Introduction àA Greedy Approximation Algorithm for DT àAn Analysis of the Greedy Algorithm ß (ln n +1)-approximation àOther Results and Open Problems

DT has no PTAS unless P=NP àMAX3SAT5 [Feige]: ß 3CNF; each literal appears in exactly 5 clauses ßThm: There exists a universal constant  > 0 such that it is NP-Hard to distinguish 3SAT5 formula that are satisfiable and those in which at most (1-  )|C| clauses are simultaneously satisfied. àGap preserving reduction from MAX3SAT5 to DT ßVia a set cover

DT has no PTAS unless P=NP All clauses satisfied: Cost:

DT has no PTAS unless P=NP At most (1-  )|C| clauses satisfied: Cost:

The ConDT Problem: àInput: A set X=(x 1,…,x n ) of m-bit binary strings (called items) ßEach item x i has a label TRUE or FALSE àSolution: A binary tree ßEach internal node is a bit k; each leaf is a label ßThe tree correctly labels each item (consistent) àCost: Total number of leaves àOptimal Solution: Consistent decision tree with minimum number of leaves àNot possible to approx. size s DTs with size s k DTs (for any constant k) unless NP is in DTIME[2 m  ] for some  < 1

Open Problems àGap in approximation ratios between lower and upper bounds ßTechniques from ConDT don’t work àItems with weights ßTests with weights ßMinimize:

Fin