Intelligent Backtracking Algorithms Foundations of Constraint Processing CSCE421/821, Fall2016: Berthe Y. Choueiry (Shu-we-ri) Avery Hall, Room 360
Reading Required reading Recommended reading Hybrid Algorithms for the Constraint Satisfaction Problem [Prosser, CI 93] Recommended reading Chapters 5 and 6 of Dechter’s textbook Tsang, Chapter 5
Outline Review of terminology of search Hybrid backtracking algorithms
Backtrack search (BT) Variable/value ordering Variable instantiation (Current) path Current variable Past variables Future variables Shallow/deep levels /nodes Search space / search tree Back-checking Backtracking
Outline Review of terminology of search Hybrid backtracking algorithms Vanilla: BT Improving back steps: {BJ, CBJ} Improving forward step: {BM, FC}
Two main mechanisms in BT Backtracking: To recover from dead-ends To go back Consistency checking: To expand consistent paths To move forward
Backtracking To recover from dead-ends Chronological (BT) Intelligent Backjumping (BJ) Conflict directed backjumping (CBJ) With learning algorithms (Dechter Chapt 6.4) Etc.
Consistency checking To expand consistent paths Back-checking: against past variables Backmarking (BM) Look-ahead: against future variables Forward checking (FC) (partial look-ahead) Directional Arc-Consistency (DAC) (partial look-ahead) Maintaining Arc-Consistency (MAC) (full look-ahead)
Hybrid algorithms Backtracking + checking = new hybrids BT BJ CBJ BM BMJ BM-CBJ FC FC-BJ FC-CBJ Evaluation: Empirical: Prosser 93. 450 instances of Zebra Theoretical: Kondrak & Van Beek 95
Notations (in Prosser’s paper) Variables: Vi, i in [1, n] Domain: Di = {vi1, vi2, …,viMi} Constraint between Vi and Vj: Ci,j Constraint graph: G Arcs of G: Arc(G) Instantiation order (static or dynamic) Language primitives: list, push, pushnew, remove, set-difference, union, max-list
Main data structures v: a (1xn) array to store assignments v[i] gives the value assigned to ith variable v[0]: pseudo variable (root of tree), backtracking to v[0] indicates insolvability domain[i]: a (1xn) array to store the original domains of variables current-domain[i]: a (1xn) array to store the current domains of variables Upon backtracking, current-domain[i] of future variables must be refreshed check(i,j): a function that checks whether the values assigned to v[i] and v[j] are consistent
Generic search: bcssp Procedure bcssp (n, status) Begin consistent true status unknown i 1 While status = unknown Do Begin If consistent Then i label (i, consistent) Else i unlabel (i, consistent) If i > n Then status “solution” Else If i=0 then status “impossible” End Forward move: x-label Backward move: x-unlabel Input: i: current variable, consistent: Boolean Return: i: new current variable
Chronological backtracking (BT) Uses bt-label and bt-unlabel bt-label: When v[i] is assigned a value from current-domain[i], we perform back-checking against past variables (check(i,k)) If back-checking succeeds, bt-label returns i+1 If back-checking fails, we remove the assigned value from current-domain[i], assign the next value in current-domain[i], etc. If no other value exists, consistent nil (bt-unlabel will be called) bt-unlabel Current level is set to i-1 (notation for current variable: v[h]) For all future variables j: current-domain[j] domain[j] If domain[h] is not empty, consistent true (bt-label will be called) Note: for all past variables g, current-domain[g] domain[g]
BT-label Function bt-label(i,consistent): INTEGER BEGIN consistent false For v[i] each element of current-domain[i] while not consistent Do Begin consistent true For h 1 to (i-1) While consistent Do consistent check(i,h) If not consistent Then current-domain[i] remove(v[i], current-domain[i]) End If consistent then return(i+1) ELSE return(i) END Terminates: consistent=true, return i+1 consistent=false, current-domain[i]=nil, returns i
BT-unlabel FUNCTION bt-unlabel(i,consistent):INTEGER BEGIN h i -1 current-domain[i] domain[i] current-domain[h] remove(v[h],current-domain[h]) consistent current-domain[h] nil return(h) END Is called when consistent=false and current-domain[i]=nil Selects vh to backtrack to (Uninstantiates all variables between vh and vi) Uninstantiates v[h]: removes v[h] from current-domain [h]: Sets consistent to true if current-domain[h] 0 Returns h
Example: BT (the dumbest example ever) - {1,2,3,4,5} V1 v[1] 1 {1,2,3,4,5} v[2] V2 1 {1,2,3,4,5} V3 v[3] 1 CV3,V4={(V3=1,V4=3)} {1,2,3,4,5} V4 v[4] etc… 1 2 3 4 CV2,V5={(V2=5,V5=1),(V2=5,V5=4)} {1,2,3,4,5} V5 v[5] 1 2 3 4 5
Outline Review of terminology of search Hybrid backtracking algorithms Vanilla: BT Improving back steps: BJ, CBJ Improving forward step: BM, FC
Danger of BT: thrashing BT assumes that the instantiation of v[i] was prevented by a bad choice at (i-1). It tries to change the assignment of v[i-1] When this assumption is wrong, we suffer from thrashing (exploring ‘barren’ parts of solution space) Backjumping (BT) tries to avoid that Jumps to the reason of failure Then proceeds as BT
max-check[i] max(max-check[i], h) Backjumping (BJ) Tries to reduce thrashing by saving some backtracking effort When v[i] is instantiated, BJ remembers v[h], the deepest node of past variables that v[i] has checked against. Uses: max-check[i], global, initialized to 0 At level i, when check(i,h) succeeds max-check[i] max(max-check[i], h) If current-domain[h] is getting empty, simple chronological backtracking is performed from h BJ jumps then steps! 1 2 1 2 3 3 h-2 h-1 h h-1 h i Past variable Current variable
BJ: label/unlabel bj-label: same as bt-label, but updates max-check[i] bj-unlabel, same as bt-unlabel but Backtracks to h = max-check[i] Resets max-check[j] 0 for j in [h+1,i] Important: max-check is the deepest level we checked against, could have been success or could have been failure 1 2 3 i h-1 h h-2
Example: BJ v[0] = 0 - {1,2,3,4,5} V2 V1 V3 V4 V5 v[1] 1 v[2] 1 2 v[3] CV2,V4={(V2=1,V4=3)} CV1,V5={(V1=1,V5=2)} CV2,V5={(V2=5,V5=1)} v[1] 1 Max-check[1] = 0 v[2] 1 Max-check[2] = 1 2 v[3] 1 V4=1, fails for V2, mc=2 V4=2, fails for V2, mc=2 V4=3, succeeds v[4] max-check[4] = 3 1 2 3 4 V5=1, fails for V1, mc=1 V5=2, fails for V2, mc=2 V5=3, fails for V1 V5=4, fails for V1 v[5] 1 2 3 4 5 V5=5, fails for V1 max-check[5] = 2
Conflict-directed backjumping (CBJ) jumps from v[i] to v[h], but then, it steps back from v[h] to v[h-1] CBJ improves on BJ Jumps from v[i] to v[h] And jumps back again, across conflicts involving both v[i] and v[h] To maintain completeness, we jump back to the level of deepest conflict Backtracking
CBJ: data structure Maintains a conflict set: conf-set 1 2 Maintains a conflict set: conf-set conf-set[i] are first initialized to {0} At any point, conf-set[i] is a subset of past variables that are in conflict with i g conf-set[g] {0} h-1 conf-set[h] {0} h i conf-set[i] {0} {0}
CBJ: conflict-set When a check(i,h) fails When current-domain[i] empty {x} {3} {1, g, h} {0} conf-set[g] conf-set[h] conf-set[i] 1 2 3 g h-1 h Current variable i Past variables {3,1, g} {x, 3,1} When a check(i,h) fails conf-set[i] conf-set[i] {h} When current-domain[i] empty Jumps to deepest past variable h in conf-set[i] Updates conf-set[h] conf-set[h] (conf-set[i] \{h}) Primitive form of learning (while searching)
Example CBJ {1,2,3,4,5} V2 V1 V3 V4 V5 {(V1=1, V6=3)} - v[1] v[2] v[3] conf-set[1] = {0} conf-set[2] = {0} conf-set[3] = {0} {(V4=5, V6=3)} {(V2=1, V4=3), (V2=4, V4=5)} conf-set[6] = {1} V6 {(V1=1, V5=3)} conf-set[4] = {2} v[5] conf-set[6] = {1,4} conf-set[4] = {1, 2} conf-set[5] = {1}
CBJ for finding all solutions After finding a solution, if we jump from this last variable, then we may miss some solutions and lose completeness Two solutions, proposed by Chris Thiel (S08) Using conflict sets Using cbf of Kondrak, a clear pseudo-code Rationale by Rahul Purandare (S08) We cannot skip any variable without chronologically backtracking to it at least once In fact, exactly once
CBJ/All solutions without cbf When a solution is found, force the last variable, N, to conflict with everything before it conf-set[N] {1, 2, ..., N-1}. This operation, in turn, forces some chronological backtracking as the conf-sets are propagated backward
CBJ/All solutions with cbf Kondrak proposed to fix the problem using cbf (flag), a 1xn vector i, cbf[i] 0 When you find a solution, i, cbf[i] 1 In unlabel if (cbf[i]=1) Then h i-1; cbf[i] 0 Else h max-list (conf-set[i])
Backtracking: summary Chronological backtracking Steps back to previous level No extra data structures required Backjumping Jumps to deepest checked-against variable, then steps back Uses array of integers: max-check[i] Conflict-directed backjumping Jumps across deepest conflicting variables Uses array of sets: conf-set[i]
Outline Review of terminology of search Hybrid backtracking algorithms Vanilla: BT Improving back steps: BJ, CBJ Improving forward step: BM, FC
Backmarking: goal Tries to reduce amount of consistency checking Situation: v[i] about to be re-assigned k v[i]k was checked against v[h]g v[h] has not been modified v[h] = g k v[i] k
BM: motivation Two situations Either (v[i]=k,v[h]=g) has failed it will fail again Or, (v[i]=k,v[h]=g) was founded consistent it will remain consistent v[h] = g v[i] k v[h] = g v[i] k In either case, back-checking effort against v[h] can be saved!
Data structures for BM: 2 arrays maximum checking level: mcl (n x m) Minimum backup level: mbl (n x 1) Number of variables n Number of variables n max domain size m
Maximum checking level mcl[i,k] stores the deepest variable that v[i]k checked against mcl[i,k] is a finer version of max-check[i] Number of variables n max domain size m
Minimum backup level mbl[i] gives the shallowest past variable whose value has changed since v[i] was the current variable Number of variables n BM (and all its hybrid) do not allow dynamic variable ordering
When mcl[i,k]=mbl[i]=j BM is aware that The deepest variable that (v[i] k) checked against is v[j] Values of variables in the past of v[j] (h<j) have not changed So We do need to check (v[i] k) against the values of the variables between v[j] and v[i] We do not need to check (v[i] k) against the values of the variables in the past of v[j] v[j] k v[i] k mbl[i] = j
Type a savings When mcl[i,k] < mbl[i], do not check v[i] k because it will fail v[h] v[i] k v[j] mcl[i,k]=h mcl[i,k] < mbl[i]=j
Type b savings When mcl[i,k] mbl[i], do not check (i,h<j) because they will succeed h v[j] k v[g] v[i] k mcl[i,k]=g mbl[i] = j mcl[i,k]mbl[i]
Hybrids of BM mcl can be used to allow backjumping in BJ Mixing BJ & BM yields BMJ avoids redundant consistency checking (types a+b savings) and reduces the number of nodes visited during search (by jumping) Mixing BM & CBJ yields BM-CBJ
Problem of BM and its hybrids: warning BMJ enjoys only some of the advantages of BM Assume: mbl[h] = m and max-check[i]=max(mcl[i,x])=g Backjumping from v[i]: v[i] backjumps up to v[g] Backmarking of v[h]: When reconsidering v[h], v[h] will be checked against all f [m,g) effort could be saved Phenomenon will worsen with CBJ Problem fixed by Kondrak & van Beek 95 v[m] v[g] v[h] v[i] v[g] v[h] v[i] v[m] v[m] v[f] v[g] v[h] v[h] v[i]
Forward checking (FC) Looking ahead: from current variable, consider all future variables and clear from their domains the values that are not consistent with current partial solution FC makes more work at every instantiation, but will expand fewer nodes When FC moves forward, the values in current-domain of future variables are all compatible with past assignment, thus saving backchecking FC may “wipe out” the domain of a future variable (aka, domain annihilation) and thus discover conflicts early on. FC then backtracks chronologically Goal of FC is to fail early (avoid expanding fruitless subtrees)
reductions[j] = {{a, b}, {c, d, e}, {f, g, h}} FC: data structures v[i] v[k] v[l] v[n] v[m] v[j] When v[i] is instantiated, current-domain[j] are filtered for all j connected to i and I < j n reduction[j] store sets of values remove from current-domain[j] by some variable before v[j] reductions[j] = {{a, b}, {c, d, e}, {f, g, h}} future-fc[i]: subset of the future variables that v[i] checks against (redundant) future-fc[i] = {k, j, n} past-fc[i]: past variables that checked against v[i] All these sets are treated like stacks
Forward Checking: functions check-forward undo-reductions update-current-domain fc-label fc-unlabel
FC: functions check-forward(i,j) is called when instantiating v[i] It performs Revise(j,i) Returns false if current-domain[j] is empty, true otherwise Values removed from current-domain[j] are pushed, as a set, into reductions[j] These values will be popped back if we have to backtrack over v[i] (undo-reductions)
FC: functions update-current-domain fc-label current-domain[i] domain[i] \ reductions[i] actually, we have to iterate over reductions, which is a set of sets fc-label Attempts to instantiate current-variable Then filters domains of all future variables (push into reductions) Whenever current-domain of a future variable is wiped-out: v[i] is un-instantiated and domain filtering is undone (pop reductions)
Hybrids of FC FC suffers from thrashing: it is based on BT FC-BJ: max-check is integrated in fc-bj-label and fc-bj-unlabel Enjoys advantages of FC and BJ… but suffers malady of BJ (first jumps, then steps back) FC-CBJ: Best algorithm so far fc-cbj-label and fc-cbj-unlabel
Consistency checking: summary Chronological backtracking Uses back-checking No extra data structures Backmarking Uses mcl and mbl Two types of consistency-checking savings Forward-checking Works more at every instantiation, but expands fewer subtrees Uses: reductions[i], future-fc[i], past-fc[i]
Experiments were carried out under static variable ordering Empirical evaluations on Zebra Representative of design/scheduling problems 25 variables, 122 binary constraints Permutation of variable ordering yields new search spaces Variable ordering: different bandwidth/induced width of graph 450 problem instances were generated Each algorithm was applied to each instance Experiments were carried out under static variable ordering
Analysis of experiments Algorithms compared with respect to: Number of consistency checks (average) FC-CBJ ≼ FC-BJ ≼BM-CBJ ≼ FC ≼ CBJ ≼ BMJ ≼ BM ≼ BJ ≼ BT Number of nodes visited (average) FC-CBJ ≼ FC-BJ ≼ FC ≼ BM-CBJ ≼ BMJ=BJ ≼ BM=BT CPU time (average) FC-CBJ ≼ FC-BJ ≼ FC ≼ BM-CBJ ≼ CBJ ≼ BMJ ≼ BJ ≼ BT ≼ BM FC-CBJ apparently the champion
Additional developments Other backtracking algorithms exist: Graph-based backjumping (GBJ), etc. [Dechter] Pseudo-trees [Freuder 85] Other look-ahead techniques exist DAC, MAC, etc. More empirical evaluations over randomly generated problems Theoretical comparisons [Kondrak & van Beek IJCAI’95]
Implementing BT-based algorithms Preprocessing Enforce NC, do not include in #CC (e.g., Zebra) Normalize all constraints (fapp01-0200-0) Check for empty relations (bqwh-15-106-0_ext) Interrupt as soon as you detect domain wipe out Dynamic variable ordering Apply domino effect