Download presentation
Presentation is loading. Please wait.
1
Constraint Satisfaction Problems
2/10 Constraint Satisfaction Problems
2
Search when states are factored
Until now, we assumed states are black-boxes. We will now assume that states are made up of “state-variables” and their “values” Two interesting problem classes CSP & SAT (Constraint Satisfaction Problems) Planning
4
Constraint Satisfaction Problems (a brief animated overview)
Z X Y X: red Y: blue Z: green Red green blue X Search backtracking, variable/value heuristics Inference Consistency enforcement, forward checking Y Z Coloring Problem Constraint Graph CSP Representation Variables Problem Statement CSP Algorithm Values Solution Constraints December 2, 1998 Sqalli, Tutorial on Constraint Satisfaction Problems
6
Example: N-queen problem
Variables: Queen per column Values: N rows that queen can be in Constraints: no pair in same row, column or diagonal
7
Constraint Graphs will be hyper-graphs for non-binary CSPs
8
“Real world” CSP problems..
Most assignment problems including Time-tabling Variables: Courses; Values: Rooms, times Jobshop Scheduling Variables: jobs; values: machines Sudoku; KenKen Cross-word puzzle Boolean satisfiability
9
Complexity of CSP.. Boolean Satisfiability is a special case of discrete variable CSP problem So, CSP is NP-hard Specific types of CSP may be tractable. E.g. if all the variables are boolean and all the constraints are binary, you have 2-SAT which is tractable. The topology of the “constraint graph” also affects the complexity of the CSP problem E.g. If the constraint graph is a chain graph or a multi-tree, we can solve it polynomially
11
How about: Breadth-first search? IDDFS? All solutions are at depth d!
13
Review of CSP/SAT concepts
x,y,u,v: {A,B,C,D,E} w: {D,E} l : {A,B} x=A wE y=B uD u=C lA v=D lB Constraint Satisfaction Problem (CSP) Given A set of variables (Normally, discrete—but can be continuous) Legal domains for each of the variables A set of constraints on values groups of variables can take Constraints can be “Unary”, “binary” or “multi-ary” based on how many variables they connect Find an assignment of values to all the variables so that none of the constraints are violated SAT Problem = CSP with boolean variables A solution: x=B, y=C, u=D, v=E, w=D, l=B x A N : {x=A} 1 y B MAY BE WORTH PUTTING THIS IN TH EMIDDLE OF THE TUTORIAL__ASK FOR A SHOW OF HANDS AS TO HOW MANY KNOW CSPs. IF they don’t;, kEEP the CSP in a different window and show it to them… MAY BE WORTH PUTTING SOME NUMBERS FROM BACCHUS’s SLIDES. N : {x= A & y = B } 2 v D N : {x= A & y = B & v = D } 3 u C N : {x= A & y = B & v = D & u = C } 4 w E N : {x= A & y = B & v = D & u = C & w= E } w D 5 N : {x= A & y = B & v = D & u = C & w= D } 6
15
“Most Constrained Variable First”
“Least-constraining Value First”
16
2/12 I do not think much of a man who
I will say then that I am not, nor ever have been in favor of bringing about in anyway the social and political equality of the white and black races - that I am not nor ever have been in favor of making voters or jurors of negroes, nor of qualifying them to hold office, nor to intermarry with white people; and I will say in addition to this that there is a physical difference between the white and black races which I believe will forever forbid the two races living together on terms of social and political equality. And inasmuch as they cannot so live, while they do remain together there must be the position of superior and inferior, and I as much as any other man am in favor of having the superior position assigned to the white race. I say upon this occasion I do not perceive that because the white man is to have the superior position the negro should be denied everything." September 18, 1858 2/12 I do not think much of a man who is not wiser today than he was yesterday. My paramount object in this struggle is to save the Union, and is not either to save or to destroy slavery. If I could save the Union without freeing any slave I would do it, and if I could save it by freeing all the slaves I would do it; and if I could save it by freeing some and leaving others alone I would also do that. What I do about slavery, and the colored race, I do because I believe it helps to save the Union; and what I forbear, I forbear because I do not believe it would help to save the Union. I shall do less whenever I shall believe what I am doing hurts the cause, and I shall do more whenever I shall believe doing more will help the cause. ---August 22, 1862 (born 2/12/1809) In giving freedom to the slave, we assure freedom to the free - honorable alike in what we give, and what we preserve. We shall nobly save, or meanly lose, the last best hope of earth. Other means may succeed; this could not fail. The way is plain, peaceful, generous, just - a way which, if followed, the world will forever applaud, and God must forever bless. ---December, 1, 1862.
17
“It is not the strongest of the species that survives, nor the most intelligent that survives. It is the one that is the most adaptable to change.” The universe we observe has precisely the properties we should expect if there is, at bottom, no design, no purpose, no evil, no good, nothing but blind, pitiless indifference (b 2/12/1809) “We can allow satellites, planets, suns, universe, nay whole systems of universes, to be governed by laws, but the smallest insect, we wish to be created at once by special act.” “We must, however, acknowledge, as it seems to me, that man with all his noble qualities... still bears in his bodily frame the indelible stamp of his lowly origin.”
18
[They brought..] the change from soul to mind as the engine of existence, and then from angels to ages as the overseers of life
19
General Search vs. CSP Blackbox State External Child-generator
State-space can be infinite External goal test Goals can occur at any depth Goals can have different costs All the search algorithms we discussed until now are appropriate. Heuristics are aimed at estimating the cost to goal node.. State is made-up of state variables Children generation involves assigning values to more variables State space is finite A state is a goal state if all variables are assigned and no constraints are violated All goals occur at the same depth In the basic formulation, all goals have the same cost This can be generalized Only the Depth-first search makes sense! Heuristics are aimed at picking the right variable to assign next, and deciding the right value to assign to it.
20
y n y y n n n n n Dynamic variable ordering: Pick the variable with the smallest “live” (remaining) domain next.
21
Why are CSP problems hard?
Because what seems like a locally good decision (value assignment to a variable), may wind up leading to global inconsistency But what if we pre-process the CSP problem such that the local choices are more likely to be globally consistent? Two CSP problems CSP1 and CSP2 are considered equivalent if both of them have the same solutions. Related to the way artificial potential fields can be set up for improving hill-climbing..
22
Pre-processing to enforce consistency
Special terminology for binary CSPs 2-consistency is called “Arc” consistency (since you need only considers pairs of variables connected by an edge in the constraint graph) 3-consistency is called “path” An n-variable CSP problem is said to be k-consistent iff every consistent assignment for (k-1) of the n variables can be extended to include any k-th variable Strongly k-consistent if it is j-consistent for all j from 1 to k Higher the level of (strong) consistency of problem, the lesser the amount of backtracking required to solve the problem A CSP with strong n-consistency can be solved without any backtracking We can improve the level of consistency of a problem by explicating implicit constraints Enforcing k-consistency is of O(nk) complexity Break-even seems to be around k=2 (“arc consistency”) or 3 (“path consistency”) One measure of hardness of a problem is the minimum level of consistency that needs to be enforced on the problem so it becomes strongly n-consistent
23
(measured in k-strong-consistency)
How much consistency should we enforce? Total cost incurred in search Cost of enforcing the consistency Cost of searching with the heuristic 1 2 3 h0 n Degree of consistency (measured in k-strong-consistency) Overloading new semantics on an old graphic
24
Enforcing Arc Consistency: An example
When we enforce arc-consistency on the top left CSP (shown as a constraint graph), we get the CSP shown in the bottom left. Notice how the domains have reduced. Here is an explanation of what happens. Suppose we start from Z. If Z=1, then Y can’t have any valid values. So, we remove 1 from Z’s domain. If Z=2, or 3 then Y can have a valid value (since Y can be 1 or 2). Now we go to Y. If Y=1, then X can’t have any value. So, we remove 1 from X’s domain. If Y=3, then Z can’t have any value. So, we remove 3 from Y’s domain. So Y has just 2 in its domain! Now notice that Y’s domain has changed. So, we re-consider Z (since anytime Y’s domain changes, it is possible that Z’s domain gets affected). Sure enough, Z=2 no longer works since Y can only be 2 and so it can’t take any value if Z=2. So, we remove 2 also from Z’s domain. So, Z’s domain is now just 3! Now, we go to X. X can’t be 2 or 3 (since for either of those values, Y will not have a value. So, X has domain 1! Notice that in this case, arc-consistency SOLVES the problem, since X,Y and Z have exactly 1 value each and that is the only possible solution. This is not always the case (see next example). X:{1,2,3} X<Y Y:{1,2,3} Y<Z Z:{1,2,3} X:{1} X<Y Y:{2} Y<Z Z:{3} You can do arc-consistency Either as pre-processing or In lieu of forward checking
25
Things discussed on board
If the constraint graph is disconnected, then you essentially have independent subproblems. For example, suppose you mixed up a coloring problem CSP with a queens problm CSP You are better off solving them separately and concatenating the results You may ask “Why should I solve them separately? Can’t my search algorithm find the independence itself? The answer is that normal search algorithms that do chronological backtracking are unable to recognize and exploit problem independence dynamically. You need “dependency directed backtracking” Another question is how to do constraint graphs when you have non-binary (ternary etc.) constraints When you have n-ary (n>2) constraints, your constraint graph is a hyper graph (with edges connecting a set rather than a pair of vertices) It is possible to convert every non-binary CSP into a binary CSP (by introducing new variables. If there is a constraint between X, Y, and Z, I can introduce a super variable called x-y and make a binary constraint between it and Z) Of course, when you do this, the resulting constraints may not be natural for someone who knows the domain Just as an assembly language program may not make as much sense to a domain expert as does a high-level language program Binary CSPs and Boolean CSPs are canonical classes of CSP in that any arbitrary CSP can be “compiled down” to an equivalent binary or boolean CSP
26
Not enough to show the correct configuration
of the 18-puzzle problem or rubik’s cube.. (although by including the list of actions as part of the state, you can support hill-climbing)
27
What is needed: --A neighborhood function The larger the neighborhood you consider, the less myopic the search (but the more costly each iteration) --A “goodness” function needs to give a value to non-solution configurations too for 8 queens: (-ve) of number of pair-wise conflicts
29
Applying min-conflicts based hill-climbing to 8-puzzle
Understand the tradeoffs in defining smaller vs. larger neighborhood Applying min-conflicts based hill-climbing to 8-puzzle Local Minima No single queen move can improve h
30
Problematic scenarios for hill-climbing
Solution(s): Random restart hill-climbing Do the non-greedy thing with some probability p>0 Use simulated annealing Ridges When the state-space landscape has local minima, any search that moves only in the greedy direction cannot be (asymptotically) complete Random walk, on the other hand, is asymptotically complete Idea: Put random walk into greedy hill-climbing
31
A greedier version of the above (pick both the best var and val):
Ideas for improving convergence: -- Random restart hill-climbing After every N iterations, start with a completely random assignment --Probabilistic greedy -with probability p do what the greedy strategy suggests -with probability (1-p) pick a random variable and change its value randomly -- p can increase as the search progresses The neighborhood 1 is subsumed by Neighborhood 2 A greedier version of the above (pick both the best var and val): For each variable v, let l(v) be the value that it can take so that the number of conflicts are minimized. Let n(v) be the number of conflicts with this value. --Pick the variable v with the lowest n(v) value. --Assign it the value l(v) 1 2 This one basically searches the 1-neighborhood of the current assignment (where k-neighborhood is all assignments that differ from the current assignment in atmost k-variable values)
32
Making Hill-Climbing Asymptotically Complete
Random restart hill-climbing Keep some bound B. When you made more than B moves, reset the search with a new random initial seed. Start again. Getting random new seed in an implicit search space is non-trivial! In 8-puzzle, if you generate a random state by making random moves from current state, you are still not truly random (as you will continue to be in one of the two components) “biased random walk”: Avoid being greedy when choosing the seed for next iteration With probability p, choose the best child; but with probability (1-p) choose one of the children randomly Use simulated annealing Similar to the previous idea—the probability p itself is increased asymptotically to one (so you are more likely to tolerate a non-greedy move in the beginning than towards the end) With random restart or the biased random walk strategies, we can solve very large problems million queen problems in under minutes!
33
“Beam search” for Hill-climbing
Hill climbing, as described, uses one seed solution that is continually updated Why not use multiple seeds? Stochastic hill-climbing uses multiple seeds (k seeds k>1). In each iteration, the neighborhoods of all k seeds are evaluated. From the neighborhood, k new seeds are selected probabilistically The probability that a seed is selected is proportional to how good it is. Not the same as running k hill-climbing searches in parallel Stochastic hill-climbing is sort of “almost” close to the way evolution seems to work with one difference Define the neighborhood in terms of the combination of pairs of current seeds (Sexual reproduction; Crossover) The probability that a seed from current generation gets to “mate” to produce offspring in the next generation is proportional to the seed’s goodness To introduce “randomness” do mutation over the offspring This type of stochastic beam-search hillclimbing algorithms are called Genetic algorithms. Genetic algorithms limit number of matings to keep the num seeds the same
34
Illustration of Genetic Algorithms in Action
Very careful modeling needed so the things emerging from crossover and mutation are still potential seeds (and not monkeys typing Hamlet) Is the “genetic” metaphor really buying anything?
35
Hill-climbing in “continuous” search spaces
Example: cube root Finding using newton- Raphson approximation Gradient descent (that you study in calculus of variations) is a special case of hill-climbing search applied to continuous search spaces The local neighborhood is defined in terms of the “gradient” or derivative of the error function. Since the error function gradient will be zero near the minimum, and higher farther from it, you tend to take smaller steps near the minimum and larger steps farther away from it. [just as you would want] Gradient descent is guranteed to converge to the global minimum if alpha (see on the right) is small, and the error function is “uni-modal” (I.e., has only one minimum). Versions of gradient-descent algorithms will be used in neuralnetwork learning. Unfortunately, the error function is NOT unimodal for multi-layer neural networks. So, you will have to change the gradient descent with ideas such as “simulated annealing” to increase the chance of reaching global minimum. Err= |x3-a| a1/3 xo X Tons of variations based on how alpha is set
36
--didn’t discuss the remaining slides--
40
N-queens vs. Boolean Satisfiability
Given nxn board, bind assignment of positions to n queens so no queen constraints are violated Assign: Each queen can take values 1..8 corresponding to its position in its column Find a complete assignment for all queens The approach we discussed is called “min-conflict” search which does hill climbing in terms of number of conflicts Given n boolean variables and m clauses that constrain the values that those variables can take Each clause is of the form [v1, ~v2, v7] Meaning that one of those must hold (either v1 is true or v7 is true or v2 is false) Find an assignment of T/F values to the n variables that ensures that all clauses are satisified So boolean variable is like a queen, T/F values are like queens positions; clauses are like queen constraints; number of violated clauses are like number of queen conflicts. You can do min-conflict search! Extremely useful in large-scale circuit verification etc.
41
Consistency and Hardness
In the worst case, a CSP can be solved efficiently (i.e., without backtracking) only if it is strongly n-consistent However, in many problems, enforcing k-consistency automatically renders the problem n-consistent as a side-effect In such a case, we can clearly see that the problem is solvable in O(nk) time (basically the time taken for pre-processing) The hardness of a CSP problem can be thought of in terms of the “degree of consistency” that needs to be enforced on that CSP before it can be solved efficiently (backtrack-free) added
42
Graph rectification as an analog for local consistency in normal search
Local consistency involves “pre-processing” the search space so later search is faster. One way we could do it for normal graph search is to do a k-lookahead from each state and revise a node’s actual distance from its neighbors Running value iteration for a few iterations has exactly this effect..
43
Consistency enforcement as inferring implicit constraints
In general, enforcing consistency involves explicitly adding constraints that are implicit given the existing constraints E.g. In enforcing 3-consistency if we find that for a particular 2-label {xi=v1 & xj=v2} there is no possible consistent value of xk, then we write this as an additional constraint {xi=v1}=> {xj != v2} [Domain reduction is just a special case] When enforcing 2-consistency (or arc-consistency), the new constraints will be of the form xi!=v1 , and so these can be represented by just contracting the domain of xi by pruning v1 from it Unearthing implicit constraints can also be interpreted as “inferring” new constraints that hold (are “entailed”) given the existing constraints In the context of boolean CSPs (I.e., propositional satisfiability problems), the analogy is even more striking since unearthing new constraints means writing down new clauses (or “facts”) that are entailed given the existing clauses This interpretation shows that consistency enforcement is just a form of “inference”/ “entailment computation” process. [Conditioning & Inference—The Yin and Yang] There is a general idea that in solving a search problem, you can interleave two different processes “inference” trying to either infer the solution itself or saying no solution exists “conditioning or enumeration”which attempts to systematically go through potential solutions looking for a real solution. Good search algorithms interleave both inference and conditioning E.g. the CSP algorithm we discussed in the class uses backtracking search (enumeration), and forward checking (inference).
44
More on arc-consistency
Arc-consistency doesn’t always imply that the CSP has a solution or that there is no search required. In the previous example, if each variable had domains 1,2,3,4, then at the end of enforcing arc-consistency, each variable will still have 2 values in its domain—thus necessitating search. Here is another example which shows that the search may find that there is no solution for the CSP, even though it is arc-consistent. Here is a binary CSP that Is arc-consistent but has no Solution.
45
Approximating K-Consistency
K-consistency enforcement takes O(nk) effort. Since we are doing this only to reduce the overall search effort (rather than to get a separate badge for consistency), we can cut corners [Directional K-consistency] Given a fixed permutation (order) over the n variables, assignment to first k-1 variables can be extended to the k-th variable Clearly cheaper than K-consistency If we know that the search will consider variables in a fixed order, then enforcing directional consistency w.r.t. that order is better. [Partial K-consistency enforcement] Run the K-consistency enforcement algorithm partially (i.e., stop before reaching fixed-point) Put a time-limit on the consistency computation Recall how we could cut corners in computing Pattern Database heuristics by spending only a limited time on the PDB and substituting other cheaper heuristics in their place Only do one pass of the consistency enforcement This is what “forward checking” does..
46
Arc-Consistency > directed arc-consistency > Forward Checking
> : “stronger than” DAC:For each variable u, we only consider The effects on the variables that Come after u in the ordering After directional arc-consistency Assuming the variable order X<Y<Z X:{1,2} X:{1,2,3} X<Y X<Y Y:{2} Y:{1,2,3} Y<Z Y<Z AC is the strongest It propagates Changes in all directions Until we reach a fixed point (no further changes) Z:{1,2,3} Z:{1,2,3} After forward checking assuming X<Y<Z, and X has been set to value 1 After arc-consistency X:{1} X<Y X:{1} Y:{2} X<Y FC: We start with the current Assignment for some of the Variables, and only consider their Effects on the future variables. (only a single level Propagation is done. After we find that a value of Y is pruned, we don’t try To see if that changes domain of Z) Y<Z Y:{2,3} Y<Z Z:{3} Z:{1,2,3} ADDED AFTER CLASS IMPORTANT
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.