Scalable Knowledge Representation and Reasoning Systems Henry Kautz AT&T Shannon Laboratories.

Slides:



Advertisements
Similar presentations
Planning Module THREE: Planning, Production Systems,Expert Systems, Uncertainty Dr M M Awais.
Advertisements

SAT-Based Planning Using Propositional SAT- Solvers to Search for Plans.
Dana Nau: Lecture slides for Automated Planning Licensed under the Creative Commons Attribution-NonCommercial-ShareAlike License:
UIUC CS 497: Section EA Lecture #2 Reasoning in Artificial Intelligence Professor: Eyal Amir Spring Semester 2004.
Situation Calculus for Action Descriptions We talked about STRIPS representations for actions. Another common representation is called the Situation Calculus.
Methods of Proof Chapter 7, second half.. Proof methods Proof methods divide into (roughly) two kinds: Application of inference rules: Legitimate (sound)
CPSC 422, Lecture 21Slide 1 Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 21 Mar, 4, 2015 Slide credit: some slides adapted from Stuart.
1 Backdoor Sets in SAT Instances Ryan Williams Carnegie Mellon University Joint work in IJCAI03 with: Carla Gomes and Bart Selman Cornell University.
Dynamic Restarts Optimal Randomized Restart Policies with Observation Henry Kautz, Eric Horvitz, Yongshao Ruan, Carla Gomes and Bart Selman.
The Theory of NP-Completeness
Proof methods Proof methods divide into (roughly) two kinds: –Application of inference rules Legitimate (sound) generation of new sentences from old Proof.
Constraint Logic Programming Ryan Kinworthy. Overview Introduction Logic Programming LP as a constraint programming language Constraint Logic Programming.
3/25  Monday 3/31 st 11:30AM BYENG 210 Talk by Dana Nau Planning for Interactions among Autonomous Agents.
CP Formal Models of Heavy-Tailed Behavior in Combinatorial Search Hubie Chen, Carla P. Gomes, and Bart Selman
08/1 Foundations of AI 8. Satisfiability and Model Construction Davis-Putnam, Phase Transitions, GSAT Wolfram Burgard and Bernhard Nebel.
Ryan Kinworthy 2/26/20031 Chapter 7- Local Search part 1 Ryan Kinworthy CSCE Advanced Constraint Processing.
Methods of Proof Chapter 7, second half.
AAAI00 Austin, Texas Generating Satisfiable Problem Instances Dimitris Achlioptas Microsoft Carla P. Gomes Cornell University Henry Kautz University of.
Analysis of Algorithms CS 477/677
1 Backdoors To Typical Case Complexity Ryan Williams Carnegie Mellon University Joint work with: Carla Gomes and Bart Selman Cornell University.
1 BLACKBOX: A New Paradigm for Planning Bart Selman Cornell University.
Encoding Domain Knowledge in the Planning as Satisfiability Framework Bart Selman Cornell University.
1 Compute-Intensive Methods in AI: New Opportunities for Reasoning and Search Bart Selman Cornell University
1 CS 4700: Foundations of Artificial Intelligence Carla P. Gomes Module: Satisfiability (Reading R&N: Chapter 7)
1 BLACKBOX: A New Approach to the Application of Theorem Proving to Problem Solving Bart Selman Cornell University Joint work with Henry Kautz AT&T Labs.
Knowledge Representation II (Inference in Propositional Logic) CSE 473 Continued…
Ryan Kinworthy 2/26/20031 Chapter 7- Local Search part 2 Ryan Kinworthy CSCE Advanced Constraint Processing.
1 Combinatorial Problems in Cooperative Control: Complexity and Scalability Carla Gomes and Bart Selman Cornell University Muri Meeting March 2002.
Logic - Part 2 CSE 573. © Daniel S. Weld 2 Reading Already assigned R&N ch 5, 7, 8, 11 thru 11.2 For next time R&N 9.1, 9.2, 11.4 [optional 11.5]
Distributions of Randomized Backtrack Search Key Properties: I Erratic behavior of mean II Distributions have “heavy tails”.
Planning as Satisfiability CS Outline 0. Overview of Planning 1. Modeling and Solving Planning Problems as SAT - SATPLAN 2. Improved Encodings using.
The Role of Domain-Specific Knowledge in the Planning as Satisfiability Framework Henry Kautz AT&T Labs Bart Selman Cornell University.
1 The Theory of NP-Completeness 2012/11/6 P: the class of problems which can be solved by a deterministic polynomial algorithm. NP : the class of decision.
Satisfiability and State- Transition Systems: An AI Perspective Henry Kautz University of Washington.
Boolean Satisfiability and SAT Solvers
1 Planning as Satisfiability Alan Fern * * Based in part on slides by Stuart Russell and Dana Nau  Review of propositional logic (see chapter 7)  Planning.
CHAPTERS 7, 8 Oliver Schulte Logical Inference: Through Proof to Truth.
Solvers for the Problem of Boolean Satisfiability (SAT) Will Klieber Aug 31, 2011 TexPoint fonts used in EMF. Read the TexPoint manual before you.
Open-Loop Planning as Satisfiability Henry Kautz AT&T Labs.
Heavy-Tailed Phenomena in Satisfiability and Constraint Satisfaction Problems by Carla P. Gomes, Bart Selman, Nuno Crato and henry Kautz Presented by Yunho.
Planning as Propositional Satisfiabililty Brian C. Williams Oct. 30 th, J/6.834J GSAT, Graphplan and WalkSAT Based on slides from Bart Selman.
1 The Theory of NP-Completeness 2 Cook ’ s Theorem (1971) Prof. Cook Toronto U. Receiving Turing Award (1982) Discussing difficult problems: worst case.
Explorations in Artificial Intelligence Prof. Carla P. Gomes Module Logic Representations.
Logical Agents Chapter 7. Knowledge bases Knowledge base (KB): set of sentences in a formal language Inference: deriving new sentences from the KB. E.g.:
LDK R Logics for Data and Knowledge Representation Propositional Logic: Reasoning First version by Alessandro Agostini and Fausto Giunchiglia Second version.
On the Relation between SAT and BDDs for Equivalence Checking Sherief Reda Rolf Drechsler Alex Orailoglu Computer Science & Engineering Dept. University.
CS 5411 Compilation Approaches to AI Planning 1 José Luis Ambite* Some slides are taken from presentations by Kautz and Selman. Please visit their.
SAT 2009 Ashish Sabharwal Backdoors in the Context of Learning (short paper) Bistra Dilkina, Carla P. Gomes, Ashish Sabharwal Cornell University SAT-09.
CPSC 422, Lecture 21Slide 1 Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 21 Oct, 30, 2015 Slide credit: some slides adapted from Stuart.
© Copyright 2008 STI INNSBRUCK Intelligent Systems Propositional Logic.
© Daniel S. Weld 1 Logistics Problem Set 2 Due Wed A few KR problems Robocode 1.Form teams of 2 people 2.Write design document.
Accelerating Random Walks Wei Wei and Bart Selman.
1 Propositional Logic Limits The expressive power of propositional logic is limited. The assumption is that everything can be expressed by simple facts.
NPC.
Logical Agents Chapter 7. Outline Knowledge-based agents Propositional (Boolean) logic Equivalence, validity, satisfiability Inference rules and theorem.
AAAI of 20 Deconstructing Planning as Satisfiability Henry Kautz University of Rochester in collaboration with Bart Selman and Jöerg Hoffmann.
Local Search Methods for SAT Geoffrey Levine March 11, 2004.
Inference in Propositional Logic (and Intro to SAT) CSE 473.
Proof Methods for Propositional Logic CIS 391 – Intro to Artificial Intelligence.
Knowledge Repn. & Reasoning Lecture #9: Propositional Logic UIUC CS 498: Section EA Professor: Eyal Amir Fall Semester 2005.
Dana Nau: Lecture slides for Automated Planning Licensed under the Creative Commons Attribution-NonCommercial-ShareAlike License:
1 P NP P^#P PSPACE NP-complete: SAT, propositional reasoning, scheduling, graph coloring, puzzles, … PSPACE-complete: QBF, planning, chess (bounded), …
Logical Agents. Outline Knowledge-based agents Logic in general - models and entailment Propositional (Boolean) logic Equivalence, validity, satisfiability.
Inference in Propositional Logic (and Intro to SAT)
Inference and search for the propositional satisfiability problem
Planning as Satisfiability
Planning as Search State Space Plan Space Algorihtm Progression
Compute-Intensive Methods in AI: New Opportunities for Reasoning and Search Bart Selman Cornell University 1 1.
Graphplan/ SATPlan Chapter
Graphplan/ SATPlan Chapter
Presentation transcript:

Scalable Knowledge Representation and Reasoning Systems Henry Kautz AT&T Shannon Laboratories

Introduction In recent years, we've seen substantial progress in scaling up knowledge representation and reasoning systems Shift from toy domains to real-world applications autonomous systems - NASA Remote Agent “just in time” manufacturing - I2, PeopleSoft deductive approaches to verification - Nitpick (D. Jackson), bounded model checking (E. Clarke) solutions to open problems in mathematics - group theory (W. McCune, H. Zhang) New emphasis on propositional reasoning and search

Approaches to Scaling Up KR&R Traditional approach: specialized languages / specialized reasoning algorithms difficult to share / evaluate results New direction: compile combinatorial reasoning problems into a common propositional form (SAT) apply new, highly efficient general search engines Combinatorial Task SAT Encoding SAT Solver Decoder

Methodology Compare with use of linear / integer programming packages: emphasis on mathematical modeling after modeling, problem is handed to a state of the art solver Compare with reasoning under uncertainty: convergence to Bayes nets and MDP's

Would specialized solver not be better? Perhaps theoretically, but often not in practice Rapid evolution of fast solvers 1990: 100 variable hard SAT problems 1999: 10, ,000 variables competitions encourage sharing of algorithms and implementations Germany 91 / China 96 / DIMACS-93/97/98 Encodings can compensate for much of the loss due to going to a uniform representation

Two Kinds of Knowledge Compilation Compilation to a tractable subset of logic shift inference costs offline guaranteed fast run-time response E.g.: real-time diagnosis for NASA Deep Space One - 35 msec response time! fundamental limits to tractable compilation Compilation to a minimal combinatorial core can reduce SAT size by compiling together problem spec & control knowledge inference for core still NP-hard new randomized SAT algorithms - low exponential growth E.g.: optimal planning with states!

OUTLINE I. Compilation to tractable languages Horn approximations Fundamental limits II. Compilation to a combinatorial core SATPLAN III. Improved encodings Compiling control knowledge IV. Improved SAT solvers Randomized restarts

I. Compilation to Tractable Languages

Consider problem of determining if a query follows from a knowledge base KB q ? Highly expressive KB languages make querying intractable ( ignition_on & engine_off )  ( battery_dead V tank_empty ) require general CNF - query answering is NP- complete Less expressive languages allow polynomial time query answering Horn clauses, binary clauses, DNF Expressiveness vs. Complexity Tradeoff

Tractable Knowledge Compilation Goal: guaranteed fast online query answering cost shifted to offline compilation Exact compilation often not possible Can approximate original theory yet retain soundness / completeness for queries (Kautz & Selman 1993, 1996, 1999; Papadimitriou 1994) expressive source language tractable target language

Example: Compilation into Horn Source: clausal propositional theories Inference: NP-complete –example: (~a V ~b V c V d) –equivalently: (a & b)  (c V d) Target: Horn theories Inference: linear time –at most one positive literal per clause –example: (a & b)  c strictly less expressive

Horn Bounds Idea: compile CNF into a pair of Horn theories that approximate it Model = truth assignment which satisfies a theory Can logically bound theory from above and below: LB S UB lower bound = fewer models = logically stronger upper bound = more models = logically weaker BEST bounds: LUB and GLB

S q ? If LUB q then S q (linear time) If GLB q then S q (linear time) Otherwise, use S directly (or return "don't know") Queries answered in linear time lead to improvement in overall response time to a series of queries Using Approximations for Query Answering

Computing Horn Approximations Theorem: Computing LUB or GLB is NP-hard Amortize cost over total set of queries Query-algorithm still correct if weaker bounds are used anytime computation of bounds desirable

Computing the GLB Horn strengthening: r  (p V q) has two Horn-strengthenings: r  p r  q Horn-strengthening of a theory = conjunction of one Horn-strengthening of each clause Theorem: Each LB of S is equivalent to some Horn- strengthening of S. Algorithm: search space of Horn-strengthenings for a local maxima (GLB)

Computing the LUB Basic strategy: Compute all resolvents of original theory, and collect all Horn resolvents Problem: Even a Horn theory can have exponentially many Horn resolvents Solution: Resolve only pairs of clauses where exactly one clause is Horn Theorem: Method is complete

Properties of Bounds GLB Anytime algorithm Not unique - any GLB may be used for query answering Size of GLB  size of original theory LUB Anytime algorithm Is unique No space blow-up for Horn Can construct non-Horn theories with exponentially larger LUB

Empirical Evaluation 1. Hard random theories, random queries 2. Plan-recognition domain e.g.: query (obs1 & obs2)  (goal1 V goal2) ? Time to answer 1000 queries original with bounds rand rand plan Cost of compilation amortized in less than 500 queries

Limits of Tractable Compilation Some theories have an exponentially-larger clausal form LUB QUESTION: Can we always find a clever way to keep the LUB small (new variables, non- clausal form, structure sharing,...)? Theorem: There do exist theories whose Horn LUB is inherently large: any representation that enables polytime inference is exponentially large –Proof based on non-uniform circuit complexity - if false, polynomial hierarchy collapses to  2

Other Tractable Target Languages Model-based representations (Kautz & Selman 1992, Dechter & Pear 1992, Papadimitriou 1994, Roth & Khardon 1996, Mannila 1999, Eiter 1999) Prime Implicates (Reiter & DeKleer 1987, del Val 1995, Marquis 1996, Williams 1998) Compilation from nonmonotonic logics (Nerode 1995, Cadoli & Donini 1996) Similar limits to compilability hold for all!

Truly Combinatorial Problems Tractable compilation not a universal solution for building scalable KRR systems often useful, but theoretical and empirical limits not applicable if you only care about a single query: no opportunity to amortize cost of compilation Sometimes must face NP-hard reasoning problems head on will describe how advances in modeling and SAT solvers are pushing the envelope of the size problems that can be handled in practice

II. Compilation to a Combinatorial Core

Example : Planning Planning: find a (partially) ordered set of actions that transform a given initial state to a specified goal state. in most general case, can cover most forms of problem solving scheduling: fixes set of actions, need to find optimal total ordering planning problems typically highly non-linear, require combinatorial search

Some Applications of Planning Autonomous systems Deep Space One Remote Agent (Williams & Nayak 1997) Mission planning (Muscettola 1998) Natural language understanding TRAINS (Allen 1998) - mixed initiative dialog Software agents Softbots (Etzioni 1994) Goal-driven characters in games (Nareyek 1998) Help systems - plan recognition (Kautz 1989) Manufacturing Supply chain management (Crawford 1998) Software understanding / verification Bug-finding (goal = undesired state) (Jackson 1998)

State-space Planning State = complete truth assignment to a set of variables (fluents) Goal = partial truth assignment (set of states) Operator = a partial function State  State specified by three sets of variables: precondition, add list, delete list (STRIPS, Fikes & Nilsson 1971)

Abdundance of Negative Complexity Results I. Domain-independent planning: PSPACE- complete or worse (Chapman 1987; Bylander 1991; Backstrom 1993) II. Bounded-length planning: NP-complete (Chenoweth 1991; Gupta and Nau 1992) III. Approximate planning: NP-complete or worse (Selman 1994)

Practice Traditional domain-independent planners can generate plans of only a few steps. Most practical systems try to eliminate search: Tractable compilation Custom, domain-specific algorithms Scaling remains problematic when state space is large or not well understood!

Planning as Satisfiability SAT encodings are designed so that plans correspond to satisfying assignments Use recent efficient satisfiability procedures (systematic and stochastic) to solve Evaluation performance on benchmark instances

SATPLAN axiom schemas instantiated propositional clauses satisfying model plan length problem description SAT engine(s) instantiate interpret

SAT Encodings Propositional CNF: no variables or quantifiers Sets of clauses specified by axiom schemas Use modeling conventions (Kautz & Selman 1996) Compile STRIPS operators (Kautz & Selman 1999) Discrete time, modeled by integers upper bound on number of time steps predicates indexed by time at which fluent holds / action begins –each action takes 1 time step –many actions may occur at the same step fly(Plane, City1, City2, i)  at(Plane, City2, i +1)

Solution to a Planning Problem A solution is specified by any model (satisfying truth assignment) of the conjunction of the axioms describing the initial state, goal state, and operators Easy to convert back to a STRIPS-style plan

Satisfiability Testing Procedures Systematic, complete procedures Davis-Putnam (DP) –backtrack search + unit propagation (1961) –little progress until then explosion of improved algorithms & implementations satz (1997) - best branching heuristic See SATLIB 1998 / Hoos & Stutzle : csat, modoc, rel_sat, sato,... Stochastic, incomplete procedures Walksat (Kautz, Selman & Cohen 1993) –greedy local search + noise to escape local minima –outperforms systematic algorithms on random formulas, graph coloring, … (DIMACS 1993, 1997)

Walksat Procedure Start with random initial assignment. Pick a random unsatisfied clause. Select and flip a variable from that clause: With probability p, pick a random variable. With probability 1-p, pick greedily –a variable that minimizes the number of unsatisfied clauses Repeat until time limit reached.

Planning Benchmark Test Set Extension of Graphplan benchmark set Graphplan (Blum & Furst 1995) - best domain-independent state-space planning algorithm logistics - complex, highly-parallel transportation domain, ranging up to 14 time slots, unlimited parallelism 2,165 possible actions per time slot optimal solutions containing 150 distinct actions Problems of this size (10 18 configurations) not previously handled by any state-space planning system

Scaling Up Logistics Planning

What SATPLAN Shows General propositional theorem provers can compete with state of the art specialized planning systems New, highly tuned variations of DP surprising powerful –result of sharing ideas and code in large SAT/CSP research community –specialized engines can catch up, but by then new general techniques Radically new stochastic approaches to SAT can provide very low exponential scaling –2+ orders magnitude speedup on hard benchmark problems Reflects general shift from first-order & non-standard logics to propositional logic as basis of scalable KRR systems

Further Paths to Scale-Up Efficient representations and new SAT engines extend the range of domain-independent planning Ways for further improvement: Better SAT encodings Better general search algorithms

III. Improved Encodings: Compiling Control Knowledge

Kinds of Control Knowledge About domain itself a truck is only in one location airplanes are always at some airport About good plans do not remove a package from its destination location do not unload a package and immediate load it again About how to search plan air routes before land routes work on hardest goals first

Expressing Knowledge Such information is traditionally incorporated in the planning algorithm itself –or in a special programming language Instead: use additional declarative axioms –(Bacchus 1995; Kautz 1998; Chen, Kautz, & Selman 1999) Problem instance: operator axioms + initial and goal axioms + control axioms Control knowledge  constraints on search and solution spaces Independent of any search engine strategy

Axiomatic Control Knowledge State Invariant: A truck is at only one location at(truck,loc1,i) & loc1  loc2   at(truck,loc2,i) Optimality: Do not return a package to a location at(pkg,loc,i) &  at(pkg,loc,i+1) & i<j   at(pkg,loc,j) Simplifying Assumption: Once a truck is loaded, it should immediately move  in(pkg,truck,i) & in(pkg,truck,i+1) & at(truck,loc,i+1)   at(truck,loc,i+2)

Adding Control Kx to SATPLAN Problem Specification Axioms Control Knowledge Axioms Instantiated Clauses SAT Simplifier SAT Engine SAT “Core” As control knowledge increases, Core shrinks!

Logistics - Control Knowledge

Scale Up with Compiled Control Knowledge Significant scale-up using axiomatic control knowledge Same knowledge useful for both systematic and local search engines –simple DP now scales from to states –order of magnitude speedup for Walksat Control axioms summarize general features of domain / good plans: not a detailed program! Obtained benefits using only “admissible” control axioms: no loss in solution quality (Cheng, Kautz, & Selman 1999) Many kinds of control knowledge can be created automatically Machine learning (Minton 1988, Etzioni 1993, Weld 1994, Kambhampati 1996) Type inference (Fox & Long 1998, Rintanen 1998) Reachability analysis (Kautz & Selman 1999)

IV. Improved SAT Solvers: Randomized Restarts

Background Combinatorial search methods often exhibit a remarkable variability in performance. It is common to observe significant differences between: different heuristics same heuristic on different instances different runs of same heuristic with different random seeds

Example: SATZ

Preview of Strategy We’ll put variability / unpredictability to our advantage via randomization / averaging.

Cost Distributions Consider distribution of running times of backtrack search on a large set of “equivalent” problem instances renumber variables change random seed used to break ties Observation (Gomes 1997): distributions often have heavy tails infinite variance mean increases without limit probability of long runs decays by power law (Pareto-Levy), rather than exponentially (Normal)

Heavy-Tailed Distributions … infinite variance … infinite mean Introduced by Pareto in the 1920’s “probabilistic curiosity” Mandelbrot established the use of heavy-tailed distributions to model real-world fractal phenomena. Examples: stock-market, earth-quakes, weather,...

How to Check for “Heavy Tails”? Log-Log plot of tail of distribution should be approximately linear. Slope gives value of infinite mean and infinite variance infinite mean and infinite variance infinite variance infinite variance

Heavy Tails Bad scaling of systematic solvers can be caused by heavy tailed distributions Deterministic algorithms get stuck on particular instances but that same instance might be easy for a different deterministic algorithm! Expected (mean) solution time increases without limit over large distributions

Randomized Restarts Solution: randomize the systematic solver Add noise to the heuristic branching (variable choice) function Cutoff and restart search after a fixed number of backtracks Provably Eliminates heavy tails In practice: rapid restarts with low cutoff can dramatically improve performance (Gomes 1996, Gomes, Kautz, and Selman 1997, 1998)

Rapid Restart on LOG.D Note Log Scale: Exponential speedup!

Increased Predictability

Overall insight: Randomized tie-breaking with rapid restarts can boost systematic search Related analysis: Luby & Zuckerman 1993; Alt & Karp Other applications: sports scheduling, circuit synthesis, quasigroup competion, …

Conclusions Discussed approaches to scalable KRR systems based on propositional reasoning and search Shift to 10,000+ variables and 10 6 clauses has opened up new applications Methodology: Model as SAT; Compile away as much complexity as possible Use off-the-shelf SAT Solver for remaining core – Analogous to LP approaches

Conclusions, cont. Example: AI planning / SATPLAN system Order of magnitude improvement (last 3yrs): 10 step to 200 step optimal plans Huge economic impact possible with 2 more! up to 20,000 steps... Discussed themes in Encodings & Solvers Local search Control knowledge Heavy-tails / Randomized restarts

Tractable Knowledge Compilation: Summary Many techniques have been developed for compiling general KR languages to computationally tractable languages Horn approximations (Kautz & Selman 1993, Cadoli 1994, Papadimitriou 1994) Model-based representations (Kautz & Selman 1992, Dechter & Pearl 1992, Roth & Khardon 1996, Mannila 1999, Eiter 1999) Prime Implicates (Reiter & DeKleer 1987, del Val 1995, Marquis 1996, Williams 1998)

Limits to Compilability While practical for some domains, there are fundamental theoretical limitations to the approach some KB’s cannot be compiled into a tractable form unless polynomial hierarchy collapses (Kautz) Sometimes must face NP-hard reasoning problems head on will describe how advances in modeling and SAT solvers are pushing the envelope of the size problems that can be handled in practice

Logistics: Increased Predictability

Example: SATZ