Probabilistic Inference Modulo Theories

Probabilistic Inference Modulo Theories
Rodrigo de Salvo - Braz Ciaran O’Reilly - Vibhav Gogate - Rina Dechter Presented by Shaked Or Special thanks to Ian Gent for watched literals explanation

Lecture Objective Introduce Probabilistic Inference
Explain why current approaches are lacking Introduce SGDPLL Seeing SGDPLL applications in probabilistic calculations

Lecture layout: What is Probabilistic Inference? Definitions
Real life example Boolean Satisfiability problem (SAT) – reminder Satisfiability Modulo Theories (SMT)– SAT generalization SGDPLL(T) – one step further Generalize the problem further The SGDPL algorithm Applications in probability Experiments and Optimizations

Probabilistic Inference
a=P(Retake Course) As we find more evidence, the probability Changes

Probabilistic Inference - Formally
P - probability distribution a - event {Ei |1 ≤ i ≤ n} – evidence Given new incoming facts (evidence), we derive the posterior probability of a:

Lets look at an example of a “real” problem. Given N voters, the Amount of voters that like an incumbent Senator is influenced by: The number of terror attacks The Dow Jones index The number of newly created jobs The number of people who like the challenging candidate.

Let us take: Attacks ~ Uniform[20] newJobs ~ Uniform[10000] Dow ~ Uniform[11000,18000] likeChallenger ~ Uniform [N] And take the influence on P(likeIncumbant) to be:

We want to calculate examples like: For a constant value of new jobs While treating new jobs as a parameter

Problems with current approaches
Regular approach: building a database of the problem In interesting problems, database will be too large Requires many calculations to get parametric relationships Logical approach: Can help reduce redundancies while inferring data from logical relationships Has really efficient solvers for private cases Not expressive enough for real problems We are going to solve this last problem.

Reminder – SAT problem Given a CNF: xi are Boolean variables X=(x1,x2,x3)

Reminder – SAT problem Given a CNF: xi are Boolean variables X=(x1,x2,x3) We ask: Is f satisfiable? Meaning:

Reminder – SAT problem Given a CNF: xi are Boolean variables X=(x1,x2,x3) We ask: Is f satisfiable? Meaning: In this case, for We get:

Generalization - SMT We add literals from a first order theory T, but still connect them with Boolean operators. Examples: T is the theory of difference arithmetic, meaning {≤,≥,<,>,=,+,x}

SMT -formally Given: a first order logic theory T
an expression containing literals of T and Boolean operators. We ask if that expression is satisfiable. In SAT, assignment is independent In SMT, not always the case

SMT -formally Example:

SMT -formally Example: We want g(x)=T and h(x)=F But

SMT -formally Variables are not only True and False:

SMT - Currently There are very efficient SMT solvers
However SMT is not expressive enough Can’t use it for probability

So far SMT Probabilistic Inference SAT

And now SGDPLL SMT Probabilistic Inference SAT

Taking it one step further – SGDPLL(T)
We can be: symbolic parametric quantifier parametric input theory parametric. Example:

We can be: symbolic parametric quantifier parametric input theory parametric. Our answer can be in terms of y , not all terms are implicitly quantified Possible Answer:

We can be: symbolic parametric quantifier parametric input theory parametric. Our answer can be in terms of y , not all terms are implicitly quantified Gives as bigger problem solving capabilities.

In SAT,SMT

We can be: symbolic parametric quantifier parametric input theory parametric. We don’t only ask for exists, we can ask for summation etc

We can be: symbolic parametric quantifier parametric input theory parametric. Caveat: We can only use associative quantifiers. Examples:

We can be: symbolic parametric quantifier parametric input theory parametric. Not only Boolean operators. Can use if, then, else, x, etc.

SGDPLL(T) – formally SGDPLL(T) takes a theory T=(TC ,TL) TC = Constraint theory TL = Input Theory

SGDPLL(T) – formally SGDPLL(T) Recieves a T Problem = Associative Quantifier x = indexed variable y = free variables ( y=(y1 ,y2 ,…) ) F(x,y) = Conjunction of literals in TC E(x,y) = Expression in TL connecting literals in TC

SGDPLL(T) – formally Output of SGDPLL(T) is a T solution. T Solution:
A quantifier free expression in TL Equivalent to T problem

SGDPLL(T) – formally Example: TC = Difference Arithmetic TL = F(x,y,w)= E(x,y,w)=

SGDPLL(T) – formally T problem: T Solution:

SGDPLL algorithm – formally
Base Cases and non base Cases: Base Case: E(x,y) contains no Literals in TC. Non base case: The general case.

SGDPLL algorithm - illustration

Base Case: Now its time for the base case solver

SGDPLL algorithm – illustration
The problem: Becomes:

SGDPLL algorithm – formally
Requirements: Base quantifiers solver, quantifiers must be associative. Base case solvers. Consistency checkers Note that in the following steps we do theory independent operations. This is the power of SGDPLL, it gives you the power to do complex calculation when supplying only the basic tools.

SGDPLL algorithm – Algorithm
SGDPLL Psuedo Code: Choose a splitter literal L If L contains x, do Quantifier Splitting Else do if splitting

Quantifier splitting: If x is in L, Create 2 sub-problems. First with L Second with not L Combine them using Quantifier

IF splitting: If x is not in L, Create 2 sub-problems. First with implication of L Second with implication of not L Combine them using the if clause

Real life example Boolean Satisfiability problem (SAT) – reminder Satisfiability Modulo Theories – SAT generalization SGDPLL(T) – one step further Generalize the problem further The SGDPL algorithm Applications in probability Experiments and Optimizations

SGDPLL algorithm – Applications in probability
Discreet probabilistic calculations can be reduced to SGDLL summation problems. Example: Calculating marginal probability P= joint probability distribution X=X1,X2X3…..

Example: calculating posterior probability, useful in probabilistic inference. P and X like before We solve this with 2 SGDPLL(T)s One to get a summation free expression in the numerator One for the denominator

SGDPLL algorithm – Back to our problem
Given N voters, the Amount of voters that like an incumbent Senator is influenced by: The number of terror attacks The Dow Jones index The number of newly created jobs The number of people who like the challenging candidate.

Let us take: Attacks ~ Uniform[20] newJobs ~ Uniform[10000] Dow ~ Uniform[11000,18000] likeChallenger ~ Uniform[N] And take the influence on P(likeIncumbant) to be:

Since our Constraints are expressed in a language receivable by SGDPLL We don’t have to generate a database We can just transform the rules to SGDPLL format. For example, Attacks ~ Uniform[20] become: If attacks ≥ 0 and attacks ≤ 20 then 1/21 else 0

P(likeIncumbant | dow, newjobs, attacks, likeChallenger) After converting, we can easily calculate complex queries on this system.

We want to calculate examples like: Now, this is much simpler: For N=108 we get the anwer:

This is a T Solution: Expression in TL No quantifier Equivalent to problem =

This answer is parametric, newJob is a free variable here We did not have to iterate over all values of newJob We did not have to iterate over all values of attacks etc for posterior probability =

Experimental results SGDPLL was compared against a state of the art probabilistic inference solver, VEC. Given the aforementioned system and the query: SGDPLL took 2 seconds to complete the task, Constant in N. For N=108, VEC could not solve the problem exactly because the database it needed to create was too big to instantiate. For N = 500, newJobs ~ uniform[100], dow~[110,180]: Its took VEC 51 seconds to complete the task.

SGDPLL algorithm – Possible optimization
Implication trimming: Sometimes we get redundancies For example: Solution: We can keep a conjunction of all chosen literals If we find contradiction, we prince the search Notice that this can be used by any SMT solver (and there are many) so we can always use a state of the art SMT solver.

Unit propagation: In SAT, if we have a clause with one unassigned literal, we assume it to be true. For example: Assume a=T

Watched literals: There are smart data structures that help us decide when to attempt unit propagation. Idea: Unit propagation only fires when all except 1 are false So, if 2 vars are unassigned or true, we do nothing.

Watched literals: Data structure: Array with 2 triggers per clause. Clause: D C B A T/F

Watched literals: If a is assigned false, update pointer/trigger D C B A T/F F

Watched literals: When back tracking, don’t move back D C B A T/F

Watched literals: If other variables are assigned, do nothing D C B A F T/F

Watched literals: If we cant find something new and unassigned for the trigger… D C B A F T/F

Unit Propa gation Watched literals: D C B A F T

Watched literals: Not useful for small problems (3-SAT) Only have 2 watch 2 literals per or-clause Rest of the time no work Useful for big problems

References Original Paper Watched literal presentation:

Probabilistic Inference Modulo Theories

Similar presentations

Presentation on theme: "Probabilistic Inference Modulo Theories"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Probabilistic Inference Modulo Theories

Similar presentations

Presentation on theme: "Probabilistic Inference Modulo Theories"— Presentation transcript:

Similar presentations

About project

Feedback