BLOG: Probabilistic Models with Unknown Objects Brian Milch Harvard CS 282 November 29, 2007 1.

Slides:

Advertisements

Similar presentations

Probabilistic models Jouni Tuomisto THL. Outline Deterministic models with probabilistic parameters Hierarchical Bayesian models Bayesian belief nets.

Advertisements

Naïve Bayes. Bayesian Reasoning Bayesian reasoning provides a probabilistic approach to inference. It is based on the assumption that the quantities of.

Hierarchical Dirichlet Processes

Bayesian Estimation in MARK

Gibbs sampling in open-universe stochastic languages Nimar S. Arora Rodrigo de Salvo Braz Erik Sudderth Stuart Russell.

Automatic Inference in BLOG Nimar S. Arora University of California, Berkeley Stuart Russell University of California, Berkeley Erik Sudderth Brown University.

Gibbs Sampling Qianji Zheng Oct. 5th, 2010.

Introduction of Probabilistic Reasoning and Bayesian Networks

CS 355 – Programming Languages

Markov Chains 1.

Introduction to Sampling based inference and MCMC Ata Kaban School of Computer Science The University of Birmingham.

Computational statistics 2009 Random walk. Computational statistics 2009 Random walk with absorbing barrier.

The University of Texas at Austin, CS 395T, Spring 2008, Prof. William H. Press IMPRS Summer School 2009, Prof. William H. Press 1 4th IMPRS Astronomy.

CSE 574 – Artificial Intelligence II Statistical Relational Learning Instructor: Pedro Domingos.

1 Learning Entity Specific Models Stefan Niculescu Carnegie Mellon University November, 2003.

Part 2 of 3: Bayesian Network and Dynamic Bayesian Network.

1 Department of Computer Science and Engineering, University of South Carolina Issues for Discussion and Work Jan 2007  Choose meeting time.

5/25/2005EE562 EE562 ARTIFICIAL INTELLIGENCE FOR ENGINEERS Lecture 16, 6/1/2005 University of Washington, Department of Electrical Engineering Spring 2005.

1 CILOG User Manual Bayesian Networks Seminar Sep 7th, 2006.

BLOG: Probabilistic Models with Unknown Objects Brian Milch CS /6/04 Joint work with Bhaskara Marthi, David Sontag, Daniel Ong, Andrey Kolobov, and.

Announcements Homework 8 is out Final Contest (Optional)

Describing Syntax and Semantics

1 Bayesian Networks Chapter ; 14.4 CS 63 Adapted from slides by Tim Finin and Marie desJardins. Some material borrowed from Lise Getoor.

Approximate Inference 2: Monte Carlo Markov Chain

Topics Combining probability and first-order logic –BLOG and DBLOG Learning very complex behaviors –ALisp: hierarchical RL with partial programs State.

Introduction to MCMC and BUGS. Computational problems More parameters -> even more parameter combinations Exact computation and grid approximation become.

Relational Probability Models Brian Milch MIT 9.66 November 27, 2007.

First-Order Probabilistic Languages: Into the Unknown Brian Milch and Stuart Russell University of California at Berkeley, USA August 27, 2006 Based on.

A Logical Formulation of PRMs. Example Institute(InstId,Type) Researcher(RID,Area,Salary,InstID) Paper(PaperId,Topic) Author(RId,PaperID) Cites(PaperId1,PaperId2)

1 First-Order Probabilistic Models Brian Milch 9.66: Computational Cognitive Science December 7, 2006.

IJCAI 2003 Workshop on Learning Statistical Models from Relational Data First-Order Probabilistic Models for Information Extraction Advisor: Hsin-His Chen.

Markov Logic And other SRL Approaches

Unknown Objects and BLOG Brian Milch MIT IPAM Summer School July 16, 2007.

Kansas State University Department of Computing and Information Sciences CIS 730: Introduction to Artificial Intelligence Lecture 28 of 41 Friday, 22 October.

Bayes’ Nets: Sampling [These slides were created by Dan Klein and Pieter Abbeel for CS188 Intro to AI at UC Berkeley. All CS188 materials are available.

4. Particle Filtering For DBLOG PF, regular BLOG inference in each particle Open-Universe State Estimation with DBLOG Rodrigo de Salvo Braz*, Erik Sudderth,

The famous “sprinkler” example (J. Pearl, Probabilistic Reasoning in Intelligent Systems, 1988)

BLOG: Probabilistic Models with Unknown Objects Brian Milch, Bhaskara Marthi, Stuart Russell, David Sontag, Daniel L. Ong, Andrey Kolobov University of.

1 / 48 Formal a Language Theory and Describing Semantics Principles of Programming Languages 4.

An Introduction to Markov Chain Monte Carlo Teg Grenager July 1, 2004.

Probabilistic models Jouni Tuomisto THL. Outline Deterministic models with probabilistic parameters Hierarchical Bayesian models Bayesian belief nets.

CS 188: Artificial Intelligence Bayes Nets: Approximate Inference Instructor: Stuart Russell--- University of California, Berkeley.

by Ryan P. Adams, Iain Murray, and David J.C. MacKay (ICML 2009)

Inference on Relational Models Using Markov Chain Monte Carlo Brian Milch Massachusetts Institute of Technology UAI Tutorial July 19, 2007.

Representational and inferential foundations for possible large-scale information extraction and question-answering from the web Stuart Russell Computer.

1 CMSC 671 Fall 2001 Class #20 – Thursday, November 8.

1 Chapter 8: Model Inference and Averaging Presented by Hui Fang.

1 Scalable Probabilistic Databases with Factor Graphs and MCMC Michael Wick, Andrew McCallum, and Gerome Miklau VLDB 2010.

04/21/2005 CS673 1 Being Bayesian About Network Structure A Bayesian Approach to Structure Discovery in Bayesian Networks Nir Friedman and Daphne Koller.

Learning and Structural Uncertainty in Relational Probability Models Brian Milch MIT 9.66 November 29, 2007.

TEMPLATE DESIGN © Approximate Inference Completing the analogy… Inferring Seismic Event Locations We start out with the.

Probabilistic Reasoning Inference and Relational Bayesian Networks.

Bogdan Moldovan, Ingo Thon, Jesse Davis, and Luc de Raedt

Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 12

Artificial Intelligence

CS 4/527: Artificial Intelligence

Uncertainty in an Unknown World

CAP 5636 – Advanced Artificial Intelligence

Relational Probability Models

Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 12

Instructors: Fei Fang (This Lecture) and Dave Touretzky

Machine learning, probabilistic modelling

CS 188: Artificial Intelligence

Class #19 – Tuesday, November 3

Gibbs sampling in open-universe stochastic languages

Class #16 – Tuesday, October 26

Automatic Inference in PyBLOG

BN Semantics 3 – Now it’s personal! Parameter Learning 1

The open universe Stuart Russell

Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 12

Presentation transcript:

BLOG: Probabilistic Models with Unknown Objects Brian Milch Harvard CS 282 November 29,

2 Handling Unknown Objects Fundamental task: given observations, make inferences about initially unknown objects But most probabilistic modeling languages assume set of objects is fixed and known Bayesian logic (BLOG) lifts this assumption

3 Outline Motivating examples Bayesian logic (BLOG) –Syntax –Semantics Inference on BLOG models using MCMC

4 S. Russel and P. Norvig (1995). Artificial Intelligence: A Modern Approach. Upper Saddle River, NJ: Prentice Hall. Example 1: Bibliographies Russell, Stuart and Norvig, Peter. Articial Intelligence. Prentice-Hall, Title: … Name: … PubCited AuthorOf

5 Example 2: Aircraft Tracking Detection Failure

6 Example 2: Aircraft Tracking False Detection Unobserved Object

Simple Example: Balls in an Urn Draws (with replacement) P(n balls in urn) P(n balls in urn | draws) 1234

Possible Worlds …… …… 3.00 x x x x x Draws

Typed First-Order Language Types: Function symbols: 9 Ball, Draw, Color (Built-in types: Boolean, NaturalNum, Real, RkVector, String) TrueColor: (Ball)  Color BallDrawn: (Draw)  Ball ObsColor: (Draw)  Color Blue: ()  Color Green: ()  Color Draw1: ()  Draw Draw2: ()  Draw Draw3: ()  Draw constant symbols

First-Order Structures A structure for a typed first-order language maps… –Each type  a set of objects –Each function symbol  a function on those objects A BLOG model defines: –A typed first-order language –A probability distribution over structures of that language 10

BLOG Model for Urn and Balls: Header type Color; type Ball; type Draw; random Color TrueColor(Ball); random Ball BallDrawn(Draw); random Color ObsColor(Draw); guaranteed Color Blue, Green; guaranteed Draw Draw1, Draw2, Draw3, Draw4; type declarations function declarations guaranteed object statements: introduce constant symbols, assert that they denote distinct objects

Defining the Distribution: Known Objects Suppose only guaranteed objects exist Then possible world is fully specified by values for basic random variables Model will define conditional distributions for these variables 12 V f [o 1, …, o k ] random function objects of f’s argument types

Dependency Statements 13 TrueColor(b) ~ TabularCPD[[0.5, 0.5]](); BallDrawn(d) ~ Uniform({Ball b}); ObsColor(d) if (BallDrawn(d) != null) then ~ TabularCPD[[0.8, 0.2], [0.2, 0.8]] (TrueColor(BallDrawn(d))); Elementary CPD CPD parameters CPD arguments

Syntax of Dependency Statements Function(x 1,..., x k ) if Cond 1 then ~ ElemCPD 1 [params](Arg 1,1,..., Arg 1,m ) elseif Cond 2 then ~ ElemCPD 2 [params](Arg 2,1,..., Arg 2,m )... else ~ ElemCPD n [params](Arg n,1,..., Arg n,m ); Conditions are arbitrary first-order formulas Elementary CPDs are names of Java classes Arguments can be terms or set expressions

BLOG Model So Far 15 type Color; type Ball; type Draw; random Color TrueColor(Ball); random Ball BallDrawn(Draw); random Color ObsColor(Draw); guaranteed Color Blue, Green; guaranteed Draw Draw1, Draw2, Draw3, Draw4; TrueColor(b) ~ TabularCPD[[0.5, 0.5]](); BallDrawn(d) ~ Uniform({Ball b}); ObsColor(d) if (BallDrawn(d) != null) then ~ TabularCPD[[0.8, 0.2], [0.2, 0.8]] (TrueColor(BallDrawn(d))); ??? Distribution over what balls exist?

Challenge of Unknown Objects A C B D A C B D A C B D A C B D Attribute Uncertainty A C B D A C B D A C B D A C B D Relational Uncertainty A, C B, D Unknown Objects A, B, C, D A, C B, D A C, D B

Number Statements Define conditional distributions for basic RVs called number variables, e.g., N Ball Can have same syntax as dependency statements: 17 #Ball ~ Poisson[6](); #Candies if Unopened(Bag) then ~ RoundedNormal[10] (MeanCount(Manuf(Bag))) else ~ Poisson[50];

Full BLOG Model for Urn and Balls 18 type Color; type Ball; type Draw; random Color TrueColor(Ball); random Ball BallDrawn(Draw); random Color ObsColor(Draw); guaranteed Color Blue, Green; guaranteed Draw Draw1, Draw2, Draw3, Draw4; #Ball ~ Poisson[6](); TrueColor(b) ~ TabularCPD[[0.5, 0.5]](); BallDrawn(d) ~ Uniform({Ball b}); ObsColor(d) if (BallDrawn(d) != null) then ~ TabularCPD[[0.8, 0.2], [0.2, 0.8]] (TrueColor(BallDrawn(d)));

Model for Citations: Header 19 type Res; type Pub; type Cit; random String Name(Res); random NaturalNum NumAuthors(Pub); random Res NthAuthor(Pub, NaturalNum); random String Title(Pub); random Pub PubCited(Cit); random String Text(Cit); guaranteed Citation Cit1, Cit2, Cit3, Cit4;

Model for Citations: Body #Res ~ NumResearchersPrior(); Name(r) ~ NamePrior(); #Pub ~ NumPubsPrior(); NumAuthors(p) ~ NumAuthorsPrior(); NthAuthor(p, n) if (n < NumAuthors(p)) then ~ Uniform({Res r}); Title(p) ~ TitlePrior(); PubCited(c) ~ Uniform({Pub p}); Text(c) ~ FormatCPD (Title(PubCited(c)), {n, Name(NthAuthor(PubCited(c), n)) for NaturalNum n : n < NumAuthors(PubCited(c))});

21 Probability Model for Aircraft Tracking SkyRadar Existence of radar blips depends on existence and locations of aircraft

22 BLOG Model for Aircraft Tracking origin Aircraft Source(Blip); origin NaturalNum Time(Blip); … #Aircraft ~ NumAircraftDistrib(); State(a, t) if t = 0 then ~ InitState() else ~ StateTransition(State(a, Pred(t))); #Blip(Source = a, Time = t) ~ NumDetectionsDistrib(State(a, t)); #Blip(Time = t) ~ NumFalseAlarmsDistrib(); ApparentPos(r) if (Source(r) = null) then ~ FalseAlarmDistrib() else ~ ObsDistrib(State(Source(r), Time(r))); 2 Source Time a t Blips 2 Time t Blips

Families of Number Variables Defines family of number variables Note: no dependency statements for origin functions 23 #Blip(Source = a, Time = t) ~ NumDetectionsDistrib(State(a, t)); N blip [Source = o s, Time = o t ] Object of type Aircraft Object of type NaturalNum

24 Outline Motivating examples Bayesian logic (BLOG) –Syntax –Semantics Inference on BLOG models using MCMC

25 Declarative Semantics What is the set of possible worlds? –They’re first-order structures, but with what objects? What is the probability distribution over worlds?

26 What Exactly Are the Objects? Potential objects are tuples that encode generation history –Aircraft: (Aircraft, 1), (Aircraft, 2), … –Blips from (Aircraft, 2) at time 8 : (Blip, (Source, (Aircraft, 2)), (Time, 8), 1) (Blip, (Source, (Aircraft, 2)), (Time, 8), 2) … Point: If we specify value for number variable N blip [Source=(Aircraft, 2), Time=8] there ’ s no ambiguity about which blips have this source and time

27 Worlds and Random Variables Recall basic random variables: –One for each random function on each tuple of potential arguments –One for each number statement and each tuple of potential generating objects Lemma: Full instantiation of basic RVs uniquely identifies a possible world Caveat: Infinitely many potential objects  infinitely many basic RVs

Each BLOG model defines contingent Bayesian network (CBN) over basic RVs –Edges active only under certain conditions Contingent Bayesian Network TrueColor((Ball,1))TrueColor((Ball,2))TrueColor((Ball, 3)) … ObsColor(D1) BallDrawn(D1) #Ball BallDrawn(D1) = (Ball,1) BallDrawn(D1) = (Ball,2) BallDrawn(D1) = (Ball,3) [Milch et al., AI/Stats 2005] (Ball,2)=

BN Semantics Usual semantics for BN with N nodes: If BN is infinite but has topological numbering X 1, X 2, …, then suffices to make same assertion for each finite prefix of this numbering 29 But CBN may fail to have topological numbering!

Self-Supporting Instantiations x 1, …, x n is self-supporting if for all i < n: –x 1, …, x (i-1) determines which parents of X i are active –These active parents are all in X 1,…,X (i-1) 30 TrueColor((Ball,1))TrueColor((Ball,2))TrueColor((Ball, 3)) … ObsColor(D1) BallDrawn(D1) #Ball BallDrawn(D1) = (Ball,1) BallDrawn(D1) = (Ball,2) BallDrawn(D1) = (Ball,3) (Ball,2)= 12 = = Green = Blue

Semantics for CBNs and BLOG CBN asserts that for each self- supporting instantiation x 1,…,x n : Theorem: If CBN satisfies certain conditions (analogous to BN acyclicity), these constraints fully define distribution So by earlier lemma, BLOG model fully defines distribution over possible worlds 31 [Milch et al., IJCAI 2005]

32 Outline Motivating examples Bayesian logic (BLOG) –Syntax –Semantics Inference on BLOG models using MCMC

Review: Markov Chain Monte Carlo Markov chain s 1, s 2,... over outcomes in E Designed so unique stationary distribution is proportional to p(s) Fraction of s 1, s 2,..., s N in query event Q converges to p(Q|E) as N   E Q

Metropolis-Hastings MCMC Let s 1 be arbitrary state in E For n = 1 to N –Sample s  E from proposal distribution q(s | s n ) –Compute acceptance probability –With probability , let s n+1 = s; else let s n+1 = s n Stationary distribution is proportional to p(s) Fraction of visited states in Q converges to p(Q|E)

Toward General-Purpose Inference Successful applications of MCMC with domain-specific proposal distributions: –Citation matching [Pasula et al., 2003] –Multi-target tracking [Oh et al., 2004] But each application requires new code for: –Proposing moves –Representing MCMC states –Computing acceptance probabilities Goal: –User specifies model and proposal distribution –General-purpose code does the rest

General MCMC Engine Propose MCMC state s given s n Compute ratio q(s n | s) / q(s | s n ) Compute acceptance probability based on model Set s n+1 Define p(s) Custom proposal distribution (Java class) General-purpose engine (Java code) Model (in BLOG) 1. What are the MCMC states? 2. How does the engine handle arbitrary proposals efficiently? [Milch et al., UAI 2006]

Proposer for Citations Split-merge moves: –Propose titles and author names for affected publications based on citation strings Other moves change total number of publications [Pasula et al., NIPS 2002]

MCMC States Not complete instantiations! –No titles, author names for uncited publications States are partial instantiations of random variables –Each state corresponds to an event: set of outcomes satisfying description #Pub = 100, PubCited(Cit1) = (Pub, 37), Title((Pub, 37)) = “Calculus”

MCMC over Events Markov chain over events , with stationary distrib. proportional to p(  ) Theorem: Fraction of visited events in Q converges to p(Q|E) if: –Each  is either subset of Q or disjoint from Q –Events form partition of E E Q

Computing Probabilities of Events Engine needs to compute p(  ) / p(  n ) efficiently (without summations) Use self-supporting instantiations Then probability is product of CPDs:

States That Are Even More Abstract Typical partial instantiation: –Specifies particular publications, even though publications are interchangeable Let states be abstract partial instantiations: There are conditions under which we can compute probabilities of such events #Pub = 100, PubCited(Cit1) = (Pub, 37), Title((Pub, 37)) = “Calculus”, PubCited(Cit2) = (Pub, 14), Title((Pub, 14)) = “Psych”  x  y  x [#Pub = 100, PubCited(Cit1) = x, Title(x) = “Calculus”, PubCited(Cit2) = y, Title(y) = “Psych”]

Computing Acceptance Probabilities Efficiently First part of acceptance probability is: If moves are local, most factors cancel Need to compute factors for X i only if proposal changes X i or one of

Identifying Factors to Compute Maintain list of changed variables To find children of changed variables, use context-specific BN Update context-specific BN as active dependencies change Title((Pub, 37)) Text(Cit1) PubCited(Cit1) Text(Cit2) PubCited(Cit2) Title((Pub, 37))Title((Pub, 14)) Text(Cit1) PubCited(Cit1) Text(Cit2) PubCited(Cit2) split

Results on Citation Matching Hand-coded version uses: –Domain-specific data structures to represent MCMC state –Proposer-specific code to compute acceptance probabilities BLOG engine takes 5x as long to run But it’s faster than hand-coded version was in 2003! (hand-coded version took 120 secs on old hardware and JVM) Face (349 cits) Reinforce (406 cits) Reasoning (514 cits) Constraint (295 cits) Hand-codedAcc:95.1%81.8%88.6%91.7% Time:14.3 s19.4 s19.0 s12.1 s BLOG engineAcc:95.6%78.0%88.7%90.7% Time:69.7 s99.0 s99.4 s59.9 s

45 BLOG Software Bayesian Logic inference engine available:

46 Summary Modeling unknown objects is essential BLOG models define probability distributions over possible worlds with –Varying sets of objects –Varying mappings from observations to objects Can do inference on BLOG models using MCMC over partial worlds