The open universe Stuart Russell

Slides:



Advertisements
Similar presentations
First-Order Logic: Better choice for Wumpus World Propositional logic represents facts First-order logic gives us Objects Relations: how objects relate.
Advertisements

Answer Set Programming Overview Dr. Rogelio Dávila Pérez Profesor-Investigador División de Posgrado Universidad Autónoma de Guadalajara
Gibbs sampling in open-universe stochastic languages Nimar S. Arora Rodrigo de Salvo Braz Erik Sudderth Stuart Russell.
CSE 574 – Artificial Intelligence II Statistical Relational Learning Instructor: Pedro Domingos.
BLOG: Probabilistic Models with Unknown Objects Brian Milch CS /6/04 Joint work with Bhaskara Marthi, David Sontag, Daniel Ong, Andrey Kolobov, and.
Statistical Relational Learning Pedro Domingos Dept. Computer Science & Eng. University of Washington.
Topics Combining probability and first-order logic –BLOG and DBLOG Learning very complex behaviors –ALisp: hierarchical RL with partial programs State.
Relational Probability Models Brian Milch MIT 9.66 November 27, 2007.
First-Order Probabilistic Languages: Into the Unknown Brian Milch and Stuart Russell University of California at Berkeley, USA August 27, 2006 Based on.
1 First-Order Probabilistic Models Brian Milch 9.66: Computational Cognitive Science December 7, 2006.
IJCAI 2003 Workshop on Learning Statistical Models from Relational Data First-Order Probabilistic Models for Information Extraction Advisor: Hsin-His Chen.
Representational and inferential foundations for possible large-scale information extraction and question-answering from the web Stuart Russell Computer.
Bayesian networks. Motivation We saw that the full joint probability can be used to answer any question about the domain, but can become intractable as.
Unknown Objects and BLOG Brian Milch MIT IPAM Summer School July 16, 2007.
1 Logical Agents CS 171/271 (Chapter 7) Some text and images in these slides were drawn from Russel & Norvig’s published material.
1 Logical Agents CS 171/271 (Chapter 7) Some text and images in these slides were drawn from Russel & Norvig’s published material.
4. Particle Filtering For DBLOG PF, regular BLOG inference in each particle Open-Universe State Estimation with DBLOG Rodrigo de Salvo Braz*, Erik Sudderth,
BLOG: Probabilistic Models with Unknown Objects Brian Milch, Bhaskara Marthi, Stuart Russell, David Sontag, Daniel L. Ong, Andrey Kolobov University of.
Inference on Relational Models Using Markov Chain Monte Carlo Brian Milch Massachusetts Institute of Technology UAI Tutorial July 19, 2007.
Representational and inferential foundations for possible large-scale information extraction and question-answering from the web Stuart Russell Computer.
Artificial Intelligence: Research and Collaborative Possibilities a presentation by: Dr. Ernest L. McDuffie, Assistant Professor Department of Computer.
1 Scalable Probabilistic Databases with Factor Graphs and MCMC Michael Wick, Andrew McCallum, and Gerome Miklau VLDB 2010.
BLOG: Probabilistic Models with Unknown Objects Brian Milch Harvard CS 282 November 29,
Learning and Structural Uncertainty in Relational Probability Models Brian Milch MIT 9.66 November 29, 2007.
Some Thoughts to Consider 5 Take a look at some of the sophisticated toys being offered in stores, in catalogs, or in Sunday newspaper ads. Which ones.
The Approach of Modern AI Stuart Russell Computer Science Division UC Berkeley.
Probabilistic Reasoning Inference and Relational Bayesian Networks.
Announcements  Upcoming due dates  Thursday 10/1 in class Midterm  Coverage: everything in lecture and readings except first-order logic; NOT probability.
Knowledge Representation Techniques
By P. S. Suryateja Asst. Professor, CSE Vishnu Institute of Technology
From Classical Proof Theory to P vs. NP
Announcements No office hours today!
Computer Science cpsc322, Lecture 20
CS 4700: Foundations of Artificial Intelligence
Chapter 10: Using Uncertain Knowledge
A I (Artificial Intelligence)
CS 4700: Foundations of Artificial Intelligence
Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 7
EA C461 – Artificial Intelligence Logical Agent
Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 14
Artificial Intelli-gence 1: logic agents
Uncertainty in an Unknown World
Remember that our objective is for some density f(y|) for observations where y and  are vectors of data and parameters,  being sampled from a prior.
CAP 5636 – Advanced Artificial Intelligence
Open universes and nuclear weapons
Probabilistic Reasoning; Network-based reasoning
Relational Probability Models
Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 12
Instructors: Fei Fang (This Lecture) and Dave Touretzky
Artificial Intelligence
Machine learning, probabilistic modelling
CS 188: Artificial Intelligence
Artificial Intelligence: Logic agents
CPSC 322 Introduction to Artificial Intelligence
Problem Solving Skill Area 305.1
CS 188: Artificial Intelligence Fall 2007
October 6, 2011 Dr. Itamar Arel College of Engineering
Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 14
Gibbs sampling in open-universe stochastic languages
CS 188: Artificial Intelligence Spring 2007
Automatic Inference in PyBLOG
CS 188: Artificial Intelligence
Computer Science cpsc322, Lecture 20
Announcements Midterm is Wednesday March 20, 7pm-9pm 
Representations & Reasoning Systems (RRS) (2.2)
Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 7
CS 188: Artificial Intelligence Fall 2008
Logical Agents Prof. Dr. Widodo Budiharto 2018
Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 12
Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 14
Presentation transcript:

The open universe Stuart Russell 8/23/2019 6:37:33 AM The open universe Stuart Russell Computer Science Division, UC Berkeley

8/23/2019 6:37:33 AM Outline Why we need expressive probabilistic languages [a.k.a. preaching to choir] Expressiveness and openness BLOG: a generative language

The world has things in it!! 8/23/2019 6:37:33 AM The world has things in it!! Expressive language => concise models => fast learning, sometimes fast reasoning E.g., rules of chess: 1 page in first-order logic, ~100000 pages in propositional logic, ~100000000000000000000000000000000000000 pages as atomic-state model [Note: chess is a teeny problem] Expressiveness essential for general-purpose AI via learning (rather than constant reprogramming)

Hidden expressiveness 8/23/2019 6:37:33 AM Hidden expressiveness Alpha-beta is an atomic state-space search Chess programs don’t really express the transition model atomically They use Universal quantification (defun f (x y) ….) Loops (loop for x from 1 to 8 do …) Complex logical terms (cons (cons x y) z)) But This “knowledge” is single-purpose, inaccessible Most learning algorithms don’t output procedural programs - declarative languages seem more suited

Brief history of expressiveness 8/23/2019 6:37:33 AM Brief history of expressiveness probability logic atomic propositional first-order/relational

Brief history of expressiveness 8/23/2019 6:37:33 AM Brief history of expressiveness probability 5th C B.C. logic atomic propositional first-order/relational

Brief history of expressiveness 8/23/2019 6:37:33 AM Brief history of expressiveness 17th C probability 5th C B.C. logic atomic propositional first-order/relational

Brief history of expressiveness 8/23/2019 6:37:33 AM Brief history of expressiveness 17th C probability 5th C B.C. 19th C logic atomic propositional first-order/relational

Brief history of expressiveness 8/23/2019 6:37:33 AM Brief history of expressiveness 17th C 20th C probability 5th C B.C. 19th C logic atomic propositional first-order/relational

Brief history of expressiveness 8/23/2019 6:37:33 AM Brief history of expressiveness 17th C 20th C 21st C probability 5th C B.C. 19th C logic atomic propositional first-order/relational

Brief history of expressiveness 8/23/2019 6:37:33 AM Brief history of expressiveness 17th C 20th C 21st C probability (be patient!) 5th C B.C. 19th C logic atomic propositional first-order/relational

First-order probabilistic languages 8/23/2019 6:37:33 AM First-order probabilistic languages Gaifman [1964]: distributions over first-order possible worlds Halpern [1990]: syntax for constraints on such distributions Poole [1993], Sato [1997], Koller and Pfeffer [1998], various others: KB defines distribution exactly (cf. Bayes nets) assumes unique names and domain closure like Prolog, databases (Herbrand semantics)

Herbrand vs full first-order 8/23/2019 6:37:33 AM Herbrand vs full first-order Given Father(Bill,William) and Father(Bill,Junior) How many children does Bill have?

Herbrand vs full first-order 8/23/2019 6:37:33 AM Herbrand vs full first-order Given Father(Bill,William) and Father(Bill,Junior) How many children does Bill have? Herbrand semantics: 2 First-order logical semantics: Between 1 and ∞

8/23/2019 6:37:33 AM Possible worlds Propositional

Possible worlds Propositional 8/23/2019 6:37:33 AM Possible worlds Propositional First-order + unique names, domain closure A B C D A B A B A B A B C D C D C D C D

Possible worlds Propositional 8/23/2019 6:37:33 AM Possible worlds Propositional First-order + unique names, domain closure First-order open-universe A B C D A B A B A B A B C D C D C D C D A B C D A B C D A B C D A B C D A B C D A B C D

8/23/2019 6:37:33 AM Possible worlds contd. First-order logic is just one way to define sets of open-universe relational worlds Distinction between “programming languages” and “logic” is not completely clear (cf. Prolog’s dual semantics) Every “program” is an assertion in temporal logic with exactly one model per input

8/23/2019 6:37:33 AM Open-universe models Essential for learning about what exists, e.g., vision, NLP, information integration, tracking, life [Note the GOFAI Gap: logic-based systems going back to Shakey assumed that perceived objects would be named correctly] [IJCAI 97, IJCAI 99, IJCAI 01, NIPS 02, CDC 04, IJCAI 05, AI/Stats 06, UAI 06] Tim Huang, Hanna Pasula, Brian Milch, Bhaskara Marthi, David Sontag, Songhwai Oh, Nimar Arora, Rodrigo Braz, Erik Sudderth

Open-universe models in BLOG 8/23/2019 6:37:33 AM Open-universe models in BLOG Construct worlds using two kinds of steps, proceeding in topological order: Dependency statements: Set the value of a function or relation on a tuple of (quantified) arguments, conditioned on parent values Includes setting the referent of a constant symbol (0-ary function)

Open-universe models in BLOG 8/23/2019 6:37:33 AM Open-universe models in BLOG Construct worlds using two kinds of steps, proceeding in topological order: Dependency statements: Set the value of a function or relation on a tuple of (quantified) arguments, conditioned on parent values Includes setting the referent of a constant symbol (0-ary function) Number statements: Add some objects to the world, conditioned on what objects and relations exist so far

8/23/2019 6:37:33 AM Semantics Every well-formed* BLOG model specifies a unique proper probability distribution over open-universe possible worlds; equivalent to an infinite contingent Bayes net * No infinite receding ancestor chains, no conditioned cycles, all expressions finitely evaluable

Example: Citation Matching [Lashkari et al 94] Collaborative Interface Agents, Yezdi Lashkari, Max Metral, and Pattie Maes, Proceedings of the Twelfth National Conference on Articial Intelligence, MIT Press, Cambridge, MA, 1994. Metral M. Lashkari, Y. and P. Maes. Collaborative interface agents. In Conference of the American Association for Artificial Intelligence, Seattle, WA, August 1994. Are these descriptions of the same object? Core task in CiteSeer, Google Scholar, over 300 companies in the record linkage industry

(Simplified) BLOG model 8/23/2019 6:37:33 AM (Simplified) BLOG model #Researcher ~ NumResearchersPrior(); Name(r) ~ NamePrior(); #Paper(FirstAuthor = r) ~ NumPapersPrior(Position(r)); Title(p) ~ TitlePrior(); PubCited(c) ~ Uniform({Paper p}); Text(c) ~ NoisyCitationGrammar (Name(FirstAuthor(PubCited(c))), Title(PubCited(c)));

(Simplified) BLOG model 8/23/2019 6:37:33 AM (Simplified) BLOG model #Researcher ~ NumResearchersPrior(); Name(r) ~ NamePrior(); #Paper(FirstAuthor = r) ~ NumPapersPrior(Position(r)); Title(p) ~ TitlePrior(); PubCited(c) ~ Uniform({Paper p}); Text(c) ~ NoisyCitationGrammar (Name(FirstAuthor(PubCited(c))), Title(PubCited(c)));

(Simplified) BLOG model 8/23/2019 6:37:33 AM (Simplified) BLOG model #Researcher ~ NumResearchersPrior(); Name(r) ~ NamePrior(); #Paper(FirstAuthor = r) ~ NumPapersPrior(Position(r)); Title(p) ~ TitlePrior(); PubCited(c) ~ Uniform({Paper p}); Text(c) ~ NoisyCitationGrammar (Name(FirstAuthor(PubCited(c))), Title(PubCited(c)));

(Simplified) BLOG model 8/23/2019 6:37:33 AM (Simplified) BLOG model #Researcher ~ NumResearchersPrior(); Name(r) ~ NamePrior(); #Paper(FirstAuthor = r) ~ NumPapersPrior(Position(r)); Title(p) ~ TitlePrior(); PubCited(c) ~ Uniform({Paper p}); Text(c) ~ NoisyCitationGrammar (Name(FirstAuthor(PubCited(c))), Title(PubCited(c)));

(Simplified) BLOG model 8/23/2019 6:37:33 AM (Simplified) BLOG model #Researcher ~ NumResearchersPrior(); Name(r) ~ NamePrior(); #Paper(FirstAuthor = r) ~ NumPapersPrior(Position(r)); Title(p) ~ TitlePrior(); PubCited(c) ~ Uniform({Paper p}); Text(c) ~ NoisyCitationGrammar (Name(FirstAuthor(PubCited(c))), Title(PubCited(c)));

(Simplified) BLOG model 8/23/2019 6:37:33 AM (Simplified) BLOG model #Researcher ~ NumResearchersPrior(); Name(r) ~ NamePrior(); #Paper(FirstAuthor = r) ~ NumPapersPrior(Position(r)); Title(p) ~ TitlePrior(); PubCited(c) ~ Uniform({Paper p}); Text(c) ~ NoisyCitationGrammar (Name(FirstAuthor(PubCited(c))), Title(PubCited(c)));

(Simplified) BLOG model 8/23/2019 6:37:33 AM (Simplified) BLOG model #Researcher ~ NumResearchersPrior(); Name(r) ~ NamePrior(); #Paper(FirstAuthor = r) ~ NumPapersPrior(Position(r)); Title(p) ~ TitlePrior(); PubCited(c) ~ Uniform({Paper p}); Text(c) ~ NoisyCitationGrammar (Name(FirstAuthor(PubCited(c))), Title(PubCited(c)));

Citation Matching Results 8/23/2019 6:37:33 AM Citation Matching Results Four data sets of ~300-500 citations, referring to ~150-300 papers

Example: Sibyl attacks Typically between 100 and 10,000 real entities About 90% are honest, have one identity Dishonest entities own between 10 and 1000 identities. Transactions may occur between identities If two identities are owned by the same entity (sibyls), then a transaction is highly likely; Otherwise, transaction is less likely (depending on honesty of each identity’s owner). An identity may recommend another after a transaction: Sibyls with the same owner usually recommend each other; Otherwise, probability of recommendation depends on the honesty of the two entities.

8/23/2019 6:37:33 AM #Entity ~ LogNormal[6.9, 2.3](); Honest(x) ~ Boolean[0.9](); #Identity(Owner = x) ~ if Honest(x) then 1 else LogNormal[4.6,2.3](); Transaction(x,y) ~ if Owner(x) = Owner(y) then SibylPrior () else TransactionPrior(Honest(Owner(x)), Honest(Owner(y))); Recommends(x,y) ~ if Transaction(x,y) then if Owner(x) = Owner(y) then Boolean[0.99]() else RecPrior(Honest(Owner(x)), Evidence: lots of transactions and recommendations, maybe some Honest(.) assertions Query: Honest(x)

Example: classical data association

Example: classical data association

Example: classical data association

Example: classical data association

Example: classical data association

Example: classical data association

State Estimation for “Aircraft” 8/23/2019 6:37:33 AM State Estimation for “Aircraft” Dependency statements for simple model: #Aircraft ~ NumAircraftPrior(); State(a, t) if t = 0 then ~ InitState() else ~ StateTransition(State(a, t-1)); #Blip(Source = a, Time = t) ~ NumDetectionsCPD(State(a, t)); #Blip(Time = t) ~ NumFalseAlarmsPrior(); ApparentPos(r) if (Source(r) = null) then ~ FalseAlarmDistrib() else ~ ObsCPD(State(Source(r), Time(r)));

Aircraft Entering and Exiting 8/23/2019 6:37:33 AM Aircraft Entering and Exiting #Aircraft(EntryTime = t) ~ NumAircraftPrior(); Exits(a, t) if InFlight(a, t) then ~ Bernoulli(0.1); InFlight(a, t) if t < EntryTime(a) then = false elseif t = EntryTime(a) then = true else = (InFlight(a, t-1) & !Exits(a, t-1)); State(a, t) if t = EntryTime(a) then ~ InitState() elseif InFlight(a, t) then ~ StateTransition(State(a, t-1)); #Blip(Source = a, Time = t) if InFlight(a, t) then ~ NumDetectionsCPD(State(a, t)); …plus last two statements from previous slide

Extending the Model: Air Bases 8/23/2019 6:37:33 AM Extending the Model: Air Bases Suppose aircraft don’t just enter and exit, but actually take off and land at bases Want to track how many aircraft there are at each base Aircraft have destinations (particular bases) that they generally fly towards Assume set of bases is known

Extending the Model: Air Bases 8/23/2019 6:37:33 AM Extending the Model: Air Bases #Aircraft(InitialBase = b) ~ InitialAircraftPerBasePrior(); CurBase(a, t) if t = 0 then = InitialBase(b) elseif TakesOff(a, t-1) then = null elseif Lands(a, t-1) then = Dest(a, t-1) else = CurBase(a, t-1); InFlight(a, t) = (CurBase(a, t) = null); TakesOff(a, t) if !InFlight(a, t) then ~ Bernoulli(0.1); Lands(a, t) if InFlight(a, t) then ~ LandingCPD(State(a, t), Location(Dest(a, t))); Dest(a, t) if TakesOff(a, t) then ~ Uniform({Base b}) elseif InFlight(a, t) then = Dest(a, t-1) State(a, t) if TakesOff(a, t-1) then ~ InitState(Location(CurBase(a, t-1))) elseif InFlight(a, t) then ~ StateTrans(State(a, t-1), Location(Dest(a, t)));

Unknown Air Bases Just add two more lines: #AirBase ~ NumBasesPrior(); 8/23/2019 6:37:33 AM Unknown Air Bases Just add two more lines: #AirBase ~ NumBasesPrior(); Location(b) ~ BaseLocPrior();

Experience at UC Irvine 8/23/2019 6:37:33 AM Experience at UC Irvine “The first model we designed was the model implemented in BLOG. It is a very intuitive model, which seems to be true of most BLOG models. Writing the BLOG model … was nearly trivial.”

8/23/2019 6:37:33 AM Inference BLOG inference algorithms (rejection sampling, importance sampling, MCMC) converge to correct posteriors for any well-formed model, for any first-order query Built-in MCMC is M-H on partial possible worlds with generic proposal conditioning on parents only => SLOOOOW User may substitute any other proposer

Experience at UC Irvine, contd. 8/23/2019 6:37:33 AM Experience at UC Irvine, contd. “One author set about writing another Markov logic model, while the other began writing a custom Metropolis-Hastings proposer for the BLOG model. This turned out to be a time consuming and non-trivial task…”

BLOG status BLOG available online 8/23/2019 6:37:33 AM BLOG status BLOG available online npBLOG (Carbonetto et al., UAI 05) provided nonparametric extensions DBLOG (open-universe state estimation): see Rodrigo’s poster pyBLOG (a much faster reimplementation with generalized Gibbs and BUGS-like subproposal “experts”): see Nimar’s poster

8/23/2019 6:37:33 AM BLOG status contd. Blocking M-H seems to be essential for many applications with deterministic or near-deterministic relations Need to develop a large library of models to gain experience, develop idioms Structure-learning algorithms would be helpful Compiler technology would also be helpful Develop inference benchmarks Explore multicore implementations

8/23/2019 6:37:33 AM BLOG status contd. Blocking M-H seems to be essential for many applications with deterministic or near-deterministic relations Need to develop a large library of models to gain experience, develop idioms Structure-learning algorithms would be helpful Compiler technology would also be helpful Develop inference benchmarks Explore multicore implementations, or not

Closing remarks Fruit flies can learn to recognize handwritten digits 8/23/2019 6:37:33 AM Closing remarks Fruit flies can learn to recognize handwritten digits First-order logic is the mathematics of objects and relations; for a world containing many related objects, it (or some close relative) is probably useful Prediction: knowledge representation will make a comeback as it finds niches that connect to real data