Relational Probability Models Brian Milch MIT 9.66 November 27, 2007.

Slides:

Advertisements

Similar presentations

Brief Introduction to Logic. Outline Historical View Propositional Logic : Syntax Propositional Logic : Semantics Satisfiability Natural Deduction : Proofs.

Advertisements

Knowledge Representation and Reasoning University "Politehnica" of Bucharest Department of Computer Science Fall 2010 Adina Magda Florea

1 Some Comments on Sebastiani et al Nature Genetics 37(4)2005.

CSE 5522: Survey of Artificial Intelligence II: Advanced Techniques Instructor: Alan Ritter TA: Fan Yang.

Selectivity Estimation using Probabilistic Models Author: Lise Getoor, Ben Taskar, Daphne Koller Presenter: Qidan Cheng.

Lifted First-Order Probabilistic Inference Rodrigo de Salvo Braz, Eyal Amir and Dan Roth This research is supported by ARDA’s AQUAINT Program, by NSF grant.

Plan Recognition with Multi- Entity Bayesian Networks Kathryn Blackmond Laskey Department of Systems Engineering and Operations Research George Mason University.

Modelling Relational Statistics With Bayes Nets School of Computing Science Simon Fraser University Vancouver, Canada Tianxiang Gao Yuke Zhu.

CPSC 322, Lecture 26Slide 1 Reasoning Under Uncertainty: Belief Networks Computer Science cpsc322, Lecture 27 (Textbook Chpt 6.3) March, 16, 2009.

A Probabilistic Framework for Information Integration and Retrieval on the Semantic Web by Livia Predoiu, Heiner Stuckenschmidt Institute of Computer Science,

CSE 574 – Artificial Intelligence II Statistical Relational Learning Instructor: Pedro Domingos.

Brief Introduction to Logic. Outline Historical View Propositional Logic : Syntax Propositional Logic : Semantics Satisfiability Natural Deduction : Proofs.

Knowledge Representation using First-Order Logic (Part II) Reading: Chapter 8, First lecture slides read: Second lecture slides read:

CSE 574: Artificial Intelligence II Statistical Relational Learning Instructor: Pedro Domingos.

© 2002 Franz J. Kurfess Introduction 1 CPE/CSC 481: Knowledge-Based Systems Dr. Franz J. Kurfess Computer Science Department Cal Poly.

Representing Uncertainty CSE 473. © Daniel S. Weld 2 Many Techniques Developed Fuzzy Logic Certainty Factors Non-monotonic logic Probability Only one.

1 Department of Computer Science and Engineering, University of South Carolina Issues for Discussion and Work Jan 2007  Choose meeting time.

CIS 410/510 Probabilistic Methods for Artificial Intelligence Instructor: Daniel Lowd.

1 Bayesian Networks Chapter ; 14.4 CS 63 Adapted from slides by Tim Finin and Marie desJardins. Some material borrowed from Lise Getoor.

Statistical Relational Learning Pedro Domingos Dept. Computer Science & Eng. University of Washington.

CSE 515 Statistical Methods in Computer Science Instructor: Pedro Domingos.

Probabilistic Reasoning

First-Order Probabilistic Languages: Into the Unknown Brian Milch and Stuart Russell University of California at Berkeley, USA August 27, 2006 Based on.

1 First-Order Probabilistic Models Brian Milch 9.66: Computational Cognitive Science December 7, 2006.

CS 4100 Artificial Intelligence Prof. C. Hafner Class Notes March 13, 2012.

IJCAI 2003 Workshop on Learning Statistical Models from Relational Data First-Order Probabilistic Models for Information Extraction Advisor: Hsin-His Chen.

Markov Logic And other SRL Approaches

Collective Classification A brief overview and possible connections to -acts classification Vitor R. Carvalho Text Learning Group Meetings, Carnegie.

Unknown Objects and BLOG Brian Milch MIT IPAM Summer School July 16, 2007.

Bayesian Networks for Data Mining David Heckerman Microsoft Research (Data Mining and Knowledge Discovery 1, (1997))

First-Order Logic Introduction Syntax and Semantics Using First-Order Logic Summary.

Lifted First-Order Probabilistic Inference Rodrigo de Salvo Braz SRI International joint work with Eyal Amir and Dan Roth.

Markov Logic Networks Pedro Domingos Dept. Computer Science & Eng. University of Washington (Joint work with Matt Richardson)

Learning With Bayesian Networks Markus Kalisch ETH Zürich.

1 Lifted First-Order Probabilistic Inference Rodrigo de Salvo Braz University of Illinois at Urbana-Champaign with Eyal Amir and Dan Roth.

BLOG: Probabilistic Models with Unknown Objects Brian Milch, Bhaskara Marthi, Stuart Russell, David Sontag, Daniel L. Ong, Andrey Kolobov University of.

CPSC 422, Lecture 11Slide 1 Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 11 Oct, 2, 2015.

CPSC 322, Lecture 33Slide 1 Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 33 Nov, 30, 2015 Slide source: from David Page (MIT) (which were.

Wei Sun and KC Chang George Mason University March 2008 Convergence Study of Message Passing In Arbitrary Continuous Bayesian.

1 First order theories (Chapter 1, Sections 1.4 – 1.5) From the slides for the book “Decision procedures” by D.Kroening and O.Strichman.

First-Order Probabilistic Inference Rodrigo de Salvo Braz.

Computing & Information Sciences Kansas State University Lecture 12 of 42 CIS 530 / 730 Artificial Intelligence Lecture 12 of 42 William H. Hsu Department.

1 Scalable Probabilistic Databases with Factor Graphs and MCMC Michael Wick, Andrew McCallum, and Gerome Miklau VLDB 2010.

04/21/2005 CS673 1 Being Bayesian About Network Structure A Bayesian Approach to Structure Discovery in Bayesian Networks Nir Friedman and Daphne Koller.

BLOG: Probabilistic Models with Unknown Objects Brian Milch Harvard CS 282 November 29,

Learning and Structural Uncertainty in Relational Probability Models Brian Milch MIT 9.66 November 29, 2007.

First-Order Logic Semantics Reading: Chapter 8, , FOL Syntax and Semantics read: FOL Knowledge Engineering read: FOL.

1 First Order Logic CS 171/271 (Chapter 8) Some text and images in these slides were drawn from Russel & Norvig’s published material.

Computing & Information Sciences Kansas State University CIS 530 / 730: Artificial Intelligence Lecture 09 of 42 Wednesday, 17 September 2008 William H.

CPSC 322, Lecture 26Slide 1 Reasoning Under Uncertainty: Belief Networks Computer Science cpsc322, Lecture 27 (Textbook Chpt 6.3) Nov, 13, 2013.

First-Order Probabilistic Inference Rodrigo de Salvo Braz SRI International.

Probabilistic Reasoning Inference and Relational Bayesian Networks.

Chapter 12. Probability Reasoning Fall 2013 Comp3710 Artificial Intelligence Computing Science Thompson Rivers University.

Announcements  Upcoming due dates  Thursday 10/1 in class Midterm  Coverage: everything in lecture and readings except first-order logic; NOT probability.

Computing & Information Sciences Kansas State University Monday, 18 Sep 2006CIS 490 / 730: Artificial Intelligence Lecture 11 of 42 Monday, 18 September.

Learning Bayesian Networks for Complex Relational Data

Brief Intro to Machine Learning CS539

A Brief Introduction to Bayesian networks

Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 30

A Quick Romp Through Probabilistic Relational Models

Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 20

Representing Uncertainty

CSE 515 Statistical Methods in Computer Science

Relational Probability Models

CAP 5636 – Advanced Artificial Intelligence

CS 188: Artificial Intelligence

CS 188: Artificial Intelligence

Probabilistic Reasoning

Statistical Relational AI

Representations & Reasoning Systems (RRS) (2.2)

Presentation transcript:

Relational Probability Models Brian Milch MIT 9.66 November 27, 2007

2 Objects, Attributes, Relations AuthorOf Reviews Specialty: Theory Specialty: RL Specialty: BNs Topic: Learning Topic: Theory Topic: Learning Topic: BNs AuthorOf

3 Specific Scenario Prof. Smith Smith98aSmith00 AuthorOf Bayesian networks have become a word pos 1 word pos 2 word pos 3 word pos 4 word pos 5 … InDoc word pos 1 word pos 2 word pos 3 word pos 4 word pos 5 … InDoc

4 Graphical Model for This Scenario Smith98a Topic Smith00 Topic Smith Specialty Bayesian networks BNs Smith98a word 1 Smith98a word 2 Smith98a word 3 … Smith00 word 1 Smith00 word 2 Smith00 word 3 are … Dependency models are repeated at each node Graphical model is specific to Smith98a and Smith00

5 Abstract Knowledge Humans have abstract knowledge that can be applied to any individuals – Within a scenario – Across scenarios How can such knowledge be: – Represented? – Learned? – Used in reasoning?

Outline Logic: first-order versus propositional Relational probability models (RPMs): first-order logic meets probability Relational uncertainty in RPMs Thursday: models with unknown objects

7 Kinds of possible worlds Atomic: each possible world is an atom or token with no internal structure. E.g., Heads or Tails Propositional: each possible world defined by values assigned to variables. E.g., propositional logic, graphical models First-order: each possible world defined by objects and relations [Slide credit: Stuart Russell]

Specialties and Topics PropositionalFirst-Order Spec_Smith_BNs  Topic_Smith98a_BNs Spec_Smith_Theory  Topic_Smith98a_Theory Spec_Smith_Learning  Topic_Smith98a_Learning Spec_Smith_BNs  Topic_Smith00_BNs Spec_Smith_Theory  Topic_Smith00_Theory Spec_Smith_Learning  Topic_Smith00_Learning  r  t  p [(Spec(r, t)  AuthorOf(r, p))  Topic(p, t)] AuthorOf(Smith, Smith00) AuthorOf(Smith, Smith98a)

9 Expressiveness matters Expressive language => concise models => fast learning, sometimes fast reasoning E.g., rules of chess: 1 page in first-order logic, ~ pages in propositional logic, ~ pages as atomic-state model (Note: chess is a teeny problem) [Slide credit: Stuart Russell]

10 Brief history of expressiveness deterministic probabilistic atomicpropositionalfirst-order histogram 17th–18th centuries Probabilistic logic [Nilsson 1986], Graphical models late 20th century Boolean logic 5th century B.C. - 19th century First-order logic 19th - early 20th century First-order probabilistic languages (FOPLs) 20th-21st centuries

11 First-Order Logic Syntax Constants: Brian, 2, AIMA2e, MIT,... Predicates: AuthorOf, >,... Functions: PublicationYear, ,... Variables: x,y,a,b,... Connectives: ∧ ∨    Equality: = Quantifiers: ∀ ∃ [Slide credit: Stuart Russell]

12 Terms A term refers (according to a given possible world) to an object in that world Term = – function(term 1,...,term n ) or – constant symbol or – variable E.g., PublicationYear(AIMA2e) Arbitrary nesting  infinitely many terms [Slide credit: Stuart Russell]

13 Atomic sentences Atomic sentence = – predicate(term 1,...,term n ) or – term1=term2 E.g., – AuthorOf(Norvig,AIMA2e) – NthAuthor(AIMA2e,2) = Norvig Can be combined using connectives, e.g., (Peter=Norvig)  (NthAuthor(AIMA2e,2) = Peter) [Slide credit: Stuart Russell]

14 Semantics: Truth in a world Each possible world contains ≥1 objects (domain elements), and maps… – Constant symbols → objects – Predicate symbols → relations (sets of tuples of objects satisfying the predicate) – Function symbols → functional relations An atomic sentence predicate(term 1,...,term n ) is true iff the objects referred to by term 1,...,term n are in the relation referred to by predicate [Slide credit: Stuart Russell]

15 Example AuthorOf HumanProblemSolving Newell Simon AuthorOf(Newell,HumanProblemSolving) is true in this world [Slide credit: Stuart Russell]

Outline Logic: first-order versus propositional Relational probability models (RPMs): first-order logic meets probability Relational uncertainty in RPMs Thursday: models with unknown objects

17 Relational Probability Models Abstract probabilistic model for attributes Relational skeleton: objects & relations Graphical model

18 Representation Have to represent – Set of variables – Dependencies – Conditional probability distributions (CPDs) Many proposed languages We ’ ll use Bayesian logic (BLOG) [Milch et al. 2005] All depend on relational skeleton

19 Typed First-Order Logic Objects divided into types Boolean, Researcher, Paper, WordPos, Word, Topic Express attributes and relations with functions (predicates are just Boolean functions) FirstAuthor(paper)  Researcher (non-random) Specialty(researcher)  Topic (random) Topic(paper)  Topic (random) Doc(wordpos)  Paper (non-random) WordAt(wordpos)  Word (random)

20 Set of Random Variables For random functions, have random variable for each tuple of argument objects Researcher: Smith, Jones Paper: Smith98a, Smith00, Jones00 WordPos: Smith98a_1, …, Smith98a_3212, Smith00_1, etc. Specialty(Smith)Specialty(Jones) Topic(Smith98a)Topic(Smith00)Topic(Jones00) WordAt(Smith98a_1)WordAt(Smith98a_3212) WordAt(Smith00_1)WordAt(Smith00_2774) WordAt(Jones00_1)WordAt(Jones00_4893) … … … Specialty: Topic: WordAt:

21 Dependency Statements Specialty(r) ~TabularCPD[[0.5, 0.3, 0.2]]; Topic(p) ~ TabularCPD[[0.90, 0.01, 0.09], [0.02, 0.85, 0.13], [0.10, 0.10, 0.80]] (Specialty(FirstAuthor(p))); WordAt(wp) ~ TabularCPD[[0.03,..., 0.02, 0.001,...], [0.03,..., 0.001, 0.02,...], [0.03,..., 0.003, 0.003,...]] (Topic(Doc(wp))); BNsRLTheory BNsRLTheory | BNs | RL | Theory Logical term identifying parent node the Bayesian reinforcement | BNs | RL | Theory

22 Variable Numbers of Parents What if we allow multiple authors? – Let skeleton specify predicate AuthorOf(r, p) Topic(p) now depends on specialties of multiple authors Number of parents depends on skeleton

23 Aggregation Aggregate distributions Aggregate values Topic(p) ~ TopicAggCPD({Specialty(r) for Researcher r : AuthorOf(r, p)}); multiset defined by formula mixture of distributions conditioned on individual elements of multiset [Taskar et al., IJCAI 2001] Topic(p) ~ TopicCPD(Mode({Specialty(r) for Researcher r : AuthorOf(r, p)})); aggregation function

24 Semantics: Ground BN FirstAuthor R1R2 P1 P2 P3 ……… Spec(R1) Spec(R2) Topic(P3) Topic(P2) W(P3_1)W(P3_4893)W(P2_1)W(P2_2774) W(P1_1)W(P1_3212) Skeleton Topic(P1) Ground BN FirstAuthor 3212 words 2774 words 4893 words

25 When Is Ground BN Acyclic? Look at symbol graph – Node for each random function – Read off edges from dependency statements Theorem: If symbol graph is acyclic, then ground BN is acyclic for every skeleton Specialty Topic WordAt [Koller & Pfeffer, AAAI 1998]

26 Inference: Knowledge-Based Model Construction (KBMC) Construct relevant portion of ground BN ? …… Spec(R1) Spec(R2) Topic(P3) Topic(P2) W(P3_1)W(P3_4893) W(P1_1)W(P1_3212) Topic(P1) R1 R2 P1 P2 P3 Skeleton: Constructed BN: [Breese 1992; Ngo & Haddawy 1997]

27 Inference on Constructed Network Run standard BN inference algorithm – Exact: variable elimination/junction tree – Approx: Gibbs sampling, loopy belief propagation Exploit some repeated structure with lifted inference [Pfeffer et al., UAI 1999; Poole, IJCAI 2003; de Salvo Braz et al., IJCAI 2005]

28 References Wellman, M. P., Breese, J. S., and Goldman, R. P. (1992) “ From knowledge bases to decision models ”. Knowledge Engineering Review 7: Breese, J.S. (1992) “ Construction of belief and decision networks ”. Computational Intelligence 8(4): Ngo, L. and Haddawy, P. (1997) “ Answering queries from context-sensitive probabilistic knowledge bases ”. Theoretical Computer Sci. 171(1-2): Koller, D. and Pfeffer, A. (1998) “ Probabilistic frame-based systems ”. In Proc. 15 th AAAI National Conf. on AI, pages Friedman, N., Getoor, L., Koller, D.,and Pfeffer, A. (1999) “ Learning probabilistic relational models ”. In Proc. 16 th Int ’ l Joint Conf. on AI, pages Pfeffer, A., Koller, D., Milch, B., and Takusagawa, K. T. (1999) “ SPOOK: A System for Probabilistic Object-Oriented Knowledge ”. In Proc. 15 th Conf. on Uncertainty in AI, pages Taskar, B., Segal, E., and Koller, D. (2001) “ Probabilistic classification and clustering in relational data ”. In Proc. 17 th Int ’ l Joint Conf. on AI, pages Getoor, L., Friedman, N., Koller, D., and Taskar, B. (2002) “ Learning probabilistic models of link structure ”. J. Machine Learning Res. 3: Taskar, B., Abbeel, P., and Koller, D. (2002) “ Discriminative probabilistic models for relational data ”. In Proc. 18 th Conf. on Uncertainty in AI, pages

29 References Poole, D. (2003) “ First-order probabilistic inference ”. In Proc. 18 th Int ’ l Joint Conf. on AI, pages de Salvo Braz, R. and Amir, E. and Roth, D. (2005) “ Lifted first-order probabilistic inference. ” In Proc. 19 th Int ’ l Joint Conf. on AI, pages Dzeroski, S. and Lavrac, N., eds. (2001) Relational Data Mining. Springer. Flach, P. and Lavrac, N. (2002) “ Learning in Clausal Logic: A Perspective on Inductive Logic Programming ”. In Computational Logic: Logic Programming and Beyond (Essays in Honour of Robert A. Kowalski), Springer Lecture Notes in AI volume 2407, pages Pasula, H. and Russell, S. (2001) “ Approximate inference for first-order probabilistic languages ”. In Proc. 17th Int ’ l Joint Conf. on AI, pages Milch, B., Marthi, B., Russell, S., Sontag, D., Ong, D. L., and Kolobov, A. (2005) “ BLOG: Probabilistic Models with Unknown Objects ”. In Proc. 19th Int ’ l Joint Conf. on AI, pages