Lorentz workshop on Human Probabilistic Inferences A complexity-theoretic perspective on the preconditions for Bayesian tractability Johan Kwisthout (joint.

Slides:



Advertisements
Similar presentations
CS188: Computational Models of Human Behavior
Advertisements

CPSC 422, Lecture 11Slide 1 Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 11 Jan, 29, 2014.
CSE 473/573 Computer Vision and Image Processing (CVIP) Ifeoma Nwogu Lecture 27 – Overview of probability concepts 1.
Bart Jansen 1.  Problem definition  Instance: Connected graph G, positive integer k  Question: Is there a spanning tree for G with at least k leaves?
A Tutorial on Learning with Bayesian Networks
1 Some Comments on Sebastiani et al Nature Genetics 37(4)2005.
1 University of Southern California Keep the Adversary Guessing: Agent Security by Policy Randomization Praveen Paruchuri University of Southern California.
1 Complexity of domain-independent planning José Luis Ambite.
Structured SVM Chen-Tse Tsai and Siddharth Gupta.
ANDREW MAO, STACY WONG Regrets and Kidneys. Intro to Online Stochastic Optimization Data revealed over time Distribution of future events is known Under.
Dynamic Bayesian Networks (DBNs)
Ai in game programming it university of copenhagen Statistical Learning Methods Marco Loog.
Visual Recognition Tutorial
On the complexity of finding common approximate substrings Theoretical Computer Science 306 (2003) Patricia A. Evans, Andrew D. Smith, H. Todd.
Bayesian network inference
Introduction  Bayesian methods are becoming very important in the cognitive sciences  Bayesian statistics is a framework for doing inference, in a principled.
. Bayesian Networks Lecture 9 Edited from Nir Friedman’s slides by Dan Geiger from Nir Friedman’s slides.
Machine Learning CMPT 726 Simon Fraser University
Analysis of Algorithms CS 477/677
Expectation Maximization Algorithm
Constructing Belief Networks: Summary [[Decide on what sorts of queries you are interested in answering –This in turn dictates what factors to model in.
Statistical Learning: Pattern Classification, Prediction, and Control Peter Bartlett August 2002, UC Berkeley CIS.
CS Bayesian Learning1 Bayesian Learning. CS Bayesian Learning2 States, causes, hypotheses. Observations, effect, data. We need to reconcile.
Crash Course on Machine Learning
1.1 Chapter 1: Introduction What is the course all about? Problems, instances and algorithms Running time v.s. computational complexity General description.
Fixed Parameter Complexity Algorithms and Networks.
Ahsanul Haque *, Swarup Chandra *, Latifur Khan * and Charu Aggarwal + * Department of Computer Science, University of Texas at Dallas + IBM T. J. Watson.
EM and expected complete log-likelihood Mixture of Experts
Chapter 7 Inefficiency and Intractability CS 345 Spring Quarter, 2014.
Ahsanul Haque *, Swarup Chandra *, Latifur Khan * and Michael Baron + * Department of Computer Science, University of Texas at Dallas + Department of Mathematical.
Tonga Institute of Higher Education Design and Analysis of Algorithms IT 254 Lecture 8: Complexity Theory.
Closest String with Wildcards ( CSW ) Parameterized Complexity Analysis for the Closest String with Wildcards ( CSW ) Problem Danny Hermelin Liat Rozenberg.
1 Statistical Distribution Fitting Dr. Jason Merrick.
ECE 8443 – Pattern Recognition Objectives: Error Bounds Complexity Theory PAC Learning PAC Bound Margin Classifiers Resources: D.M.: Simplified PAC-Bayes.
CSC321: 2011 Introduction to Neural Networks and Machine Learning Lecture 11: Bayesian learning continued Geoffrey Hinton.
Bayesian Inversion of Stokes Profiles A.Asensio Ramos (IAC) M. J. Martínez González (LERMA) J. A. Rubiño Martín (IAC) Beaulieu Workshop ( Beaulieu sur.
ECE 8443 – Pattern Recognition LECTURE 07: MAXIMUM LIKELIHOOD AND BAYESIAN ESTIMATION Objectives: Class-Conditional Density The Multivariate Case General.
LECTURE 13. Course: “Design of Systems: Structural Approach” Dept. “Communication Networks &Systems”, Faculty of Radioengineering & Cybernetics Moscow.
CSC 413/513: Intro to Algorithms NP Completeness.
CSE 3813 Introduction to Formal Languages and Automata Chapter 14 An Introduction to Computational Complexity These class notes are based on material from.
UT-ORNL 15 March 2005 CSIIR Workshop 1 Trusted Computing Amidst Untrustworthy Intermediaries Mike Langston Department of Computer Science University of.
Inference Complexity As Learning Bias Daniel Lowd Dept. of Computer and Information Science University of Oregon Joint work with Pedro Domingos.
1 The Theory of NP-Completeness 2 Cook ’ s Theorem (1971) Prof. Cook Toronto U. Receiving Turing Award (1982) Discussing difficult problems: worst case.
Learning With Bayesian Networks Markus Kalisch ETH Zürich.
NP-COMPLETE PROBLEMS. Admin  Two more assignments…  No office hours on tomorrow.
CPSC 422, Lecture 11Slide 1 Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 11 Oct, 2, 2015.
Tractable Inference for Complex Stochastic Processes X. Boyen & D. Koller Presented by Shiau Hong Lim Partially based on slides by Boyen & Koller at UAI.
Gaussian Processes For Regression, Classification, and Prediction.
Data Reduction for Graph Coloring Problems Bart M. P. Jansen Joint work with Stefan Kratsch August 22 nd 2011, Oslo.
Deterministic Algorithms for Submodular Maximization Problems Moran Feldman The Open University of Israel Joint work with Niv Buchbinder.
Dynamics of Binary Search Trees under batch insertions and deletions with duplicates ╛ BACKGROUND The complexity of many operations on Binary Search Trees.
Young CS 331 D&A of Algo. NP-Completeness1 NP-Completeness Reference: Computers and Intractability: A Guide to the Theory of NP-Completeness by Garey and.
Lecture. Today Problem set 9 out (due next Thursday) Topics: –Complexity Theory –Optimization versus Decision Problems –P and NP –Efficient Verification.
1 Structure Learning (The Good), The Bad, The Ugly Inference Graphical Models – Carlos Guestrin Carnegie Mellon University October 13 th, 2008 Readings:
Tommy Messelis * Stefaan Haspeslagh Burak Bilgin Patrick De Causmaecker Greet Vanden Berghe *
The NP class. NP-completeness Lecture2. The NP-class The NP class is a class that contains all the problems that can be decided by a Non-Deterministic.
CSC Lecture 23: Sigmoid Belief Nets and the wake-sleep algorithm Geoffrey Hinton.
Probabilistic Reasoning Inference and Relational Bayesian Networks.
CS498-EA Reasoning in AI Lecture #23 Instructor: Eyal Amir Fall Semester 2011.
8.1 Determine whether the following statements are correct or not
ICS 280 Learning in Graphical Models
Non-Parametric Models
Circuit Lower Bounds A combinatorial approach to P vs NP
Special Topics In Scientific Computing
CSCI 5822 Probabilistic Models of Human and Machine Learning
Analysis and design of algorithm
Parameterised Complexity
Expectation-Maximization & Belief Propagation
NP-Completeness Reference: Computers and Intractability: A Guide to the Theory of NP-Completeness by Garey and Johnson, W.H. Freeman and Company, 1979.
Parsing Unrestricted Text
Presentation transcript:

Lorentz workshop on Human Probabilistic Inferences A complexity-theoretic perspective on the preconditions for Bayesian tractability Johan Kwisthout (joint work with Iris van Rooij, Todd Wareham, Maria Otworowska, and Mark Blokpoel)

The Tractability Constraint “The computations postulated by a model of cognition need to be tractable in the real world in which people live, not only in the small world of an experiment... This eliminates NP-hard models that lead to computational explosion.” (Gigerenzer et al., 2008)

The Tractability Constraint “The computations postulated by a model of cognition need to be tractable in the real world in which people live, not only in the small world of an experiment... This eliminates NP-hard models that lead to computational explosion.” (Gigerenzer et al., 2008) (neuro-)

Marr’s three levels of explanation LevelMarr’s levelsQuestion 1ComputationalWhat? 2AlgorithmMethod? 3ImplementationImplementation? input i Cognitive process output o = f(i)

Computational-level Models of Cognition input i Cognitive process output o = f(i) Bayesian inference

The Tractability Constraint “The computations postulated by a model of cognition need to be tractable in the real world in which people live, not only in the small world of an experiment... This eliminates NP-hard models that lead to computational explosion.” (Gigerenzer et al., 2008)

Scalability Baker, C.L., Tenenbaum J.B.,& Saxe, R.R. (2007) Goal Inference as Inverse Planning, CogSci’07 Does this scale to, e.g., recognizing intentions in a job interview?

Scalability (2) Karl Friston (2013) Life as we know it. J R Soc Interface. (2003) Neural Netw.

The Tractability Constraint “The computations postulated by a model of cognition need to be tractable in the real world in which people live, not only in the small world of an experiment... This eliminates NP-hard models that lead to computational explosion.” (Gigerenzer et al., 2008)

Why NP-hard is considered intractable NP-hard functions cannot be computed in polynomial time (assuming P  NP). Instead they require exponential time (or worse) for their computation, which is why they are considered intractable (in other words, unrealistic to compute for all but small inputs). n2n2n msec min x 10 2 yrs x yrs x yrs nn2n msec sec sec sec min

n2n2n msec min x 10 2 yrs x yrs x yrs nn2n msec sec sec sec min polynomialexponential

Bayesian Inference is Intractable (NP-hard) Bayesian inference

Bayesian Inference is Intractable (NP-hard) Bayesian inference

input i intractable function f output o = f(i) input i tractable algorithm A output A(i)  f(i) Algorithmic-level Computational-level consistent Can approximation help us?

input i intractable function f output o = f(i) input i tractable algorithm A output A(i)  f(i) Algorithmic-level Computational-level For Bayesian computations: in general, NO

It is NP-hard to: compute posteriors approximately (Dagum & Luby, 1994) fixate beliefs approximately (Abdelbar & Hedetniemi, 1998) or with a non-zero probability (Kwisthout, 2011) fixate a belief that resembles the most probable belief (Kwisthout, 2012) fixate a belief that is likely the most probable belief (Kwisthout & van Rooij, 2012) fixate a belief that ranks within the k most probable beliefs (Kwisthout, 2014) Classical Bayesian approximation results

Isn’t this all just a property of discrete distributions? No, not in general. Any discrete probability distribution can be approximated by a (multivariate) Gaussian mixture distribution and vice versa But shouldn’t we then assume that the distributions used in computational cognitive models are constrained to simple Gaussian distributions? We might. But then, can we then reasonable scale models up to, e.g., intention recognition? Or even as far as counterfactual reasoning? Blokpoel, Kwisthout, and Van Rooij (2012). When can predictive brains be truly Bayesian? Commentary to Andy’s BBS paper – see also Andy’s reply But wait…

Intractability as “bad news” intractability is a theoretical problem "zeer indrukwekkend, collega... maar werkt het ook in theorie? " Fokke & Sukke: “Very impressive, colleague … but does it also work in theory?” If it is tractable in practice, then it should also be tractable in theory. If not, then our theories fail to explain what happens in practice.

Tractability as a theoretical constraint helps cognitive modelers by ruling out unrealistic models. Intractability as “good news” tractability is a model constraint Cognitive functions Tractable functions All possible functions Inference Belief fixation constrained

Step 1. Identify parameters of the model that can be proven to be sources of intractability In general, NP-hard problems take exponential time in the worst case to solve  some instances are easy, some are hard Identify what makes these instances hard (or easy) How to constrain Bayesian inferences?

Step 1. Identify parameters of the model that can be proven to be sources of intractability exp(n)  exp(k 1, k 2, …k m )poly(n) For example, k 1 : max out degree k 2 : max # unknowns k 3 : etc. Bayesian inference How to constrain Bayesian inferences?

Step 1. Identify parameters of the model that can be proven to be sources of intractability exp(n)  exp(k 1, k 2, …k m )poly(n) Step 2. Constrain the model to small values for the parameters k 1, k 2, …k m. (Note: n can still be large!) Step 3. Verify that the constraints hold for humans in real-life situations, and test in the lab if performance breaks down when parameters are large How to constrain Bayesian inferences?

What makes Bayesian inferences tractable? Exact inferences degree of network? cardinality of variables? length of paths/chains? structure of dependences posterior probability Approximate inferences degree of network? cardinality of variables? length of paths/chains? structure of dependences posterior probability characteristics of the probability distribution?

Local search techniques, MC sampling, etc., are dependent on the landscape of the probability distribution For some Bayesian inference problems, this landscape can be parameterized – we can prove bounds on the success of the approximation algorithm relative to this parameter Kwisthout & Van Rooij (2013), Bridging the gap between theory and practice of approximate Bayesian inference. Cognitive Systems Research, 24, 2–8. Kwisthout (under review), Treewidth and the Computational Complexity of MAP Approximations. CPDs and approximate inference

Our version of the Tractability Constraint “The computations postulated by a model of cognition need to be tractable in the real world in which people live, not only in the small world of an experiment... This eliminates NP-hard models that lead to computational explosion.” (Gigerenzer et al., 2008) This poses the need for a thorough analysis of the sources of complexity underlying NP-hard models, and eliminates NP-hard models expect those that can be proven to be fixed-parameter tractable for parameters that may safely be assumed to be small in the real world.

Informal Formal Tractability testing Empirical testing Computational level theory Tractable-design cycle