Graphical models: approximate inference and learning CA6b, lecture 5.

Slides:



Advertisements
Similar presentations
Part 2: Unsupervised Learning
Advertisements

Bayesian Belief Propagation
Linear Time Methods for Propagating Beliefs Min Convolution, Distance Transforms and Box Sums Daniel Huttenlocher Computer Science Department December,
Lecture 16 Hidden Markov Models. HMM Until now we only considered IID data. Some data are of sequential nature, i.e. have correlations have time. Example:
Exact Inference. Inference Basic task for inference: – Compute a posterior distribution for some query variables given some observed evidence – Sum out.
Exact Inference in Bayes Nets
Expectation Maximization
Junction Trees And Belief Propagation. Junction Trees: Motivation What if we want to compute all marginals, not just one? Doing variable elimination for.
Dynamic Bayesian Networks (DBNs)
Supervised Learning Recap
An Introduction to Variational Methods for Graphical Models.
Introduction to Belief Propagation and its Generalizations. Max Welling Donald Bren School of Information and Computer and Science University of California.
EE462 MLCV Lecture Introduction of Graphical Models Markov Random Fields Segmentation Tae-Kyun Kim 1.
Belief Propagation by Jakob Metzler. Outline Motivation Pearl’s BP Algorithm Turbo Codes Generalized Belief Propagation Free Energies.
Overview of Inference Algorithms for Bayesian Networks Wei Sun, PhD Assistant Research Professor SEOR Dept. & C4I Center George Mason University, 2009.
Markov Networks.
Hidden Markov Models M. Vijay Venkatesh. Outline Introduction Graphical Model Parameterization Inference Summary.
GS 540 week 6. HMM basics Given a sequence, and state parameters: – Each possible path through the states has a certain probability of emitting the sequence.
Lecture 17: Supervised Learning Recap Machine Learning April 6, 2010.
Midterm Review. The Midterm Everything we have talked about so far Stuff from HW I won’t ask you to do as complicated calculations as the HW Don’t need.
1 Graphical Models in Data Assimilation Problems Alexander Ihler UC Irvine Collaborators: Sergey Kirshner Andrew Robertson Padhraic Smyth.
Lecture 5: Learning models using EM
Conditional Random Fields
Belief Propagation, Junction Trees, and Factor Graphs
Graphical Models Lei Tang. Review of Graphical Models Directed Graph (DAG, Bayesian Network, Belief Network) Typically used to represent causal relationship.
. Applications and Summary. . Presented By Dan Geiger Journal Club of the Pharmacogenetics Group Meeting Technion.
Computer vision: models, learning and inference
Extracting Places and Activities from GPS Traces Using Hierarchical Conditional Random Fields Yong-Joong Kim Dept. of Computer Science Yonsei.
1 Naïve Bayes Models for Probability Estimation Daniel Lowd University of Washington (Joint work with Pedro Domingos)
Sum-Product Networks CS886 Topics in Natural Language Processing
Undirected Models: Markov Networks David Page, Fall 2009 CS 731: Advanced Methods in Artificial Intelligence, with Biomedical Applications.
Belief Propagation. What is Belief Propagation (BP)? BP is a specific instance of a general class of methods that exist for approximate inference in Bayes.
Lecture 19: More EM Machine Learning April 15, 2010.
Varieties of Helmholtz Machine Peter Dayan and Geoffrey E. Hinton, Neural Networks, Vol. 9, No. 8, pp , 1996.
Frontiers in Applications of Machine Learning Chris Bishop Microsoft Research
Hidden Markov Models in Keystroke Dynamics Md Liakat Ali, John V. Monaco, and Charles C. Tappert Seidenberg School of CSIS, Pace University, White Plains,
CSC 2535 Lecture 8 Products of Experts Geoffrey Hinton.
Inference Complexity As Learning Bias Daniel Lowd Dept. of Computer and Information Science University of Oregon Joint work with Pedro Domingos.
Randomized Algorithms for Bayesian Hierarchical Clustering
Readings: K&F: 11.3, 11.5 Yedidia et al. paper from the class website
Computing & Information Sciences Kansas State University Data Sciences Summer Institute Multimodal Information Access and Synthesis Learning and Reasoning.
CS Statistical Machine learning Lecture 24
CHAPTER 8 DISCRIMINATIVE CLASSIFIERS HIDDEN MARKOV MODELS.
Multi-Speaker Modeling with Shared Prior Distributions and Model Structures for Bayesian Speech Synthesis Kei Hashimoto, Yoshihiko Nankaku, and Keiichi.
1 Mean Field and Variational Methods finishing off Graphical Models – Carlos Guestrin Carnegie Mellon University November 5 th, 2008 Readings: K&F:
Lecture 2: Statistical learning primer for biologists
CIAR Summer School Tutorial Lecture 1b Sigmoid Belief Nets Geoffrey Hinton.
Exact Inference in Bayes Nets. Notation U: set of nodes in a graph X i : random variable associated with node i π i : parents of node i Joint probability:
Pattern Recognition and Machine Learning
CS Statistical Machine learning Lecture 25 Yuan (Alan) Qi Purdue CS Nov
Machine Learning – Lecture 18
Daphne Koller Overview Conditional Probability Queries Probabilistic Graphical Models Inference.
Perceptual and Sensory Augmented Computing Machine Learning, Summer’09 Machine Learning – Lecture 13 Exact Inference & Belief Propagation Bastian.
Today Graphical Models Representing conditional dependence graphically
1 Relational Factor Graphs Lin Liao Joint work with Dieter Fox.
Markov Networks: Theory and Applications Ying Wu Electrical Engineering and Computer Science Northwestern University Evanston, IL 60208
Bayesian Brain Probabilistic Approaches to Neural Coding 1.1 A Probability Primer Bayesian Brain Probabilistic Approaches to Neural Coding 1.1 A Probability.
SUPERVISED AND UNSUPERVISED LEARNING Presentation by Ege Saygıner CENG 784.
30 November, 2005 CIMCA2005, Vienna 1 Statistical Learning Procedure in Loopy Belief Propagation for Probabilistic Image Processing Kazuyuki Tanaka Graduate.
10 October, 2007 University of Glasgow 1 EM Algorithm with Markov Chain Monte Carlo Method for Bayesian Image Analysis Kazuyuki Tanaka Graduate School.
CS498-EA Reasoning in AI Lecture #23 Instructor: Eyal Amir Fall Semester 2011.
Learning Deep Generative Models by Ruslan Salakhutdinov
Markov Networks.
Hidden Markov Models Part 2: Algorithms
CSCI 5822 Probabilistic Models of Human and Machine Learning
Expectation-Maximization & Belief Propagation
Probabilistic image processing and Bayesian network
Lecture 3: Exact Inference in GMs
Markov Networks.
Mean Field and Variational Methods Loopy Belief Propagation
Presentation transcript:

Graphical models: approximate inference and learning CA6b, lecture 5

Bayesian Networks General Factorization

D-separation: Example

Trees Undirected Tree Directed TreePolytree

Converting Directed to Undirected Graphs (2) Additional links

Inference on a Chain

Inference in a HMM E step: belief propagation

Belief propagation in a HMM E step: belief propagation

Expectation maximization in a HMM E step: belief propagation

The Junction Tree Algorithm Exact inference on general graphs. Works by turning the initial graph into a junction tree and then running a sum- product-like algorithm.

Factor Graphs

Factor Graphs from Undirected Graphs

The Sum-Product Algorithm (6)

The Sum-Product Algorithm (5)

The Sum-Product Algorithm (3)

The Sum-Product Algorithm (7) Initialization

Sensory observations Prior expectations Forest Tree LeaveRoot Bottom-up Top-down Stem Green Consequence of failing inhibition in hierarchical inference

Causal model Pairwise factor graph Bayesian network and factor graph

Causal model Pairwise factor graph

Causal model Pairwise factor graph

Pairwise graphs Log belief ratio Log messages ratio

Belief propagation and inhibitory loops

Tight excitatory/inhibitory balance is required, and sufficient Okun and Lampl, Nat Neuro 2008 Inhibition Excitation

Lewis et al, Nat Rev Nsci 05 controls schizophrenia Support for impaired inhibition in schizophrenia See also: Benes, Neuropsychopharmacology 2010, Uhhaas and Singer, Nat Rev Nsci 2010… GAD26

Circular inference: Impaired inhibitory loops

Circular inference and overconfidence:

Renaud Jardri Alexandra Litvinova & Sandrine Duverne The Fisher Task 3 4 A priori Evidence sensorielles Confiance a posteriori

Mean group responses Controls:Schizophrenes: Simple Bayes:

Control Patients

? s SCZ CTL *** * Parameter value (mean + sd) Mean parameter values

PANSS positive factor Inference loops and psychosis 25 Non-clinical beliefs (PDI-21 scores) PDI score Strenght of loops

The Junction Tree Algorithm Exact inference on general graphs. Works by turning the initial graph into a junction tree and then running a sum- product-like algorithm. Intractable on graphs with large cliques.

What if exact inference is intractable? Loopy belief propagation works in some scenarios. Markov-Monte-Carlo sampling methods. Variational methods (not covered here)

Loopy Belief Propagation Sum-Product on general graphs. Initial unit messages passed across all links, after which messages are passed around until convergence (not guaranteed!). Approximate but tractable for large graphs. Sometime works well, sometimes not at all.

Neural code for uncertainty: sampling

Alternative neural code for uncertainty: sampling Berkes et al, Science 2011

Alternative neural code for uncertainty: sampling

Learning in graphical models More generally: learning parameters in latent variable models Visible Hidden

Learning in graphical models More generally: learning parameters in latent variable models Visible Hidden

Learning in graphical models More generally: learning parameters in latent variable models Visible Hidden Huge!

Mixture of Gaussians (clustering algorithm) Data (unsupervised)

Mixture of Gaussians (clustering algorithm) Data (unsupervised) Generative model: M possible clusters Gaussian distribution

Mixture of Gaussians (clustering algorithm) Data (unsupervised) Generative model: M possible clusters Gaussian distribution Parameters

Given the current parameters and the data, what are the expected hidden states? Expectation stage: Responsability

Given the responsabilities of each cluster, update the parameters to maximize the likelihood of the data: Maximization stage:

Learning in hidden Markov models Hidden state Observations cause Forward model Sensory likelihood Inverse model

Object present/not Receptor spike/not Time

Leak Synaptic input Bayesian integration corresponds to leaky integration.

Expectation maximization in a HMM Multiple training sequences: What are the parameters: Transition probabilities Observation probabilities

Expectation stage E step: belief propagation

Expectation stage E step: belief propagation

Expectation stage E step: belief propagation

Using “on-line” expectation maximization, a neuron can adapt to the statistics of its input.

Fast adaptation in single neurons Adaptation to temporal statistics? Fairhall et al, 2001