Download presentation
Presentation is loading. Please wait.
Published byHorace Dickerson Modified over 9 years ago
1
Abstract We offer a formal treatment of choice behaviour based on the premise that agents minimise the expected free energy of future outcomes. Crucially, the negative free energy or quality of a policy can be decomposed into extrinsic and epistemic (intrinsic) value. Minimising expected free energy is therefore equivalent to maximising extrinsic value or expected utility (defined in terms of prior preferences or goals), while maximising information gain or intrinsic value; i.e., reducing uncertainty about the causes of valuable outcomes. The resulting scheme resolves the exploration-exploitation dilemma: epistemic value is maximised until there is no further information gain, after which exploitation is assured through maximisation of extrinsic value. This is formally consistent with the Infomax principle, generalising formulations of active vision based upon salience (Bayesian surprise) and optimal decisions based on expected utility and risk sensitive (KL) control. Furthermore, as with previous active inference formulations of discrete (Markovian) problems; ad hoc softmax parameters become the expected (Bayes-optimal) precision of beliefs about – or confidence in – policies. We focus on the basic theory – illustrating the minimisation of expected free energy using simulations. A key aspect of this minimisation is the similarity of precision updates and dopaminergic discharges observed in conditioning paradigms. Active inference and epistemic value Karl Friston, Francesco Rigoli, Dimitri Ognibene, Christoph Mathys, Thomas FitzGerald and Giovanni Pezzulo
2
Premise All agents minimize free energy (under a generative model) All agents possess prior beliefs (preferences) Free energy is minimized when priors are (actively) realized All agents believe they will minimize (expected) free energy
3
Set-up and definitions: active inference Free energy if action perception world agent Generative process Generative model Approximate posterior Perception-action cycle
4
Hidden states Control states ? ? Reject or stay Low offer High offer ? ? Accept or shift Low offer High offer An example:
5
if Full priors – control states Empirical priors – hidden states Likelihood The (normal form) generative model Hidden states Action Control states
6
KL or risk-sensitive control In the absence of ambiguity: Expected utility theory In the absence of posterior uncertainty or risk: Priors over policies Bayesian surprise and Infomax In the absence of prior beliefs about outcomes: Prior beliefs about policies Expected free energy Extrinsic value Epistemic value Predicted divergence Predicted ambiguity Predicted divergence Extrinsic value Bayesian surprise Predicted mutual information
7
The quality of a policy corresponds to (negative) expected free energy Generative model of future statesFuture generative model of states Prior preferences (goals) over future outcomes
8
The mean field partition And variational updates Minimising free energy
9
midbrain motor Cortex occipital Cortex striatum Variational updates Perception Action selection Precision Functional anatomy Forward sweeps over future states if prefrontal Cortex hippocampus Predicted divergence Predicted ambiguity
10
The (T-maze) problem Y Y or
11
YY Generative model Hidden states Control states Observations Extrinsic value Epistemic value Prior beliefs about control Posterior beliefs about states YYYY location stimulus CS CS US NS YYYY location context YYYY location
12
Comparing different schemes 00.20.40.60.81 0 20 40 60 80 Performance Prior preference success rate (%) FE KL EU DA Sensitive to risk or ambiguity - - + - + +
13
Simulating conditioned responses (in terms of precision updates)
14
-8-7-6-5-4-3-20 0 1 2 3 4 5 6 7 8 Expected value Expected precision Expected precision and value Changes in expected precision reflect changes in expected value: c.f., dopamine and reward prediction error
15
Simulating conditioned responses
16
Learning as inference YY Hidden states YYYY location context Control states YYYY YYYY maze Hierarchical augmentation of state space Bayesian belief updating between trials: of conserved (maze) states
17
Learning as inference performance uncertainty
18
Summary Optimal behaviour can be cast as a pure inference problem, in which valuable outcomes are defined in terms of prior beliefs about future states. Exact Bayesian inference (perfect rationality) cannot be realised physically, which means that optimal behaviour rests on approximate Bayesian inference (bounded rationality). Variational free energy provides a bound on Bayesian model evidence that is optimised by bounded rational behaviour. Bounded rational behaviour requires (approximate Bayesian) inference on both hidden states of the world and (future) control states. This mandates beliefs about action (control) that are distinct from action per se – beliefs that entail a precision. These beliefs can be cast in terms of minimising the expected free energy, given current beliefs about the state of the world and future choices. The ensuing quality of a policy entails epistemic value and expected utility that account for exploratory and exploitative behaviour respectively. Variational Bayes provides a formal account of how posterior expectations about hidden states of the world, policies and precision depend upon each other; and may provide a metaphor for message passing in the brain. Beliefs about choices depend upon expected precision while beliefs about precision depend upon the expected quality of choices. Variational Bayes induces distinct probabilistic representations (functional segregation) of hidden states, control states and precision – and highlights the role of reciprocal message passing. This may be particularly important for expected precision that is required for optimal inference about hidden states (perception) and control states (action selection). The dynamics of precision updates and their computational architecture are consistent with the physiology and anatomy of the dopaminergic system.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.