Aims Research aim to provide internally consistent, practically applicable, methodology of dynamic decision making (DM) Talk aims to provide DM-based.

Slides:



Advertisements
Similar presentations
FT228/4 Knowledge Based Decision Support Systems
Advertisements

Design of Experiments Lecture I
Naïve Bayes. Bayesian Reasoning Bayesian reasoning provides a probabilistic approach to inference. It is based on the assumption that the quantities of.
Statistical Decision Theory Abraham Wald ( ) Wald’s test Rigorous proof of the consistency of MLE “Note on the consistency of the maximum likelihood.
Rulebase Expert System and Uncertainty. Rule-based ES Rules as a knowledge representation technique Type of rules :- relation, recommendation, directive,
Lecture XXIII.  In general there are two kinds of hypotheses: one concerns the form of the probability distribution (i.e. is the random variable normally.
Part 3 Probabilistic Decision Models
1 Slides for the book: Probabilistic Robotics Authors: Sebastian Thrun Wolfram Burgard Dieter Fox Publisher: MIT Press, Web site for the book & more.
Ai in game programming it university of copenhagen Statistical Learning Methods Marco Loog.
AI – CS364 Uncertainty Management 26 th September 2006 Dr Bogdan L. Vrusias
Towards Objective Ranking of Project Proposals Miroslav Kárný Department of Adaptive Systems Institute of Information Theory and Automation Academy of.
Statistics between Inductive Logic and Empirical Science Jan Sprenger University of Bonn Tilburg Center for Logic and Philosophy of Science 3 rd PROGIC.
Marakas: Decision Support Systems, 2nd Edition © 2003, Prentice-Hall Chapter Chapter 4: Modeling Decision Processes Decision Support Systems in the.
Probabilistic Robotics Introduction Probabilities Bayes rule Bayes filters.
1 September 4, 2003 Bayesian System Identification and Structural Reliability Soheil Saadat, Research Associate Mohammad N. Noori, Professor & Head Department.
Asaf Cohen Department of Mathematics University of Michigan Financial Mathematics Seminar University of Michigan September 10,
SCIENTIFIC INVESTIGATION
MAKING COMPLEX DEClSlONS
Introduction to variable selection I Qi Yu. 2 Problems due to poor variable selection: Input dimension is too large; the curse of dimensionality problem.
Soft Computing Lecture 17 Introduction to probabilistic reasoning. Bayesian nets. Markov models.
Statistical Decision Theory
SWAN 2006, ARRI_ Control of Distributed-Information Nonlinear Stochastic Systems Prof. Thomas Parisini University of Trieste.
ECE 8443 – Pattern Recognition ECE 8423 – Adaptive Signal Processing Objectives: Deterministic vs. Random Maximum A Posteriori Maximum Likelihood Minimum.
ECE 8443 – Pattern Recognition LECTURE 07: MAXIMUM LIKELIHOOD AND BAYESIAN ESTIMATION Objectives: Class-Conditional Density The Multivariate Case General.
Estimating Component Availability by Dempster-Shafer Belief Networks Estimating Component Availability by Dempster-Shafer Belief Networks Lan Guo Lane.
Uncertainty Management in Rule-based Expert Systems
Statistical Decision Theory Bayes’ theorem: For discrete events For probability density functions.
Ch15: Decision Theory & Bayesian Inference 15.1: INTRO: We are back to some theoretical statistics: 1.Decision Theory –Make decisions in the presence of.
Chapter 20 Classification and Estimation Classification – Feature selection Good feature have four characteristics: –Discrimination. Features.
Probabilistic Robotics Introduction.  Robotics is the science of perceiving and manipulating the physical world through computer-controlled devices.
6. Population Codes Presented by Rhee, Je-Keun © 2008, SNU Biointelligence Lab,
1 Machine Learning: Lecture 6 Bayesian Learning (Based on Chapter 6 of Mitchell T.., Machine Learning, 1997)
Probabilistic Robotics Introduction Probabilities Bayes rule Bayes filters.
Information and Statistics in Nuclear Experiment and Theory - Introduction D. G. Ireland 16 November 2015 ISNET-3, ECT* Trento.
Naïve Bayes Classifier April 25 th, Classification Methods (1) Manual classification Used by Yahoo!, Looksmart, about.com, ODP Very accurate when.
Probabilistic Robotics Introduction. SA-1 2 Introduction  Robotics is the science of perceiving and manipulating the physical world through computer-controlled.
Probabilistic Robotics Probability Theory Basics Error Propagation Slides from Autonomous Robots (Siegwart and Nourbaksh), Chapter 5 Probabilistic Robotics.
On triangular norms, metric spaces and a general formulation of the discrete inverse problem or starting to think logically about uncertainty On triangular.
Bayesian analysis of a conceptual transpiration model with a comparison of canopy conductance sub-models Sudeep Samanta Department of Forest Ecology and.
Bayesian Estimation and Confidence Intervals Lecture XXII.
Lecture 1.31 Criteria for optimal reception of radio signals.
CHAPTER 5 Handling Uncertainty BIC 3337 EXPERT SYSTEM.
KARL POPPER ON THE PROBLEM OF A THEORY OF SCIENTIFIC METHOD
Bayesian Estimation and Confidence Intervals
Charles Peirce Decision or beliefs are based on Method of tenacity
Research & Writing in CJ
6.5 Stochastic Prog. and Benders’ decomposition
CS61C : Machine Structures Lecture 6. 2
Markov ó Kalman Filter Localization
Course: Autonomous Machine Learning
Data Mining Lecture 11.
Introduction to particle filter
Hidden Markov Models Part 2: Algorithms
CSE-490DF Robotics Capstone
The
Introduction to particle filter
This Lecture Substitution model
Where did we stop? The Bayes decision rule guarantees an optimal classification… … But it requires the knowledge of P(ci|x) (or p(x|ci) and P(ci)) We.
Intelligent Systems and
MACHINE LEARNING TECHNIQUES IN IMAGE PROCESSING
MACHINE LEARNING TECHNIQUES IN IMAGE PROCESSING
Bayes for Beginners Luca Chech and Jolanda Malamud
Machine Learning: Lecture 6
28th September 2005 Dr Bogdan L. Vrusias
Machine Learning: UNIT-3 CHAPTER-1
CS 416 Artificial Intelligence
This Lecture Substitution model
Information Theoretical Analysis of Digital Watermarking
6.5 Stochastic Prog. and Benders’ decomposition
Kalman Filter: Bayes Interpretation
Presentation transcript:

Aims Research aim to provide internally consistent, practically applicable, methodology of dynamic decision making (DM) Talk aims to provide DM-based justification of Bayesian learning to point to open problems worth of research effort

Design of optimal DM strategy Optimal strategy RT should reach the best behavior Q = (dT, T) using available information while respecting given restrictions unseen t System ST is a world part, St: world state, at yt, t innovations yt data dt= ( yt, at ) actions at Strategy RT generates actions by rules Rt: prior, dt-1 at information, complexity, range restrictions

Behaviors’ & strategies’ ordering Aim orders strategies: R is better than R iff behavior Q = ( dT, T ) of S-R is closer to the aim than behavior Q = (dT , T) of S-R Order of behaviors via loss Z: Q fully ordered set  R better than R  Z(Q)  Z(Q) But Z is unsuitable for design if dependence of Q on R is unknown !

Loss as uncertainty function Uncertainty =  U Split Q = (knowingly R-given part,U)  {R} U  Z(U) = Z(Q)  unseen T complexity modeling errors vague quantification neglected influences For U  , prior ordering on {R} is needed E: Z• fully ordered set  R better than R  EZ  EZ …what E’s are good?

Requirements on E The best R implied by the chosen E must not be a priori bad The E should be as objective as possible  - it has to be good on designer chosen subsets of {R} - it has to serve for any (reasonable) designer chosen loss The best R must not depend on designer attitude to uncertainty The best R has to minimize loss when uncertainty is empty

Consequences of requirements          Z(U)  U U   EZ=EZ +EZ   Neutral (Z) = Z …E mathematical expectation of utility function (Z) of loss Z!

E Bayesian calculus R: D a a min E[Z] = E[min E[Z|a,D]] Basic DM lemma V(dT ) = E[ Z(dT, T) | dT, MT] V(dt-1) = min E[V(dt) |at, dt-1, Mt-1], Design of DDM t =T, T-1, …,1 Mt model of St at Optimal RT randomized supp f (at | dt-1,Mt-1) = Arg min E[V(dt) | at, dt-1, Mt-1] f (T| dT ,MT) … Bayesian filtering Needed pdfs f (yt|at, dt-1 ,Mt-1) … Bayesian prediction Required models Mt of St implied by the need to filter & predict !

Mt: observation & time evolution NC DM f(at, t-1 | dt-1) = f(at | dt-1, t-1 ) f(t-1 | dt-1) Observation model f(yt | at, dt-1, t) relates seen to unseen Evolution model f(t+1| at+1, dt, t) models unseen Predictive pdf f (yt| at, dt-1) =  f(yt | at, dt-1, t) f(t| at, dt-1)dt data update f(t | dt)  f(yt | at, dt-1, t) f(t | at, dt-1) Filtering time update f(t+1| at+1,dt) =  f(t+1| at+1,dt,t) f(t| dt) dt Prior pdf f(1 | d0 ) = f(1) = belief in possible values of 1 … why f(1) > 0 when MT  ST for any 1?

Bayesian paradigm & reality World part generating dT, T Prior pdf = belief that 1 is the best projection ST the nearest model … unknown as ST unknown posterior pdf = belief in 1 corrected by data dT Model set indexed by T … any practical consequence ?

Projection consequences The World model is a subjective thing  DM is free to see the World … at own responsibility Bayes’ rule learns the best projection  minimizes entropy rate  entropy rate is an inherent Bayesian discrepancy measure Quality of the best projection depends heavily on the model set  careful modeling of the World pays back Projection error cannot be measured … a range of methodologies ignores this ! … any chance to get information about better models ?

Model comparison f(dT|M)= f(dT,T) dT f(dT |M)= f(dT,T) dT Model set M indexed by T Model set M indexed by T The best M*  {M M} uncertain Point estimation Model combination f(M* | dT)  f(dT |M*) f(M*) f(dT) = M* f(dT|M*) f(M*) preserves complexity avoids unnecessary DM

Lesson from model comparison Values of predictive pdf serve for model comparison Compound hypothesis testing is straightforward if alternative model sets are specified (modeling!) Unnecessary DM should be avoided (valid generally!) No new techniques are needed … just modeling !

Open problems Is it useful to exploit that often strategies R  R with U  U ? … probably yes Has to be E additive on losses with uncommon support ? … probably reasonable, I see no alternative Is the notion of the loss, ordering a posteriori behaviors, needed ? … probably unnecessary and just matter of explanation: worth trying in justifying fully probabilistic design of DM strategies Does Bayesian scheme lead to quantum mechanical effects ? … probably yes via modeling measurement as generally non-commutating projections of physical (societal) quantities

Open problems Is it a priori possible to point to drawbacks of non-Bayesian DM ? … conceptually probably yes, but job was not done even for important classes of methodologies … Is DM possible without an explicit modeling ? … definitely yes at the cost of final quality & reliability … Is there a systematic and feasible way of generating alternative Ms ? … probably no, but guides fixing good practice (!?) are needed How to translate partial, domain specific, knowledge pieces into model and prior pdf ? … I do not know a sufficiently general way but partial results make me optimistic

Open problems Is there possibility to make modeling fully algorithmic ? … probably no, but guides fixing good practice (!?) are needed How to cope with the fact that filtering (estimation) takes its outputs from the feasible class where prior pdf is naturally chosen … methodological unification of Bayesian paradigm with approximation theory is probably only systematic but hard way; otherwise, partial ad hoc solutions have to be elaborated Is it possible to formulate (and finally solve) DM design under complexity restrictions … I do not know so it is time to stop here !