Value of Information 1 st year review. UCLA 2012 Kickoff VOI Kickoff ARO MURI on Value-centered Information Theory for Adaptive Learning, Inference, Tracking,

Slides:



Advertisements
Similar presentations
Pattern Recognition and Machine Learning
Advertisements

Pattern Classification & Decision Theory. How are we doing on the pass sequence? Bayesian regression and estimation enables us to track the man in the.
Learning Riemannian metrics for motion classification Fabio Cuzzolin INRIA Rhone-Alpes Computational Imaging Group, Pompeu Fabra University, Barcellona.
SHAPE THEORY USING GEOMETRY OF QUOTIENT SPACES: STORY STORY SHAPE THEORY USING GEOMETRY OF QUOTIENT SPACES: STORY STORY ANUJ SRIVASTAVA Dept of Statistics.
Differential geometry I
Nonlinear Dimension Reduction Presenter: Xingwei Yang The powerpoint is organized from: 1.Ronald R. Coifman et al. (Yale University) 2. Jieping Ye, (Arizona.
1 12. Principles of Parameter Estimation The purpose of this lecture is to illustrate the usefulness of the various concepts introduced and studied in.
Fast Bayesian Matching Pursuit Presenter: Changchun Zhang ECE / CMR Tennessee Technological University November 12, 2010 Reading Group (Authors: Philip.
Dynamic Bayesian Networks (DBNs)
Ai in game programming it university of copenhagen Statistical Learning Methods Marco Loog.
Visual Recognition Tutorial
For stimulus s, have estimated s est Bias: Cramer-Rao bound: Mean square error: Variance: Fisher information How good is our estimate? (ML is unbiased:
Pattern Recognition and Machine Learning
Parameter Estimation: Maximum Likelihood Estimation Chapter 3 (Duda et al.) – Sections CS479/679 Pattern Recognition Dr. George Bebis.
Maximum likelihood (ML) and likelihood ratio (LR) test
Uncalibrated Geometry & Stratification Sastry and Yang
Maximum likelihood (ML) and likelihood ratio (LR) test
Scalable Information-Driven Sensor Querying and Routing for ad hoc Heterogeneous Sensor Networks Maurice Chu, Horst Haussecker and Feng Zhao Xerox Palo.
Arizona State University DMML Kernel Methods – Gaussian Processes Presented by Shankar Bhargav.
Maximum likelihood (ML)
Rician Noise Removal in Diffusion Tensor MRI
Binary Variables (1) Coin flipping: heads=1, tails=0 Bernoulli Distribution.
Hamdy N.Abd-ellah حمدي نور الدين عبد الله Department of Mathematics, Faculty of Science, Assiut University Assiut, Egypt جامعة أم القرى قسم الرياضيات.
1 Reproducing Kernel Exponential Manifold: Estimation and Geometry Kenji Fukumizu Institute of Statistical Mathematics, ROIS Graduate University of Advanced.
Speech Recognition Pattern Classification. 22 September 2015Veton Këpuska2 Pattern Classification  Introduction  Parametric classifiers  Semi-parametric.
Julian Center on Regression for Proportion Data July 10, 2007 (68)
Statistical Decision Making. Almost all problems in statistics can be formulated as a problem of making a decision. That is given some data observed from.
Lecture note for Stat 231: Pattern Recognition and Machine Learning 4. Maximum Likelihood Prof. A.L. Yuille Stat 231. Fall 2004.
Value of Information 1 st year review. UCLA 2012 Kickoff ARO MURI on Value-centered Information Theory for Adaptive Learning, Inference, Tracking, and.
ECE 8443 – Pattern Recognition ECE 8423 – Adaptive Signal Processing Objectives: Deterministic vs. Random Maximum A Posteriori Maximum Likelihood Minimum.
Simulation of the matrix Bingham-von Mises- Fisher distribution, with applications to multivariate and relational data Discussion led by Chunping Wang.
Ch 2. Probability Distributions (1/2) Pattern Recognition and Machine Learning, C. M. Bishop, Summarized by Yung-Kyun Noh and Joo-kyung Kim Biointelligence.
SUPA Advanced Data Analysis Course, Jan 6th – 7th 2009 Advanced Data Analysis for the Physical Sciences Dr Martin Hendry Dept of Physics and Astronomy.
MURI: Integrated Fusion, Performance Prediction, and Sensor Management for Automatic Target Exploitation 1 Dynamic Sensor Resource Management for ATE MURI.
1 E. Fatemizadeh Statistical Pattern Recognition.
Value of Information 1 st year review. UCLA 2012 Kickoff VOI Kickoff ARO MURI on Value-centered Information Theory for Adaptive Learning, Inference, Tracking,
Approximate Dynamic Programming Methods for Resource Constrained Sensor Management John W. Fisher III, Jason L. Williams and Alan S. Willsky MIT CSAIL.
The Dirichlet Labeling Process for Functional Data Analysis XuanLong Nguyen & Alan E. Gelfand Duke University Machine Learning Group Presented by Lu Ren.
Non-Euclidean Example: The Unit Sphere. Differential Geometry Formal mathematical theory Work with small ‘patches’ –the ‘patches’ look Euclidean Do calculus.
Computational Intelligence: Methods and Applications Lecture 23 Logistic discrimination and support vectors Włodzisław Duch Dept. of Informatics, UMK Google:
PROBABILITY AND STATISTICS FOR ENGINEERING Hossein Sameti Department of Computer Engineering Sharif University of Technology Principles of Parameter Estimation.
Presented by Jian-Shiun Tzeng 5/7/2009 Conditional Random Fields: An Introduction Hanna M. Wallach University of Pennsylvania CIS Technical Report MS-CIS
CHAPTER 5 SIGNAL SPACE ANALYSIS
Chapter 3: Maximum-Likelihood Parameter Estimation l Introduction l Maximum-Likelihood Estimation l Multivariate Case: unknown , known  l Univariate.
BCS547 Neural Decoding. Population Code Tuning CurvesPattern of activity (r) Direction (deg) Activity
MURI Annual Review, Vanderbilt, Sep 8 th, 2009 Heterogeneous Sensor Webs for Automated Target Recognition and Tracking in Urban Terrain (W911NF )
ECE 8443 – Pattern Recognition ECE 8423 – Adaptive Signal Processing Objectives: Supervised Learning Resources: AG: Conditional Maximum Likelihood DP:
Lecture 2: Statistical learning primer for biologists
Computer vision: models, learning and inference M Ahad Multiple Cameras
Tracking with dynamics
Lecture 3: MLE, Bayes Learning, and Maximum Entropy
Introduction to Estimation Theory: A Tutorial
Review of statistical modeling and probability theory Alan Moses ML4bio.
Ch 2. Probability Distributions (1/2) Pattern Recognition and Machine Learning, C. M. Bishop, Summarized by Joo-kyung Kim Biointelligence Laboratory,
Hodge Theory Calculus on Smooth Manifolds. by William M. Faucette Adapted from lectures by Mark Andrea A. Cataldo.
Learning Theory Reza Shadmehr Distribution of the ML estimates of model parameters Signal dependent noise models.
- A Maximum Likelihood Approach Vinod Kumar Ramachandran ID:
Statistical Decision Making. Almost all problems in statistics can be formulated as a problem of making a decision. That is given some data observed from.
Chapter 3: Maximum-Likelihood Parameter Estimation
12. Principles of Parameter Estimation
Probability Theory and Parameter Estimation I
LECTURE 03: DECISION SURFACES
Unsupervised Riemannian Clustering of Probability Density Functions
MURI Annual Review Meeting Randy Moses November 3, 2008
Learning with information of features
Uncalibrated Geometry & Stratification
Roc curves By Vittoria Cozza, matr
12. Principles of Parameter Estimation
Data Exploration and Pattern Recognition © R. El-Yaniv
Computer Aided Geometric Design
Presentation transcript:

Value of Information 1 st year review. UCLA 2012 Kickoff VOI Kickoff ARO MURI on Value-centered Information Theory for Adaptive Learning, Inference, Tracking, and Exploitation Value of Information 1 st year review. UCLA 2012 Kickoff ARO MURI on Value-centered Information Theory for Adaptive Learning, Inference, Tracking, and Exploitation VOI Information Value in Registration and Sensor Management Doug Cochran Arizona State University ARO/OSD MURI Review UCLA 28 October 2012 Joint work with Steve Howard, Utku Ilkturk, Bill Moran, and Rodrigo Platte 1 st year review. UCLA 2012

Value of Information 1 st year review. UCLA 2012 Kickoff VOI Kickoff 1 st year review. UCLA 2012 Value of Information Kickoff VOI Summary of 2012 Research Thrusts 1.Gauge-invariant estimation in networks In parameter estimation with a sensor network, which links contribute most? 2.Sensor Management via Riemannian geometry The metric structure induced on a parameter manifold by the Fisher information in estimation problems provides an approach to managing sensor configurations 3.Measurement selection for observability (started in collaboration with ARL) What are the considerations when deciding which linear measurement map to choose from a library at each stage of an iterative observation problem?

Value of Information 1 st year review. UCLA 2012 Kickoff VOI Kickoff !6 !5 !3 !1 !2 !4 !7 !8 VOI Gauge-invariant Estimation in Networks Tenets The purpose of sensor networks is to sense; i.e., to enable detection, estimation, classification, and tracking The value of network infrastructure is to enable sharing and fusion of data collected at different sensors To exploit this, data at the nodes must be registered Intrinsic data; e.g., clocks, platform orientation Extrinsic data: collected by sensors Can we quantify the value of adding particular links in terms that are meaningful to the sensing mission?

Value of Information 1 st year review. UCLA 2012 Kickoff VOI Network graph: Directed graph  on a vertex set V(  ) with edges E(  ) Vertex labels: An element of a Lie group G is associated with each node of  Edge labels: An element of G on each edge e E(  ) representing a noisy measurement of the difference of the values on the target vertex t(e) and the source vertex s(e) Goal: Estimate the connection – the true relative offsets between the vertex values The state of the network is a G-valued function x on V(  ) but it is never directly observed If the network were aligned, the function would be constant (flat connection) What is observed is the connection  G |E(  )| relative to the chosen gauge In the absence of noise, for an aligned network this is the identity connection If the network is not aligned, a gauge transformation can be found that takes  to the identity Gauge-invariant Estimation in Networks Registration on a Graph

Value of Information 1 st year review. UCLA 2012 Kickoff VOI If G is the real line and the noise on the edges of  is zero-mean Gaussian with covariance matrix R=  2 I The Fisher information is F=L/  2  where L is and cofactor of the Laplacian of  det F = t(  )/  2(|V(  )|-1) where t(  ) denotes the number of spanning trees in  The ML estimator of the connection x modulo any chosen gauge is unbiased with covariance F -1 and determinant 1/det F Additional results for large classes of Lie groups G, including compact, non-compact, abelian, and non-abelian cases More precise general formulation of the notion of gauge-invariant estimation on graphs and properties of the estimators (Allerton 2012) Gauge-invariant Estimation in Networks Representative Results

Value of Information 1 st year review. UCLA 2012 Kickoff VOI Kickoff 1 st year review. UCLA 2012 Value of Information Kickoff VOI Sensor Management via Riemannian Geometry Mutual Information and divergences have successful histories in sensor management surrogates for actual cost functions (i.e., representing VoI) Problem: Given what is known, what sensor trajectory optimizes information gathering over the next T seconds? Fact: The set of all Riemannian metrics on a Riemannian manifold M is a (weak) infinite dimensional Riemannian manifold M (M) Observation 1: The Riemannian metrics on M corresponding to Fisher information constitute a submanifold of M, and Observation 2: For a particular problem of estimating a parameter from sensor data, each choice of sensor corresponds to a Riemannian metric on M lying in this submanifold of M Sensing Action S  log-Likelihood l S on M  Fisher Information F S on M  Riemannian metric on M  Point in M (M)

Value of Information 1 st year review. UCLA 2012 Kickoff VOI Kickoff 1 st year review. UCLA 2012 Value of Information Kickoff VOI Sensor Management via Riemannian Geometry Estimation-theoretic Preliminaries Consider a family of conditional densities p(x|  ) for a RV x on X parameterized by  in a smooth d-dimensional manifold M For given x, p(x|  ) defines the likelihood function on M The log-likelihood l x : M  R is defined by l x (  )=log p(x|  ) The optimal test for  versus  ’ given data x is of the form The Kullback-Leibler (KL) divergence is a natural measure of discrimination on M

Value of Information 1 st year review. UCLA 2012 Kickoff VOI Kickoff 1 st year review. UCLA 2012 Value of Information Kickoff VOI Sensor Management via Riemannian Geometry Metric Geometry of Sensor Selection A Riemannian metric on a smooth manifold is a (positive definite) inner product on each tangent space that varies smoothly from point to point Although the KL divergence is not symmetric, it induces a Riemannian metric on M This Fisher information metric is defined by The corresponding volume form is the Jeffreys prior on M Under suitable assumptions, a metric on M is given by where dP = vol F or, more generally, is a probability density on M

Value of Information 1 st year review. UCLA 2012 Kickoff VOI Kickoff 1 st year review. UCLA 2012 Value of Information Sensor Riemannian Metrics Manifold of M S Manifold g VOI Sensor Management via Riemannian Geometry Manifold of Sensor Configurations Suppose that the sensor configuration is parameterized by a smooth manifold S A configuration sS gives rise to a particular Riemannian metric g s on M The mapping g taking s to g s will be assumed to be smooth and one-to-one (e.g., an immersion) Although M is infinite- dimensional, the trajectory planning takes place in a finite-dimensional sub- manifold that inherits its metric structure from M Geometrically, optimal navigation in S is via geodesics The geometry here is defined directly in terms of Fisher information

Value of Information 1 st year review. UCLA 2012 Kickoff VOI Kickoff 1 st year review. UCLA 2012 Value of Information Kickoff VOI Sensor Management via Riemannian Geometry Geodesics The geodesic structure of M has been studied outside the context of information geometry The “energy integral” of a smooth curve  : [0,1]  M is Geodesics in M are extremals of E  ; they satisfy With  restricted to S, E  becomes an integral I  on S defined with respect to pullbacks of the quantities in the energy integral on M Geodesics in S are extremals of I , which satisfy where  denotes the Cristoffel symbol for the Levi-Civita connection on M

Value of Information 1 st year review. UCLA 2012 Kickoff VOI Sensor Management via Riemannian Geometry Computational Example Mobile sensor platforms with bearings- only sensors seek to localize a stationary emitter The parameter manifold M is R 2 – the position of the emitter in the plane Noise is independent von Mises – mean zero, concentration parameter  To simplify computation, sensors are constrained to remain at right angles with respect to emitter

Value of Information 1 st year review. UCLA 2012 Kickoff VOI Kickoff !6 !5 !3 !1 !2 !4 !7 !8 VOI Measurement Selection for Observability New Topic in Collaboration with ARL In a stochastic dynamical system, suppose the state can be measured at each time instant via a measurement map that is selectable from a library E.g., x R d with x(k+1)=Ax(k)+w(k) and y(k)=C s (k)+n(k), k=1,2,… where C s is selectable What is the most informative measurement sequence, subject to constraints, for “observation” In terms of estimation fidelity for x(0)? For numerical conditioning? For hypothesis testing on functions of x(0) with myopic, finite-horizon, and infinite horizon objectives? What if the dynamics are driven by an adversary? How do biological systems manage measurement of dynamical information for sensorimotor control? Brian’s ongoing collaborations at UMD What can we learn about quantifying value of information for to support multi-faceted and dynamic tasks? Started summer 2012 during Utku Ilkturk’s six-week visit to ARL Continuing in collaboration with Brian Sadler