Value Function Approximation with Diffusion Wavelets and Laplacian Eigenfunctions by S. Mahadevan & M. Maggioni Discussion led by Qi An ECE, Duke University.

Slides:



Advertisements
Similar presentations
Context-based object-class recognition and retrieval by generalized correlograms by J. Amores, N. Sebe and P. Radeva Discussion led by Qi An Duke University.
Advertisements

A Theory For Multiresolution Signal Decomposition: The Wavelet Representation Stephane Mallat, IEEE Transactions on Pattern Analysis and Machine Intelligence,
1 D. R. Wilton ECE Dept. ECE 6382 Introduction to Linear Vector Spaces Reference: D.G. Dudley, “Mathematical Foundations for Electromagnetic Theory,” IEEE.
Principal Component Analysis Based on L1-Norm Maximization Nojun Kwak IEEE Transactions on Pattern Analysis and Machine Intelligence, 2008.
MRA basic concepts Jyun-Ming Chen Spring Introduction MRA (multi- resolution analysis) –Construct a hierarchy of approximations to functions in.
An Analysis of Linear Models, Linear Value-Function Approximation, and Feature Selection for Reinforcement Learning Ronald Parr, Lihong Li, Gavin Taylor,
Extensions of wavelets
10/11/2001Random walks and spectral segmentation1 CSE 291 Fall 2001 Marina Meila and Jianbo Shi: Learning Segmentation by Random Walks/A Random Walks View.
Signal , Weight Vector Spaces and Linear Transformations
Signal , Weight Vector Spaces and Linear Transformations
Symmetric Matrices and Quadratic Forms
Real-time Combined 2D+3D Active Appearance Models Jing Xiao, Simon Baker,Iain Matthew, and Takeo Kanade CVPR 2004 Presented by Pat Chan 23/11/2004.
Weak Formulation ( variational formulation)
1 Hybrid Agent-Based Modeling: Architectures,Analyses and Applications (Stage One) Li, Hailin.
Class 25: Question 1 Which of the following vectors is orthogonal to the row space of A?
Orthogonality and Least Squares
Diffusion Maps and Spectral Clustering
5.1 Orthogonality.
ADITI BHAUMICK ab3585. To use reinforcement learning algorithm with function approximation. Feature-based state representations using a broad characterization.
Introduction The central problems of Linear Algebra are to study the properties of matrices and to investigate the solutions of systems of linear equations.
Length and Dot Product in R n Notes: is called a unit vector. Notes: The length of a vector is also called its norm. Chapter 5 Inner Product Spaces.
Inner Product Spaces Euclidean n-space: Euclidean n-space: vector lengthdot productEuclidean n-space R n was defined to be the set of all ordered.
Chapter 5: The Orthogonality and Least Squares
CHAPTER FIVE Orthogonality Why orthogonal? Least square problem Accuracy of Numerical computation.
MAT 4725 Numerical Analysis Section 8.2 Orthogonal Polynomials and Least Squares Approximations (Part II)
Gram-Schmidt Orthogonalization
General Polynomial Time Algorithm for Near-Optimal Reinforcement Learning Duke University Machine Learning Group Discussion Leader: Kai Ni June 17, 2005.
Linear Algebra (Aljabar Linier) Week 10 Universitas Multimedia Nusantara Serpong, Tangerang Dr. Ananda Kusuma
AN ORTHOGONAL PROJECTION
Orthogonality and Least Squares
Linear Image Reconstruction Bart Janssen 13-11, 2007 Eindhoven.
Practical Dynamic Programming in Ljungqvist – Sargent (2004) Presented by Edson Silveira Sobrinho for Dynamic Macro class University of Houston Economics.
Value Function Approximation on Non-linear Manifolds for Robot Motor Control Masashi Sugiyama1)2) Hirotaka Hachiya1)2) Christopher Towell2) Sethu.
Orthonormal Bases; Gram-Schmidt Process; QR-Decomposition
4.8 Rank Rank enables one to relate matrices to vectors, and vice versa. Definition Let A be an m  n matrix. The rows of A may be viewed as row vectors.
Section 5.1 Length and Dot Product in ℝ n. Let v = ‹v 1­­, v 2, v 3,..., v n › and w = ‹w 1­­, w 2, w 3,..., w n › be vectors in ℝ n. The dot product.
Elementary Linear Algebra Anton & Rorres, 9 th Edition Lecture Set – 07 Chapter 7: Eigenvalues, Eigenvectors.
Class 26: Question 1 1.An orthogonal basis for A 2.An orthogonal basis for the column space of A 3.An orthogonal basis for the row space of A 4.An orthogonal.
A Convergent Solution to Tensor Subspace Learning.
D. Rincón, M. Roughan, W. Willinger – Towards a Meaningful MRA of Traffic Matrices 1/36 Towards a Meaningful MRA for Traffic Matrices D. Rincón, M. Roughan,
1. Systems of Linear Equations and Matrices (8 Lectures) 1.1 Introduction to Systems of Linear Equations 1.2 Gaussian Elimination 1.3 Matrices and Matrix.
Advanced Computer Graphics Spring 2014 K. H. Ko School of Mechatronics Gwangju Institute of Science and Technology.
4.8 Rank Rank enables one to relate matrices to vectors, and vice versa. Definition Let A be an m  n matrix. The rows of A may be viewed as row vectors.
1 MAC 2103 Module 11 lnner Product Spaces II. 2 Rev.F09 Learning Objectives Upon completing this module, you should be able to: 1. Construct an orthonormal.
12.1 Orthogonal Functions a function is considered to be a generalization of a vector. We will see the concepts of inner product, norm, orthogonal (perpendicular),
Mesh Segmentation via Spectral Embedding and Contour Analysis Speaker: Min Meng
Kernelized Value Function Approximation for Reinforcement Learning Gavin Taylor and Ronald Parr Duke University.
Geometric diffusions as a tool for harmonic analysis and structure definition of data By R. R. Coifman et al. The second-round discussion* on * The first-round.
Abstract LSPI (Least-Squares Policy Iteration) works well in value function approximation Gaussian kernel is a popular choice as a basis function but can.
4 Vector Spaces 4.1 Vector Spaces and Subspaces 4.2 Null Spaces, Column Spaces, and Linear Transformations 4.3 Linearly Independent Sets; Bases 4.4 Coordinate.
An inner product on a vector space V is a function that, to each pair of vectors u and v in V, associates a real number and satisfies the following.
By Poornima Balakrishna Rajesh Ganesan George Mason University A Comparison of Classical Wavelet with Diffusion Wavelets.
Wavelets Chapter 7 Serkan ERGUN. 1.Introduction Wavelets are mathematical tools for hierarchically decomposing functions. Regardless of whether the function.
6 6.5 © 2016 Pearson Education, Ltd. Orthogonality and Least Squares LEAST-SQUARES PROBLEMS.
QR decomposition: A = QR Q is orthonormal R is upper triangular To find QR decomposition: 1.) Q: Use Gram-Schmidt to find orthonormal basis for column.
Compressive Coded Aperture Video Reconstruction
Introduction The central problems of Linear Algebra are to study the properties of matrices and to investigate the solutions of systems of linear equations.
Introduction The central problems of Linear Algebra are to study the properties of matrices and to investigate the solutions of systems of linear equations.
Lecture 16 Cramer’s Rule, Eigenvalue and Eigenvector
Orthogonality and Least Squares
Reinforcement Learning in MDPs by Lease-Square Policy Iteration
Ilan Ben-Bassat Omri Weinstein
A first-round discussion* on
Symmetric Matrices and Quadratic Forms
Using Manifold Structure for Partially Labeled Classification
Vector Spaces, Subspaces
Orthogonality and Least Squares
Symmetric Matrices and Quadratic Forms
Marios Mattheakis and Pavlos Protopapas
Reinforcement Learning (2)
Presentation transcript:

Value Function Approximation with Diffusion Wavelets and Laplacian Eigenfunctions by S. Mahadevan & M. Maggioni Discussion led by Qi An ECE, Duke University

Outline Introduction Approximate policy iteration Value function approximation Laplacian eigenfunctions approximation Diffusion Wavelets approximation Experimental results Conclusions

Introduction In MDP models, it is desirable/necessary to approximate the value function for a large state size or reinforcement learning situation. Two novel approaches are explored in this paper to make value function approximation on state space graphs

Approximate policy iteration In a RL MDP model, value function approximation is a part of approximate policy iteration process, which is used to iteratively solve the RL problem.

Approximate policy iteration Sample (s, a, r, s’)

Value function approximation A variety of linear and non-linear architectures have been widely studied as they offer many advantages in the context of value function approximation However, many of them are handcoded in an ad hoc trial-and-error process by a human designer.

Value function approximation A finite MDP can be defined as Any policy defines a unique value function, which satisfies the Bellman equation We want to project the value function into another lower dimensional space

Value function approximation In the approximation, is a |S||A|*k matrix, each column of which is a basis function evaluated at (s,a) points, k is the number of basis functions selected and is a weight vector. The problem is how to efficiently and effectively construct those basis functions

Laplacian eigenfunctions We model the state space as a finite undirected weighted graph (G,E,W) The combinational Laplacian L is defined as: The normalized Laplacian is We use the eigenfunctions of L as the orthonormal basis

Diffusion wavelets Diffusion wavelets generalize wavelet analysis and associated signal processing techniques to functions on manifolds and graphs. They allows fast and accurate computation of high powers of a Markov chain P on the graph, including direct computation of the Green’s function of the Markov chain, (I- P) -1, for solving Bellman’s equation.

Diffusion wavelets Markov Random Walk We symmetrize P and take powers where and are eigenvalues and eigenfunctions of the normalized Laplacian

Diffusion wavelets A diffusion wavelets tree consists of orthogonal diffusion scaling function and orthogonal wavelets. The scaling functions span a subspace with the property,and the span of wavelets,,is the orthogonal complement of into.

Diffusion wavelets

The detail subspaces Downsampling, orthogonalization, and operator compression  - diffusion maps: X is the data set A - diffusion operator, G – Gram-Schmidt ortho-normalization, M - A  G

Diffusion wavelets

Experimental results

Conclusions Two novel value function approximation methods are exploited The underlying representation and policies are simultaneously learned Diffusion wavelets is a powerful tool for signal processing techniques of functions on manifolds and graphs