Tijl De Bie John Shawe-Taylor ECS, ISIS, University of Southampton

Slides:



Advertisements
Similar presentations
Advanced topics in Financial Econometrics Bas Werker Tilburg University, SAMSI fellow.
Advertisements

Pattern Recognition and Machine Learning
Support Vector Machines
Pattern Recognition and Machine Learning: Kernel Methods.
CSCI 347 / CS 4206: Data Mining Module 07: Implementations Topic 03: Linear Models.
CS Statistical Machine learning Lecture 13 Yuan (Alan) Qi Purdue CS Oct
Properties of State Variables
Markov processes in a problem of the Caspian sea level forecasting Mikhail V. Bolgov Water Problem Institute of Russian Academy of Sciences.
A Constraint Generation Approach to Learning Stable Linear Dynamical Systems Sajid M. Siddiqi Byron Boots Geoffrey J. Gordon Carnegie Mellon University.
Principal Component Analysis CMPUT 466/551 Nilanjan Ray.
Lecture 11: Recursive Parameter Estimation
Pattern Recognition and Machine Learning
x – independent variable (input)
280 SYSTEM IDENTIFICATION The System Identification Problem is to estimate a model of a system based on input-output data. Basic Configuration continuous.
The value of kernel function represents the inner product of two training points in feature space Kernel functions merge two steps 1. map input data from.
1 Introduction to Kernels Max Welling October (chapters 1,2,3,4)
A Constraint Generation Approach to Learning Stable Linear Dynamical Systems Sajid M. Siddiqi Byron Boots Geoffrey J. Gordon Carnegie Mellon University.
Support Vector Regression David R. Musicant and O.L. Mangasarian International Symposium on Mathematical Programming Thursday, August 10, 2000
Particle Filtering for Non- Linear/Non-Gaussian System Bohyung Han
Mehdi Ghayoumi MSB rm 132 Ofc hr: Thur, a Machine Learning.
An Introduction to Support Vector Machines Martin Law.
Binary Variables (1) Coin flipping: heads=1, tails=0 Bernoulli Distribution.
Overview of Kernel Methods Prof. Bennett Math Model of Learning and Discovery 2/27/05 Based on Chapter 2 of Shawe-Taylor and Cristianini.
Cao et al. ICML 2010 Presented by Danushka Bollegala.
CS 8751 ML & KDDSupport Vector Machines1 Support Vector Machines (SVMs) Learning mechanism based on linear programming Chooses a separating plane based.
Machine Learning Seminar: Support Vector Regression Presented by: Heng Ji 10/08/03.
1 SUPPORT VECTOR MACHINES İsmail GÜNEŞ. 2 What is SVM? A new generation learning system. A new generation learning system. Based on recent advances in.
Overview Particle filtering is a sequential Monte Carlo methodology in which the relevant probability distributions are iteratively estimated using the.
An Introduction to Support Vector Machines (M. Law)
Measure Independence in Kernel Space Presented by: Qiang Lou.
PROBABILITY AND STATISTICS FOR ENGINEERING Hossein Sameti Department of Computer Engineering Sharif University of Technology Principles of Parameter Estimation.
SYSTEMS Identification Ali Karimpour Assistant Professor Ferdowsi University of Mashhad Reference: “System Identification Theory For The User” Lennart.
Processing Sequential Sensor Data The “John Krumm perspective” Thomas Plötz November 29 th, 2011.
Robotics Research Laboratory 1 Chapter 7 Multivariable and Optimal Control.
Sparse Kernel Methods 1 Sparse Kernel Methods for Classification and Regression October 17, 2007 Kyungchul Park SKKU.
An Introduction to Kalman Filtering by Arthur Pece
Support vector machine LING 572 Fei Xia Week 8: 2/23/2010 TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: A 1.
Dept. E.E./ESAT-STADIUS, KU Leuven
Bayesian Brain: Probabilistic Approaches to Neural Coding Chapter 12: Optimal Control Theory Kenju Doya, Shin Ishii, Alexandre Pouget, and Rajesh P.N.Rao.
1 EE571 PART 3 Random Processes Huseyin Bilgekul Eeng571 Probability and astochastic Processes Department of Electrical and Electronic Engineering Eastern.
Kalman Filtering And Smoothing
CS Statistical Machine learning Lecture 12 Yuan (Alan) Qi Purdue CS Oct
The Unscented Particle Filter 2000/09/29 이 시은. Introduction Filtering –estimate the states(parameters or hidden variable) as a set of observations becomes.
A Kernel Approach for Learning From Almost Orthogonal Pattern * CIS 525 Class Presentation Professor: Slobodan Vucetic Presenter: Yilian Qin * B. Scholkopf.
Colorado Center for Astrodynamics Research The University of Colorado 1 STATISTICAL ORBIT DETERMINATION Kalman Filter with Process Noise Gauss- Markov.
Incremental Reduced Support Vector Machines Yuh-Jye Lee, Hung-Yi Lo and Su-Yun Huang National Taiwan University of Science and Technology and Institute.
Massive Support Vector Regression (via Row and Column Chunking) David R. Musicant and O.L. Mangasarian NIPS 99 Workshop on Learning With Support Vectors.
Nonlinear Adaptive Kernel Methods Dec. 1, 2009 Anthony Kuh Chaopin Zhu Nate Kowahl.
DSP-CIS Part-III : Optimal & Adaptive Filters Chapter-9 : Kalman Filters Marc Moonen Dept. E.E./ESAT-STADIUS, KU Leuven
RECONSTRUCTION OF MULTI- SPECTRAL IMAGES USING MAP Gaurav.
Locating a Shift in the Mean of a Time Series Melvin J. Hinich Applied Research Laboratories University of Texas at Austin
Learning to Align: a Statistical Approach
Ch 12. Continuous Latent Variables ~ 12
12. Principles of Parameter Estimation
The Elements of Statistical Learning
Support Vector Machines
Probabilistic Robotics
Machine Learning Basics
Kalman Filtering: Control with Limited/Noisy Measurements
Unfolding Problem: A Machine Learning Approach
Filtering and State Estimation: Basic Concepts
Principal Component Analysis
Identification of Wiener models using support vector regression
Recursively Adapted Radial Basis Function Networks and its Relationship to Resource Allocating Networks and Online Kernel Learning Weifeng Liu, Puskal.
§1—2 State-Variable Description The concept of state
A. S. Weigend M. Mangeas A. N. Srivastava
Unfolding with system identification
12. Principles of Parameter Estimation
WHY WE FILTER TO IDENTIFY THE RELATIONSHIP
Kalman Filter: Bayes Interpretation
Presentation transcript:

Tijl De Bie John Shawe-Taylor ECS, ISIS, University of Southampton A statistical learning approach to subspace identification of dynamical systems Tijl De Bie John Shawe-Taylor ECS, ISIS, University of Southampton

Warning! work in progress…

PASCAL workshop - Bohinj Overview Dynamical systems and the state space representation Subspace identification for linear systems Regularized subspace identification Kernel version for nonlinear dynamical systems Preliminary experiments PASCAL workshop - Bohinj

PASCAL workshop - Bohinj Dynamical systems Dynamical systems: Accepts inputs ut 2 <m Generates outputs yt 2 <l Is affected by noise… E.g.: yt = (a1yt-1 + a2yt-2) + (b0ut + b1ut-1) + nt u0, u1, …, ut, … y0, y1, …, yt, … Dynamical system PASCAL workshop - Bohinj

PASCAL workshop - Bohinj Dynamical systems Practical examples: Car: inputs = position of steering wheel, force applied to wheel, clutch position, road conditions,… outputs = position of the car Chemical reactor: inputs = inflow of reactants, warmth added,… outputs = temperature, pressure,… Other: bridges, vocal tract, electrical systems,… PASCAL workshop - Bohinj

State space representation Next position of car (output) depends on the current inputs, and on the current position and speed Temperature and pressure (outputs) depend on the current inputs and on the current composition of the mixture, temperature and pressure Summarize total effect of past inputs in the state xt 2 <n of the system: speed of car / mixture composition This leads to the state space representation: PASCAL workshop - Bohinj

State space representation State space representation (SSR): Memory, stored in the state vector: xt Summarizes the past u0, u1, …, ut, … y0, y1, …, yt, … State update equation: xt+1 = fstate(xt,ut,wt) Output equation: yt = foutput(xt,ut,vt) PASCAL workshop - Bohinj

State space representation Linear state space model: Interpretation: the states are latent variables with Markov dependency… Note: even simple systems like the car are nonlinear, but often linear is a good approximation around working point State update equation: xt+1 = Axt + But + wt Output equation: yt = Cxt + Dut + vt PASCAL workshop - Bohinj

State space representation Advantages of the SSR: Intuitive, often close to ‘first principles’ State observer (such as Kalman filter) allows to estimate states based on input and outputs  optimal control based on these Kalman states. This is, if the system (i.e. and ) is known… If the system is not specified: algorithms for system identification exist, often identifying a SSR PASCAL workshop - Bohinj

PASCAL workshop - Bohinj Overview Dynamical systems and the state space representation Subspace identification for linear systems Regularized subspace identification Kernel version for nonlinear systems Preliminary experiments PASCAL workshop - Bohinj

Subspace identification for linear systems System identification: Given: and Noise unknown but iid … (other technical conditions) Determine: system parameters, i.e. the system matrices Classical system identification focuses on asymptotic unbiasedness (sometimes consistency) of the estimators  ok for large samples Regularization issues are at most a side-note (+/- 10 pages in the 600 pages reference work by Ljung) Explanation: often datasets are relatively low dimensional and large PASCAL workshop - Bohinj

Subspace identification for linear systems Fact: If an estimate for the state sequence is known, determining the system matrices is a (least squares) regression problem: PASCAL workshop - Bohinj

Subspace identification for linear systems Estimate state sequence first ! Two observations: We can estimate state based on past inputs and past outputs (and of initial state x0) So, for any vector there are vectors and such that: PASCAL workshop - Bohinj

Subspace identification for linear systems Future outputs are depending on the current state and future inputs So, for any vector there are vectors and such that: PASCAL workshop - Bohinj

Subspace identification for linear systems Use notation: Similarly past and future outputs and State sequence: ‘Past inputs’ as columns, time shifted ‘Future inputs’ as columns, time shifted PASCAL workshop - Bohinj

Subspace identification for linear systems Then: which is strongly related to CVA (Canonical Variate Analysis), a well known subspace identification method… Solution: generalized eigenvalue problem (like CCA); as many significantly nonzero eigenvalues as the dimensionality of the state space Reconstruct state sequence as: However: generalization problems with high-dimensional input space and small sample sizes… PASCAL workshop - Bohinj

PASCAL workshop - Bohinj Overview Dynamical systems and the state space representation Subspace identification for linear systems Regularized subspace identification Kernel version for nonlinear systems Preliminary experiments PASCAL workshop - Bohinj

Regularized subspace identification Introduce regularization: Solution: generalized eigenvalue problem PASCAL workshop - Bohinj

Regularized subspace identification Introduce regularization: Notes: Different regularization: if output space is high dimensional, also on the and Different constraints (e.g. if is omitted and without regularization, we get exactly CVA) PASCAL workshop - Bohinj

PASCAL workshop - Bohinj Overview Dynamical systems and the state space representation Subspace identification for linear systems Regularized subspace identification Kernel version for nonlinear systems Preliminary experiments PASCAL workshop - Bohinj

Kernel version for nonlinear systems Hammerstein models (input nonlinearity): Examples: in the car examples relations between forces, angles of the steering wheel,… involve sinus functions; also: saturation in actuators correspond to (‘soft’) sign functions Using feature map on inputs , then: (with B having potentially infinitely many columns) PASCAL workshop - Bohinj

Kernel version for nonlinear systems Use representer theorem: where = vector containing (with the training samples) and containing dual variables… PASCAL workshop - Bohinj

Kernel version for nonlinear systems Then:  Kernel version of subspace identification algorithm; regularization is absolutely necessary here… Solution: similar generalized eigenvalue problem (cfr. Kernel-CCA) Extensions (further work): output nonlinearity (Wiener-Hammerstein systems); requires the inverse feature map… PASCAL workshop - Bohinj

PASCAL workshop - Bohinj Overview Dynamical systems and the state space representation Subspace identification for linear systems Regularized subspace identification Kernel version for nonlinear systems Preliminary experiments PASCAL workshop - Bohinj

Preliminary experiments Experiments with system (200 samples): Random Gaussian inputs with std=1, Gaussian noise PASCAL workshop - Bohinj

PASCAL workshop - Bohinj Further work Conclusions: Regularization ideas imported into system identification for large dimensionalities / small samples Kernel trick allows for identification of Hammerstein nonlinear systems Further work: Motivation of our type of regularization with learning theory bounds Wiener-Hammerstein models (also output nonlinearity) Extension of notions like controllability to such nonlinear models Design of Kalman filter and controllers based on this kind of Hammerstein systems PASCAL workshop - Bohinj

PASCAL workshop - Bohinj Thanks! Questions? PASCAL workshop - Bohinj