Instabilities of SVD Small eigenvalues -> m+ sensitive to small amounts of noise Small eigenvalues maybe indistinguishable from 0 Possible to remove small.

Slides:



Advertisements
Similar presentations
Optimization of thermal processes
Advertisements

Optimization 吳育德.
Least Squares example There are 3 mountains u,y,z that from one site have been measured as 2474 ft., 3882 ft., and 4834 ft.. But from u, y looks 1422 ft.
Steepest Decent and Conjugate Gradients (CG). Solving of the linear equation system.
Jonathan Richard Shewchuk Reading Group Presention By David Cline
1cs542g-term Notes  Assignment 1 due tonight ( me by tomorrow morning)
Function Optimization Newton’s Method. Conjugate Gradients
Newton’s Method applied to a scalar function Newton’s method for minimizing f(x): Twice differentiable function f(x), initial solution x 0. Generate a.
Some useful linear algebra. Linearly independent vectors span(V): span of vector space V is all linear combinations of vectors v i, i.e.
Tutorial 12 Unconstrained optimization Conjugate gradients.
Methods For Nonlinear Least-Square Problems
Chapter 1 Introduction The solutions of engineering problems can be obtained using analytical methods or numerical methods. Analytical differentiation.
Tutorial 5-6 Function Optimization. Line Search. Taylor Series for Rn
Newton’s Method applied to a scalar function Newton’s method for minimizing f(x): Twice differentiable function f(x), initial solution x 0. Generate a.
Linear and generalised linear models
Function Optimization. Newton’s Method Conjugate Gradients Method
Advanced Topics in Optimization
Why Function Optimization ?
Linear and generalised linear models
Basics of regression analysis
Linear and generalised linear models Purpose of linear models Least-squares solution for linear models Analysis of diagnostics Exponential family and generalised.
Tier I: Mathematical Methods of Optimization

9 1 Performance Optimization. 9 2 Basic Optimization Algorithm p k - Search Direction  k - Learning Rate or.
Computational Optimization
UNCONSTRAINED MULTIVARIABLE
Chapter 15 Modeling of Data. Statistics of Data Mean (or average): Variance: Median: a value x j such that half of the data are bigger than it, and half.
Non-Linear Models. Non-Linear Growth models many models cannot be transformed into a linear model The Mechanistic Growth Model Equation: or (ignoring.
Rudolf Žitný, Ústav procesní a zpracovatelské techniky ČVUT FS 2010 Error analysis Statistics Regression Experimental methods E EXM8.
Computing a posteriori covariance in variational DA I.Gejadze, F.-X. Le Dimet, V.Shutyaev.
Short introduction for the quasi-equilibrium binary neutron star solutions. Introducing two patches of fluid coordinate grids, the initial data code can.
Newton's Method for Functions of Several Variables Joe Castle & Megan Grywalski.
Scientific Computing Partial Differential Equations Poisson Equation.
Nonlinear least squares Given m data points (t i, y i ) i=1,2,…m, we wish to find a vector x of n parameters that gives a best fit in the least squares.
Non-Linear Models. Non-Linear Growth models many models cannot be transformed into a linear model The Mechanistic Growth Model Equation: or (ignoring.
ECE 8443 – Pattern Recognition ECE 8423 – Adaptive Signal Processing Objectives: Deterministic vs. Random Maximum A Posteriori Maximum Likelihood Minimum.
1 EEE 431 Computational Methods in Electrodynamics Lecture 4 By Dr. Rasime Uyguroglu
Computer Animation Rick Parent Computer Animation Algorithms and Techniques Optimization & Constraints Add mention of global techiques Add mention of calculus.
Multivariate Unconstrained Optimisation First we consider algorithms for functions for which derivatives are not available. Could try to extend direct.
Colorado Center for Astrodynamics Research The University of Colorado 1 STATISTICAL ORBIT DETERMINATION ASEN 5070 LECTURE 11 9/16,18/09.
Boundary Value Problems l Up to this point we have solved differential equations that have all of their initial conditions specified. l There is another.
Numerical Methods.
Solution of Nonlinear Functions
Data Modeling Patrice Koehl Department of Biological Sciences National University of Singapore
Insight: Steal from Existing Supervised Learning Methods! Training = {X,Y} Error = target output – actual output.
SOLVING QUADRATIC EQUATIONS Factoring Method. Warm Up Factor the following. 1. x 2 – 4x – x 2 + 2x – x 2 -28x + 48.
Non-Linear Models. Non-Linear Growth models many models cannot be transformed into a linear model The Mechanistic Growth Model Equation: or (ignoring.
CHAPTER 10 Widrow-Hoff Learning Ming-Feng Yeh.
Chapter 2-OPTIMIZATION G.Anuradha. Contents Derivative-based Optimization –Descent Methods –The Method of Steepest Descent –Classical Newton’s Method.
Neural Networks 2nd Edition Simon Haykin 柯博昌 Chap 3. Single-Layer Perceptrons.
METHOD OF STEEPEST DESCENT ELE Adaptive Signal Processing1 Week 5.
Numerical Analysis – Data Fitting Hanyang University Jong-Il Park.
Bounded Nonlinear Optimization to Fit a Model of Acoustic Foams
Gradient Descent 梯度下降法
CHE 391 T. F. Edgar Spring 2012.
CS5321 Numerical Optimization
Non-linear Least-Squares
Some useful linear algebra
Collaborative Filtering Matrix Factorization Approach
Ying shen Sse, tongji university Sep. 2016
Digital Visual Effects Yung-Yu Chuang
Computers in Civil Engineering 53:081 Spring 2003
Chapter 10. Numerical Solutions of Nonlinear Systems of Equations
Linear regression Fitting a straight line to observations.
~ Least Squares example
The loss function, the normal equation,
Mathematical Foundations of BME Reza Shadmehr
~ Least Squares example
CS5321 Numerical Optimization
Performance Optimization
Presentation transcript:

Instabilities of SVD Small eigenvalues -> m+ sensitive to small amounts of noise Small eigenvalues maybe indistinguishable from 0 Possible to remove small eigenvalues to stabilize solution -> Truncated SVD, TSVD Condition number cond(G)=s 1 /s k

TSVD Example: removing instrument response g 0 t exp(-t/T 0 ) (t≥0) g(t)= 0 (t<0) v(t)=∫g(t-  )m true (  )d  (recorded acceleration) Problem: deconvolving g(t) from v(t) to get m true (t i -t j )exp[-(t i -t j )]/T 0  t (t j ≥t i ) d=Gm, G i,j = 0 (t j <t i ) --

TSVD time [-5,100]s  t=0.5s -> G with m=n=210 Singular values 25.3->0.017, Cond(G)~1480 e.g., 1/1000 noise creates instability.. True signal: m true (t)=exp[-(t-8) 2 /2  2 ] exp[-(t-25) 2 /2  2 ]

TSVD d true =Gm true m=VS -1 U T d true

TSVD d true =Gm true m=VS -1 U T (d true +  ),  =N(0,(0.05 V) 2 ) Solution fits data perfectly, but worthless…

TSVD d true =Gm true m=V p S p -1 U p T (d true +  ),  =N(0,(0.05 V) 2 ) Solution for p=26 (removed 184 eigenvalues!)

Nonlinear Regression Linear regression: Now we know (e.g., LS) Assume a nonlinear system of m eq and m unknowns F(x)=0 … What are we going to do?? We will try to find a sequence of vectors x 0, x 1, … that will converge toward a solution x* Linearize: (assume that F is continuously differentiable)

Nonlinear Regression Linear regression: Now we know (e.g., LS) Assume a nonlinear system of m eq and m unknowns F(x)=0 … What are we going to do?? We will try to find a sequence of vectors x 0, x 1, … that will converge toward a solution x* Linearize: (assume that F is continuously differentiable) F(x 0 +  x)≈F(x 0 )+  F(x 0 )  x where  F(x 0 ) is the Jacobian

Nonlinear Regression Assume that the  x puts us at the unknown solution x*: F(x 0 +  x)≈F(x 0 )+  F(x 0 )  x=F(x*)=0 -F(x 0 ) ≈  F(x 0 )  x = Newton’s Method! F(x)=0, initial solution x 0. Generate a sequence of solutions x 1, x 2, …and stop if the sequence converges to a solution with F(x)=0. 1.Solve -F(x k ) ≈  F(x k )  x (fx, using Gaussian elimination). 2. Let x k+1 =x k +  x. 3. let k=k+1

Properties of Newton’s Method If x 0 is close enough to x*, F(x) is continuously differentiable in a neighborhood of x*, and  F(x * ) is nonsingular, Newton’s method will converge to x*. The convergence rate is ||x k+1 -x*|| 2 ≤c||x k -x*|| 2 2

Newton’s Method applied to a scalar function Problem: Minimize f(x) If f(x) is twice continuously differentiable f(x 0 +  x)≈f(x 0 )+  f(x 0 ) T  x+1/2  x T  2 f(x 0 )  x where  f(x 0 ) is the gradient and  2 f(x 0 ) is the Hessian

Newton’s Method applied to a scalar function A necessary condition for x* to be a minimum of f(x) is that  f(x * )=0. In the vicinity of x 0 we can approximate the gradient as  f(x 0 +  x)≈  f(x 0 )+  2 f(x 0 ) T  x (eq higher-order terms) Setting the gradient to zero (assuming x 0 +  x puts us at x*) we get -  f(x 0 ) ≈  2 f(x 0 ) T  x, which is Newton’s method for minimizing f(x): Twice differentiable function f(x), initial solution x 0. Generate a sequence of solutions x 1, x 2, …and stop if the sequence converges to a solution with  f(x)=0. 1.Solve -  f(x k ) ≈  2 f(x k )  x 2. Let x k+1 =x k +  x. 3. let k=k+1

Newton’s Method applied to a scalar function Is the same as solving a nonlinear system of equations applied to  f(x)=0, so therefore If x 0 is close enough to x*, f(x) is twice continuously differentiable in a neighborhood of x*, and there is a constant such that ||  2 f(x)-  2 f(y)|| 2 ≤ ||x-y|| 2 for every y in the neighborhood, and  2 f(x*) is positive definite, and x 0 is close enough to x*, then Newton’s method will converge quadratically to x*

Newton’s Method applied to LS Not directly applicable to most nonlinear regression and inverse problems (not equal # of model parameters and data points, no exact solution to G(m)=d). Instead we will use N.M. to minimize a nonlinear LS problem, e.g. fit a vector of n parameters to a data vector d. f(m)=∑ [(G(m) i -d i )/  i ] 2 Let f i (m)=(G(m) i -d i )/  I i=1,2,…,m, F(m)=[f 1 (m) … f m (m)] T So that f(m)= ∑ f i (m) 2  f(m)=∑  f i (m) 2 ] m i=1 m i=1 m i=1

Newton’s Method applied to LS  f(m) j =∑ 2  f i (m) j F(m) j ]  f(m)=2J(m) T F(m), where J(m) is the Jacobian   f(m) j =∑    f i (m) 2 ] = ∑ H i (m), where H i (m) is the Hessian of f i (m) 2 H i j,k (m)= m i=i m i=i m i=i

Newton’s Method applied to LS   f(m)=2J(m) T J(m)+Q(m), where Q(m)=∑ f i (m)   f i (m) Gauss-Newton (GN) method ignores Q(m),   f(m)≈2J(m) T J(m), assuming f i (m) will be reasonably small as we approach m*. That is, NM: Solve -  f(x k ) ≈  2 f(x k )  x  f(m) j =∑ 2  f i (m) j F(m) j, i.e. J(m k ) T J(m k )  m=-J(m k ) T F(m k ) m i=i

Newton’s Method applied to LS Levenberg-Marquardt (LM) method uses [J(m k ) T J(m k )+ I]  m=-J(m k ) T F(m k ) ->0 : GN ->large, steepest descent (SD) (down-gradient most rapidly). SD provides slow but certain convergence. Which value of to use? Small values when GN is working well, switch to larger values in problem areas. Start with small value of, then adjust.

Statistics of iterative methods Cov(Ad)=A Cov(d) A T (d has multivariate N.D.) Cov(m L2 )=(G T G) -1 G T Cov(d) G(G T G) -1 Cov(d)=  2 I: Cov(m L2 )=  2 (G T G) -1 However, we don’t have a linear relationship between data and estimated model parameters for the nonlinear regression, so cannot use these formulas. Instead: F(m*+  m)≈F(m*)+J(m*)  m Cov(m*)≈(J(m*) T J(m*)) -1 r i =G(m*) i -d i s=[∑ r i 2 /(m-n)] Cov(m*)=s 2 (J(m*) T J(m*)) -1

Implementation Issues 1.Explicit (analytical) expressions for derivatives 2.Finite difference approximation for derivatives 3.When to stop iterating?