Zeroing in on the Implicit Function Theorem The Implicit Function Theorem for several equations in several unknowns.

Slides:



Advertisements
Similar presentations
Differential Equations Brannan Copyright © 2010 by John Wiley & Sons, Inc. All rights reserved. Chapter 08: Series Solutions of Second Order Linear Equations.
Advertisements

Copyright © Cengage Learning. All rights reserved. 14 Partial Derivatives.
4.5: Linear Approximations and Differentials
3 Copyright © Cengage Learning. All rights reserved. Applications of Differentiation.
ESSENTIAL CALCULUS CH11 Partial derivatives
1 MA 1128: Lecture 19 – 4/19/11 Quadratic Formula Solving Equations with Graphs.
SEQUENCES and INFINITE SERIES
Solving Systems of Equations. Rule of Thumb: More equations than unknowns  system is unlikely to have a solution. Same number of equations as unknowns.
1cs542g-term Notes. 2 Solving Nonlinear Systems  Most thoroughly explored in the context of optimization  For systems arising in implicit time.
Ch 5.2: Series Solutions Near an Ordinary Point, Part I
1 Math Review APEC 3001 Summer Objectives Review basic algebraic skills required to successfully solve homework, quiz, and exam problems in this.
Grover. Part 2. Components of Grover Loop The Oracle -- O The Hadamard Transforms -- H The Zero State Phase Shift -- Z O is an Oracle H is Hadamards H.
Ch 7.3: Systems of Linear Equations, Linear Independence, Eigenvalues
Linear and generalised linear models
Newton's Method for Functions of Several Variables
Basic Mathematics for Portfolio Management. Statistics Variables x, y, z Constants a, b Observations {x n, y n |n=1,…N} Mean.
Linear and generalised linear models
Linear Functions.
Mathematics for Business (Finance)
Ch 8.1 Numerical Methods: The Euler or Tangent Line Method
Taylor Series.
Functions of Several Variables Local Linear Approximation.
COMP 175: Computer Graphics March 24, 2015
Systems of Linear Equation and Matrices
Solving Systems of Equations. Rule of Thumb: More equations than unknowns  system is unlikely to have a solution. Same number of equations as unknowns.
Introduction This chapter reminds us of how to calculate midpoints and distances between co-ordinates We will see lots of Algebraic versions as well We.
4 4.2 © 2012 Pearson Education, Inc. Vector Spaces NULL SPACES, COLUMN SPACES, AND LINEAR TRANSFORMATIONS.
Optimization in Engineering Design Georgia Institute of Technology Systems Realization Laboratory 101 Quasi-Newton Methods.
Newton's Method for Functions of Several Variables Joe Castle & Megan Grywalski.
4 4.4 © 2012 Pearson Education, Inc. Vector Spaces COORDINATE SYSTEMS.
Examples. Example 1 Which of the points below are a solution to the graph of 2x + 3y = 6? a. (2.25, 0.5) b. (3.75, -0.5) c. (0, 2) d. (-6.75, 6.25)
MA Day 25- February 11, 2013 Review of last week’s material Section 11.5: The Chain Rule Section 11.6: The Directional Derivative.
MA/CS 375 Fall MA/CS 375 Fall 2002 Lecture 31.
2.5 Implicit Differentiation
Matrix Differential Calculus By Dr. Md. Nurul Haque Mollah, Professor, Dept. of Statistics, University of Rajshahi, Bangladesh Dr. M. N. H. MOLLAH.
Multivariate Unconstrained Optimisation First we consider algorithms for functions for which derivatives are not available. Could try to extend direct.
Differentiability for Functions of Two (or more!) Variables Local Linearity.
Zeroing in on the Implicit Function Theorem Real Analysis II Spring, 2007.
Boundary Value Problems l Up to this point we have solved differential equations that have all of their initial conditions specified. l There is another.
CHAPTER 3 NUMERICAL METHODS
Copyright © Cengage Learning. All rights reserved. 6 Inverse Functions.
linear  2.3 Newton’s Method ( Newton-Raphson Method ) 1/12 Chapter 2 Solutions of Equations in One Variable – Newton’s Method Idea: Linearize a nonlinear.
Matrices and linear transformations For grade 1, undergraduate students For grade 1, undergraduate students Made by Department of Math.,Anqing Teachers.
Local Linear Approximation for Functions of Several Variables.
Algebra Problems… Solutions Algebra Problems… Solutions © 2007 Herbert I. Gross Set 17 part 2 By Herbert I. Gross and Richard A. Medeiros next.
Integration 4 Copyright © Cengage Learning. All rights reserved.
Matrices and Matrix Operations. Matrices An m×n matrix A is a rectangular array of mn real numbers arranged in m horizontal rows and n vertical columns.
A function is a rule f that associates with each element in a set A one and only one element in a set B. If f associates the element b with the element.
5 5.1 © 2016 Pearson Education, Ltd. Eigenvalues and Eigenvectors EIGENVECTORS AND EIGENVALUES.
Local Linear Approximation for Functions of Several Variables.
Singular Value Decomposition and Numerical Rank. The SVD was established for real square matrices in the 1870’s by Beltrami & Jordan for complex square.
Precalculus Fifth Edition Mathematics for Calculus James Stewart Lothar Redlin Saleem Watson.
ALGEBRAIC EIGEN VALUE PROBLEMS
Solving Systems of Equations. Rule of Thumb: More equations than unknowns  system is unlikely to have a solution. Same number of equations as unknowns.
Chapter 14 Partial Derivatives
CS B553: Algorithms for Optimization and Learning
Boundary-Value Problems for ODE )בעיות הגבול(
Solution of Equations by Iteration
The Implicit Function Theorem---Part 1
Ch 5.2: Series Solutions Near an Ordinary Point, Part I
Copyright © Cengage Learning. All rights reserved.
The Chain Rule Theorem Chain Rule
Linear Algebra Lecture 3.
4.5: Linear Approximations and Differentials
Fourier Analysis Lecture-8 Additional chapters of mathematics
Stability Analysis of Linear Systems
Direction Fields and Euler's Method
Vector Spaces COORDINATE SYSTEMS © 2012 Pearson Education, Inc.
EE, NCKU Tien-Hao Chang (Darby Chang)
Presentation transcript:

Zeroing in on the Implicit Function Theorem The Implicit Function Theorem for several equations in several unknowns.

So where do we stand? Solving a system of m equations in n unknowns is equivalent to finding the “zeros” of a vector-valued function from  n →  m. When n > m, such a system will “ typically ” have infinitely many solutions. In “ nice ” cases, the solution will be a function from  n-m →  m.

So where do we stand? Solving linear systems is easy; we are interested in the non-linear case. We will ordinarily not be able to solve a system “ globally. ” But under reasonable conditions, there will be a solution function in a (possibly small!) region around a single “ known ” solution.

x y (a,b)(a,b)(a,b)(a,b) (a,b)(a,b)(a,b)(a,b) y = g(x) For a Single Equation in Two Unknowns In a small box around (a,b), we hope to find g(x). And we can, provided that the y- partials at (a,b) are continuous and non-zero.

x y (a,b)(a,b)(a,b)(a,b) (a,b)(a,b)(a,b)(a,b) y = g(x) Start with a point (a,b) on the contour line, where the partial with respect to y is not 0: Make the box around (a,b) small enough so that all of the y-partials in this box are “close” to D.

x y (a,b)(a,b)(a,b)(a,b) (a,b)(a,b)(a,b)(a,b) y = g(x) Start with a point (a,b) on the contour line, where the partial with respect to y is not 0: x Fix x. For this x, construct function Iterate  x (y). What happens?

A Whole Family of Quasi-Newton Functions Remember that (a,b) is a “known” solution. and There are a whole bunch of these functions: There is one for each x value.

A Whole Family of Quasi-Newton Functions Remember that (a,b) is a “known” solution to f(x,y)=0, and If x = a, then we have The “best of all possible worlds” Leibniz method! If x  a, then we have The “pretty good” Quasi-Newton method!

What are the issues? We have to make sure the iterated maps converge---how do we do this? “Pretty good” quasi-Newton’s method If we choose D near enough to f’(p) so that |Q’(p)| < ½, iterated maps will converge in a neighborhood of p. How does that work in our case? If we make sure that the partials of f with respect to y are all near enough to D to guarantee that |  x ’(y) | < ½ for all (x,y) in the square, then the iterated maps will converge.

The Role of the Derivative If we have a function f: [a,b] →  which is differentiable on (a,b) and continuous on [a,b], the Mean Value Theorem says that there is some c in [a,b] such that If |f (x)| < k < 1 for all x in [a,b] Likewise, if |f (x)| > k > 1 for all x in [a,b] Distances contract by a factor of k! Distances expand by a factor of k!

The Role of the Derivative If p is a fixed point of f in [a,b], and |f (x)| < k < 1 for all x in [a,b], then Likewise, if |f (x)| > k > 1 for all x in [a,b], f moves other points farther and farther away from p. (Repelling fixed point!) f moves other points closer to p by a factor of k! But f (p) = p, so Each time we re-apply the next iterate is even closer to p! (Attracting fixed point!)

What are the issues? Not so obvious... we have to work a bit to make sure we get the right fixed point. (We don’t leave the box!) (a,b)(a,b)(a,b)(a,b) x y x

Systems of Equations and Differentiability A vector valued function f of several variables is differentiable at a vector p if in some small neighborhood of p, the graph of f “looks a lot like” an affine function. That is, there is a linear transformation Df(p) so that for all z “close” to p, Suppose that f (p) = 0. When can we solve f(z) = 0 in some neighborhood of p? Where Df(p) is the Jacobian matrix made up of all the partial derivatives of f.

Systems of Equations and Differentiability For z “close” to p, Since f (p) = 0, When can we solve f(z) = 0 in some neighborhood of p? Answer: Whenever we can solve Because the existence of a solution depends on the geometry of the function.

When can we do it? is a linear system. We understand linear systems extremely well. We can solve for the variables y 1, y 2, and y 3 in terms of x 1 and x 2 if and only if the sub-matrix... is invertible.

A bit of notation To simplify things, we will write our vector valued function F :  n+m →  m. We will write our “input” variables as concatenations of n-vectors and m-vectors. e.g. (y,x)=(y 1, y 2,..., y n, x 1, x 2,..., x m ) So when we solve F(y,x)=0 we will be solving for the y-variables in terms of the x-variables.

The System We can solve for the variables y 1, y 2,... y n in terms of x 1, x 2,..., x m if and only if the sub-matrix... is invertible.

So the Condition We Need is the invertibility of the matrix We will refer to the inverse of the matrix D as D -1.

The Implicit Function Theorem F :  n+m →  m has continuous partials. Suppose b   n and a   m with F(b,a)=0. The n x n matrix that corresponds to the y- partials of F (denoted by D) is invertible. Then “near” a there exists a unique function g(x) such that F(g(x),x)=0; moreover g(x) is continuous.

What Function Do We Iterate? The 2-variable case. Fix x. For this x, construct function Iterate  x (y). Where

What Function Do We Iterate? The 2-variable case. Fix x. For this x, construct function Iterate  x (y). The n-variable case. Fix x. For this x, construct function Iterate  x (y). Multi-variable Parallel?

 n+m t a x g(x)g(x) We want g continuous & unique F ( g(x),x ) = 0 mm (b,a)(b,a) r b r nn t r Partials of F in the ball are all close to the partials at (b,a)

(b,a)(b,a) r  n+m Notation: Let dF(y,x) denote the n x n submatrix made up of all the y partial derivatives of F at (y,x). Step I: Since D is invertible,  D -1   0. We choose r as follows:

We choose r as follows: use continuity of the partials of F to choose r small enough so that for all (y,x)  B r (b,a) Then So ! Multivariable Mean Value Theorem Notation: Let dF(y,x) denote the n x n submatrix made up of all the y partial derivatives of F at (y,x). Step I: Since D is invertible,  D -1   0. 

There are two (highly non-trivial) steps in the proof: In Step 1 we choose the radius r of the ball in  n+m so that the partials in the ball are all “close” to the (known) partials at (b,a). This makes  x contract distances on the ball, forcing convergence to a fixed point. (Uses continuity of the partial derivatives.) In Step 2 we choose the radius t around a “small” so as to guarantee that our iterates stay in the ball. In other words, our “initial guess” for the quasi-Newton’s method is good enough to guarantee convergence to the “correct” root. The two together guarantee that we can find value a g(x) which solves the equation. We then map x to g(x).

Since this is the central idea of the proof, I will reiterate it: In Step 1 we make sure that iterated maps on our “pretty good” quasi-Newton function actually converge. In Step 2, by making sure that they didn’t “leave the box,” we made sure the iterated maps converged to the fixed point we were aiming for. That is, that they didn’t march off to some other fixed point outside the region. (a,b)(a,b)(a,b)(a,b) x y x

Final thoughts The Implicit Function Theorem is a DEEP theorem of Mathematics. Fixed point methods show up in many other contexts: For instance:  They underlie many numerical approximation techniques.  They are used in other theoretical contexts, such as in Picard’s Iteration proof of the fundamental theorem of differential equations. The iteration functions frequently take the form of a “quasi-Newton’s method.”