1 More Mathematics: Finding Minimum
Numerical Optimization Find the minimum of If a given function is continuous and differentiable, find the root of the derivative How could we do this by using loops? =) 2
Why we need numerical optimization? The derivative of a function is not always available. The bisection algorithm cannot be used to find a minimum of a function. Root finding for a multi-variate function is usually out of interest because in general a root value is not a point but a hyperspace. Many many many real problems are reduced to the optimization. 3
Line (or linear) search At the minimum, Find any x with sufficiently small h. The precision is limited by the chosen h 4
Application of minimization The definition of an average A number that minimizes the squared sum of deviations. 5
Application of minimization Example
7
Application of minimization What is the minimum of the absolute sum of deviation do you think? 8
Minimization of multi-variate functions A linear search can find only a minimum of a uni-variate or one-dimensional function. What if we have two dimensional or more variables? For example, find the average of point sets. Automatic face tracking Linear regression 9
Linear Regression Sample Data 10 Hours Spent Studying Math SAT Score
Linear Regression Scatter Plot X-axis : Hours spent for the math study Y-axis: The math SAT score 11
Linear Regression The goal is to find the best line to describe the trend of a given data. 12
Linear Regression Let’s take a mathematical approach The equation of a line graph: The slope is a, and the intercept is b. The unknowns are a and b not x and y. The line is a set of points that satisfies 13
Line Equation between Two Points 14 The line equation between two points Why? The two points must satisfy the system of linear equations:
Line Equation between Two Points These two equations arranged into The slope is computed as Putting a into the one of those equation results in b. 15
16
Is This Enough? The quantitative answer is Yes. But, the qualitative answer is no. By the quantitative analysis, it shows a similar trend with the given data. By the qualitative analysis, we should ask first what is the measure of accuracy (or goodness)? 17
Accuracy of the line graph How can we measure the accuracy of the line graph? In 1-dimensional case, we computed the deviation. We should remember the slide following. 18
Application of minimization Example Closely look at the It is the square of the distance between x_i and x.
Accuracy of the line graph We can use the distance between each data point and a line we are seeking with for loop. The distance between a point and a line is given as Why?
Accuracy of the line graph As we computed the average by numerical minimization, Let us compute the best slope and intercept that minimizes the distance between each point and the line.
Namely, the best data line How to compute the minimum of F(a,b) with respect to a and b? What is the derivative of F(a,b)? There should be two derivative functions One with respect to a Another with respect to b 22
23
SciPY Helps! There are provided functions for minimization (or optimization) with scipy package. Please refer the document There are functions not using the derivative, fprime. fmin(func, x0) fmin_powell(func, x0) Most optimization can be more efficient using derivatives. 24
25 Demo for scipy.optimize
Understanding the landscape The space of a and b Which plot do you think most appropriate or helpful? Let us use contourf() function 26
27 Ain't over til it's over
Matrix Diagonalization A matrix – vector operation creates another vector. Did you have any curiosity where vectors go? Where a circle goes? 28
Matrix Diagonalization 29 Matrix – Matrix multiplication Easy to compute row vectors individually How to create a matrix? NP.hstack(), NP.vstack()
30 Recitation for Matrix Diagonalization
Matrix Diagonalization A circle has transformed into an ellipse! 31
Matrix Diagonalization Each matrix maps a vector into another vector by multiplication. Create a sequence of vectors by multiplication. After linear scaling with the maximum element in A^j x 32
33
Matrix Diagonalization Observation Every vector converges to a major vector. The major principal direction is The minor principal direction is orthogonal to the major principal direction. The minor principal direction is not obvious in the plot though but indeed exists. 34
Diagonal Matrix A diagonal matrix has its elements only at the diagonal indices. Playing with normal (or unit) vectors. 35
36 What is the major principal vector in this plot?
Matrix Diagonalization What is the difference between A and D? The off-diagonal elements are different. In the previous example, we multiplied 32 sample points from a circle to the matrix A. What was the minimum scaling? What was the maximum scaling? 37
Matrix Diagonalization: Question In the previous example, we multiplied 32 sample points from a circle to the matrix A. What was the minimum scaling after multiplying A to xy? What was the maximum scaling after multiplying A to xy? What degrees were corresponding points rotated? 38