CSCI-256 Data Structures & Algorithm Analysis Lecture Note: Some slides by Kevin Wayne. Copyright © 2005 Pearson-Addison Wesley. All rights reserved. 18
Elements of DP From an engineering perspective, when should we look for a DP solution to a problem? –Optimal substructure: The first step in solving an optimization problem by DP is to characterize the structure of an optimal solution. A problem exhibits optimal structure if an optimal solution to the problem contains within it optimal solutions to subproblems –Overlapping subproblems: The space of subproblems must be “small” in the sense that a recursive algorithm for the problem solves the same subproblems over and over again, rather than always generating new subproblems. Typically, total number of distinct subproblems is a polynomial in the input size. DP algorithms take advantage of this by solving each subproblem once and storing the solution in a table
Least Squares Least squares –Foundational problem in statistics and numerical analysis –Given n points in the plane: (x 1, y 1 ), (x 2, y 2 ),..., (x n, y n ) –Find a line y = ax + b that minimizes the sum of the squared error –Solution: Calculus min error is achieved when x y
Least Squares Solution? x y
Segmented Least Squares Segmented least squares (first attempt) –Points lie roughly on a sequence of several line segments –Given n points in the plane (x 1, y 1 ), (x 2, y 2 ),..., (x n, y n ) with x 1 < x 2 <... < x n, find a sequence of lines that minimizes SSE? x y
What is the optimal linear interpolation with two line segments?
Optimal interpolation with two segments Give an equation for the optimal interpolation of p 1,…,p n with two line segments. Let E i,j be the least squares error for the optimal line interpolating p i,... p j (DONE IN CLASS)
What is the optimal linear interpolation with three line segments?
Optimal interpolation with three segments Give an equation for the optimal interpolation of p 1,…,p n with three line segments. Let E i,j be the least squares error for the optimal line interpolating p i,... p j (DONE IN CLASS)
What is the optimal linear interpolation with n line segments?
Segmented Least Squares Segmented least squares –Points lie roughly on a sequence of several line segments –Given n points in the plane (x 1, y 1 ), (x 2, y 2 ),..., (x n, y n ) with x 1 < x 2 <... < x n, find a sequence of lines that minimizes f(x) Q: What's a reasonable choice for f(x) to balance accuracy and parsimony? goodness of fitnumber of lines x y
Segmented Least Squares Segmented least squares –Points lie roughly on a sequence of several line segments –Given n points in the plane (x 1, y 1 ), (x 2, y 2 ),..., (x n, y n ) with x 1 < x 2 <... < x n, find a sequence of lines that minimizes the sum of the sums of the squared errors E in each segment the number of lines L Tradeoff function: E + cL, for some constant c > 0 x y
Optimal substructure property Optimal solution with k line segments extends an optimal solution of k-1 line segments on a smaller problem
DP: Multiway Choice Notation –OPT(j) = minimum cost for points p 1, p i+1,..., p j –E i,j = minimum sum of squares for points p i, p i+1,..., p j Give a recursive definition for OPT(j) (DONE IN CLASS)
Segmented Least Squares: Algorithm Running time: O(n 3 ) –Bottleneck = computing E i,j for O(n 2 ) pairs, O(n) per pair using previous formula can be improved to O(n 2 ) by pre-computing various statistics INPUT: n, p 1,…,p N, c Segmented-Least-Squares() { Opt[0] = 0 for j = 1 to n for i = 1 to j compute the least square error E ij for the segment p i,…, p j for j = 1 to n Opt[j] = min 1 i j (E ij + c + Opt[i-1]) return Opt[n] }
Determining the solution When Opt[j] is computed, record the value of i that minimized the sum Store this value in an auxiliary array Use to reconstruct solution