Download presentation
Presentation is loading. Please wait.
Published byAron Johns Modified over 9 years ago
1
CSCI 256 Data Structures and Algorithm Analysis Lecture 14 Some slides by Kevin Wayne copyright 2005, Pearson Addison Wesley all rights reserved, and some by Iker Gondra
2
Elements of DP From an engineering perspective, when should we look for a DP solution to a problem? –Optimal substructure: The first step in solving an optimization problem by DP is to characterize the structure of an optimal solution. A problem exhibits optimal structure if an optimal solution to the problem contains within it optimal solutions to subproblems –Overlapping subproblems: The space of subproblems must be “small” in the sense that a recursive algorithm for the problem solves the same subproblems over and over again, rather than always generating new subproblems. Typically, total number of distinct subproblems is a polynomial in the input size. DP algorithms take advantage of this by solving each subproblem once and storing the solution in a table
3
Least Squares Least squares –Foundational problem in statistics and numerical analysis –Given n points in the plane: (x 1, y 1 ), (x 2, y 2 ),..., (x n, y n ) –Find a line y = ax + b that minimizes the sum of the squared error –Solution: Calculus min error is achieved when x y
4
Least Squares Solution? Sensible?? x y
5
Segmented Least Squares Segmented least squares (first attempt) –Points lie roughly on a sequence of several line segments –Given n points in the plane (x 1, y 1 ), (x 2, y 2 ),..., (x n, y n ) with x 1 < x 2 <... < x n, find a sequence of lines that minimizes SSE which we could call the error x y
6
Segmented Least Squares -- How many line segments should we choose?? To optimize, we want to give assume a greater penalty for a larger number of segments as well as for the error—the squared deviations of the points from its corresponding line. Penalty of a partition is the sum of: –the number of segments into which we partition the points times a given multiplier, c –For each segment the error value of the optimal line through that segment This problem is a partitioning problem. This is an important problem in data mining and statistics known as change detection: given a sequence of data points, identify a few points in the sequence at which a discrete change occurs (in this case a change from one linear approximation to another)
7
Segmented Least Squares Goal in segmented Least Squares Problem: find a partition of minimal penalty
8
What is the optimal linear interpolation with two line segments?
9
Optimal interpolation with two segments Give an equation for the error of the optimal line ( having minimal least squares error ) through p 1,…,p n with two line segments. Let E i,j be the least squares error for the optimal line through p i,... p j (DONE IN CLASS)
10
What is the optimal linear interpolation with three line segments?
11
Optimal interpolation with three segments Give an equation for the error of the optimal line ( having minimal least squares error ) through p 1,…,p n with three line segments. Let E i,j be the least squares error for the optimal line through p i,... p j Need to find i and j which minimize (E j+1,n + E i+1,j + E 1,i ) (Note we haven’t included a penalty term accounting for the number of segments) Can we do this recursively?
12
What is the optimal linear interpolation with n line segments?
13
Segmented Least Squares Segmented least squares –Points lie roughly on a sequence of several line segments –Given n points in the plane (x 1, y 1 ), (x 2, y 2 ),..., (x n, y n ) with x 1 < x 2 <... < x n, find a sequence of lines that minimizes f(x) Question: What's a reasonable choice for f(x) to balance accuracy and parsimony? goodness of fitnumber of lines x y
14
Segmented Least Squares Segmented least squares –Points lie roughly on a sequence of several line segments –Given n points in the plane (x 1, y 1 ), (x 2, y 2 ),..., (x n, y n ) with x 1 < x 2 <... < x n, find a sequence of lines that minimizes the sum of the sums of the squared errors E in each segment the number of lines L Tradeoff function: E + cL, for some constant c > 0 x y
15
Optimal substructure property Optimal solution with k line segments extends an optimal solution of k-1 line segments on a smaller problem
16
DP: Multiway Choice Notation –OPT[j] = minimum cost for points p 1, p 2,..., p j –E i,j = minimum sum of squares for points p i, p i+1,..., p j Give a recursive definition for OPT[j]
17
Notation. OPT[j] = minimum cost for points p 1, p 2,..., p j. E i,j = minimum sum of squares for points p i, p i+1,..., p j. To compute OPT[j]: –Last segment uses points p i, p i+1,..., p j for some i. –Cost = E i,j + c + OPT[i-1]. –Which i ??? Opt[j] = min 1 i j (E i,j + c + Opt[i-1])
18
Segmented Least Squares: Algorithm can be improved to O(n 2 ) by pre-computing various statistics INPUT: n, p 1,…,p N, c Segmented-Least-Squares() { Opt[0] = 0 for j = 1 to n for i = 1 to j compute the least square error E ij for the segment p i,…, p j endfor for j = 1 to n Opt[j] = min 1 i j (E ij + c + Opt[i-1]) endfor return Opt[n] }
19
Total Running time: O(n 3 ) Computing E i,j for O(n 2 ) pairs, O(n) per pair using previous formula –this gives O(n 3 ) to compute all E i,j pairs Following this the algorithm has n iterations for values j = 1,…,n; for each value of j we have to compute the minimum of the recurrence to fill the array entry Opt[j]; this takes O(n) for each j; –This part gives O(n 2 ) Remark – there is an exercise in the text which shows how to reduce the total running time from O(n 3 ) to O(n 2 )
20
Determining the solution When Opt[j] is computed, record the value of i that minimized the sum Store this value in an auxiliary array Use to reconstruct solution
21
Determining the solution Find-Segments(j) If j = 0 then 0utput nothing Else Get i that minimizes E i,j + C + Opt[i-1] Output the segment {p i,…p j } and the result of Find-Segments(i-1) Endif
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.