Presentation is loading. Please wait.

Presentation is loading. Please wait.

Curve Simplification under the L 2 -Norm Ben Berg Advisor: Pankaj Agarwal Mentor: Swaminathan Sankararaman.

Similar presentations


Presentation on theme: "Curve Simplification under the L 2 -Norm Ben Berg Advisor: Pankaj Agarwal Mentor: Swaminathan Sankararaman."— Presentation transcript:

1 Curve Simplification under the L 2 -Norm Ben Berg Advisor: Pankaj Agarwal Mentor: Swaminathan Sankararaman

2 Motivation Given a large sets of data sampled from an underlying function that we wish to estimate Data sets are too large Data has some noise, but the degree of accuracy changes over time P: sequence of n points in R 2 Each trajectory P=

3 Problem Formulation We want a compact representation of the curve underlying P k ≥ 0: A parameter Compute a continuous, piecewise- linear function of at most k pieces Let p i =(x i, y i ) Error ∆= Find ƒ which minimizes error Error is the sum of the squared vertical distances

4 Some Definitions f1f1 f4f4 f3f3 f2f2 A function f can be decomposed into its component functions We call the intersection point between functions a breakpoint

5 Previous Results L ∞ -Norm instead of L 2 -Norm ∆ ∞ (P,ƒ)=max i |y i – ƒ(x i )| Near linear time algorithm shown by Agarwal et al. Imai and Iri consider fixing breakpoints to occur at input points Guha et al. give a linear time (1+ ε ) approximation for the discontinuous case Shortest path formulation demonstrates a dynamic programming solution for the discontinuous case The discontinuous case is often unhelpful for estimating situations such as trajectories which are often continuous curves

6 Discontinuous Case Do not require ƒ to be continuous O(n 2 k) dynamic programming solution Every point becomes a vertex Every optimal regression line becomes an edge Use a modified Bellman-Ford algorithm to find shortest path of length k ∆(f * i,j,P i,j ) ij

7 An Exact Algorithm First consider the case where k=2 For a given coverage, we can find the optimal function with this coverage This generalizes to solving for the optimal k piece function over a given coverage Try all coverages of the point set Runs in O(n O(k) (k 5 +n))

8 The Two Piece Case Let OPT: Suppose We have the following convex program to find ƒ * Where : ƒ L (x)=m L x + c L ƒ R (x)=m R x + c R ƒ L (x) ƒ R (x) xLxL xRxR The coverage of ƒ describes which function covers which subset of points

9 The Two Piece Case g L : OPT regression line for x ≤ x i g R : OPT regression line for x≥ x i+1 x: breakpoint of g L, g R Claim: if else, x lies on a boundary due to the convexity of ∆ gLgL gRgR ƒ*Lƒ*L ƒ*Rƒ*R x’ x*x*

10 Extension to k Pieces Given a coverage of P f[i,j]: the optimal function covering pieces i through j ∆[i,j]: the error associated with this function Two functions f[i,j] and f[j+1,m] form a valid solution if they intersect such that no constraints are violated. The above figure shows pieces which do not form a valid solution.

11 Extension to k Pieces If for some i < m < j, f[i,m] and f[m+1,j] form a valid solution, this solution is optimal If any particular solution is not valid, what does this tell us? We can make another argument based on the convexity of ∆

12 Extension to k Pieces So, for a given coverage: Try to find two optimal functions which form a valid solution If none do, then there are k constraints, solve a constrained optimization problem Solve for all optimums of length 2, 3, …, k Try all coverages of P If all constraints are tight, solve a constrained optimization problem

13 A (1+ ε ) Dynamic Programming Solution To simplify things, lets find a set of “candidate” lines What if there was a discrete grid over the point set, P? All functions would be somewhat close to a function which passed through pairs of grid marks

14 Discretization Scheme Claim: Comparing error in OPT solutions: (∆ ∞ (P)) 2 ≤ ∆(P)≤ n(∆ ∞ (P)) 2 Discretize x=x i for all i Two consecutive points y, y’ on x=x i For any piece covering zero or one point, adjust this line to cover at least two points For every piece, translate the piece vertically until it passes through a grid mark Rotate it about this grid mark until it hits a second grid mark Factor of ε /n ε (∆ ∞ ) 2 /n Proof Idea: Each piece in OPT can be shifted to pass through two points in C

15 Claim 1 Any two functions, f and g, which pass between the same pair of grid marks above or below every point are factor (1+ ε ) away from each other in error Every point’s error is increased by either ε ∆ ∞ /n, or a factor of (1+ ε ), thus error doesn’t increase by more than factor (1+ ε )

16 Claim 2 Any function, f, can be transformed to a function g which passes through same intervals above or below every point, such that every piece of g passes through 2 grid marks Consider some piece f i (x) Through rotation, f i (x) cannot become vertical No line leaves any interval it lies in Rotation and Translation may change coverage, but only between lines in the same interval

17 Claim 2

18

19 Finishing Up There exists a function through pairs of grid marks with error no more than factor (1+ ε ) of opt. Find the minimal such function Let T(i,p,q) be the optimal i- piece solution whose final piece passes through grid marks p and q We can solve this in Let Φ (p,q,r,s) be the error of the line through p,q to the right of its intersection with the line through r,s

20 Future Work Can we draw fewer grid marks? Can we choose lines through these grid marks in a more clever manner? Coreset based approach: A coreset is a subset of the input points whose error is not too different from that of the input set For example, we could find a (1+ ε ) coreset for the problem Har-Peled describes coresets for similar problems Suggests we could find an O(n+k) O(1) approximation


Download ppt "Curve Simplification under the L 2 -Norm Ben Berg Advisor: Pankaj Agarwal Mentor: Swaminathan Sankararaman."

Similar presentations


Ads by Google