2IMG15 Algorithms for Geographic Data

2IMG15 Algorithms for Geographic Data
Fall Lecture 3: Simplification

Problem Many algorithms handling movement data are slow, e.g.
similarity O(nm log nm) general segmentation O(n2) grouping O(t n3) … Observation: Many trajectories can be simplified without losing much information.

Problem Many algorithms handling movement data are slow, e.g.
similarity O(nm log nm) general segmentation O(n2) grouping O(t n3) … Observation: Many trajectories can be simplified without losing much information. Example:

Steps in Algorithmic Problem Solving
Understand the input Understand what you really want from the output Write an input and output specification and double-check it! Find geometric properties of the desired output Construct an algorithm Verify that it actually solves the problem you specified Analyze the efficiency

Assumptions Due to lack of further information:
constant speed (and velocity) on each segment ➨ location changes continuously, but speed/velocity is piecewise constant and changes “in jumps” (xn,yn),tn (x2,y2),t2 (xi,yi),ti (x1,y1),t1

Assumptions We do not want to assume that …
sampling occurred at regular intervals no data is missing the object is only present at the measured locations but not in between A B ? How much did A deviate from the route of B?

Steps in algorithmic problem solving
 Understand the input Understand what you really want from the output Write an input and output specification and double-check it! Find geometric properties of the desired output Construct an algorithm Verify that it actually solves the problem you specified Analyze the efficiency

Specifying the output We want a trajectory with the smallest number of vertices and with error at most  Without thinking about the output, one could say: “Just use a line simplification method” 1 2 3 4 5 6 assume the numbers correspond to minutes 1 6 we have made speed uniform! 1 6 5 correct!

Specifying the output We want a trajectory with the smallest number of vertices and with error at most  … but what do we mean with “error” ? Option 1: the maximum error in location for any moment of time: max distance( T(t), S(t) ) time t T input trajectory S simplified trajectory 1 2 3 4 5 6 T S1 S2 (5) (3)

Specifying the output We want a trajectory with the smallest number of vertices and with error at most  … but what do we mean with “error” ? Option 2: the error in speed as a multiplicative factor, for any moment Option 3: the error in velocity (speed and heading combined) … or any combination of the above

Let’s start simple In this lecture mostly: curve simplification
You will see in Excercise A2.2 an example of how to incorporate time

Today Simplifying polygonal curves Ramer–Douglas–Peucker, 1973
Imai-Iri, 1988 Agarwal, Har-Peled, Mustafa and Wang, 2005

Ramer-Douglas-Peucker
1972 by Urs Ramer and 1973 by David Douglas and Thomas Peucker The most frequently used simplification algorithm. Used in GIS, geography, computer vision, pattern recognition… Very easy to implement and “works well” in practice.

Input polygonal path P = p1,…,pn and threshold  Initially i=1 and j=n Algorithm DP(P,i,j) Find the vertex vf between pi and pj farthest from pipj. dist := the distance between vf and pipj. if dist >  then DP(P, vi , vf) DP(P, vf , vj) else Output(vivj) pj pi

Input polygonal path P = p1,…,pn and threshold  Initially i=1 and j=n Algorithm DP(P,i,j) Find the vertex vf between pi and pj farthest from pipj. dist := the distance between vf and pipj. if dist >  then DP(P, vi , vf) DP(P, vf , vj) else Output(vivj) pj pi dist

q > e > e > e p

Time complexity? Testing a shortcut between pi and pj takes O(j-i) time. Worst-case recursion? DP(P, vi , vi+1) DP(P, vi+1 , vj) Time complexity T(n) = O(n) + T(n-1) = O(n2) Algorithm DP(P,i,j) Find the vertex vf farthest from pipj. dist := the distance between vf and pipj. if dist >  then DP(P, vi , vf) DP(P, vf , vj) else Output(vivj)

Summary: Ramer-Douglas-Peucker
Worst-case time complexity: O(n2) In most realistic cases: T(n) = T(n1) + T(n2) + O(n) = O(n log n), where n1 and n2 are smaller than n/c for some constant c. For curve in 2D Can be computed in O(n log n) time, and if curve not self-intersect, in O(n log* n) time using a type of dynamic convex hull data structure [Hershberger & Snoeyink ’92&98] Does not give any bound on the complexity of the simplification! > e p

Imai-Iri Previous algorithm is simple and fast but do not give a bound on the complexity of the simplification Examples for which they perform poorly? ε ε ε Douglas-Peucker Optimal

Imai-Iri Previous algorithm is simple and fast but do not give a bound on the complexity of the simplification Examples for which they perform poorly? Imai-Iri 1988 gave an algorithm that produces an -simplification with the minimum number of links. Generally, two variants Min-vertices (for given ε): this is the one we mostly consider Min-ε (for given number of vertices): often uses binary search with Min-vertices as subroutine Another distinction: does the simplification only use vertices of the input or not? In lecture only input vertices, but see Ex A2.1

Imai-Iri Input polygonal path P = p1,…,pn and threshold 
Build a graph G containing all valid shortcuts. Find a minimum link path from p1 to pn in G

Imai-Iri Find all possible valid shortcuts

Imai-Iri Find all possible valid shortcuts > e

Imai-Iri Find all possible valid shortcuts <e <e

Imai-Iri Find all possible valid shortcuts >e

Imai-Iri Find all possible valid shortcuts

Imai-Iri All possible shortcuts!

Imai-Iri 1. Build a directed graph of valid shortcuts. 2. Compute a shortest path from p1 to pn using breadth-first search.

Imai-Iri Brute force running time: ? #possible shortcuts ?

Analysis: Imai-Iri Brute force running time: ? #possible shortcuts ?
Running time: O(n3) O(n2) possible shortcuts O(n) per shortcut  O(n3) to build graph O(n2) BFS in the graph Output: A path with minimum number of edges Improvements: Chan and Chin’92: O(n2) for Min-ε: O(n2 log n)

Speeding up Imai-Iri The graph can have ~n2 edges; testing one shortcut takes time linear in the number of vertices in between The Imai-Iri algorithm avoids spending ~n3 time by testing all shortcuts from a single vertex in linear time

Speeding up Imai-Iri A shortcut (pi pj) is feasible
iff it intersects all ε-disks of vertices in between (i+1,…,j-1) iff both rays (from pi through pj and from pj through pi) intersect all ε-disks of vertices inbetween Checking if a ray intersects all disks can be done efficiently: Algorithm (vertex i) Wi = whole plane for j = i+1,...,n Wj = Wj-1 ∩ wedge wi,j defined by pi & ε-disk at pj if Wj = empty then break if ray rj in Wj then (pi pj) is feasible Note: after projecting you can see this as interval intersection problem

Is the problem in Ω(n2)? If we compute the “shortest-path graph” we can’t avoid Ω(n2). For curves in Rd with d = Ω(log n), it is unlikely that an O(n2-ε)-time algorithm exists [Buchin et al. ‚16] For curves in 2d there are cases that can be solved in subquadratic time using a clique-cover of the shortest-path graph [Agarwal, Varadarajan ’00] General 2d Euclidean case still open …

Imai-Iri with Fréchet distance
The framework of Imai-Iri can also be used for simplification under the Fréchet distance Min-vertices (for given ε) : solve decision problem for each edge: O(n3) Min-ε (for given length of simplification): solve computation problem for each edge: O(n3 log n) Adapt [Guibas et al. ’93] to compute incrementally

Results so far RDP: O(n2) time (simple and fast in practice)
Output: A path with no bound on the size of the path Imai-Iri: O(n2) time by Chan and Chin Output: A path with minimum number of edges RDP and II use the Hausdorff distance II can use Fréchet distance but slow Question: Can we get something that is simple, fast and has a worst-case bound using the Fréchet error measure?

Agarwal et al. Agarwal, Har-Peled, Mustafa and Wang, 2002
Time: Running time O(n log n) Measure: Fréchet distance Output: Path has at most the same complexity as a minimum link (/2)-simplification

Algorithm - Agarwal et al.
Let P = p1, p2, ... , pn be a polygonal curve Notation: Let (pipj) denote the Fréchet distance between (pi,pj) and the subpath (pi,pj) of P between pi and pj. pi pm pj pl

Algorithm(P = p1,…,pn, ) i := 1 P’= p1 while i < n do find any j>i such that ((pi,pj)   and ((pi,pj+1) > ) or ((pi,pj)   and j=n) P’= concat(P,pj) i := j end return P’

Algorithm(P = p1,…,pn, ) i := 1 P’= p1 while i < n do find any j>i such that ((pi,pj)   and ((pi,pj+1) > ) or ((pi,pj)   and j=n) P’= concat(P,pj) i := j end return P’ p1 pj Pj+1

Efficient algorithm? Recall: Computing the Fréchet distance between (pi,pj) and (pi,pj) can be done in O(j-i) time pi pj Pj+1

Efficient algorithm? Recall: Computing the Fréchet distance between (pi,pj) and (pi,pj) can be done in O(j-i) time Finding the first j such that (pi,pj)   and (pi,pj+1) >  takes O(n2) time. pi pj Pj+1

How can we speed up the search for pj? Note that we just want to find any vertex pj such that (pi,pj)   and (pi,pj+1) >  Idea: search for pj using exponential search followed by binary search (double&search)

Exponential search: Test pi+1, pi+2, pi+4, pi+8… until found pi+2k, such that (pi,pi+2k) > . Binary search: Search for pj between pi+2k-1 and pi+2k s.t. (pi,pj)   and (pi,pj+1) > .   >  i+2k-1 i+2k

Exponential search: Test pi+1, pi+2, pi+4, pi+8… until found pi+2k, such that (pi,pi+2k) > . Binary search: Search for pj between pi+2k-1 and pi+2k s.t. (pi,pj)   and (pi,pj+1) > . if   then search right   >  i+2k-1 i+2k

Exponential search: Test pi+1, pi+2, pi+4, pi+8… until found pi+2k, such that (pi,pi+2k) > . Binary search: Search for pj between pi+2k-1 and pi+2k s.t. (pi,pj)   and (pi,pj+1) > . if   then search right   >  i+2k-1 i+2k if >  then search left

Exponential search: Test pi+1, pi+2, pi+4, pi+8… until found pi+2k, such that (pi,pi+2k) > . Binary search: Search for pj between pi+2k-1 and pi+2k s.t. (pi,pj)   and (pi,pj+1) > .   >   > pj pj+1 i+2k-1 i+2k

Exponential search: Test pi+1, pi+2, pi+4, pi+8… until found pi+2k, such that (pi,pi+2k) > . Iterations: O(log n) Binary search: Search for pj between pi+2k-1 and pi+2k s.t. (pi,pj)   and (pi,pj+1) > . The algorithm computes indices ij = ij-1 + rj with rj in [2l,2l+1] and ∑rj = n. To compute ij requires l = O(log n) iterations of both searches, costing O(l∙2l) : sums to overall O(n log n)

Analysis - Agarwal et al.
Lemma 1: Let P = p1, p2, ... , pn be a polygonal curve. For l ≤ i ≤ j ≤ m, (pipj) ≤ 2·(plpm). pi pm pj pl

Lemma 1: Let P = p1, p2, ... , pn be a polygonal curve. For l ≤ i ≤ j ≤ m, (pipj) ≤ 2·(plpm). Proof: Fix matching that realises (plpm). Set  = (plpm). pi pm pj pl

Lemma 1: Let P = p1, p2, ... , pn be a polygonal curve. For l ≤ i ≤ j ≤ m, (pipj) ≤ 2·(plpm). Proof: Fix matching that realises (plpm). Set  = (plpm). ((pipj),(qiqj)) ≤  pi pm pj pl qi qj

Lemma 1: Let P = p1, p2, ... , pn be a polygonal curve. For l ≤ i ≤ j ≤ m, (pipj) ≤ 2·(plpm). Proof: Fix matching that realises (plpm). Set  = (plpm). ((pipj),(qiqj)) ≤  qi pi pj (pipj) qj

Lemma 1: Let P = p1, p2, ... , pn be a polygonal curve. For l ≤ i ≤ j ≤ m, (pipj) ≤ 2·(plpm). Proof: Fix matching that realises (plpm). Set  = (plpm). ((pipj),(qiqj)) ≤  ((qiqj),(pipj)) ≤  pi pm pj pl qi qj

Lemma 1: Let P = p1, p2, ... , pn be a polygonal curve. For l ≤ i ≤ j ≤ m, (pipj) ≤ 2·(plpm). Proof: Fix matching that realises (plpm). Set  = (plpm). ((pipj),(qiqj)) ≤  ((qiqj),(pipj)) ≤  ≤  qi pi pj qj ≤ 

Lemma 1: Let P = p1, p2, ... , pn be a polygonal curve. For l ≤ i ≤ j ≤ m, (pipj) ≤ 2·(plpm). Proof: Fix matching that realises (plpm). Set  = (plpm). ((pipj),(qiqj)) ≤  ((qiqj),(pipj)) ≤  ((pipj),pipj) ≤ ((pipj),(qiqj)) + ((qiqj),(pipj)) ≤ 2 pi pm pj pl qi (pipj) qj □

Theorem: (P,P’)   and |P’|  Popt(/2) Proof: (P,P’)   follows by construction.

Theorem: |P’|  Popt(/2) Proof: Q = p1= pj1,…, pjl=pn - optimal (/2)-simplification P’ = p1= pi1,…, pik=pn Prove (by induction) that im  jm , m  1. pim-1 pjm pim

pjm-1 Theorem: |P’|  Popt(/2) Proof: Q = p1= pj1,…, pjl=pn - optimal (/2)-simplification P’ = p1= pi1,…, pik=pn Prove (by induction) that im  jm , m  1. Assume im-1  jm-1 and let i’ = im+1. pim-1 pjm pim pi’

pjm-1 Theorem: |P’|  Popt(/2) Proof: Q = p1= pj1,…, pjl=pn - optimal (/2)-simplification P’ = p1= pi1,…, pik=pn Prove (by induction) that im  jm , m  1. Assume im-1  jm-1 and let i’ = im+1. By construction we have: (pim-1pi’) >  and (pim-1pi’-1)   If i’ > jm then we are done! pim-1 > pjm pim  pi’

pjm-1 Theorem: |P’|  Popt(/2) Proof: Q = p1= pj1,…, pjl=pn - optimal (/2)-simplification P’ = p1= pi1,…, pik=pn Prove (by induction) that im  jm , m  1. Assume im-1  jm-1 and let i’ = im+1. By construction we have: (pim-1pi’) >  and (pim-1pi’-1)   Assume the opposite i.e. i’  jm pim-1 pim pi’ pjm

pjm-1 Theorem: |P’|  Popt(/2) Proof: Q = p1= pj1,…, pjl=pn - optimal (/2)-simplification P’ = p1= pi1,…, pik=pn Prove (by induction) that im  jm , m  1. Assume im-1  jm-1 and let i’ = im+1. By construction we have: (pim-1pi’) >  and (pim-1pi’-1)   Assume the opposite i.e. i’  jm Q is an (/2)-simplification  (pjm-1pjm)  /2 pim-1 pim pi’ pjm

pjm-1 Theorem: |P’|  Popt(/2) Proof: Q = p1= pj1,…, pjl=pn - optimal (/2)-simplification P’ = p1= pi1,…, pik=pn Assume the opposite i.e. i’  jm Q is an (/2)-simplification  (pjm-1pjm)  /2 pim-1 pim pi’ pjm

pjm-1 Theorem: |P’|  Popt(/2) Proof: Q = p1= pj1,…, pjl=pn - optimal (/2)-simplification P’ = p1= pi1,…, pik=pn Assume the opposite i.e. i’  jm Q is an (/2)-simplification  (pjm-1pjm)  /2 According to Lemma 1: (pim-1pi’)  2 (pjm-1pjm)   pim-1 pim pi’ pjm

pjm-1 Theorem: |P’|  Popt(/2) Proof: Q = p1= pj1,…, pjl=pn - optimal (/2)-simplification P’ = p1= pi1,…, pik=pn Assume the opposite i.e. i’  jm Q is an (/2)-simplification  (pjm-1pjm)  /2 According to Lemma 1: (pim-1pi’)  2 (pjm-1pjm)   This is a contradiction since (pim-1pi’) >  by construction.  i’ > jm and we are done! pim-1 pim pi’ pjm

Summary - Agarwal et al. Theorem:
Given a polygonal curve P in Rd and a parameter   0, an -simplification of P of size at most Popt(/2) can be constructed in O(n log n) time and O(n) space.

Summary Ramer–Douglas–Peucker 1973 Used very often in practice
Easy to implement and fast in practice Hausdorff error Imai-Iri 1988 Gives an optimal solution but slow Hausdorff or Fréchet error Agarwal, Har-Peled, Mustafa and Wang 2005 Fast and easy to implement Gives a worst-case performance guarantee Fréchet error

More algorithms t Simplifying trajectories
Time has to be incorporated, possibly as third dimension Gudmundsson et al. 2009 based on Ramer–Douglas–Peucker Assignment 2, Exercise 2: Imai-Iri Streaming setting Abam, de Berg, Hachenberger, Zarei 2007 Non-vertex constrained Guibas, Hershberger, Mitchell, Snoeyink, 1993 … and more, e.g. topological or geometric constraints t Where-at (t)

Where-at (t) t Where-at (t)

Main references W. S. Chan and F. Chin. Approximation of polygonal curves with minimum number of line segments. In Proceedings of the 3rd International Symposium on Algorithms and Computation, 1992. D. H. Douglas and T. K. Peucker. Algorithms for the reduction of the number of points required to represent a digitized line or its caricature. The Canadian Cartographer, 1973. P. K. Agarwal, S. Har-Peled, N. Mustafa, and Y. Wang. Near-linear time approximation algorithms for curve simplification in two and three dimensions. Algorithmica, 2005.

2IMG15 Algorithms for Geographic Data

Similar presentations

Presentation on theme: "2IMG15 Algorithms for Geographic Data"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

2IMG15 Algorithms for Geographic Data

Similar presentations

Presentation on theme: "2IMG15 Algorithms for Geographic Data"— Presentation transcript:

Similar presentations

About project

Feedback