2IMA20 Algorithms for Geographic Data

Slides:



Advertisements
Similar presentations
Trajectory Segmentation Marc van Kreveld. Algorithms Researchers … … want their problems to be well-defined (fully specified) … care about efficiency.
Advertisements

Introduction to Computer Science 2 Lecture 7: Extended binary trees
Fundamental tools: clustering
Fast Algorithms For Hierarchical Range Histogram Constructions
Comp 122, Spring 2004 Binary Search Trees. btrees - 2 Comp 122, Spring 2004 Binary Trees  Recursive definition 1.An empty tree is a binary tree 2.A node.
AVL Trees COL 106 Amit Kumar Shweta Agrawal Slide Courtesy : Douglas Wilhelm Harder, MMath, UWaterloo
Greedy Algorithms Amihood Amir Bar-Ilan University.
Indexing and Range Queries in Spatio-Temporal Databases
On Map-Matching Vehicle Tracking Data
©Silberschatz, Korth and Sudarshan12.1Database System Concepts Chapter 12: Part C Part A:  Index Definition in SQL  Ordered Indices  Index Sequential.
2IL50 Data Structures Spring 2015 Lecture 8: Augmenting Data Structures.
Analysis of Algorithms CS 477/677 Instructor: Monica Nicolescu Lecture 11.
Binary Heaps CSE 373 Data Structures Lecture 11. 2/5/03Binary Heaps - Lecture 112 Readings Reading ›Sections
Sequence Alignment Variations Computing alignments using only O(m) space rather than O(mn) space. Computing alignments with bounded difference Exclusion.
©Silberschatz, Korth and Sudarshan12.1Database System Concepts Chapter 12: Part B Part A:  Index Definition in SQL  Ordered Indices  Index Sequential.
1 Indexing Structures for Files. 2 Basic Concepts  Indexing mechanisms used to speed up access to desired data without having to scan entire.
DAST, Spring © L. Joskowicz 1 Data Structures – LECTURE 1 Introduction Motivation: algorithms and abstract data types Easy problems, hard problems.
Indexing. Goals: Store large files Support multiple search keys Support efficient insert, delete, and range queries.
Computational Movement Analysis Lecture 5: Segmentation, Popular Places and Regular Patterns Joachim Gudmundsson.
Spring 2015 Lecture 11: Minimum Spanning Trees
The X-Tree An Index Structure for High Dimensional Data Stefan Berchtold, Daniel A Keim, Hans Peter Kriegel Institute of Computer Science Munich, Germany.
Jessie Zhao Course page: 1.
B-trees and kd-trees Piotr Indyk (slides partially by Lars Arge from Duke U)
CSCI 256 Data Structures and Algorithm Analysis Lecture 14 Some slides by Kevin Wayne copyright 2005, Pearson Addison Wesley all rights reserved, and some.
CSIS7101 – Advanced Database Technologies Spatio-Temporal Data (Part 1) On Indexing Mobile Objects Kwong Chi Ho Leo Wong Chi Kwong Simon Lui, Tak Sing.
CSC 211 Data Structures Lecture 13
2IL50 Data Structures Fall 2015 Lecture 9: Range Searching.
Spatio-temporal Pattern Queries M. Hadjieleftheriou G. Kollios P. Bakalov V. J. Tsotras.
A New Framework for Criteria-­based Trajectory Segmentation Kevin Buchin Joint work with Sander Alewijnse, Maike Buchin, Andrea Kölzsch, Helmut Kruckenberg.
15.082J & 6.855J & ESD.78J September 30, 2010 The Label Correcting Algorithm.
2IMA20 Algorithms for Geographic Data Spring 2016 Lecture 6: Segmentation.
2IL05 Data Structures Spring 2010 Lecture 9: Augmenting Data Structures.
2IMA20 Algorithms for Geographic Data Spring 2016 Lecture 5: Simplification.
2IMA20 Algorithms for Geographic Data Spring 2016 Lecture 7: Labeling.
CSE332: Data Abstractions Lecture 7: AVL Trees
Analysis of Algorithms CS 477/677
Data Structures – LECTURE Balanced trees
Indexing Structures for Files and Physical Database Design
RE-Tree: An Efficient Index Structure for Regular Expressions
COMP 430 Intro. to Database Systems
Haim Kaplan and Uri Zwick
Summary of General Binary search tree
Chapter 12: Query Processing
Spatio-temporal Pattern Queries
Spatial Online Sampling and Aggregation
CSE373: Data Structures & Algorithms Lecture 11: Implementing Union-Find Linda Shapiro Spring 2016.
Enumerating Distances Using Spanners of Bounded Degree
Greedy Algorithms / Dijkstra’s Algorithm Yin Tat Lee
B+-Trees and Static Hashing
Algorithms (2IL15) – Lecture 2
Indexing and Hashing Basic Concepts Ordered Indices
Ch. 8 Priority Queues And Heaps
Greedy Algorithms Many optimization problems can be solved more quickly using a greedy approach The basic principle is that local optimal decisions may.
CSE373: Data Structures & Algorithms Lecture 5: AVL Trees
Tree-Structured Indexes
Minimizing the Aggregate Movements for Interval Coverage
CS 583 Analysis of Algorithms
Hashing Sections 10.2 – 10.3 Lecture 26 CS302 Data Structures
2IMG15 Algorithms for Geographic Data
Kinetic Collision Detection for Convex Fat Objects
Algorithms (2IL15) – Lecture 7
Lecture 14 Shortest Path (cont’d) Minimum Spanning Tree
Dynamic Programming II DP over Intervals
Binary SearchTrees [CLRS] – Chap 12.
Richard Anderson Spring 2016
Clustering.
Hashing.
Lecture 13 Shortest Path (cont’d) Minimum Spanning Tree
Donghui Zhang, Tian Xia Northeastern University
Analysis of Algorithms CS 477/677
Presentation transcript:

2IMA20 Algorithms for Geographic Data Spring 2016 Lecture 4: Segmentation

Motivation: Geese Migration Two behavioural types stopover migration flight Input: GPS tracks expert description of behaviour Goal: Delineate stopover sites of migratory geese

Abstract / general purpose questions Single trajectory simplification, cleaning segmentation into semantically meaningful parts finding recurring patterns (repeated subtrajectories) Two trajectories similarity computation subtrajectory similarity Multiple trajectories clustering, outliers flocking/grouping pattern detection finding a typical trajectory or computing a mean/median trajectory visualization

Problem For analysis, it is often necessary to break a trajectory into pieces according to the behaviour of the entity (e.g., walking, flying, …). Input: A trajectory T, where each point has a set of attribute values, and a set of criteria. Attributes: speed, heading, curvature… Criteria: bounded variance in speed, curvature, direction, distance… Aim: Partition T into a minimum number of subtrajectories (so-called segments) such that each segment fulfils the criteria. “Within each segment the points have similar attribute values”

Criteria-Based Segmentation Goal: Partition trajectory into a small number of segments such that a given criterion is fulfilled on each segment

Example Trajectory T sampled with equal time intervals Criterion: speed cannot differ more than a factor 2?

Example speed Trajectory T sampled with equal time intervals Criterion: speed cannot differ more than a factor 2? Criterion: direction of motion differs by at most 90⁰? speed

Example speed speed Trajectory T sampled with equal time intervals Criterion: speed cannot differ more than a factor 2? Criterion: direction of motion differs by at most 90⁰? speed speed heading

Example speed speed Trajectory T sampled with equal time intervals Criterion: speed cannot differ more than a factor 2? Criterion: direction of motion differs by at most 90⁰? speed speed speed & heading heading

Decreasing monotone criteria Definition: A criterion is decreasing monotone, if it holds on a segment, it holds on any subsegment. Examples: disk criterion (location), angular range (heading), speed… Theorem: A combination of conjunctions and disjunctions of decreasing monotone criteria is a decreasing monotone criterion.

Greedy Algorithm Observation: If criteria are decreasing monotone, a greedy strategy works.

Designing greedy algorithms try to discover structure of optimal solutions: what properties do optimal solutions have ? what are the choices that need to be made ? do we have optimal substructure ? optimal solution = first choice + optimal solution for subproblem do we have greedy-choice property for the first choice ? prove that optimal solutions indeed have these properties prove optimal substructure and greedy-choice property use these properties to design an algorithm and prove correctness proof by induction (possible because optimal substructure)

lemma obviously holds, so assume OPT does not include ai. Lemma: Let k be the smallest index such that 𝜏 𝑡 0 , 𝑡 𝑘 fails for a monotone decreasing criterion 𝜏. If there is a solution, then there is an optimal solution including 𝜏 𝑡 0 , 𝑡 𝑘−1 as segment. Proof. Let OPT be an optimal solution for A. If OPT includes ai then the lemma obviously holds, so assume OPT does not include ai. We will show how to modify OPT into a solution OPT* such that (i) OPT* is a valid solution (ii) OPT* includes ai (iii) size(OPT*) ≤ size(OPT) Thus OPT* is an optimal solution including 𝜏 𝑡 0 , 𝑡 𝑘−1 , and so the lemma holds. To modify OPT we proceed as follows. standard text you can basically use in proof for any greedy-choice property quality OPT* ≥ quality OPT here comes the modification, which is problem-specific

Greedy Algorithm Observation: If criteria are decreasing monotone, a greedy strategy works. For many decreasing monotone criteria Greedy requires O(n) time, e.g. for speed, heading…

Greedy Algorithm Observation: For some criteria, iterative double & search is faster. Double & search: An exponential search followed by a binary search.

Criteria-Based Segmentation Boolean or linear combination of decreasing monotone criteria Greedy Algorithm incremental in O(n) time or constant-update criteria e.g. bounds on speed or heading double & search in O(n log n) time for non-constant update criteria [M.Buchin et al.’11]

Motivation: Geese Migration Two behavioural types stopover migration flight Input: GPS tracks expert description of behaviour Goal: Delineate stopover sites of migratory geese

Case Study: Geese Migration Data Spring migration tracks White-fronted geese 4-5 positions per day March – June Up to 10 stopovers during spring migration Stopover: 48 h within radius 30 km Flight: change in heading <120° Kees Adri

Comparison manual computed

Evaluation Few local differences: Shorter stops, extra cuts in computed segmentation

Criteria A combination of decreasing and increasing monotone criteria stopover migration flight Within radius 30km At least 48h Change in heading <120° AND OR

Decreasing and Increasing Monotone Criteria Observation: For a combination of decreasing and increasing monotone criteria the greedy strategy does not always work. Example: Min duration 2 AND Max speed range 4 5 speed 1

Non-Monotone Segmentation Many Criteria are not (decreasing) monotone: Minimum time Standard deviation Fixed percentage of outliers For these Aronov et al. introduced the start-stop diagram Example: Geese Migration

Start-Stop Diagram Algorithmic approach input trajectory compute start-stop diagram compute segmentation

Start-Stop Diagram Given a trajectory T over time interval I = {t0,…,t} and criterion C The start-stop diagram D is (the upper diagonal half of) the n x n grid, where each point (i,j) is associated to segment [ti,tj] with (i,j) is in free space if C holds on [ti,tj] (i,j) is in forbidden space if C does not hold on [ti,tj] A (minimal) segmentation of T corresponds to a (min-link) staircase in D forbidden space free space staircase

Start-Stop Diagram A (minimal) segmentation of T corresponds to a (min-link) staircase in D. 12 11 10 9 8 7 6 5 4 3 2 1 6 7 8 5 9 11 10 4 3 12 2 123456789101112 1 2 3 4 5 6 7 8 9 10 11 12 1

Start-Stop Diagram A (minimal) segmentation of T corresponds to a (min-link) staircase in D. 123456789101112 1 2 3 4 5 6 7 8 9 10 11 12 12 11 10 9 8 7 6 5 4 3 2 1 6 7 8 5 9 11 10 4 3 12 2 1

Start-Stop Diagram Discrete case: A non-monotone segmentation can be computed in O(n2) time. 123456789101112 1 2 3 4 5 6 7 8 9 10 11 12 12 11 10 9 8 7 6 5 4 3 2 1 4 5 8 11 1 12 6 9 10 7

Start-Stop Diagram Discrete case: A non-monotone segmentation can be computed in O(n2) time. 123456789101112 1 2 3 4 5 6 7 8 9 10 11 12 12 11 10 9 8 7 6 5 4 3 2 1 4 5 8 11 1 12 6 9 10 7

Stable Criteria Definition: A criterion is stable if and only if 𝑖=𝑜 𝑛 𝑣 𝑖 =𝑂(𝑛) where 𝑣 𝑖 = number of changes of validity on segments [0,i], [1,i], …, [i-1,i]

Stable Criteria Definition: A criterion is stable if and only if 𝑖=𝑜 𝑛 𝑣 𝑖 =𝑂(𝑛) where 𝑣 𝑖 = number of changes of validity on segments [0,i], [1,i], …, [i-1,i] Observations: Decreasing and increasing monotone criteria are stable. A conjunction or disjunction of stable criterion are stable.

Compressed Start-Stop Diagram For stable criteria the start-stop diagram can be compressed by applying run-length encoding. Examples: decreasing monotone increasing monotone 9

Computing the Compressed Start-Stop Diagram For a decreasing criterion consider the algorithm: ComputeLongestValid(crit C, traj T) Algorithm: Move two pointers i,j from n to 0 over the trajectory. For every trajectory index j the smallest index i for which 𝜏 𝑡 𝑖 , 𝑡 𝑗 satisfies the criterion C is stored. 9 i j

Computing the Compressed Start-Stop Diagram For a decreasing criterion ComputeLongestValid(crit C, traj T): Move two pointers i,j from n to 0 over the trajectory 9 i j Requires a data structure for segment [ti,tj] allowing the operations isValid, extend, and shorten, e.g., a balanced binary search tree on attribute values for range or bound criteria. Runs in O(nc(n)) time where c(n) is the time to update & query. Analogously for increasing criteria.

Computing the Compressed Start-Stop Diagram The start-stop diagram of a conjunction (or disjunction) of two stable criteria is their intersection (or union). The start-stop diagram of a negated criteria is its inverse. The corresponding compressed start-stop diagrams can be computed in O(n) time. 9

Attributes and criteria Examples of stable criteria Lower bound/Upper bound on attribute Angular range criterion Disk criterion Allow an approximate fraction of outliers …

Computing the Optimal Segmentation Observation: The optimal segmentation for [0,i] is either one segment, or an optimal sequence of segments for [0,j<i] appended with a segment [j,i], where j is an index such [j,i] is valid. Dynamic programming algorithm for each row from 0 to n find white cell with min-link That is, iteratively compute a table S[0,n] where entry S[i] for row i stores last: index of last link count: number of links so far runs in O(n2) time n 10, 4 8, 3 8, 3 7, 2 0, 1 0, 0+1=1 nil, ∞ S nil, ∞ nil, ∞ nil, ∞ nil, ∞ nil, 0

Computing the Optimal Segmentation More efficient dynamic programming algorithm for compressed diagrams Process blocks of white cells using a range query in a binary search tree T (instead of table S) storing index: row index last: index of last link count: number of links so far augmented by minimal count in subtree T [Alewijnse et al.’14]

Methodology for augmenting a data structure Choose an underlying data structure. Determine additional information to maintain. Verify that we can maintain additional information for existing data structure operations. Develop new operations. R-B tree min-count[x] theorem next slide count(block) using range query

Theorem Augment a R-B tree with field f, where f[x] depends only on information in x, left[x], and right[x] (including f[left[x]] and f[right[x]]). Then can maintain values of f in all nodes during insert and delete without affecting O(log n) performance. When we alter information in x, changes propagate only upward on the search path for x …

Computing the Optimal Segmentation More efficient dynamic programming algorithm for compressed diagrams Process blocks of white cells using a range query in a binary search tree T (instead of table S) storing index: row index last: index of last link count: number of links so far augmented by minimal count in subtree runs in O(n log n) time T [Alewijnse et al.’14]

Beyond criteria-based segmentation When is criteria-based segmentation applicable? What can we do in other cases?

Excursion: Movement Models

Movement Models Movement data: sequence of observations, e.g. (xi,yi,ti) Linear movement realistic?

Movement Models Movement data: sequence of observations, e.g. (xi,yi,ti) Linear movement realistic? Space-time Prisms

Movement Models Movement data: sequence of observations, e.g. (xi,yi,ti) Linear movement realistic? Space-time Prisms

Movement Models Movement data: sequence of observations, e.g. (xi,yi,ti) Linear movement realistic? Space-time Prisms

Movement Models Movement data: sequence of observations, e.g. (xi,yi,ti) Linear movement realistic? Space-time Prisms Random Motion Models (e.g. Brownian bridges)

Movement Models Movement data: sequence of observations, e.g. (xi,yi,ti) Linear movement realistic? Space-time Prisms Random Motion Models (e.g. Brownian bridges)

Brownian Bridges - Examples no bridges

Segment by diffusion coefficient

Summary Greedy algorithm for decreasing monotone criteria O(n) or O(n log n) time [M.Buchin, Driemel, van Kreveld, Sacristan, 2010] Case Study: Geese Migration [M.Buchin, Kruckenberg, Kölzsch, 2012] Start-stop diagram for arbitrary criteria O(n2) time [Aronov, Driemel, van Kreveld, Löffler, Staals, 2012] Compressed start-stop diagram for stable criteria O(n log n) time [Alewijnse, Buchin, Buchin, Sijben, Westenberg, 2014]

References References S. Alewijnse, T. Bagautdinov, M. de Berg, Q. Bouts, A. ten Brink, K. Buchin and M. Westenberg. Progressive Geometric Algorithms. SoCG, 2014. M. Buchin, A. Driemel, M. J. van Kreveld and V. Sacristan. Segmenting trajectories: A framework and algorithms using spatiotemporal criteria. Journal of Spatial Information Science, 2011. B. Aronov, A. Driemel, M. J. Kreveld, M. Loffler and F. Staals. Segmentation of Trajectories for Non-Monotone Criteria. SODA, 2013. M. Buchin, H. Kruckenberg and A. Kölzsch. Segmenting Trajectories based on Movement States. SDH, 2012.