Evaluating Queries over Route Collections Panagiotis Bouros, PhD defense.

Evaluating Queries over Route Collections Panagiotis Bouros, PhD defense

Outline  Introduction  Route collections examples  Query evaluation challenges  Evaluating path queries  Dynamic Pickup and Delivery with Transfers  Most Trusted Near Shortest Path  Conclusions  Future work June 30, 2011PhD defense

Routes as data  Several applications involve storing and querying large volumes of sequential data  Route, a sequence of spatial locations  POIs, waypoints etc.  Route collection  Routes as first-class citizens  Frequently updated  New routes added  Existing routes deleted or modified June 30, 2011PhD defense

Example 1: Sightseeing and activities  People visit Athens  GPS devices  Track sightseeing  Touristic routes  Route collections online  www.ShareMyRoutes.com  www.TravelByGPS.com  Updates  Add new interesting routes  Remove existing routes, not interesting any more June 30, 2011PhD defense

Example 1: Sightseeing and activities  Traditional graph queries  REACH: Is there a sequence of POIs from Academy to Zappeion?  PATH: Find a sequence of POIs from Academy to Zappeion  PATH more general  Graph-based solution  Searching  Low maintenance cost  Slow  Compressing TC  Fast  High maintenance cost  This thesis  Combine pros and cons  Reachability within routes June 30, 2011PhD defense

Example 1: Sightseeing and activities  Traditional graph queries  REACH: Is there a sequence of POIs from Academy to Zappeion?  PATH: Find a sequence of POIs from Academy to Zappeion  PATH more general  Graph-based solution  Searching  Low maintenance cost  Slow  Compressing TC  Fast  High maintenance cost  This thesis  Combine advantages  Reachability within routes June 30, 2011PhD defense

Example 2: Pickup and delivery  A courier company offering pickup and delivery services  Static plan  Set of requests  Transfers between vehicles  Collection of vehicles routes  Pickup and Delivery with Transfers  Create static plan  Updates  Ad-hoc requests  Modify vehicle routes to satisfy new requests June 30, 2011PhD defense

Example 2: Pickup and delivery June 30, 2011PhD defense  Query  Pickup object from n s and delivery at n t  Minimize company’s expenses  dynamic Pickup and Delivery with Transfers  Non-graph solution  Two-phase local search  This thesis  First work target dPDPT  Cost metrics  Company’s viewpoint, extra traveling or waiting time  Customer’s viewpoint, delivery time  Dynamic two-criterion shortest path problem

Example 2: Pickup and delivery June 30, 2011PhD defense  Query  Pickup object from n s and delivery at n t  Minimize company’s expenses  dynamic Pickup and Delivery with Transfers  Non-graph solution  Two-phase local search  This thesis  First work for dPDPT  Cost metrics  Company’s viewpoint, extra traveling or waiting time  Customer’s viewpoint, delivery time  Dynamic two-criterion shortest path problem

Example 3: Driving data  Group of people driving through the city  Track their driving  Vehicle routes  Sequence of road network intersections  Collection of vehicle routes  A trusted and familiar way of driving  People consult collection  Updates  New routes added - driving to unknown locations  Existing routes modified – new ways to reach known locations June 30, 2011PhD defense

Example 3: Driving data  Query  Driving directions from n s to n t  Graph-based solution  Shortest path  Time-dependent shortest path  This thesis  Capture how people actually drive  Tend to reuse roads  Consult friends  Prefer a trusted over the fastest way  New graph query  Most Trusted Near Shortest Path  Cost metrics  Unknown time, time outside routes  Length, total time  Path with lowest unknown time and length at most a times larger than SP June 30, 2011PhD defense

Example 3: Driving data  Query  Driving directions from n s to n t  Graph-based solution  Shortest path  Time-dependent shortest path  This thesis  Capture how people actually drive  Tend to reuse roads  Consult friends  Prefer a trusted over the fastest way  Cost metrics  Unknown time, time outside routes  Length, total time  New graph query  Most Trusted Near Shortest Path  Path with lowest unknown time and length at most a times larger than SP June 30, 2011PhD defense

Query evaluation  Frequent updated route collections available  Challenge for query evaluation  Path queries  Sequence of locations contained in routes  Evaluate queries directly on routes  Is it faster?  Route as a set of precomputed answers June 30, 2011PhD defense

Outline  Introduction  Route collections examples  Query evaluation challenges  Evaluating path queries  Dynamic Pickup and Delivery with Transfers  Most Trusted Near Shortest Path  Conclusions  Future work June 30, 2011PhD defense

Evaluating path queries June 30, 2011PhD defense

Evaluating PATH queries  Query  PATH(n s,n t )  Solution  Answer: a sequence of locations in routes from n s to n t  Indexing route collections  Route traversal paradigm  Link traversal paradigm  Methods for index maintenance June 30, 2011PhD defense

Indexing route collections  R-Index  Associates each location of the collection with the routes containing it  T-Index  Captures all possible transitions between routes via links  Links are shared nodes locatio n routes[] list ar 2 :3,r 3 :3 sr 1 :5,r 3 :1,r 5 :2 tr 1 :4,r 5 :1 …… r1r1 (d,f,y,t,s) r2r2 (v,b,a,c,d,x) r3r3 (s,w,a,g) r4r4 (b,z,c,f) r5r5 (t,s) June 30, 2011PhD defense routetrans[] list r2r2 r 1,d:5:1,r 3,a:3:3,r 4,b:2:1, r 4,c:4:3 r3r3 r 1,s:1:5,r 2,a:3:3,r 5,s:1:2 ……

Traversal paradigms  Route traversal paradigm  Traverse collection similar to depth-first search  For each route, push all locations after current n in search stack  Access indices on routes to terminate search  RTS: current location and target on same route (R-Index)  RTST: current location on route connected to route of target (T-Index)  Link traversal paradigm  Traverse collection similar to depth-first search on links  R-Index+  For each route, push first link after current n in search stack  Access indices to create target list T  LTS: routes containing target (R-Index+)  LTST: routes connected to routes containing target (T-Index)  LTS-k: routes connected to routes containing target via first k links before target (R-Index+) June 30, 2011PhD defense

Traversal paradigms (cont’d) June 30, 2011PhD defense  Expand path (s)  Consider every location after a in routes r 1 and r 3  Route trav.: PUSH w,a,g  Link trav.: PUSH a r1r1 (d,f,y,t,s) r2r2 (v,b,a,c,d,x) r3r3 (s,w,a,g) r4r4 (b,z,c,f) r5r5 (t,s)

Traversal paradigms (cont’d) June 30, 2011PhD defense  RTS, 5th iteration  POP d, r 1 contains d before t  RTST, 3rd iteration  POP a, r 2 connected with r 1 containing t via d  LTS, T LTS = {r 1, r 5 }, 4th iteration  POP f, r 1 contains f before t  LTST, T LTST = {r 1,r 2,r 3,r 4,r 5 }, 2nd iteration  POP a, r 2 connected with r 1 containing t via link d  LTS-1, T LTS-1 = {r 1,r 4,r 5 }, 3rd iteration  POP c, r 2 connected with r 1 containing t via link d r1r1 (d,f,y,t,s) r2r2 (v,b,a,c,d,x) r3r3 (s,w,a,g) r4r4 (b,z,c,f) r5r5 (t,s)

Index maintenance  Indices as inverted files on disk  Lazy updates  Buffering phase  Update main memory indices  Flushing phase  Propagate changes to disk  Insertions  Buffering: mark new entries or changed entries in lists  Flushing: merge main memory information with disk-based indices  Deletions  No buffering: a list of deleted routes since last flushing  Flushing: rebuilding affected lists June 30, 2011PhD defense

Experimental analysis  Rival: DFS, depth-first search over links  Datasets  Synthetic route collections  Vary |R| = {20K, 50K, 100K, 200K, 500K}  Vary |Lr| = {3, 5, 10, 30, 50}  Vary |N| = {20K, 50K, 100K, 200K, 500K}  Vary α = {0.2, 0.4, 0.6, 0.8, 1}  Experiments  Index construction  Query evaluation (queries with/without answer)  RTS, RTST Vs LTS  DFS Vs LTS, LTS-k, LTST  Index maintenance June 30, 2011PhD defense

RTS, RTST Vs LTS Execution time June 30, 2011PhD defense

DFS Vs LTS, LTS-k, LTST Execution time June 30, 2011PhD defense

Dynamic Pickup and Delivery with Transfers June 30, 2011PhD defense

Solving dPDPT  Query  dPDPT(n s,n t )  Solution  Modify static plan  4 modifications, called actions, allowed with/without detours  Pickup, delivery, transfer, transport  A sequence of actions, path p  Operational cost Op  Customer cost Cp  Dynamic plan graph  All possible actions  Answer: path p that primarily minimizes Op, secondarily Cp  Algorithms SP and SPM June 30, 2011PhD defense

Solving dPDPT (cont’d) June 30, 2011PhD defense

Solving dPDPT (cont’d) June 30, 2011PhD defense If Arr j b < Cp < Dep j b If Cp < Arr j b If Cp > Dep j b

Solving dPDPT (cont’d) June 30, 2011PhD defense

The SP and SPM algorithms  The SP algorithm  Dynamic plan graph violates subpath optimality => path enumeration  Label for each path to V i a  At each iteration select label with lowest combined cost  Compute candidate answer – upper bound  Prune search space  Terminate search  The SPM algorithm  Modified dynamic plan graph  Break Op into Op * and Op R  Subpath optimality  Extends SP  Label for each path to V i a  Most “promising” paths to every vertex June 30, 2011PhD defense

The SP and SPM algorithms (cont’d)  INITIALIZATION  Pickup E s1 a and E s3 b  SP: Q = {, }  SPM: Q = {, }  p cand = null T = 6 June 30, 2011PhD defense

The SP and SPM algorithms (cont’d)  POP  Transport E 12 a  SP: Q = {, }  SPM: Q = {, }  p cand = null T = 6 June 30, 2011PhD defense

The SP and SPM algorithms (cont’d)  POP  Transfer E 25 ac  Arr 5 c = 10 < 26 < Dep 5 c = 40  SP: Q = {, }  SPM: Q = {, }  p cand = null T = 6 June 30, 2011PhD defense

The SP and SPM algorithms (cont’d)  POP and  Transport E 34 b and transfer E 46 bc  46 > Dep 6 c = 40  SP: Q = {, }  SPM: Q = {, }  p cand = null T = 6 June 30, 2011PhD defense

The SP and SPM algorithms (cont’d)  POP  Transport E 56 c  SP: Q = {, }  SPM: Q = {, }  p cand = null T = 6 June 30, 2011PhD defense

The SP and SPM algorithms (cont’d)  POP  Transport E 56 c  SP: Q = {, }  SPM: Q = { }  p cand = null T = 6 June 30, 2011PhD defense

The SP and SPM algorithms (cont’d)  POP  Transport E 67 c  SP: Q = {, }  SPM: Q = { }  p cand = null T = 6 June 30, 2011PhD defense

The SP and SPM algorithms (cont’d)  POP  Delivery E 7e c  FOUND p cand  SP: Q = { }  SPM: Q = {} END  p cand = (V s,V 1 a,V 2 a,V 5 c,V 6 c,V 7 c )  Op cand = 24  Cp cand = 59 T = 6 June 30, 2011PhD defense

The SP and SPM algorithms (cont’d)  POP  Op cand = 24  SP: END T = 6 June 30, 2011PhD defense

Experimental analysis  Rival: two-phase method, HTT  Cheapest insertion for pickup and delivery location, for every new request  After k requests perform tabu search  Datasets  Road networks, OL with 6105 locations, ATH with 22601 locations  Static plan with HTT method  Vary |Reqs| = {200, 500, 1000, 2000}  Vary |R| = {100, 250, 500, 750, 1000}  Stored on disk  Experiments  500 dPDPT requests  HTT1, HTT3, HTT5  Measure  Total operational cost increase  Total execution time June 30, 2011PhD defense

Vary |Reqs| Operational cost increase Execution time OL road network June 30, 2011PhD defense

Vary |R| Operational cost increase Execution time OL road network June 30, 2011PhD defense

Most Trusted Near Shortest Path June 30, 2011PhD defense

Identifying MTNSP  Query  MTNSP(n s,n t, α )  Solution  Known graph  Unknown graph  Two costs for a path p  Unknown time Up  Length Lp  Answer: path p with lowest unknown time Up and length Lp ≤ α d N (n s,n t )  Offline processing phase  Lipschitz Embedding  Online processing phase  The TRUSTME algorithm June 30, 2011PhD defense

The known and unknown graphs June 30, 2011PhD defense Known subgraph Unknown subgraph Network graph

Offline processing phase  Embedding  For each node n in network graph, precompute shortest paths to every node n k in known graph  Store  d N (n,n k )  U k lowest unknown time  Compute bounds  d ≥ N (n s,n t ), d ≤ N (n s,n t )  U ≥ p, U ≤ p for p(n s,…,n t ) dNdN nsns n1n1 n5n5 n6n6 n7n7 nsns 0391118 n1n1 306815 n2n2 852310 ……………… ntnt 201714114 June 30, 2011PhD defense Upnsns n1n1 n5n5 n6n6 n7n7 nsns 00088 n1n1 00088 n2n2 55233 ……………… ntnt 17 1444

Offline processing phase  Embedding  For each node n in network graph, precompute shortest paths to every node n k in known graph  Store  d N (n,n k )  U k lowest unknown time  Compute bounds  d ≥ N (n s,n t ), d ≤ N (n s,n t )  U ≥ p, U ≤ p for p(n s,…,n t )  12 ≤ d N (n 2,n t ) ≤ 14 dNdN nsns n1n1 n5n5 n6n6 n7n7 nsns 0391118 n1n1 306815 n2n2 852310 ……………… ntnt 201714114 June 30, 2011PhD defense Upnsns n1n1 n5n5 n6n6 n7n7 nsns 00088 n1n1 00088 n2n2 55233 ……………… ntnt 17 1444

Online processing phase  The TRUSTME algorithm  Label-setting  Label for each path to n  Only the labels of most “promising” paths to every node n  At each iteration select label with lowest Lp  Compute an upper bound of the unknown time of the answer  Prune search space  Terminate search  Expansion:  Exploit d ≤ N, d ≥ N, U ≤ p,U ≥ p to prune search space June 30, 2011PhD defense

Online processing phase (cont’d) June 30, 2011PhD defense  INITITALIZATION  Q = {<n s, (n s ), 0, 0}  L = d ≤ N (n s,n t ) = 20  U = null  p cand = null α = 1.3

Online processing phase (cont’d) June 30, 2011PhD defense  POP  Edges (n 1,n s ), (n 1,n 2 ), (n 1,n 5 )  Edge (n 1,n 6 )  p(ns,n1,n6), Lp = 17  Lp + d ≥ N (n 6,n t ) = 17 + 11 = 28 > α L = 26  Discard p  L = d ≤ N (n s,n t ) = 20  U = null  p cand = null α = 1.3

Online processing phase (cont’d) June 30, 2011PhD defense  POP  Lp = 18 < d N (n s,n t )  Lp + d N (n 7,n t ) = 22 < 1.3 Lp = 23.4  FOUND upper bound for the unknown time of answer  L = d ≤ N (n s,n t ) = 20  U = 12  p cand = null α = 1.3

Online processing phase (cont’d) June 30, 2011PhD defense  POP  Up > U = 12  Not an answer  L = d ≤ N (n s,n t ) = 20  U = 12  p cand = null α = 1.3

Online processing phase (cont’d) June 30, 2011PhD defense  POP  Q = {}  END  p cand = (n s,n 1,n 5,n 2,n 6,n 7,n 4,n t )  Lp cand = 25  Up cand = 9 α = 1.3

Experimental analysis  Rival: label setting SP-EUCLIDEAN  First computing shortest path  Considering euclidean distance as lower bound  Datasets  Road networks, OL with 6105 locations, TG with 18263 locations  Familiar neighborhoods  Vary |H| = {3, 4,,5, 10, 30}  Vary α = {1.1, 1.2, 1.3, 1.4, 1.5}  Three strategies for creating known subgraph  S1: all locations in neighborhoods  S2: all locations on shortest path between neighborhoods centers  S3: combination  Stored on disk June 30, 2011PhD defense

Strategy S1 Execution time June 30, 2011PhD defense

Strategy S2 Execution time June 30, 2011PhD defense

Conclusions  Framework for evaluating path queries on frequently updated route collections  Indexing schemes  Evaluation algorithms  Three query cases  PATH query on large disk-resident collections  dynamic Pickup and Delivery with Transfers  Most Trusted Near Shortest Path June 30, 2011PhD defense

Future work  Trip planning or optimal sequence like queries  Find a path passing through a Museum, then a Stadium and finally a Restaurant  Combine query evaluation with keyword search  Find a path passing through a Restaurant relevant to “sea food, lobster”  Adopt ideas from PATH query for dPDPT  Exploit R-Index/T-Index to identify a candidate answer sooner  Additional constraints for dPDPT  Vehicle capacity, time windows  Handle updates on embedding scheme for MTNSP  Inverted index on precompute shortest paths  Complexity analysis for dPDPT and MTNSP June 30, 2011PhD defense

Publications  PATH  Evaluating Path Queries over Frequently Updated Route Collections, TKDE’11  Evaluating Path Queries over Route Collections, ICDE’10-PhD  Evaluating Reachability Queries Over Path Collections, SSDBM’09  Evaluating "Find a Path" Reachability Queries, ECAI’08-STRWS  dPDPT  Efficient Dynamic Pickup and Delivery with Transfers, TR KDBSL  Dynamic Pickup and Delivery with Transfers, SSTD’11  MTNSP  Most Trusted Near-Shortest Path, TR KDBSL June 30, 2011PhD defense

Other works  Set-values  Efficient Answering of Set Containment Queries for Skewed Item Distributions, EDBT’11  Skyline queries  Caching Dynamic Skyline Queries, SSDBM’08  Managing and personalizing topic directories  Mining User Navigation Patterns for Personalizing Topic Directories, CIKM’07-WIDM  PatMan: A Visual Database System to Manipulate Path Patterns and Data in Hierarhical Catalogs, AVIVDiLib’05  PatManQL: A language to manipulate patterns and data in hierarchical catalogs, EDBT’04-PaRMa June 30, 2011PhD defense

Thank you! June 30, 2011PhD defense

Evaluating Queries over Route Collections Panagiotis Bouros, PhD defense.

Similar presentations

Presentation on theme: "Evaluating Queries over Route Collections Panagiotis Bouros, PhD defense."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Evaluating Queries over Route Collections Panagiotis Bouros, PhD defense.

Similar presentations

Presentation on theme: "Evaluating Queries over Route Collections Panagiotis Bouros, PhD defense."— Presentation transcript:

Similar presentations

About project

Feedback