Evaluating Queries over Route Collections

Evaluating Queries over Route Collections Panagiotis Bouros, PhD defense

Outline • Introduction • Route collections examples • Query evaluation challenges • Evaluating path queries • Dynamic Pickup and Delivery with Transfers • Most Trusted Near Shortest Path • Conclusions • Future work PhD defense

Routes as data • Several applications involve storing and querying large volumes of sequential data • Route, a sequence of spatial locations • POIs, waypoints etc. • Route collection • Routes as first-class citizens • Frequently updated • New routes added • Existing routes deleted or modified PhD defense

Example 1: Sightseeing and activities • People visit Athens • GPS devices • Track sightseeing • Touristic routes • Route collections online • www.ShareMyRoutes.com • www.TravelByGPS.com • Updates • Add new interesting routes • Remove existing routes, not interesting any more PhD defense

Example 1: Sightseeing and activities • Traditional graph queries • REACH: Is there a sequence of POIs from Academy to Zappeion? • PATH: Find a sequence of POIs from Academy to Zappeion • PATH more general • Graph-based solution • Searching • Low maintenance cost • Slow • Compressing TC • Fast • High maintenance cost • This thesis • Combine pros and cons • Reachability within routes PhD defense

Example 1: Sightseeing and activities • Traditional graph queries • REACH: Is there a sequence of POIs from Academy to Zappeion? • PATH: Find a sequence of POIs from Academy to Zappeion • PATH more general • Graph-based solution • Searching • Low maintenance cost • Slow • Compressing TC • Fast • High maintenance cost • This thesis • Combine advantages • Reachability within routes PhD defense

Example 2: Pickup and delivery • A courier company offering pickup and delivery services • Static plan • Set of requests • Transfers between vehicles • Collection of vehicles routes • Pickup and Delivery with Transfers • Create static plan • Updates • Ad-hoc requests • Modify vehicle routes to satisfy new requests PhD defense

Example 2: Pickup and delivery • Query • Pickup object from ns and delivery at nt • Minimize company’s expenses • dynamic Pickup and Delivery with Transfers • Non-graph solution • Two-phase local search • This thesis • First work target dPDPT • Cost metrics • Company’s viewpoint, extra traveling or waiting time • Customer’s viewpoint, delivery time • Dynamic two-criterion shortest path problem PhD defense

Example 2: Pickup and delivery • Query • Pickup object from ns and delivery at nt • Minimize company’s expenses • dynamic Pickup and Delivery with Transfers • Non-graph solution • Two-phase local search • This thesis • First work for dPDPT • Cost metrics • Company’s viewpoint, extra traveling or waiting time • Customer’s viewpoint, delivery time • Dynamic two-criterion shortest path problem PhD defense

Example 3: Driving data • Group of people driving through the city • Track their driving • Vehicle routes • Sequence of road network intersections • Collection of vehicle routes • A trusted and familiar way of driving • People consult collection • Updates • New routes added - driving to unknown locations • Existing routes modified – new ways to reach known locations PhD defense

Example 3: Driving data • Query • Driving directions from ns to nt • Graph-based solution • Shortest path • Time-dependent shortest path • This thesis • Capture how people actually drive • Tend to reuse roads • Consult friends • Prefer a trusted over the fastest way • New graph query • Most Trusted Near Shortest Path • Cost metrics • Unknown time, time outside routes • Length, total time • Path with lowest unknown time and length at most a times larger than SP PhD defense

Example 3: Driving data • Query • Driving directions from ns to nt • Graph-based solution • Shortest path • Time-dependent shortest path • This thesis • Capture how people actually drive • Tend to reuse roads • Consult friends • Prefer a trusted over the fastest way • Cost metrics • Unknown time, time outside routes • Length, total time • New graph query • Most Trusted Near Shortest Path • Path with lowest unknown time and length at most a times larger than SP PhD defense

Query evaluation • Frequent updated route collections available • Challenge for query evaluation • Path queries • Sequence of locations contained in routes • Evaluate queries directly on routes • Is it faster? • Route as a set of precomputed answers PhD defense

Outline • Introduction • Route collections examples • Query evaluation challenges • Evaluating path queries • Dynamic Pickup and Delivery with Transfers • Most Trusted Near Shortest Path • Conclusions • Future work PhD defense

Evaluating path queries PhD defense

Evaluating PATH queries • Query • PATH(ns,nt) • Solution • Answer: a sequence of locations in routes from ns to nt • Indexing route collections • Route traversal paradigm • Link traversal paradigm • Methods for index maintenance PhD defense

Indexing route collections • R-Index • Associates each location of the collection with the routes containing it • T-Index • Captures all possible transitions between routes via links • Links are shared nodes PhD defense

Traversal paradigms • Route traversal paradigm • Traverse collection similar to depth-first search • For each route, push all locations after current n in search stack • Access indices on routes to terminate search • RTS: current location and target on same route (R-Index) • RTST: current location on route connected to route of target (T-Index) • Link traversal paradigm • Traverse collection similar to depth-first search on links • R-Index+ • For each route, push first link after current n in search stack • Access indices to create target list T • LTS: routes containing target (R-Index+) • LTST: routes connected to routes containing target (T-Index) • LTS-k: routes connected to routes containing target via first k links before target (R-Index+) PhD defense

Traversal paradigms (cont’d) • Expand path (s) • Consider every location after a in routes r1 and r3 • Route trav.: PUSH w,a,g • Link trav.: PUSH a PhD defense

Traversal paradigms (cont’d) • RTS, 5th iteration • POP d, r1 contains d before t • RTST, 3rd iteration • POP a, r2 connected with r1 containing t via d • LTS, TLTS = {r1, r5}, 4th iteration • POP f, r1 contains f before t • LTST, TLTST = {r1,r2,r3,r4,r5}, 2nd iteration • POP a, r2 connected with r1 containing t via link d • LTS-1, TLTS-1 = {r1,r4,r5}, 3rd iteration • POP c, r2 connected with r1 containing t via link d PhD defense

Index maintenance • Indices as inverted files on disk • Lazy updates • Buffering phase • Update main memory indices • Flushing phase • Propagate changes to disk • Insertions • Buffering: mark new entries or changed entries in lists • Flushing: merge main memory information with disk-based indices • Deletions • No buffering: a list of deleted routes since last flushing • Flushing: rebuilding affected lists PhD defense

Experimental analysis • Rival: DFS, depth-first search over links • Datasets • Synthetic route collections • Vary |R| = {20K, 50K, 100K, 200K, 500K} • Vary |Lr| = {3, 5, 10, 30, 50} • Vary |N| = {20K, 50K, 100K, 200K, 500K} • Vary α = {0.2, 0.4, 0.6, 0.8, 1} • Experiments • Index construction • Query evaluation (queries with/without answer) • RTS, RTST Vs LTS • DFS Vs LTS, LTS-k, LTST • Index maintenance PhD defense

RTS, RTST Vs LTS Execution time Execution time PhD defense

DFS Vs LTS, LTS-k, LTST Execution time Execution time PhD defense

Dynamic Pickup and Delivery with Transfers PhD defense

Solving dPDPT • Query • dPDPT(ns,nt) • Solution • Modify static plan • 4 modifications, called actions, allowed with/without detours • Pickup, delivery, transfer, transport • A sequence of actions, path p • Operational cost Op • Customer cost Cp • Dynamic plan graph • All possible actions • Answer: path p that primarily minimizes Op, secondarily Cp • Algorithms SP and SPM PhD defense

Solving dPDPT (cont’d) PhD defense

The SP and SPM algorithms • The SP algorithm • Dynamic plan graph violates subpath optimality => path enumeration • Label <Via,p,Op,Cp> for each path to Via • At each iteration select label with lowest combined cost • Compute candidate answer – upper bound • Prune search space • Terminate search • The SPM algorithm • Modified dynamic plan graph • Break Op into Op* and OpR • Subpath optimality • Extends SP • Label <Via,p,Op*,OpR> for each path to Via • Most “promising” paths to every vertex PhD defense

The SP and SPM algorithms (cont’d) • INITIALIZATION • Pickup Es1a and Es3b • SP: Q = {<V1a, (Vs,V1a),6,16>, <V3b,(Vs,V3b),6,36>} • SPM: Q = {<V1a, (Vs,V1a),6,0>, <V3b,(Vs,V3b),6,0>} • pcand = null T = 6 PhD defense

The SP and SPM algorithms (cont’d) • POP <V1a, (Vs,V1a),…,…> • Transport E12a • SP: Q = {<V2a, (Vs,V1a,V2a),6,26>, <V3b,(Vs,V3b),6,36>} • SPM: Q = {<V2a, (Vs,V1a,V2a),6,0>, <V3b,(Vs,V3b),6,0>} • pcand = null T = 6 PhD defense

The SP and SPM algorithms (cont’d) • POP <V2a, (Vs, V1a,V2a),…,…> • Transfer E25ac • Arr5c = 10 < 26 < Dep5c = 40 • SP: Q = {<V3b,(Vs,V3b),6,36>, <V5c, (Vs,V1a,V2a,V5c),18,36>} • SPM: Q = {<V3b,(Vs,V3b),6,0>, <V5c, (Vs,V1a,V2a,V5c),6,12>} • pcand = null T = 6 PhD defense

The SP and SPM algorithms (cont’d) • POP <V3b, (Vs,V3b),6,36> and <V4b, (Vs,V3b,V4b),6,46> • Transport E34b and transfer E46bc • 46 > Dep6c = 40 • SP: Q = {<V5c,(Vs,V1a,V2a,V5c),18,36>, <V6c,(Vs,V3b,V4b,V6c),24,52>} • SPM: Q = {<V5c,(Vs,V1a,V2a,V5c),6,12>, <V6c,(Vs,V3b,V4b,V6c),12,12>} • pcand = null T = 6 PhD defense

The SP and SPM algorithms (cont’d) • POP <V5c,(Vs,V1a,V2a,V5c),…,…> • Transport E56c • SP: Q = {<V6c,(Vs,V1a,V2a,V5c,V6c),18, 46>, <V6c,(Vs,V3b,V4b,V6c),24,52>} • SPM: Q = {<V6c,(Vs,V1a,V2a,V5c,V6c),6, 12>, <V6c,(Vs,V3b,V4b,V6c),12,12>} • pcand = null T = 6 PhD defense

The SP and SPM algorithms (cont’d) • POP <V5c,(Vs,V1a,V2a,V5c),…,…> • Transport E56c • SP: Q = {<V6c,(Vs,V1a,V2a,V5c,V6c),18, 46>, <V6c,(Vs,V3b,V4b,V6c),24,52>} • SPM: Q = {<V6c,(Vs,V1a,V2a,V5c,V6c),6, 12>} • pcand = null T = 6 PhD defense

The SP and SPM algorithms (cont’d) • POP <V6c,(Vs,V1a,V2a,V5c,V6c),…,…> • Transport E67c • SP: Q = {<V7c,(Vs,V1a,V2a,V5c,V6c,V7c), 18, 56>, <V6c,(Vs,V3b,V4b,V6c),24,52>} • SPM: Q = {<V7c,(Vs,V1a,V2a,V5c,V6c,V7c), 6,12>} • pcand = null T = 6 PhD defense

The SP and SPM algorithms (cont’d) • POP <V7c,(Vs,V1a,V2a,V5c,V6c,V7c), …,…> • Delivery E7ec • FOUND pcand • SP: Q = {<V6c,(Vs,V3b,V4b,V6c),24,52>} • SPM: Q = {} END • pcand = (Vs,V1a,V2a,V5c,V6c,V7c) • Opcand = 24 • Cpcand = 59 T = 6 PhD defense

The SP and SPM algorithms (cont’d) • POP <V6c,(Vs,V3b,V4b,V6c),24,52> • Opcand = 24 • SP: END T = 6 PhD defense

Evaluating Queries over Route Collections

Evaluating Queries over Route Collections

Presentation Transcript

Streaming Queries over Streaming Data

Evaluating Top- K Selection Queries

Evaluating Top-k Queries Over Web-Accessible Databases

Evaluating Probabilistic Queries over Uncertain Matching

Evaluating Collections – EBSS Current Topic

Evaluating Probability Threshold k-Nearest-Neighbor Queries over Uncertain Data

Queries over Streaming Sensor Data

SIGMOD’03 Evaluating Probabilistic Queries over Imprecise Data

Probabilistic Verifiers: Evaluating Constrained Nearest-Neighbor Queries over Uncertain Data

Completeness of Queries over Incomplete Databases

Continuous Queries over Data Streams

Evaluating Path Queries over Route Collections

Αποτίμηση Ερωτημάτων σε Συλλογές Διαδρομών (Evaluating Queries on Route Collections)

Evaluating Reachability Queries over Path Collections*

Evaluating Reachability Queries over Path Collections

Dynamic Queries over Mobile Objects

Efficiently Evaluating Order Preserving Similarity Queries over Historical Market-Basket Data

Evaluating “find a path” reachability queries

Evaluating top-k Queries over Web-Accessible Databases

Streaming Queries over Streaming Data

Evaluating Top-k Queries over Web-Accessible Databases

dQUOB: SQL queries over data streams