Traveling Salesman Problems Motivated by Robot Navigation

Traveling Salesman Problems Motivated by Robot Navigation Maria Minkoff MIT With Avrim Blum, Shuchi Chawla, David Karger, Terran Lane, Adam Meyerson

A Robot Navigation Problem • Robot delivering packages in a building • Goal to deliver as quickly as possible • Classic model: Traveling Salesman Problem • Find a tour of minimum length • Additional constraints: • some packages have higher priority • uncertainty in robot’s behavior • battery failure • sensor error, motor control error

Markov Decision Process Model • State space S • Choice of actions aA at each state s • Transition function T(s’|s,a) • action determines probability distribution on next state • sequence of actions produces a random path through graph • RewardsR(s)on states • If arrive in state s at time t, receive discounted rewardgtR(s)forg(0,1) • MDP Goal: policy for picking an action from any state that maximizes total discounted reward

Exponential Discounting • Motivates to get to desired state quickly • Inflation: reward collected in distant future decreases in value due to uncertainty • at time t robot loses power with fixed probability • probability of being alive at t is exponentially distributed • discounting reflects value of reward in expectation

Solving MDP • Fixing action at each state produces a Markov Chain with transition probabilities pvw • Can compute expected discounted rewardrv if start at state v: rv = rv + Swpvwgt(v,w)rw • Choosing actions to optimize this recurrence is polynomial time solvable • Linear programming • Dynamic programming (like shortest paths)

Solving the wrong problem • Package can only be delivered once • So should not get reward each time reach target • One solution: expand state space • New state = current location  past locations (packages already delivered) • Reward nonzero only on states where current location not included in list of previously visited • Now apply MDP algorithm • Problem: new state space has exponential size

Tackle an easier problem • Problem has two novel elements for “theory” • Discounting of reward based on arrival time • Probability distribution on outcome of actions • We will set aside second issue for now • In practice, robot can control errors • Even first issue by itself is hard and interesting • First step towards solving whole problem

Discounted-Reward TSP Given • undirected graph G=(V,E) • edge weights (travel times) de ≥ 0 • weights on nodes (rewards) rv ≥ 0 • discount factor  (0,1) • root node s Goal find a path P starting at s that maximizes total discounted reward (P) = v Prv dP(v)

Approximation Algorithms • Discounted-Reward TSP is NP-complete (and so is more general MDP-type problem) • reduction from minimum latency TSP • So intractable to solve exactly • Goal: approximation algorithm that is guaranteed to collect at least some constant fraction of the best possible discounted reward

Related Problems Goal of Discounted-Reward TSP seems to be to find a “short” path that collects “lots” of reward • Prize-CollectingTSP • Given a rootvertex v, find a tour containing v that minimizes total length + foregone reward(undiscounted) • Primal-dual 2-approximation algorithm [GW 95]

k-TSP • Find a tour of minimum length that visits at least k vertices • 2-approximation algorithm known for undirected graphs based on algorithm for PC-TSP [Garg 99] • Can be extended to handle node-weighted version

Mismatch Constant factor approximation on length doesn’t exponentiate well • Suppose optimum solution reaches some vertex v at time tfor reward gtr • Constant factor approximation would reach within time 2tfor reward g2tr • Result: get only gt fraction of optimum discounted reward, not a constant fraction.

Orienteering Problem Find a path of length at most D that maximizes net reward collected • Complement of k-TSP • approximates reward collected instead of length • avoids changing length, so exponentiation doesn’t hurt • unrooted case can be solved via k-TSP • Drawback: no constant factor approximation for rooted non-geometric version previously known • Our techniques also give a constant factor approximation for Orienteering problem

Our Results Using -approximation for k-TSP as subroutine • (3/2+2)-approximation for Orienteering • e(3/2+2)-approximation for Discounted-Reward Collection • constant-factor approximations for tree- and multiple-path versions of the problems

Our Results Using -approximation for k-TSP as subroutine substitute =2 announced by Garg in 1999 • (3/2+25-approximation for Orienteering • e(3/2+13-approximation for Discounted-Reward Collection • constant-factor approximations for tree- and multiple-path versions of the problems

Eliminating Exponentiation • Let dv = shortest path distance (time) to v • Define the prize at v as pv=gdv rv • max discounted reward possibly collectable at v • If given path reaches v at time tv, define excessev=tv–dv • difference between shortest path and chosen one • Then discounted reward at v is gevpv • Idea: if excess small, prize ~ discounted reward • Fact: excess only increases as traverse path • excess reflects lost time; can’t make it up

Optimum path s • assume g = ½ (can scale edge lengths) Claim: at least ½ of optimum path’s discounted reward R is collected before path’s excess reaches 1 0 0.5 • Proof by contradiction: • Let u be first vertex with eu≥ 1 u 1 0 • Suppose more than R/2 reward follows u 1.5 0.5 • Can shortcut directly to u then traverse • the rest of optimum 2 1 • reduces all excesses after u by at least 1 • so “undiscounts” rewards by factor g-1= 2 3 2 • so doubles discounted reward collected • but this was more than R/2: contradiction

New problem: Approximate Min-Excess Path • Suppose there exists an s-t path P* with prize value of length l(P*)=dt+e • Optimization: find s-t path P with prize value ≥ that minimizes excessl(P)-dt over shortest path to t • equivalent to minimizing total length, e.g. k-TSP • Approximation: find s-t path P with prize value ≥that approximates optimum excess over shortest path to t, i.e. has length l(P) = dt + ce • better than approximating entire path length

Using Min-Excess Path • Recall discounted reward at v is gevpv • Prefix of optimum discounted reward path: • collects discounted reward S gevpvR/2  spans prize S pvR/2 • and has no vertex with excess over 1 • Guess t = last node on opt path with excess et 1 • Find a path to t of approximately (4 times) minimum excess that spans R/2 prize (we can guess R/2) • Excesses at most 4, so gevpv pv/16  discounted reward on found path R/32

Solving Min-Excess Path problem Exactly solvable case: monotonic paths • Suppose optimum path goes through vertices in strictly increasing distance from root • Then can find optimum by dynamic program • Just as can solve longest path in an acyclic graph • Build table • For each vertex v: is there a monotonic path from v with length l and prize p?

Solving Min-Excess Path problem Approximable case: wiggly paths • Length of path to v is lv = dv + ev • If ev>dv then lv>ev>lv/2 • i.e., take twice as long as necessary to reach v • So if approximate lv to constant factor, also approximate ev to twice that constant factor

r Approximating path length • Can use k-TSP algorithm to find approximately shortest s-t path with specified prize • merge s and tinto vertexr • opt path becomes a tour • solve k-TSP with root r s t • “unmerge”: can get one or more cycles • connect s and t by shortest path

Decompose optimum path monotone monotone monotone wiggly wiggly Divides into independent problems > 2/3 of each wiggly path is excess

Decomposition Analysis • 2/3 of each wiggly segment is excess • That excess accumulates into whole path • total excess of wiggly segment  excess of whole path • total length of wiggly segments  3/2 of path excess • Use dynamic program to find shortest (min-excess) monotonic segments collecting target prize • Use k-TSP to find approximately shortest wiggles collecting target prize • Approximates length, so approximates excess • Over all monotonic and wiggly segments, approximates total excess

Dynamic program for Min-Excess Path • For each pair of vertices and each (discretized) prize value, find • Shortest monotonic path collecting desired prize • Approximately shortest wiggly path collecting desired prize • Note: polynomially many subproblems • Use dynamic programming to find optimum pasting together of segments

Solving Orienteering Problem: special case s • Given a path from s that • collects prize P • has length  D • ends at t, the farthest point from s 0 0.5 1 • For any const integer r1, there • exists a path from s to somev with • prize  P/r • excess  (D-dv)/r 1.5 2 1 v 3 t

Solving Orienteering Problem s General case: path ends atarbitrary t • Let u be the farthest point from s • Connect t to s via shortest path • One of path segments ending at u • has prize  P/2 • has length  D Reduced to special case • Using 4-approximation for Min-Excess Path get 8-approximation for Orienteering t u

Budget Prize-Collecting Steiner Tree problem Find a rooted tree of edge cost at most D that spans maximum amount of prize • Complement of k-MST • Create Euler tour of opt tree T* of cost 2D • Divide this tour into two paths starting at root each of lengthD • One of them contains at least ½ of total prize • Path is a type of tree • Use c-approximation algorithm for Orienteering to obtain 2c-approximation for Budget PCST

Summary • Showed maximum discounted reward can be approximated using min-excess path • Showed how to approximate min-excess pathusing k-TSP • Min-excess path can also be used to solve rooted Orienteering problem (open question) • Also solves “tree” and “cycle” versions of Orienteering

Open Questions • Non-uniform discount factors • each vertex v has its own v • Non-uniform deadlines • each vertex specifies its own deadline by which it has to be visited in order to collect reward • Directed graphs • We used k-TSP, only solved for undirected • For directed, even standard TSP has no known constant factor approximation • We only use k-TSP/undirectedness in wiggly parts

Future directions • Stochastic actions • Stochastic seems to imply directed • Special case: forget rewards. • Given choice of actions, choose to minimize cover time of graph • Applying discounting framework to other problems : • Scheduling • Exponential penalty in place of hard deadlines

Traveling Salesman Problems Motivated by Robot Navigation

Traveling Salesman Problems Motivated by Robot Navigation

Presentation Transcript

Traveling Salesman with Deadlines

The Traveling Salesman Problem

Traveling Salesman Problem

Traveling-Salesman Problem

The Traveling Salesman Problem Approximation

Traveling Salesman Problem

The Traveling Salesman Problem

Traveling Salesman Problem (TSP)

Robot Navigation

Traveling Salesman Problem

Traveling Salesman Problem

Dynamic Traveling Salesman Problem

traveling salesman problem

Traveling Salesman Problem (TSP)

The Colorful Traveling Salesman Problem

Traveling Salesman Problem

Traveling-Salesman Problems

Robot Navigation

Robot Navigation

ปัญหาการเดินทางของพนักงานขาย Traveling Salesman Problem (TSP)

The Traveling Salesman Problem