Near-optimal Observation Selection using Submodular Functions

Near-optimal Observation Selectionusing Submodular Functions Andreas Krause joint work with Carlos Guestrin (CMU)

NIMS (B. Kaiser,UCLA) 8 7.8 7.6 pH value 7.4 Position along transect (m) River monitoring • Want to monitor ecological condition of rivers and lakes • Which locations should we observe? Mixing zone of San Joaquin and Merced rivers

Water distribution networks • Pathogens in water can affect thousands (or millions) of people • Currently: Add chlorine to the source and hope for the best • Sensors in pipes could detect pathogens quickly • 1 Sensor: $5,000 (just for chlorine) + deployment, mainten.  Must be smart about where to place sensors • Battle of the Water Sensor Networks challenge • Get model of a metropolitan area water network • Simulator of water flow provided by the EPA • Competition for best placements • Collaboration with VanBriesen et al (CMU Civil Engineering)

Fundamental question:Observation Selection Where should we observe to monitor complex phenomena? • Salt concentration / algae biomass • Pathogen distribution • Temperature and light field • California highway traffic • Weblog information cascades • …

Spatial prediction • Gaussian processes • Model many spatial phenomena well [Cressie ’91] • Allow to estimate uncertainty in prediction • Want to select observations minimizing uncertainty • How do we quantify informativeness / uncertainty? Observations A µ V Prediction at unobservedlocations V\A pH value Unobserved Process (one pH value per location s 2V) Horizontal position

Entropy of uninstrumented locations before sensing Entropy of uninstrumented locations after sensing Mutual information [Caselton & Zidek ‘84] • Finite set V of possible locations • Find A* µ V maximizing mutual information: A* = argmax MI(A) Often, observations A are expensive  constraints on which sets A we can pick

Constraints for observation selection • maxA MI(A) subject to some constraints on A • What kind of constraints do we consider? • Want to place at most k sensors: |A| · k • or: more complex constraints: Sensors need to communicate (form a tree) Multiple robots(collection of paths) • All these problems NP hard. Can only hope for approximation guarantees!

The greedy algorithm • Want to find: A* = argmax|A|=k MI(A) • Greedy algorithm: • Start with A = ; • For i = 1 to k • s* := argmaxs MI(A [ {s}) • A := A [ {s*} • Problem is NP hard!How well can this simple heuristic do?

Performance of greedy • Greedy empirically close to optimal. Why? Optimal Greedy Temperature datafrom sensor network

Placement B = {S1,…, S5} S1 S2 S3 S4 S5 S1 S2 S‘ Key observation: Diminishing returns Placement A = {S1, S2} Adding S’ will help a lot! Adding S’ doesn’t help much New sensor S’ Theorem [UAI 2005, M. Narasimhan, J. Bilmes] Mutual information is submodular: For A µ B, MI(A [ {S’}) – MI(A) ¸ MI(B [ {S’})- MI(B)

Result of greedy algorithm Constant factor,~63% Optimal solution Cardinality constraints Theorem [ICML 2005, with Carlos Guestrin, Ajit Singh] Greedy MI algorithm provides constant factor approximation: placing k sensors, 8>0: Proof invokes fundamental result by Nemhauser et al ’78 on greedy algorithm for submodular functions

Myopic vs. Nonmyopic • Approaches to observation selection • Myopic: Only plan ahead on the next observation • Nonmyopic: Look for best set of observations • For finding best k observations, myopic greedy algorithm gives near-optimal nonmyopic results!  • What about more complex constraints? • Communication constraints • Path constraints • …

Communication cost = expected number of transmissions 1.2 1.6 2.1 1.9 1.4 Communication constraints:Wireless sensor placements should • … be very informative (high mutual information) • Low uncertainty at unobserved locations • … have low communicationcost • Minimize the energy spent for communication

No communicationpossible! Naive, myopic approach: Greedy-connect • Simple heuristic: Greedily optimize information • Then connect nodes to minimize communication cost efficientcommunication! Not veryinformative  relay node Most informative relay node Secondmost informative Want to find optimal tradeoff between information and communication cost Greedy-Connect can select sensors far apart…

The pSPIEL Algorithm [with Guestrin, Gupta, Kleinberg IPSN ’06] • pSPIEL: Efficient nonmyopic algorithm (padded Sensor Placements at Informative and cost-Effective Locations) • In expectation, bothmutual information and communication cost will be close to optimum

2 1 3 2 3 1 1 2 Our approach: pSPIEL • Decompose sensing region into small, well-separated clusters • Solve cardinality constrained problem per cluster • Combine solutions using k-MST algorithm 1 2 C1 C2 C4 C3

Guarantees for pSPIEL Theorem: pSPIEL finds a tree T with mutual information MI(T)¸(1) OPTMI,communication cost C(T)·O(log |V|) OPTcost [IPSN’06, with Carlos Guestrin, Anupam Gupta, Jon Kleinberg]

Prototype implementation • Implemented on Tmote Sky motes from MoteIV • Collect measurement and link information and send to base station

Optimizedplacements Accuracy Proof of concept study • Learned model from short deployment of 46 sensors at the Intelligent Workplace • Manually selected 20 sensors;Used pSPIEL to place 12 and 19 sensors • Compared prediction accuracy Initial deployment and validation set Time

Manual (M20) pSPIEL (pS19) pSPIEL (pS12) Root mean squares error (Lux) Root mean squares error (Lux) Root mean squares error (Lux) Proof of concept study better better Communication cost (ETX)

Start 1 Start 2 Start 3 Path ofRobot-2 Path ofRobot-3 Path ofRobot-1 Path constraints • Want to plan informative paths • Find collection of paths P1,…,Pk s.t. • MI(P1[ … [ Pk) is maximized • Length(Pi) · B Outline ofLake Fulmor

Start 1 Naïve, myopic algorithm • Go to most informative reachable observations • Again, the naïve myopic approach can fail badly! • Looking at benefit cost-ratio doesn’t help either • Can get nonmyopic approximation algorithm[with Amarjeet Singh, Carlos Guestrin, William Kaiser, IJCAI 07] Most informativeobservation Waste (almost)all fuel! Have to go backwithout furtherobservations

14 12 10 8 6 4 200 250 300 350 400 450 Cost of output path (meters) Comparison with heuristic • Approximation algorithm outperforms state-of-the-art heuristic for orienteering Submodularpath planning More informative Known heuristic [Chao et. al’ 96]

Submodular observation selection • Many other submodular objectives (other than MI) • Variance reduction: F(A) = Var(Y) – Var(Y | A) • (Geometric) coverage: F(A) = |area covered| • Influence in social networks (viral marketing) • Size of information cascades in blog networks • … • Key underlying problem:Constrained maximization of submodular functions • Our algorithms work for any submodular function! 

Water Networks • 12,527 junctions • 3.6 million contaminationevents • Place 20 sensors to • Maximize detection likelihood • Minimize detection time • Minimize population affected Theorem: All these objectives are submodular! 

Bounds on optimal solution • Submodularity gives online bounds on the performance of any algorithm Penalty reduction Higher is better

Results of BWSN [Ostfeld et al] • Multi-criterion optimization • [Ostfeld et al ‘07]: count number of non-dominated solutions

Conclusions • Observation selection is an important AI problem • Key algorithmic problem: Constrained maximization of submodular functions • For budgeted placements, greedy is near-optimal! • For more complex constraints (paths, etc.): • Myopic (greedy)algorithms fail  • presented near-optimal nonmyopic algorithms  • Algorithms perform well on several real-world observation selection problems

Near-optimal Observation Selection using Submodular Functions