Near-optimal Observation Selection using Submodular Functions Andreas Krause joint work with Carlos Guestrin (CMU)
River monitoring Want to monitor ecological condition of rivers and lakes Which locations should we observe? Mixing zone of San Joaquin and Merced rivers NIMS (B. Kaiser, UCLA)
Water distribution networks Pathogens in water can affect thousands (or millions) of people Currently: Add chlorine to the source and hope for the best Sensors in pipes could detect pathogens quickly 1 Sensor: $5,000 (just for chlorine) + deployment, mainten. Must be smart about where to place sensors Battle of the Water Sensor Networks challenge Get model of a metropolitan area water network Simulator of water flow provided by the EPA Competition for best placements Collaboration with VanBriesen et al (CMU Civil Engineering)
Fundamental question: Observation Selection Where should we observe to monitor complex phenomena? Salt concentration / algae biomass Pathogen distribution Temperature and light field California highway traffic Weblog information cascades …
Spatial prediction Gaussian processes Model many spatial phenomena well [Cressie ’91] Allow to estimate uncertainty in prediction Want to select observations minimizing uncertainty How do we quantify informativeness / uncertainty? Horizontal position pH value Observations A µ V Unobserved Process (one pH value per location s 2 V) Prediction at unobserved locations V\A
Mutual information [Caselton & Zidek ‘84] Finite set V of possible locations Find A* µ V maximizing mutual information: A* = argmax MI(A) Often, observations A are expensive constraints on which sets A we can pick Entropy of uninstrumented locations after sensing Entropy of uninstrumented locations before sensing
Constraints for observation selection max A MI(A) subject to some constraints on A What kind of constraints do we consider? Want to place at most k sensors: |A| · k or: more complex constraints: All these problems NP hard. Can only hope for approximation guarantees! Sensors need to communicate (form a tree) Multiple robots (collection of paths)
Want to find: A* = argmax |A|=k MI(A) Greedy algorithm: Start with A = ; For i = 1 to k s* := argmax s MI(A [ {s}) A := A [ {s*} Problem is NP hard! How well can this simple heuristic do? The greedy algorithm
Performance of greedy Greedy empirically close to optimal. Why? Greedy Optimal Temperature data from sensor network
S1S1 S2S2 S3S3 S4S4 S5S5 Placement B = {S 1,…, S 5 } Key observation: Diminishing returns S1S1 S2S2 Placement A = {S 1, S 2 } Theorem [UAI 2005, M. Narasimhan, J. Bilmes] Mutual information is submodular: For A µ B, MI(A [ {S’}) – MI(A) ¸ MI(B [ {S’})- MI(B) Adding S’ will help a lot!Adding S’ doesn’t help much S‘ New sensor S’
Cardinality constraints Theorem [ICML 2005, with Carlos Guestrin, Ajit Singh] Greedy MI algorithm provides constant factor approximation: placing k sensors, 8 >0: Optimal solution Result of greedy algorithm Constant factor, ~63% Proof invokes fundamental result by Nemhauser et al ’78 on greedy algorithm for submodular functions
Myopic vs. Nonmyopic Approaches to observation selection Myopic: Only plan ahead on the next observation Nonmyopic: Look for best set of observations For finding best k observations, myopic greedy algorithm gives near-optimal nonmyopic results! What about more complex constraints? Communication constraints Path constraints …
Communication constraints: Wireless sensor placements should … be very informative (high mutual information) Low uncertainty at unobserved locations … have low communication cost Minimize the energy spent for communication Communication cost = expected number of transmissions
Naive, myopic approach: Greedy-connect Simple heuristic: Greedily optimize information Then connect nodes to minimize communication cost Greedy-Connect can select sensors far apart… Want to find optimal tradeoff between information and communication cost relay node Second most informative No communication possible! Most informative efficient communication! Not very informative
The pSPIEL Algorithm [with Guestrin, Gupta, Kleinberg IPSN ’06] pSPIEL: Efficient nonmyopic algorithm (padded Sensor Placements at Informative and cost-Effective Locations) In expectation, both mutual information and communication cost will be close to optimum
Our approach: pSPIEL Decompose sensing region into small, well- separated clusters Solve cardinality constrained problem per cluster Combine solutions using k-MST algorithm C1C1 C2C2 C3C3 C4C
Theorem: pSPIEL finds a tree T with mutual informationMI(T) ¸ ( ) OPT MI, communication cost C(T) · O(log | V|) OPT cost [IPSN’06, with Carlos Guestrin, Anupam Gupta, Jon Kleinberg] Guarantees for pSPIEL
Prototype implementation Implemented on Tmote Sky motes from MoteIV Collect measurement and link information and send to base station
Proof of concept study Learned model from short deployment of 46 sensors at the Intelligent Workplace Manually selected 20 sensors; Used pSPIEL to place 12 and 19 sensors Compared prediction accuracy Initial deployment and validation set Optimized placements Accuracy Time
Root mean squares error (Lux) Proof of concept study Manual (M20) pSPIEL (pS19) pSPIEL (pS12) Root mean squares error (Lux) Communication cost (ETX) better
Path constraints Want to plan informative paths Find collection of paths P 1,…,P k s.t. MI(P 1 [ … [ P k ) is maximized Length(P i ) · B Path of Robot-1 Path of Robot-2 Path of Robot-3 Start 1 Start 3 Start 2 Outline of Lake Fulmor
Naïve, myopic algorithm Go to most informative reachable observations Again, the naïve myopic approach can fail badly! Looking at benefit cost-ratio doesn’t help either Can get nonmyopic approximation algorithm [with Amarjeet Singh, Carlos Guestrin, William Kaiser, IJCAI 07] Start 1 Most informative observation Waste (almost) all fuel! Have to go back without further observations
Comparison with heuristic Approximation algorithm outperforms state-of- the-art heuristic for orienteering Cost of output path (meters) Submodular path planning Known heuristic [Chao et. al’ 96] More informative
Submodular observation selection Many other submodular objectives (other than MI) Variance reduction: F(A) = Var(Y) – Var(Y | A) (Geometric) coverage: F(A) = |area covered| Influence in social networks (viral marketing) Size of information cascades in blog networks … Key underlying problem: Constrained maximization of submodular functions Our algorithms work for any submodular function!
Water Networks 12,527 junctions 3.6 million contamination events Place 20 sensors to Maximize detection likelihood Minimize detection time Minimize population affected Theorem: All these objectives are submodular!
Bounds on optimal solution Submodularity gives online bounds on the performance of any algorithm Penalty reduction Higher is better
Results of BWSN [Ostfeld et al] Author#non-dom. (out of 30) Krause et. al.26 Berry et. al.21 Dorini et. al.20 Wu and Walski19 Ostfeld and Salomons14 Propato and Piller12 Eliades and Polycarpou11 Huang et. al.7 Guan et. al.4 Ghimire and Barkdoll3 Trachtman2 Gueli2 Preis and Ostfeld1 Multi-criterion optimization [Ostfeld et al ‘07]: count number of non- dominated solutions
Conclusions Observation selection is an important AI problem Key algorithmic problem: Constrained maximization of submodular functions For budgeted placements, greedy is near-optimal! For more complex constraints (paths, etc.): Myopic (greedy) algorithms fail presented near-optimal nonmyopic algorithms Algorithms perform well on several real-world observation selection problems