Carnegie Mellon Selecting Observations against Adversarial Objectives Andreas Krause Brendan McMahan Carlos Guestrin Anupam Gupta TexPoint fonts used in.

Slides:



Advertisements
Similar presentations
Beyond Convexity – Submodularity in Machine Learning
Advertisements

QoS-based Management of Multiple Shared Resources in Dynamic Real-Time Systems Klaus Ecker, Frank Drews School of EECS, Ohio University, Athens, OH {ecker,
Cost-effective Outbreak Detection in Networks Jure Leskovec, Andreas Krause, Carlos Guestrin, Christos Faloutsos, Jeanne VanBriesen, Natalie Glance.
Nonmyopic Active Learning of Gaussian Processes An Exploration – Exploitation Approach Andreas Krause, Carlos Guestrin Carnegie Mellon University TexPoint.
Submodularity for Distributed Sensing Problems Zeyn Saigol IR Lab, School of Computer Science University of Birmingham 6 th July 2010.
C&O 355 Lecture 23 N. Harvey TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: A A A A A A A A A A.
EKF, UKF TexPoint fonts used in EMF.
Minimizing Seed Set for Viral Marketing Cheng Long & Raymond Chi-Wing Wong Presented by: Cheng Long 20-August-2011.
Cost-effective Outbreak Detection in Networks Jure Leskovec, Andreas Krause, Carlos Guestrin, Christos Faloutsos, Jeanne VanBriesen, Natalie Glance.
Randomized Sensing in Adversarial Environments Andreas Krause Joint work with Daniel Golovin and Alex Roper International Joint Conference on Artificial.
Parallel Double Greedy Submodular Maxmization Xinghao Pan, Stefanie Jegelka, Joseph Gonzalez, Joseph Bradley, Michael I. Jordan.
Niranjan Srinivas Andreas Krause Caltech Caltech
Online Distributed Sensor Selection Daniel Golovin, Matthew Faulkner, Andreas Krause theory and practice collide 1.
Submodular Dictionary Selection for Sparse Representation Volkan Cevher Laboratory for Information and Inference Systems - LIONS.
Near-Optimal Sensor Placements in Gaussian Processes Carlos Guestrin Andreas KrauseAjit Singh Carnegie Mellon University.
1 Stochastic Event Capture Using Mobile Sensors Subject to a Quality Metric Nabhendra Bisnik, Alhussein A. Abouzeid, and Volkan Isler Rensselaer Polytechnic.
Efficient Informative Sensing using Multiple Robots
Kuang-Hao Liu et al Presented by Xin Che 11/18/09.
Department of Computer Science, University of Maryland, College Park, USA TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.:
A Utility-Theoretic Approach to Privacy and Personalization Andreas Krause Carnegie Mellon University work performed during an internship at Microsoft.
1 Distributed localization of networked cameras Stanislav Funiak Carlos Guestrin Carnegie Mellon University Mark Paskin Stanford University Rahul Sukthankar.
Near-optimal Nonmyopic Value of Information in Graphical Models Andreas Krause, Carlos Guestrin Computer Science Department Carnegie Mellon University.
Sensor placement applications Monitoring of spatial phenomena Temperature Precipitation... Active learning, Experiment design Precipitation data from Pacific.
1 An Asymptotically Optimal Algorithm for the Max k-Armed Bandit Problem Matthew Streeter & Stephen Smith Carnegie Mellon University NESCAI, April
Non-myopic Informative Path Planning in Spatio-Temporal Models Alexandra Meliou Andreas Krause Carlos Guestrin Joe Hellerstein.
Optimal Nonmyopic Value of Information in Graphical Models Efficient Algorithms and Theoretical Limits Andreas Krause, Carlos Guestrin Computer Science.
1 Dynamic Resource Allocation in Conservation Planning 1 Daniel GolovinAndreas Krause Beth Gardner Sarah Converse Steve Morey.
Near-optimal Observation Selection using Submodular Functions Andreas Krause joint work with Carlos Guestrin (CMU)
1 Budgeted Nonparametric Learning from Data Streams Ryan Gomes and Andreas Krause California Institute of Technology.
1 Efficient planning of informative paths for multiple robots Amarjeet Singh *, Andreas Krause +, Carlos Guestrin +, William J. Kaiser *, Maxim Batalin.
Nonmyopic Active Learning of Gaussian Processes An Exploration – Exploitation Approach Andreas Krause, Carlos Guestrin Carnegie Mellon University TexPoint.
Distributed Combinatorial Optimization
Near-optimal Sensor Placements: Maximizing Information while Minimizing Communication Cost Andreas Krause, Carlos Guestrin, Anupam Gupta, Jon Kleinberg.
1 Introduction to Approximation Algorithms Lecture 15: Mar 5.
Approximation Algorithms: Bristol Summer School 2008 Seffi Naor Computer Science Dept. Technion Haifa, Israel TexPoint fonts used in EMF. Read the TexPoint.
Carnegie Mellon AI, Sensing, and Optimized Information Gathering: Trends and Directions Carlos Guestrin joint work with: and: Anupam Gupta, Jon Kleinberg,
Approximation Algorithms for Stochastic Combinatorial Optimization Part I: Multistage problems Anupam Gupta Carnegie Mellon University.
FDA- A scalable evolutionary algorithm for the optimization of ADFs By Hossein Momeni.
1 Introduction to Approximation Algorithms. 2 NP-completeness Do your best then.
1 1 Stanford University 2 MPI for Biological Cybernetics 3 California Institute of Technology Inferring Networks of Diffusion and Influence Manuel Gomez.
1 1 Stanford University 2 MPI for Biological Cybernetics 3 California Institute of Technology Inferring Networks of Diffusion and Influence Manuel Gomez.
Machine Learning Seminar: Support Vector Regression Presented by: Heng Ji 10/08/03.
Toward Community Sensing Andreas Krause Carnegie Mellon University Joint work with Eric Horvitz, Aman Kansal, Feng Zhao Microsoft Research Information.
Influence Maximization in Dynamic Social Networks Honglei Zhuang, Yihan Sun, Jie Tang, Jialin Zhang, Xiaoming Sun.
Carnegie Mellon Maximizing Submodular Functions and Applications in Machine Learning Andreas Krause, Carlos Guestrin Carnegie Mellon University.
Maximizing the Spread of Influence through a Social Network David Kempe, Jon Kleinberg, Eva Tardos Cornell University KDD 2003.
Randomized Composable Core-sets for Submodular Maximization Morteza Zadimoghaddam and Vahab Mirrokni Google Research New York.
Lecture 4: Statistics Review II Date: 9/5/02  Hypothesis tests: power  Estimation: likelihood, moment estimation, least square  Statistical properties.
A Passive Approach to Sensor Network Localization Rahul Biswas and Sebastian Thrun International Conference on Intelligent Robots and Systems 2004 Presented.
5 Maximizing submodular functions Minimizing convex functions: Polynomial time solvable! Minimizing submodular functions: Polynomial time solvable!
CPSC 536N Sparse Approximations Winter 2013 Lecture 1 N. Harvey TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: AAAAAAAAAA.
Submodular Maximization with Cardinality Constraints Moran Feldman Based On Submodular Maximization with Cardinality Constraints. Niv Buchbinder, Moran.
Cost-effective Outbreak Detection in Networks Presented by Amlan Pradhan, Yining Zhou, Yingfei Xiang, Abhinav Rungta -Group 1.
Maximizing Symmetric Submodular Functions Moran Feldman EPFL.
Distributed Optimization Yen-Ling Kuo Der-Yeuan Yu May 27, 2010.
1 Approximation algorithms Algorithms and Networks 2015/2016 Hans L. Bodlaender Johan M. M. van Rooij TexPoint fonts used in EMF. Read the TexPoint manual.
1 Structure Learning (The Good), The Bad, The Ugly Inference Graphical Models – Carlos Guestrin Carnegie Mellon University October 13 th, 2008 Readings:
Application of Dynamic Programming to Optimal Learning Problems Peter Frazier Warren Powell Savas Dayanik Department of Operations Research and Financial.
Optimal Relay Placement for Indoor Sensor Networks Cuiyao Xue †, Yanmin Zhu †, Lei Ni †, Minglu Li †, Bo Li ‡ † Shanghai Jiao Tong University ‡ HK University.
Slide 1 Toward Optimal Sniffer-Channel Assignment for Reliable Monitoring in Multi-Channel Wireless Networks Donghoon Shin, Saurabh Bagchi and Chih-Chun.
Inferring Networks of Diffusion and Influence
Monitoring rivers and lakes [IJCAI ‘07]
Near-optimal Observation Selection using Submodular Functions
Moran Feldman The Open University of Israel
Distributed Submodular Maximization in Massive Datasets
Coverage Approximation Algorithms
Cost-effective Outbreak Detection in Networks
Submodular Maximization Through the Lens of the Multilinear Relaxation
Near-Optimal Sensor Placements in Gaussian Processes
Submodular Maximization with Cardinality Constraints
Presentation transcript:

Carnegie Mellon Selecting Observations against Adversarial Objectives Andreas Krause Brendan McMahan Carlos Guestrin Anupam Gupta TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: A A AAAAA A A

Observation selection problems Place sensors for building automation Monitor rivers, lakes using robots Detect contaminations in water networks Set V of possible observations (sensor locations,..) Want to pick subset A * µ V such that For most interesting utilities F, NP-hard!

Placement B = {S 1,…, S 5 } Key observation: Diminishing returns Placement A = {S 1, S 2 } Formalization: Submodularity For A µ B, F(A [ {S’}) – F(A) ¸ F(B [ {S’}) – F(B) Adding S’ will help a lot! Adding S’ doesn’t help much New sensor S’

Submodularity [with Guestrin, Singh, Leskovec, VanBriesen, Faloutsos, Glance] We prove submodularity for Mutual information F(A) = H(unobs) – H(unobs|A) UAI ’05, JMLR ’07 (Spatial prediction) Outbreak detection F(A) = Impact reduction sensing A KDD ’07(Water monitoring, …) Also submodular: Geometric coverage F(A) = area covered Variance reduction F(A) = Var(Y) – Var(Y|A) …

Why is submodularity useful? Theorem [Nemhauser et al ‘78] Greedy algorithm gives constant factor approximation F(A greedy ) ¸ (1-1/e) F(A opt ) Can get online (data dependent) bounds for any algorithm Can significantly speed up greedy algorithm Can use MIP / branch & bound for optimal solution ~63% Greedy Algorithm (forward selection)

Robust observation selection What if … … parameters  of model P(X V j  ) unknown / change? … sensors fail? … an adversary selects the outbreak scenario? More variability here now  new Attack here! Best placement for parameters  old Sensors

Robust prediction Instead: minimize “width” of the confidence bands For every location s 2 V, define F s (A) = Var(s) – Var(s|A) Minimize “width”  simultaneously maximize all F s (A) Each F s (A) is (often) submodular! [Das & Kempe ‘07] Low average variance (MSE) but high maximum (in most interesting part!) Typical objective: Minimize average variance (MSE) Confidence bands Horizontal positions V pH value

Adversarial observation selection Given: Possible observations V, Submodular functions F 1,…,F m Want to solve Can model many problems this way: Width of confidence bands: F i is variance at location i unknown parameters:F i is info-gain with parameters  i adversarial outbreak scenarios:F i is utility for scenario i … Unfortunately, min i F i (A) is not submodular  One F i for each location i … …

How does greedy do? Set AF1F1 F2F2 min i F i {x}100 {y}020 {z}  {x,y}121 {x,z}1  {y,z}  2  Theorem: The problem max |A| · k min i F(A) does not admit any approximation unless P=NP  Optimal solution (k=2) Greedy picks z first Then, can choose only x or y  Greedy does arbitrarily badly. Is there something better?

Alternative formulation If somebody told us the optimal value, can we recover the optimal solution A * ? Need to solve dual problem Is this any easier? Yes, if we relax the constraint |A| · k

Solving the alternative problem Trick: For each F i and c, define truncation c |A| F i (A) F’ i (A) SetF1F1 F2F2 F’ 1 F’ 2 F’ avg,1 min i F i {x}1010½0 {y}0201½0 {z}  {x,y} {x,z}1  1  (1+  )/2  {y,z}  2  1 (1+  )/2  min i F i (A) ¸ c  F’ avg,c (A) = c Lemma: F’ avg,c (A) is submodular!

Why is this useful? Can use the greedy algorithm to find (approximate) solution! Proposition: Greedy algorithm finds A G with |A G | ·  k and F’ avg,c (A G ) = c where  = 1+log max s  i F i ({s})

Back to our example Guess c=1 First pick x Then pick y  Optimal solution! How do we find c? SetF1F1 F2F2 min i F i F’ avg,1 {x}100½ {y}020½ {z}  {x,y}1211 {x,z}1  (1+  )/2 {y,z}  2  (1+  )/2

Submodular Saturation Algorithm Given set V, integer k and functions F 1,…,F m Initialize c min =0, c max = min i F i (V) Do binary search: c = (c min +c max )/2 Use greedy algorithm to find A G such that F’ avg,c (A G ) = c If |A G | >  k: decrease c max If |A G | ·  k: increase c min until convergence c max c min c |A G | ·  k  c too low |A G | >  k  c too high

Theoretical guarantees Theorem: If there were a polytime algorithm with better constant  < , then NP µ DTIME(n log log n ) Theorem: Saturate finds a solution A S such that min i F i (A S ) ¸ OPT k and |A S | ·  k where OPT k = max |A| · k min i F i (A)  = 1 + log max s  i F i ({s}) Theorem: The problem max |A| · k min i F(A) does not admit any approximation unless P=NP 

Experiments: Minimizing maximum variance in GP regression Robust biological experimental design Outbreak detection against adversarial contaminations Goals: Compare against state of the art Analyze appropriateness of “worst-case” assumption

Number of sensors Maximum marginal variance Greedy Simulated Annealing Saturate Spatial prediction Compare to state of the art [Sacks et.al. ’88, Wiens ’05, …] Highly tuned simulated annealing heuristics (7 parameters) Saturate is competitive & faster, better on larger problems Environmental monitoring Precipitation data better Number of sensors Maximum marginal variance Greedy Saturate Simulated Annealing

Maximum vs. average variance Minimizing the worst-case leads to good average- case score, not vice versa Environmental monitoring Precipitation data better

Outbreak detection Results even more prominent on water network monitoring (12,527 nodes) Water networks better Water networks

Robust experimental design Learn parameters  of nonlinear function y i = f(x i,  ) + w Choose stimuli x i to facilitate MLE of  Difficult optimization problem! Common approach: linearization! y i ¼ f(x i,  0 ) + r f  0 (x i ) T (  -  0 ) + w Allows nice closed form (fractional) solution! How should we choose  0 ??

Robust experimental design State-of-the-art: [Flaherty et al., NIPS ‘06] Assume perturbation on Jacobian r f  0 (x i ) Solve robust SDP against worst-case perturbation Minimize maximum eigenvalue of estimation error (E-optimality) This paper: Assume perturbation of initial parameter estimate  0 Use Saturate to perform well against all initial parameter estimates Minimize MSE of parameter estimate (Bayesian A-optimality, typically submodular!)

Experimental setup Estimate parameters of Michaelis-Menten model (to compare results) Evaluate efficiency of designs Loss of optimal design, knowing true parameter  true Loss of robust design, assuming (wrong) initial parameter  0

Robust design results Saturate more efficient than SDP if optimizing for high parameter uncertainty better Low uncertainty in  0 High uncertainty in  0 A BC A BC

Future (current) work Incorporating complex constraints (communication, etc.) Dealing with large numbers of objectives Constraint generation Improved guarantees for certain objectives (sensor failures) Trading off worst-case and average-case scores Expected score Adversarial score k=5 k=10 k=15 k=20

Conclusions Many observation selection problems require optimizing adversarially chosen submodular function Problem not approximable to any factor! Presented efficient algorithm: Saturate Achieves optimal score, with bounded increase in cost Guarantees are best possible under reasonable complexity assumptions Saturate performs well on real-world problems Outperforms state-of-the-art simulated annealing algorithms for sensor placement, no parameters to tune Compares favorably with SDP based solutions for robust experimental design