jmlr jmlr2008 jmlr2008-83 jmlr2008-83-reference knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Andreas Krause, H. Brendan McMahan, Carlos Guestrin, Anupam Gupta
Abstract: In many applications, one has to actively select among a set of expensive observations before making an informed decision. For example, in environmental monitoring, we want to select locations to measure in order to most effectively predict spatial phenomena. Often, we want to select observations which are robust against a number of possible objective functions. Examples include minimizing the maximum posterior variance in Gaussian Process regression, robust experimental design, and sensor placement for outbreak detection. In this paper, we present the Submodular Saturation algorithm, a simple and efficient algorithm with strong theoretical approximation guarantees for cases where the possible objective functions exhibit submodularity, an intuitive diminishing returns property. Moreover, we prove that better approximation algorithms do not exist unless NP-complete problems admit efficient algorithms. We show how our algorithm can be extended to handle complex cost functions (incorporating non-unit observation cost or communication and path costs). We also show how the algorithm can be used to near-optimally trade off expected-case (e.g., the Mean Square Prediction Error in Gaussian Process regression) and worst-case (e.g., maximum predictive variance) performance. We show that many important machine learning problems fit our robust submodular observation selection formalism, and provide extensive empirical evaluation on several real-world problems. For Gaussian Process regression, our algorithm compares favorably with state-of-the-art heuristics described in the geostatistics literature, while being simpler, faster and providing theoretical guarantees. For robust experimental design, our algorithm performs favorably compared to SDP-based algorithms. c 2008 Andreas Krause, H. Brendan McMahan, Carlos Guestrin and Anupam Gupta. K RAUSE , M C M AHAN , G UESTRIN AND G UPTA Keywords: observation selection, experimental design, active learning, submodular functions, Gaussi
B. M. Anthony, V. Goyal, A. Gupta, and V. Nagarajan. A plant location guide for the unsure. In SODA, 2008. 2796 ROBUST S UBMODULAR O BSERVATION S ELECTION A. Asadpour and A. Saberi. An approximation algorithm for max-min fair allocation of indivisible goods. In STOC, pages 114–121, New York, NY, USA, 2007. ACM Press. ISBN 978-1-59593631-8. doi: http://doi.acm.org/10.1145/1250790.1250808. I. Averbakh. On the complexity of a class of combinatorial optimization problems with uncertainty. Mathematical Programming, 90:263–272, 2001. S. Axelrod, S. Fine, R. Gilad-Bachrach, R. Mendelson, and N. Tishby. The information of observations and application for active learning with uncertainty. Technical report, Jerusalem: Leibniz Center, Hebrew University, 2001. M. F. Balcan, A. Beygelzimer, and J. Langford. Agnostic active learning. In ICML, 2006. J. Bar-Ilan, G. Kortsarz, and D. Peleg. Generalized submodular cover problems and applications. Theoretical Computer Science, 250(1-2):179–200, January 2001. J. F. Benders. Partitioning procedures for solving mixed-variables programming problems. Numerische Mathematik, 4:238–252, 1962. J. Berger. Robustness of Bayesian Analyses, chapter The robust Bayesian viewpoint, page 63144. North-Holland, 1984. D. Bertsimas and M. Sim. Robust discrete optimization and network flows. Mathematical Programming, 98:49–71, 2003. A. Blum, S. Chawla, D. R. Karger, T. Lane, A. Meyerson, and M. Minkoff. Approximation algorithms for orienteering and discounted-reward tsp. In FOCS, page 46, 2003. S. Boyd and L. Vandenberghe. Convex Optimization. Cambridge UP, March 2004. T.M. Burgess, R. Webster, and A.B. McBratney. Optimal interpolation and isarithmic mapping of soil properties. iv. sampling strategy. Journal of Soil Science, 32:643–659, 1981. R. D. Carr, H. J. Greenberg, W. E. Hart, G. Konjevod, E. Lauer, H. Lin, T. Morrison, and C. A. Phillips. Robust optimization of contaminant sensor placement for community water systems. Mathematical Programming Series B, 107:337–356, 2006. K. Chaloner and I. Verdinelli. Bayesian experimental design: A review. Statistical Science, 10(3): 273–304, Aug. 1995. ISSN 08834237. J. Chuzhoy, S. Guha, E. Halperin, S. Khanna, G. Kortsarz, R. Krauthgamer, and J. Naor. Asymmetric k-center is log∗ n-hard to approximate. Journal of the ACM, 52(4):538–551, 2005. D. A. Cohn. Neural network exploration using optimal experiment design. In Jack D. Cowan, Gerald Tesauro, and Joshua Alspector, editors, Advances in Neural Information Processing Systems, volume 6, pages 679–686. Morgan Kaufmann Publishers, Inc., 1994. N. A. C. Cressie. Statistics for Spatial Data. Wiley, 1991. A. Das and D. Kempe. Algorithms for subset selection in linear regression. In ACM Symposium on the Theory of Computing (STOC), 2008. 2797 K RAUSE , M C M AHAN , G UESTRIN AND G UPTA H. A. Dror and D. M. Steinberg. Robust experimental design for multivariate generalized linear models. Technometrics, 48(4):520–529, 2006. U. Feige. On maximizing welfare when utility functions are subadditive. In STOC, 2006. U. Feige. A threshold of ln n for approximating set cover. J. ACM, 45(4), 1998. P. Flaherty, M. Jordan, and A. Arkin. Robust design of biological experiments. In NIPS, 2006. Y. Freund, H. S. Seung, E. Shamir, and N. Tishby. Selective sampling using the query by committee algorithm. Machine Learning, 28(2-3):133–168, 1997. T. Fujito. Approximation algorithms for submodular set cover with applications. TIEICE, 2000. URL citeseer.ist.psu.edu/article/fujito00approximation.html. A. Globerson and S. Roweis. Nightmare at test time: Robust learning by feature deletion. In ICML, 2006. D. Golovin and M. Streeter. Online algorithms for maximizing submodular set functions. In Submitted to SODA, 2008. T. F. Gonzalez. Clustering to minimize the maximum intercluster distance. Theoret. Comput. Sci., 38(2-3):293–306, 1985. ISSN 0304-3975. C. Guestrin, A. Krause, and A. Singh. Near-optimal sensor placements in Gaussian processes. In Machine Learning, Proceedings of the Twenty-Second International Conference (ICML), 2005. T. C. Harmon, R. F. Ambrose, R. M. Gilbert, J. C. Fisher, M. Stealey, and W. J. Kaiser. High resolution river hydraulic and water quality characterization using rapidly deployable networked infomechanical systems (nims rd). Technical Report 60, CENS, 2006. D. Hochbaum and D. Shmoys. A best possible heuristic for the k-center problem. Mathematics of Operations Research, 10(2):180–184, 1985. S. C. H. Hoi, R. Jin, J. Zhu, and M. R. Lyu. Batch mode active learning and its application to medical image classification. In ICML, 2006. S. Iwata, L. Fleischer, and S. Fujishige. A combinatorial strongly polynomial algorithm for minimizing submodular functions. Journal of the ACM, 48(4):761–777, 2001. M. Johanson, M. Zinkevich, and M. Bowling. Computing robust counter-strategies. In NIPS, 2007. D. S. Johnson, M. Minkoff, and S. Phillips. The prize collecting steiner tree problem: theory and practice. In SODA, 2000. P. Kouvelis and G. Yu. Robust Discrete Optimization and its Applications. Kluwer Academic Publishers, 1997. A. Krause and C. Guestrin. Near-optimal value of information in graphical models. In UAI, 2005. A. Krause and C. Guestrin. Near-optimal observation selection using submodular functions. In AAAI Nectar track, 2007a. 2798 ROBUST S UBMODULAR O BSERVATION S ELECTION A. Krause and C. Guestrin. Nonmyopic active learning of gaussian processes: An exploration— exploitation approach. In ICML, 2007b. A. Krause, C. Guestrin, A. Gupta, and J. Kleinberg. Near-optimal sensor placements: Maximizing information while minimizing communication cost. In Proceedings of the Fifth International Symposium on Information Processing in Sensor Networks (IPSN), 2006. A. Krause, B. McMahan, C. Guestrin, and A. Gupta. Selecting observations against adversarial objectives. In NIPS, 2007a. A. Krause, A. Singh, and C. Guestrin. Near-optimal sensor placements in Gaussian processes: Theory, efficient algorithms and empirical studies. In To appear in the JMLR, 2007b. G. Laporte and S. Martello. The selective travelling salesman problem. Disc. App. Math, 26:193– 207, 1990. N. Lawrence, M. Seeger, and R. Herbrich. Fast sparse gaussian process methods: The informative vector machine. In Advances in Neural Information Processing Systems (NIPS) 16, 2003. J. Leskovec, A. Krause, C. Guestrin, C. Faloutsos, J. VanBriesen, and N. Glance. Cost-effective outbreak detection in networks. In KDD, 2007. L. Lov´ sz. Submodular functions and convexity. Mathematical Programming - State of the Art, a pages 235–257, 1983. D. MacKay. Information-based objective functions for active data selection. Neural Computation, 4(4):590–604, 1992. S. Martello and P. Toth. Knapsack Problems: Algorithms and Computer Implementations. John Wiley & Sons, 1990. A. Meliou, A. Krause, C. Guestrin, and J. M. Hellerstein. Nonmyopic informative path planning in spatio-temporal models. In AAAI, 2007. E. Minieka. The m-center problem. SIAM Rev, 12(1):138–139, 1970. N. Mladenovic, M. Labb´ , and P. Hansen. Solving the p-center problem with tabu search and e variable neighborhood search. Networks, 42(1):48–64, 2003. M. Narasimhan and J. Bilmes. Pac-learning bounded tree-width graphical models. In Uncertainty in Artificial Intelligence, 2004. M. Narasimhan, N. Jojic, and J. Bilmes. Q-clustering. In NIPS, 2005. G. Nemhauser, L. Wolsey, and M. Fisher. An analysis of the approximations for maximizing submodular set functions. Mathematical Programming, 14:265–294, 1978. A. Ostfeld, J. G. Uber, and E. Salomons. Battle of water sensor networks: A design challenge for engineers and algorithms. In 8th Symposium on Water Distribution Systems Analysis, 2006. 2799 K RAUSE , M C M AHAN , G UESTRIN AND G UPTA A. Ostfeld, J. G. Uber, E. Salomons, J. W. Berry, W. E. Hart, C. A. Phillips, J. Watson, G. Dorini, P. Jonkergouw, Z. Kapelan, F. di Pierro, S. Khu, D. Savic, D. Eliades, M. Polycarpou, S. R. Ghimire, B. D. Barkdoll, R. Gueli, J. J. Huang, E. A. McBean, W. James, A. Krause, J. Leskovec, S. Isovitsch, J. Xu, C. Guestrin, J. VanBriesen, M. Small, P. Fischbeck, A. Preis, M. Propato, O. Piller, G. B. Trachtman, Z. Y. Wu, and T. Walski. The battle of the water sensor networks (BWSN): A design challenge for engineers and algorithms. To appear in the Journal of Water Resources Planning and Management, 2008. R. Panigrahy and S. Vishwanathan. An O (log∗ n) approximation algorithm for the asymmetric pcenter problem. Journal of Algorithms, 27(2):259–268, 1998. C. H. Papadimitriou and M. Yannakakis. The complexity of tradeoffs, and optimal access of web sources. In FOCS, 2000. J. Pilz, G. Spoeck, and M. G. Schimek. Geostatistics Wollongong, volume 1, chapter Taking account of uncertainty in spatial covariance estimation, pages 302–313. Kluwer, 1996. A. K. Ponnuswami and S. Khot. Approximation algorithms for the max-min allocation problem. In APPROX, 2007. R. Price and P. R. Messinger. Optimal recommendation sets: Covering uncertainty over user preferences. In AAAI, 2005. M. Queyranne. A combinatorial algorithm for minimizing symmetric submodular functions. In SODA, 1995. C. E. Rasmussen and C. K. I. Williams. Gaussian Process for Machine Learning. Adaptive Computation and Machine Learning. MIT Press, 2006. T. G. Robertazzi and S. C. Schwartz. An accelerated sequential algorithm for producing D-optimal designs. SIAM Journal of Scientific and Statistical Computing, 10(2):341–358, March 1989. L. A. Rossman. The epanet programmer’s toolkit for analysis of water distribution systems. In Annual Water Resources Planning and Management Conference, 1999. J. Sacks and S. Schiller. Statistical Decision Theory and Related Topics IV, Vol. 2. Springer, 1988. A. Schrijver. A combinatorial algorithm minimizing submodular functions in strongly polynomial time. J. Combin. Theory Ser. B, 80(2):346–355, 2000. ISSN 0095-8956. M. Seeger. Greedy forward selection in the informative vector machine. Technical report, University of California at Berkeley, 2004. M. Seeger, C. K. I. Williams, and N. D. Lawrence. Fast forward selection to speed up sparse gaussian process regression. In Proceedings of the Ninth International Workshop on Artificial Intelligence and Statistics (AISTATS), 2003. S. Seo, M. Wallat, T. Graepel, and K. Obermayer. Gaussian process regression: Active data selection and test point rejection. In Proceedings of the International Joint Conference on Neural Networks (IJCNN), volume 3, pages 241–246, 2000. 2800 ROBUST S UBMODULAR O BSERVATION S ELECTION A. Singh, A. Krause, C. Guestrin, W. Kaiser, and M. Batalin. Efficient planning of informative paths for multiple robots. In IJCAI, 2007. P. Sollich. Learning from minimum entropy queries in a large committee machine. Physical Review E, 53:R2060–R2063, 1996. J.W. van Groenigen and A. Stein. Constrained optimization of spatial sampling using continuous simulated annealing. J. Environ. Qual., 27:1078–1086, 1998. V. V. Vazirani. Approximation Algorithms. Springer, 2003. J. Watson, W. E. Hart, and R. Murray. Formulation and optimization of robust sensor placement problems for contaminant warning systems. In Water Distribution System Symposium, 2006. M. Widmann and C. S. Bretherton. 50 km resolution daily precipitation for the pacific northwest. http://www.jisao.washington.edu/data sets/widmann/, May 1999. D. P. Wiens. Robustness in spatial studies ii: minimax design. Environmetrics, 16:205–217, 2005. L.A. Wolsey. An analysis of the greedy algorithm for the submodular set covering problem. Combinatorica, 2:385–393, 1982. 2801