nips nips2007 nips2007-174 nips2007-174-reference knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Andreas Krause, Brendan Mcmahan, Carlos Guestrin, Anupam Gupta
Abstract: In many applications, one has to actively select among a set of expensive observations before making an informed decision. Often, we want to select observations which perform well when evaluated with an objective function chosen by an adversary. Examples include minimizing the maximum posterior variance in Gaussian Process regression, robust experimental design, and sensor placement for outbreak detection. In this paper, we present the Submodular Saturation algorithm, a simple and efficient algorithm with strong theoretical approximation guarantees for the case where the possible objective functions exhibit submodularity, an intuitive diminishing returns property. Moreover, we prove that better approximation algorithms do not exist unless NP-complete problems admit efficient algorithms. We evaluate our algorithm on several real-world problems. For Gaussian Process regression, our algorithm compares favorably with state-of-the-art heuristics described in the geostatistics literature, while being simpler, faster and providing theoretical guarantees. For robust experimental design, our algorithm performs favorably compared to SDP-based algorithms. 1
[1] A. Das and D. Kempe. Algorithms for subset selection in linear regression. In Manuscript, 2007.
[2] A. Krause, A. Singh, and C. Guestrin. Near-optimal sensor placements in Gaussian processes: Theory, efficient algorithms and empirical studies. In To appear in the JMLR, 2007.
[3] G. Nemhauser, L. Wolsey, and M. Fisher. An analysis of the approximations for maximizing submodular set functions. Mathematical Programming, 14:265–294, 1978.
[4] U. Feige. A threshold of ln n for approximating set cover. J. ACM, 45(4), 1998.
[5] A. Krause, B. McMahan, C. Guestrin, and A. Gupta. Robust submodular observation selection. Technical report, CMU-ML-08-100, 2008.
[6] C. E. Rasmussen and C. K. I. Williams. Gaussian Process for Machine Learning. Adaptive Computation and Machine Learning. MIT Press, 2006.
[7] P. Flaherty, M. Jordan, and A. Arkin. Robust design of biological experiments. In NIPS, 2006.
[8] S. Boyd and L. Vandenberghe. Convex Optimization. Cambridge UP, March 2004.
[9] J. Leskovec, A. Krause, C. Guestrin, C. Faloutsos, J. VanBriesen, and N. Glance. Cost-effective outbreak detection in networks. In KDD, 2007.
[10] T. Fujito. Approximation algorithms for submodular set cover with applications. TIEICE, 2000.
[11] L.A. Wolsey. An analysis of the greedy algorithm for the submodular set covering problem. Combinatorica, 2:385–393, 1982.
[12] T. G. Robertazzi and S. C. Schwartz. An accelerated sequential algorithm for producing D-optimal designs. SIAM Journal of Scientific and Statistical Computing, 10(2):341–358, March 1989.
[13] M. Widmann and C. S. Bretherton. 50 km resolution daily precipitation for the pacific northwest. http://www.jisao.washington.edu/data sets/widmann/, May 1999.
[14] J. Sacks and S. Schiller. Statistical Decision Theory and Related Topics IV, Vol. 2. Springer, 1988.
[15] D. P. Wiens. Robustness in spatial studies ii: minimax design. Environmetrics, 16:205–217, 2005.
[16] A. Ostfeld, J. G. Uber, and E. Salomons. Battle of water sensor networks: A design challenge for engineers and algorithms. In 8th Symposium on Water Distribution Systems Analysis, 2006.
[17] L. A. Rossman. The epanet programmer’s toolkit for analysis of water distribution systems. In Annual Water Resources Planning and Management Conference, 1999. 8