nips nips2005 nips2005-74 nips2005-74-reference knowledge-graph by maker-knowledge-mining

74 nips-2005-Faster Rates in Regression via Active Learning

Source: pdf

Author: Rebecca Willett, Robert Nowak, Rui M. Castro

Abstract: This paper presents a rigorous statistical analysis characterizing regimes in which active learning signiﬁcantly outperforms classical passive learning. Active learning algorithms are able to make queries or select sample locations in an online fashion, depending on the results of the previous queries. In some regimes, this extra ﬂexibility leads to signiﬁcantly faster rates of error decay than those possible in classical passive learning settings. The nature of these regimes is explored by studying fundamental performance limits of active and passive learning in two illustrative nonparametric function classes. In addition to examining the theoretical potential of active learning, this paper describes a practical algorithm capable of exploiting the extra ﬂexibility of the active setting and provably improving upon the classical passive techniques. Our active learning theory and methods show promise in a number of applications, including ﬁeld estimation using wireless sensor networks and fault line detection. 1

reference text

[1] D. Cohn, Z. Ghahramani, and M. Jordan, “Active learning with statistical models,” Journal of Artiﬁcial Intelligence Research, pp. 129–145, 1996.

[2] D. J. C. Mackay, “Information-based objective functions for active data selection,” Neural Computation, vol. 4, pp. 698–714, 1991.

[3] Y. Freund, H. S. Seung, E. Shamir, and N. Tishby, “Information, prediction, and query by committee,” Proc. Advances in Neural Information Processing Systems, 1993.

[4] K. Sung and P. Niyogi, “Active learning for function approximation,” Proc. Advances in Neural Information Processing Systems, vol. 7, 1995.

[5] G. Blanchard and D. Geman, “Hierarchical testing designs for pattern recognition,” to appear in Annals of Statistics, 2005.

[6] M. V. Burnashev and K. Sh. Zigangirov, “An interval estimation problem for controlled observations,” Problems in Information Transmission, vol. 10, pp. 223–231, 1974.

[7] P. Hall and I. Molchanov, “Sequential methods for design-adaptive estimation of discontinuities in regression curves and surfaces,” The Annals of Statistics, vol. 31, no. 3, pp. 921–941, 2003.

[8] Alexander Korostelev, “On minimax rates of convergence in image models under sequential design,” Statistics & Probability Letters, vol. 43, pp. 369–375, 1999.

[9] R. Willett, A. Martin, and R. Nowak, “Backcasting: Adaptive sampling for sensor networks,” in Proc. Information Processing in Sensor Networks, 26-27 April, Berkeley, CA, USA, 2004.

[10] Charles J. Stone, “Optimal rates of convergence for nonparametric estimators,” The Annals of Statistics, vol. 8, no. 6, pp. 1348–1360, 1980.

[11] A.P. Korostelev and A.B. Tsybakov, Minimax Theory of Image Reconstruction, Springer Lecture Notes in Statistics, 1993.

[12] R. Castro, R. Willett, and R. Nowak, “Fast rates in regression via active learning,” Tech. Rep., University of Wisconsin, Madison, June 2005, ECE-05-3 Technical Report (available at http://homepages.cae.wisc.edu/ rcastro/ECE-05-3.pdf).

[13] Alexandre B. Tsybakov, Introduction a l’estimation non-param´ trique, Math´ matiques et e e ` Applications, 41. Springer, 2004.

[14] R. Nowak, U. Mitra, and R. Willett, “Estimating inhomogeneous ﬁelds using wireless sensor networks,” IEEE Journal on Selected Areas in Communication, vol. 22, no. 6, pp. 999–1006, 2004.

[15] Andrew R. Barron, “Complexity regularization with application to artiﬁcial neural networks,” in Nonparametric Functional Estimation and Related Topics. 1991, pp. 561–576, Kluwer Academic Publishers.