nips nips2010 nips2010-22 nips2010-22-reference knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Christoph Sawade, Niels Landwehr, Tobias Scheffer
Abstract: We address the problem of estimating the Fα -measure of a given model as accurately as possible on a fixed labeling budget. This problem occurs whenever an estimate cannot be obtained from held-out training data; for instance, when data that have been used to train the model are held back for reasons of privacy or do not reflect the test distribution. In this case, new test instances have to be drawn and labeled at a cost. An active estimation procedure selects instances according to an instrumental sampling distribution. An analysis of the sources of estimation error leads to an optimal sampling distribution that minimizes estimator variance. We explore conditions under which active estimates of Fα -measures are more accurate than estimates based on instances sampled from the test distribution. 1
[1] F. Bach. Active learning for misspecified generalized linear models. In Advances in Neural Information Processing Systems, 2007.
[2] A. Beygelzimer, S. Dasgupta, and J. Langford. Importance weighted active learning. In Proceedings of the International Conference on Machine Learning, 2009.
[3] H Cram´ r. Mathematical Methods of Statistics, chapter 20. Princeton University Press, 1946. e
[4] A. Frank and A. Asuncion. UCI machine learning repository, 2010.
[5] S. Geman, E. Bienenstock, and R. Doursat. Neural networks and the bias/variance dilemma. Neural Computation, 4:1–58, 1992.
[6] J. Hammersley and D. Handscomb. Monte carlo methods. Taylor & Francis, 1964.
[7] C. Sawade, N. Landwehr, S. Bickel, and T. Scheffer. Active risk estimation. In Proceedings of the 27th International Conference on Machine Learning, 2010.
[8] M. Sugiyama. Active learning in approximately linear regression based on conditional expectation of generalization error. Journal of Machine Learning Research, 7:141–166, 2006.
[9] S. Tong and D. Koller. Support vector machine active learning with applications to text classification. Journal of Machine Learning Research, pages 45–66, 2002.
[10] C. van Rijsbergen. Information Retrieval. Butterworths, 2nd edition, 1979.
[11] M. Yamada, M. Sugiyama, and T. Matsui. Semi-supervised speaker identification under covariate shift. Signal Processing, 90(8):2353–2361, 2010. 9