nips nips2012 nips2012-305 nips2012-305-reference knowledge-graph by maker-knowledge-mining

305 nips-2012-Selective Labeling via Error Bound Minimization

Source: pdf

Author: Quanquan Gu, Tong Zhang, Jiawei Han, Chris H. Ding

Abstract: In many practical machine learning problems, the acquisition of labeled data is often expensive and/or time consuming. This motivates us to study a problem as follows: given a label budget, how to select data points to label such that the learning performance is optimized. We propose a selective labeling method by analyzing the out-of-sample error of Laplacian regularized Least Squares (LapRLS). In particular, we derive a deterministic out-of-sample error bound for LapRLS trained on subsampled data, and propose to select a subset of data points to label by minimizing this upper bound. Since the minimization is a combinational problem, we relax it into continuous domain and solve it by projected gradient descent. Experiments on benchmark datasets show that the proposed method outperforms the state-of-the-art methods.

reference text

[1] A. D. Anthony Atkinson and R. Tobias. Optimum Experimental Designs. Oxford University Press, 2007.

[2] M.-F. Balcan, A. Beygelzimer, and J. Langford. Agnostic active learning. In ICML, pages 65–72, 2006.

[3] P. L. Bartlett and S. Mendelson. Rademacher and gaussian complexities: Risk bounds and structural results. Journal of Machine Learning Research, 3:463–482, 2002.

[4] M. Belkin, P. Niyogi, and V. Sindhwani. Manifold regularization: A geometric framework for learning from labeled and unlabeled examples. Journal of Machine Learning Research, 7:2399–2434, 2006.

[5] A. Beygelzimer, D. Hsu, J. Langford, and T. Zhang. Agnostic active learning without constraints. In NIPS, pages 199–207, 2010.

[6] O. Bousquet and A. Elisseeff. Stability and generalization. Journal of Machine Learning Research, 2:499–526, 2002.

[7] O. Chapelle, B. Sch¨ lkopf, and A. Zien, editors. Semi-Supervised Learning. MIT Press, o Cambridge, MA, 2006.

[8] F. R. K. Chung. Spectral Graph Theory. American Mathematical Society, February 1997.

[9] D. A. Cohn, L. E. Atlas, and R. E. Ladner. Improving generalization with active learning. Machine Learning, 15(2):201–221, 1994.

[10] D. A. Cohn, Z. Ghahramani, and M. I. Jordan. Active learning with statistical models. In NIPS, pages 705–712, 1994.

[11] M. A. Fanty and R. A. Cole. Spoken letter recognition. In NIPS, pages 220–226, 1990.

[12] G. H. Golub and C. F. V. Loan. Matrix computations (3rd ed.). Johns Hopkins University Press, Baltimore, MD, USA, 1996.

[13] A. Guillory and J. Bilmes. Active semi-supervised learning using submodular functions. In UAI, pages 274–282, 2011.

[14] S. Hanneke. Rates of convergence in active learning. The Annals of Statistics, 39(1):333–361, 2011.

[15] T. Hastie, R. Tibshirani, and J. H. Friedman. The elements of statistical learning: data mining, inference, and prediction. New York: Springer-Verlag, 2001.

[16] X. He, W. Min, D. Cai, and K. Zhou. Laplacian optimal design for image retrieval. In SIGIR, pages 119–126, 2007.

[17] B. Korte and J. Vygen. Combinatorial Optimization: Theory and Algorithms. Springer Publishing Company, Incorporated, 4th edition, 2007.

[18] G. Schohn and D. Cohn. Less is more: Active learning with support vector machines. In ICML, pages 839–846, 2000.

[19] S. Tong and D. Koller. Support vector machine active learning with applications to text classiﬁcation. In ICML, pages 999–1006, 2000.

[20] K. Yu, J. Bi, and V. Tresp. Active learning via transductive experimental design. In ICML, pages 1081–1088, 2006.

[21] D. Zhou, O. Bousquet, T. N. Lal, J. Weston, and B. Sch¨ lkopf. Learning with local and global o consistency. In NIPS, 2003.

[22] X. Zhu, Z. Ghahramani, and J. D. Lafferty. Semi-supervised learning using gaussian ﬁelds and harmonic functions. In ICML, pages 912–919, 2003. 9