jmlr jmlr2008 jmlr2008-9 jmlr2008-9-reference knowledge-graph by maker-knowledge-mining

9 jmlr-2008-Active Learning by Spherical Subdivision


Source: pdf

Author: Falk-Florian Henrich, Klaus Obermayer

Abstract: We introduce a computationally feasible, “constructive” active learning method for binary classification. The learning algorithm is initially formulated for separable classification problems, for a hyperspherical data space with constant data density, and for great spheres as classifiers. In order to reduce computational complexity the version space is restricted to spherical simplices and learning procedes by subdividing the edges of maximal length. We show that this procedure optimally reduces a tight upper bound on the generalization error. The method is then extended to other separable classification problems using products of spheres as data spaces and isometries induced by charts of the sphere. An upper bound is provided for the probability of disagreement between classifiers (hence the generalization error) for non-constant data densities on the sphere. The emphasis of this work lies on providing mathematically exact performance estimates for active learning strategies. Keywords: active learning, spherical subdivision, error bounds, simplex halving


reference text

¨ F.R. Bach. Active learning for misspecified generalized linear models. In B. Sch olkopf, J. Platt, and T. Hoffman, editors, Advances in Neural Information Processing Systems 19, pages 65–72. MIT Press, Cambridge, MA, 2007. M.-F. Balcan, A. Beygelzimer, and J. Langford. Agnostic active learning. In ICML ’06: Proceedings of the 23rd international conference on Machine learning, pages 65–72, New York, NY, USA, 2006. ACM Press. M. Belkin and P. Niyogi. Semi-supervised learning on Riemannian manifolds. Machine Learning, 56(1–3):209–239, 2004. D. J. Best and N. I. Fisher. Efficient simulation of the von Mises distribution. Applied Statistics, 28 (2):152–157, 1979. L. Devroye. Non-Uniform Random Variate Generation. Springer-Verlag, 1986. S. Fine, R. Gilad-Bachrach, and E. Shamir. Learning using query by committee, linear separation and random walks. Theoretical Computer Science, 284:25–51, 2002. Y. Freund, H. S. Seung, E. Shamir, and N. Tishby. Selective sampling using the query by committee algorithm. Machine Learning, 28(2–3):133–168, 1997. S. Gallot, D. Hulin, and J. Lafontaine. Riemannian Geometry. Universitext. Springer, 1990. M. Gromov. Metric Structures for Riemannian and Non-Riemannian Spaces. Progress in Mathematics. Birkh¨ user, 1999. a R. Herbrich. Learning Kernel Classifiers–Theory and Algorithms. Adaptive Computation and Machine Learning. MIT Press, 2002. G. Lebanon and J. Lafferty. Hyperplane margin classifiers on the multinomial manifold. In ICML ’04: Proceedings of the twenty-first international conference on Machine learning. ACM Press, 2004. 129 H ENRICH AND O BERMAYER C. Meyer. Matrix Analysis and Applied Linear Algebra. SIAM, 2000. J. Milnor. The Schl¨ fli differential inequality. In Collected Papers (Volume 1). Publish or Perish, a Houston, 1994. H. Q. Minh, P. Niyogi, and Y. Yao. Mercer’s theorem, feature maps, and smoothing. In COLT, pages 154–168, 2006. T. M. Mitchell. Generalization as search. Artificial Intelligence, 18(2):203–226, 1982. M. Opper, H. S. Seung, and H. Sompolinsky. Query by commitee. In Proceedings of the Fifth Annual Workshop on Computational Learning Theory, pages 287–294, Pittsburgh, Pennsylvania, United States, 1992. G. W. Stewart. The efficient generation of random orthogonal matrices with an application to condition estimators. SIAM Journal on Numerical Analysis, 17:403–409, 1980. S. Tong and D. Koller. Support vector machine active learning with applications to text classification. Journal of Machine Learning Research, 2:45–66, 2001. M. K. Warmuth, J. Liao, G. R¨ tsch, M. Mathieson, and C. Lemmen. Active learning in the drug a discovery process. In T.G. Dietterich, S. Becker, and Z. Ghahramani, editors, Advances in Neural Information Processing Systems 14, pages 1449–1456. MIT Press, 2002. 130