jmlr jmlr2008 jmlr2008-9 jmlr2008-9-reference knowledge-graph by maker-knowledge-mining

9 jmlr-2008-Active Learning by Spherical Subdivision

Source: pdf

Author: Falk-Florian Henrich, Klaus Obermayer

Abstract: We introduce a computationally feasible, “constructive” active learning method for binary classiﬁcation. The learning algorithm is initially formulated for separable classiﬁcation problems, for a hyperspherical data space with constant data density, and for great spheres as classiﬁers. In order to reduce computational complexity the version space is restricted to spherical simplices and learning procedes by subdividing the edges of maximal length. We show that this procedure optimally reduces a tight upper bound on the generalization error. The method is then extended to other separable classiﬁcation problems using products of spheres as data spaces and isometries induced by charts of the sphere. An upper bound is provided for the probability of disagreement between classiﬁers (hence the generalization error) for non-constant data densities on the sphere. The emphasis of this work lies on providing mathematically exact performance estimates for active learning strategies. Keywords: active learning, spherical subdivision, error bounds, simplex halving

reference text

¨ F.R. Bach. Active learning for misspeciﬁed generalized linear models. In B. Sch olkopf, J. Platt, and T. Hoffman, editors, Advances in Neural Information Processing Systems 19, pages 65–72. MIT Press, Cambridge, MA, 2007. M.-F. Balcan, A. Beygelzimer, and J. Langford. Agnostic active learning. In ICML ’06: Proceedings of the 23rd international conference on Machine learning, pages 65–72, New York, NY, USA, 2006. ACM Press. M. Belkin and P. Niyogi. Semi-supervised learning on Riemannian manifolds. Machine Learning, 56(1–3):209–239, 2004. D. J. Best and N. I. Fisher. Efﬁcient simulation of the von Mises distribution. Applied Statistics, 28 (2):152–157, 1979. L. Devroye. Non-Uniform Random Variate Generation. Springer-Verlag, 1986. S. Fine, R. Gilad-Bachrach, and E. Shamir. Learning using query by committee, linear separation and random walks. Theoretical Computer Science, 284:25–51, 2002. Y. Freund, H. S. Seung, E. Shamir, and N. Tishby. Selective sampling using the query by committee algorithm. Machine Learning, 28(2–3):133–168, 1997. S. Gallot, D. Hulin, and J. Lafontaine. Riemannian Geometry. Universitext. Springer, 1990. M. Gromov. Metric Structures for Riemannian and Non-Riemannian Spaces. Progress in Mathematics. Birkh¨ user, 1999. a R. Herbrich. Learning Kernel Classiﬁers–Theory and Algorithms. Adaptive Computation and Machine Learning. MIT Press, 2002. G. Lebanon and J. Lafferty. Hyperplane margin classiﬁers on the multinomial manifold. In ICML ’04: Proceedings of the twenty-ﬁrst international conference on Machine learning. ACM Press, 2004. 129 H ENRICH AND O BERMAYER C. Meyer. Matrix Analysis and Applied Linear Algebra. SIAM, 2000. J. Milnor. The Schl¨ ﬂi differential inequality. In Collected Papers (Volume 1). Publish or Perish, a Houston, 1994. H. Q. Minh, P. Niyogi, and Y. Yao. Mercer’s theorem, feature maps, and smoothing. In COLT, pages 154–168, 2006. T. M. Mitchell. Generalization as search. Artiﬁcial Intelligence, 18(2):203–226, 1982. M. Opper, H. S. Seung, and H. Sompolinsky. Query by commitee. In Proceedings of the Fifth Annual Workshop on Computational Learning Theory, pages 287–294, Pittsburgh, Pennsylvania, United States, 1992. G. W. Stewart. The efﬁcient generation of random orthogonal matrices with an application to condition estimators. SIAM Journal on Numerical Analysis, 17:403–409, 1980. S. Tong and D. Koller. Support vector machine active learning with applications to text classiﬁcation. Journal of Machine Learning Research, 2:45–66, 2001. M. K. Warmuth, J. Liao, G. R¨ tsch, M. Mathieson, and C. Lemmen. Active learning in the drug a discovery process. In T.G. Dietterich, S. Becker, and Z. Ghahramani, editors, Advances in Neural Information Processing Systems 14, pages 1449–1456. MIT Press, 2002. 130