jmlr jmlr2009 jmlr2009-9 jmlr2009-9-reference knowledge-graph by maker-knowledge-mining

9 jmlr-2009-Analysis of Perceptron-Based Active Learning


Source: pdf

Author: Sanjoy Dasgupta, Adam Tauman Kalai, Claire Monteleoni

Abstract: We start by showing that in an active learning setting, the Perceptron algorithm needs Ω( ε12 ) labels to learn linear separators within generalization error ε. We then present a simple active learning algorithm for this problem, which combines a modification of the Perceptron update with an adaptive filtering rule for deciding which points to query. For data distributed uniformly over the unit ˜ sphere, we show that our algorithm reaches generalization error ε after asking for just O(d log 1 ) ε labels. This exponential improvement over the usual sample complexity of supervised learning had previously been demonstrated only for the computationally more complex query-by-committee algorithm. Keywords: active learning, perceptron, label complexity bounds, online learning


reference text

S. Agmon. The relaxation method for linear inequalities. Canadian Journal of Math., 6(3):382–392, 1954. D. Angluin. Queries revisited. In Proc. 12th International Conference on Algorithmic Learning Theory, LNAI,2225:12–31, 2001. M.-F. Balcan, A. Beygelzimer, and J. Langford. Agnostic active learning. In Proc. International Conference on Machine Learning, 2006. M.-F. Balcan, A. Broder, and T. Zhang. Margin based active learning. In Proc. 20th Annual Conference on Learning Theory, 2007. E.B. Baum. The perceptron algorithm is fast for nonmalicious distributions. Neural Computation, 2:248–260, 1997. A. Blum, A. Frieze, R. Kannan, and S. Vempala. A polynomial-time algorithm for learning noisy linear threshold functions. In Proc. 37th Annual IEEE Symposium on the Foundations of Computer Science, 1996. N. Cesa-Bianchi, A. Conconi, and C. Gentile. Learning probabilistic linear-threshold classifiers via selective sampling. In Proc. 16th Annual Conference on Learning Theory, 2003. N. Cesa-Bianchi, C. Gentile, and L. Zaniboni. Worst-case analysis of selective sampling for linearthreshold algorithms. In Advances in Neural Information Processing Systems 17, 2004. D.A. Cohn, L. Atlas, and R.E. Ladner. Improving generalization with active learning. Machine Learning, 15(2):201–221, 1994. S. Dasgupta. Analysis of a greedy active learning strategy. In Advances in Neural Information Processing Systems 17, 2004. S. Dasgupta. Coarse sample complexity bounds for active learning. In Advances in Neural Information Processing Systems 18, 2005. S. Dasgupta, D. Hsu, and C. Monteleoni. A general agnostic active learning algorithm. In Advances in Neural Information Processing Systems, 2007. Y. Freund, H.S. Seung, E. Shamir, and N. Tishby. Selective sampling using the query by committee algorithm. Machine Learning, 28(2-3):133–168, 1997. R. Gilad-Bachrach, A. Navot, and N. Tishby. Query by committee made real. In Advances in Neural Information Processing Systems 18, 2005. 298 A NALYSIS OF P ERCEPTRON -BASED ACTIVE L EARNING S. Hampson and D. Kibler. Minimum generalization via reflection: A fast linear threshold learner. Machine Learning, 37(1):51–73, 1999. S. Hanneke. A bound on the label complexity of agnostic active learning. In Proc. International Conference on Machine Learning, 2007. M. K¨ ari¨ inen. Active learning in the non-realizable case. In Proc. 17th International Conference a¨ a on Algorithmic Learning Theory, 2006. D.D. Lewis and W.A. Gale. A sequential algorithm for training text classifiers. In Proc. of SIGIR94, 17th ACM International Conference on Research and Development in Information Retrieval, 1994. C. Monteleoni and M. K¨ ari¨ inen. Practical online active learning for classification. In Proc. of the a¨ a IEEE Conference on Computer Vision and Pattern Recognition, Online Learning for Classification Workshop, 2007. C.E. Monteleoni. Learning with Online Constraints: Shifting Concepts and Active Learning. PhD Thesis, MIT Computer Science and Artificial Intelligence Laboratory, 2006. T.S. Motzkin and I.J. Schoenberg. The relaxation method for linear inequalities. Canadian Journal of Math., 6(3):393–404, 1954. F. Rosenblatt. The perceptron: A probabilistic model for information storage and organization in the brain. Psychological Review, 65:386–407, 1958. R.A. Servedio. On PAC learning using winnow, perceptron, and a perceptron-like algorithm. In Computational Learning Theory, pages 296 – 307, 1999. H.S. Seung, M. Opper, and H. Sompolinsky. Query by committee. In Proc. Fifth Annual ACM Conference on Computational Learning Theory, 1992. S. Tong and D. Koller. Support vector machine active learning with applications to text classification. Journal of Machine Learning Research, 2:45–66, 2001. 299