nips nips2013 nips2013-309 nips2013-309-reference knowledge-graph by maker-knowledge-mining

309 nips-2013-Statistical Active Learning Algorithms

Source: pdf

Author: Maria-Florina Balcan, Vitaly Feldman

Abstract: We describe a framework for designing efficient active learning algorithms that are tolerant to random classification noise and differentially-private. The framework is based on active learning algorithms that are statistical in the sense that they rely on estimates of expectations of functions of filtered random examples. It builds on the powerful statistical query framework of Kearns [30]. We show that any efficient active statistical learning algorithm can be automatically converted to an efficient active learning algorithm which is tolerant to random classification noise as well as other forms of “uncorrelated” noise. We show that commonly studied concept classes including thresholds, rectangles, and linear separators can be efficiently actively learned in our framework. These results combined with our generic conversion lead to the first computationally-efficient algorithms for actively learning some of these concept classes in the presence of random classification noise that provide exponential improvement in the dependence on the error over their passive counterparts. In addition, we show that our algorithms can be automatically converted to efficient active differentially-private algorithms. This leads to the first differentially-private active learning algorithms with exponential label savings over the passive case. 1

reference text

[1] D. Angluin and P. Laird. Learning from noisy examples. Machine Learning, 2:343–370, 1988.

[2] J. Aslam and S. Decatur. Specification and simulation of statistical query algorithms for efficiency and noise tolerance. JCSS, 56:191–208, 1998.

[3] M. Balcan, A. Broder, and T. Zhang. Margin based active learning. In COLT, pages 35–50, 2007. 8

[4] M. F. Balcan, A. Beygelzimer, and J. Langford. Agnostic active learning. In ICML, 2006.

[5] M. F. Balcan and V. Feldman. Statistical active learning algorithms, 2013. ArXiv:1307.3102.

[6] M.-F. Balcan and S. Hanneke. Robust interactive learning. In COLT, 2012.

[7] M.-F. Balcan, S. Hanneke, and J. Wortman. The true sample complexity of active learning. In COLT, 2008.

[8] M.-F. Balcan and P. M. Long. Active and passive learning of linear separators under log-concave distributions. JMLR - COLT proceedings (to appear), 2013.

[9] A. Beygelzimer, S. Dasgupta, and J. Langford. Importance weighted active learning. In ICML, pages 49–56, 2009.

[10] A. Beygelzimer, D. Hsu, J. Langford, and T. Zhang. Agnostic active learning without constraints. In NIPS, 2010.

[11] A. Blum, C. Dwork, F. McSherry, and K. Nissim. Practical privacy: the SuLQ framework. In Proceedings of PODS, pages 128–138, 2005.

[12] A. Blum, A. Frieze, R. Kannan, and S. Vempala. A polynomial time algorithm for learning noisy linear threshold functions. Algorithmica, 22(1/2):35–52, 1997.

[13] A. Blum, M. Furst, J. Jackson, M. Kearns, Y. Mansour, and S. Rudich. Weakly learning DNF and characterizing statistical query learning using Fourier analysis. In STOC, pages 253–262, 1994.

[14] A. Blumer, A. Ehrenfeucht, D. Haussler, and M. Warmuth. Learnability and the Vapnik-Chervonenkis dimension. Journal of the ACM, 36(4):929–965, 1989.

[15] R. Castro and R. Nowak. Minimax bounds for active learning. In COLT, 2007.

[16] N. Cesa-Bianchi, C. Gentile, and L. Zaniboni. Learning noisy linear classifiers via adaptive and selective sampling. Machine Learning, 2010.

[17] C. Chu, S. Kim, Y. Lin, Y. Yu, G. Bradski, A. Ng, and K. Olukotun. Map-reduce for machine learning on multicore. In Proceedings of NIPS, pages 281–288, 2006.

[18] S. Dasgupta. Coarse sample complexity bounds for active learning. In NIPS, volume 18, 2005.

[19] S. Dasgupta. Active learning. Encyclopedia of Machine Learning, 2011.

[20] S. Dasgupta and D. Hsu. Hierarchical sampling for active learning. In ICML, pages 208–215, 2008.

[21] S. Dasgupta, D.J. Hsu, and C. Monteleoni. A general agnostic active learning algorithm. NIPS, 20, 2007.

[22] S. Dasgupta, A. Tauman Kalai, and C. Monteleoni. Analysis of perceptron-based active learning. Journal of Machine Learning Research, 10:281–299, 2009.

[23] O. Dekel, C. Gentile, and K. Sridharan. Selective sampling and active learning from single and multiple teachers. JMLR, 2012.

[24] J. Dunagan and S. Vempala. A simple polynomial-time rescaling algorithm for solving linear programs. In STOC, pages 315–320, 2004.

[25] C. Dwork, F. McSherry, K. Nissim, and A. Smith. Calibrating noise to sensitivity in private data analysis. In TCC, pages 265–284, 2006.

[26] V. Feldman. A complete characterization of statistical query learning with applications to evolvability. Journal of Computer System Sciences, 78(5):1444–1459, 2012.

[27] Y. Freund, H.S. Seung, E. Shamir, and N. Tishby. Selective sampling using the query by committee algorithm. Machine Learning, 28(2-3):133–168, 1997.

[28] A. Gonen, S. Sabato, and S. Shalev-Shwartz. Efficient pool-based active learning of halfspaces. In ICML, 2013.

[29] S. Hanneke. A bound on the label complexity of agnostic active learning. In ICML, 2007.

[30] M. Kearns. Efficient noise-tolerant learning from statistical queries. JACM, 45(6):983–1006, 1998.

[31] V. Koltchinskii. Rademacher complexities and bounding the excess risk in active learning. JMLR, 11:2457–2485, 2010.

[32] L. Lov´ sz and S. Vempala. The geometry of logconcave functions and sampling algorithms. Random a Struct. Algorithms, 30(3):307–358, 2007.

[33] A. McCallum and K. Nigam. Employing EM in pool-based active learning for text classification. In ICML, pages 350–358, 1998.

[34] M. Raginsky and A. Rakhlin. Lower bounds for passive and active learning. In NIPS, pages 1026–1034, 2011.

[35] L. G. Valiant. A theory of the learnable. Communications of the ACM, 27(11):1134–1142, 1984. 9