nips nips2011 nips2011-21 nips2011-21-reference knowledge-graph by maker-knowledge-mining

21 nips-2011-Active Learning with a Drifting Distribution

Source: pdf

Author: Liu Yang

Abstract: We study the problem of active learning in a stream-based setting, allowing the distribution of the examples to change over time. We prove upper bounds on the number of prediction mistakes and number of label requests for established disagreement-based active learning algorithms, both in the realizable case and under Tsybakov noise. We further prove minimax lower bounds for this problem. 1

reference text

[AB99] M. Anthony and P. L. Bartlett. Neural Network Learning: Theoretical Foundations. Cambridge University Press, 1999. [Bar92] P. L. Bartlett. Learning with a slowly changing distribution. In Proceedings of the ﬁfth annual workshop on Computational learning theory, COLT ’92, pages 243–252, 1992. [BHV10] M.-F. Balcan, S. Hanneke, and J. Wortman Vaughan. The true sample complexity of active learning. Machine Learning, 80(2–3):111–139, September 2010. [BL97] R. D. Barve and P. M. Long. On the complexity of learning from drifting distributions. Inf. Comput., 138(2):170–193, 1997. [CAL94] D. Cohn, L. Atlas, and R. Ladner. Improving generalization with active learning. Machine Learning, 15(2):201–221, 1994. [CMEDV10] K. Crammer, Y. Mansour, E. Even-Dar, and J. Wortman Vaughan. Regret minimization with concept drift. In COLT, pages 168–180, 2010. [Das05] S. Dasgupta. Coarse sample complexity bounds for active learning. In Advances in Neural Information Processing Systems 18, 2005. [DGS10] O. Dekel, C. Gentile, and K. Sridharam. Robust selective sampling from single and multiple teachers. In Conference on Learning Theory, 2010. [DHM07] S. Dasgupta, D. Hsu, and C. Monteleoni. A general agnostic active learning algorithm. Technical Report CS2007-0898, Department of Computer Science and Engineering, University of California, San Diego, 2007. [DKM09] S. Dasgupta, A. Kalai, and C. Monteleoni. Analysis of perceptron-based active learning. Journal of Machine Learning Research, 10:281–299, 2009. [FM97] Y. Freund and Y. Mansour. Learning under persistent drift. In Proceedings of the Third European Conference on Computational Learning Theory, EuroCOLT ’97, pages 109–118, 1997. [Han07] S. Hanneke. A bound on the label complexity of agnostic active learning. In Proceedings of the 24th International Conference on Machine Learning, 2007. [Han11] S. Hanneke. Rates of convergence in active learning. The Annals of Statistics, 39(1):333–361, 2011. [HLW94] D. Haussler, N. Littlestone, and M. Warmuth. Predicting {0, 1}-functions on randomly drawn points. Information and Computation, 115:248–292, 1994. [Lit88] N. Littlestone. Learning quickly when irrelevant attributes abound: A new linear-threshold algorithm. Machine Learning, 2:285–318, 1988. [LLW08] L. Li, M. L. Littman, and T. J. Walsh. Knows what it knows: A framework for self-aware learning. In International Conference on Machine Learning, 2008. [MMR08] Y. Mansour, M. Mohri, and A. Rostamizadeh. Domain adaptation with multiple sources. In In Advances in Neural Information Processing Systems (NIPS), pages 1041–1048, 2008. [MMR09] Y. Mansour, M. Mohri, and A. Rostamizadeh. Domain adaptation: Learning bounds and algorithms. In COLT, 2009. [MT99] E. Mammen and A.B. Tsybakov. Smooth discrimination analysis. The Annals of Statistics, 27:1808–1829, 1999. [Vap82] V. Vapnik. Estimation of Dependencies Based on Empirical Data. Springer-Verlag, New York, 1982. [vdG00] S. van de Geer. Empirical Processes in M-Estimation (Cambridge Series in Statistical and Probabilistic Mathematics). Cambridge University Press, 2000. 9