nips nips2005 nips2005-160 nips2005-160-reference knowledge-graph by maker-knowledge-mining

160 nips-2005-Query by Committee Made Real

Source: pdf

Author: Ran Gilad-bachrach, Amir Navot, Naftali Tishby

Abstract: Training a learning algorithm is a costly task. A major goal of active learning is to reduce this cost. In this paper we introduce a new algorithm, KQBC, which is capable of actively learning large scale problems by using selective sampling. The algorithm overcomes the costly sampling step of the well known Query By Committee (QBC) algorithm by projecting onto a low dimensional space. KQBC also enables the use of kernels, providing a simple way of extending QBC to the non-linear scenario. Sampling the low dimension space is done using the hit and run random walk. We demonstrate the success of this novel algorithm by applying it to both artiﬁcial and a real world problems.

reference text

[1] D. Cohn, L. Atlas, and R. Ladner. Training connectionist networks with queries and selective sampling. Advanced in Neural Information Processing Systems 2, 1990.

[2] H. S. Seung, M. Opper, and H. Sompolinsky. Query by committee. Proc. of the Fifth Workshop on Computational Learning Theory, pages 287–294, 1992.

[3] C. Campbell, N. Cristianini, and A. Smola. Query learning with large margin classiﬁers. In Proc. 17’th International Conference on Machine Learning (ICML), 2000.

[4] S. Tong. Active Learning: Theory and Applications. PhD thesis, Stanford University, 2001.

[5] G. Tur, R. Schapire, and D. Hakkani-T¨ r. Active learning for spoken language underu standing. In Proc. IEEE International Conference on Acoustics, Speech and Signal Processing, 2003.

[6] Y. Freund, H. Seung, E. Shamir, and N. Tishby. Selective sampling using the query by committee algorithm. Machine Learning, 28:133–168, 1997.

[7] H. Mamitsuka and N. Abe. Efﬁcient data mining by active learning. In Progress in Discovery Science, pages 258–267, 2002.

[8] R. Bachrach, S. Fine, and E. Shamir. Query by committee, linear separation and random walks. Theoretical Computer Science, 284(1), 2002.

[9] L. Lov´ sz and S. Vempala. Hit and run is fast and fun. Technical Report MSR-TRa 2003-05, Microsoft Research, 2003.

[10] B. Boser, I. Guyon, and V. Vapnik. Optimal margin classiﬁers. In Fifth Annual Workshop on Computational Learning Theory, pages 144–152, 1992.

[11] S. Dasgupta. Analysis of a greedy active learning strategy. In Neural Information Processing Systems (NIPS), 2004.

[12] S. Dasgupta, A. T. Kalai, and C. Monteleoni. Analysis of perceptron-based active learning. In Proceeding of the 18th annual Conference on Learning Theory (COLT), 2005.

[13] R. Herbrich, T. Graepel, and C. Campbell. Bayes point machines. Journal of Machine Learning Research, 1:245–279, 2001.

[14] R. Gilad-Bachrach, A. Navot, and N. Tishby. Query by committee made real - supplementary material. http://www.cs.huji.ac.il/∼ranb/kqcb supp.ps.

[15] L. Lov´ sz and S. Vempala. Hit-and-run from a corner. In Proc. of the 36th ACM a Symposium on the Theory of Computing (STOC), 2004.

[16] A.M. Martinez and R. Benavente. The ar face database. Technical report, CVC Tech. Rep. #24, 1998.