nips nips2005 nips2005-117 nips2005-117-reference knowledge-graph by maker-knowledge-mining

117 nips-2005-Learning from Data of Variable Quality

Source: pdf

Author: Koby Crammer, Michael Kearns, Jennifer Wortman

Abstract: We initiate the study of learning from multiple sources of limited data, each of which may be corrupted at a different rate. We develop a complete theory of which data sources should be used for two fundamental problems: estimating the bias of a coin, and learning a classiﬁer in the presence of label noise. In both cases, efﬁcient algorithms are provided for computing the optimal subset of data. 1

reference text

[1] A. Blum and T. Mitchell. Combining labeled and unlabeled data with co-training. In Proceedings of the Eleventh Annual Conference on Computational Learning Theory, pages 92–100, 1998.

[2] V. N. Vapnik. Statistical Learning Theory. Wiley, 1998.

[3] M. J. Kearns and U. V. Vazirani. An Introduction to Computational Learning Theory. MIT Press, 1994.

[4] D. Haussler, M. Kearns, H.S. Seung, and N. Tishby. Rigorous learning curve bounds from statistical mechanics. In Proceedings of the Seventh Annual ACM Conference on Computational Learning Theory, pages 76–87, 1994.

[5] K. Crammer, M. Kearns, and J. Wortman. Forthcoming. 2006.

[6] M. Kearns, R. Schapire, and L. Sellie. Towards efﬁcient agnostic learning. Machine Learning, 17:115–141, 1994.