jmlr jmlr2008 jmlr2008-52 jmlr2008-52-reference knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Koby Crammer, Michael Kearns, Jennifer Wortman
Abstract: We consider the problem of learning accurate models from multiple sources of “nearby” data. Given distinct samples from multiple data sources and estimates of the dissimilarities between these sources, we provide a general theory of which samples should be used to learn models for each source. This theory is applicable in a broad decision-theoretic learning framework, and yields general results for classification and regression. A key component of our approach is the development of approximate triangle inequalities for expected loss, which may be of independent interest. We discuss the related problem of learning parameters of a distribution from multiple data sources. Finally, we illustrate our theory through a series of synthetic simulations. Keywords: error bounds, multi-task learning
M. Anthony and P. Bartlett. Neural Network Learning: Theoretical Foundations. Cambridge University Press, 1999. P. Bartlett and S. Mendelson. Rademacher and Gaussian complexities: Risk bounds and structural results. Journal of Machine Learning Research, 3:463–482, 2002. P. Bartlett, S. Boucheron, and G. Lugosi. Model selection and error estimation. Machine Learning, 48:85–113, 2002. J. Baxter. Learning internal representations. In Proceedings of the Eighth Annual Conference on Computational Learning Theory, 1995. S. Ben-David. Exploiting task relatedness for multiple task learning. In Proceedings of the Sixteenth Annual Conference on Computational Learning Theory, 2003. J. Blitzer, K. Crammer, A. Kulesza, F. Pereira, and J. Wortman. Learning bounds for domain adaptation. In Advances in Neural Information Processing Systems 20, 2007. K. Crammer, M. Kearns, and J. Wortman. Learning from data of variable quality. In Advances in Neural Information Processing Systems 18, 2006. K. Crammer, M. Kearns, and J. Wortman. Learning from multiple sources. In Advances in Neural Information Processing Systems 19, 2007. D. Haussler. Decision theoretic generalizations of the PAC model for neural net and other learning applications. Information and Computation, 100(1):78–150, 1992. W. Hoeffding. Probability inequalities for sums of bounded random variables. Journal of the American Statistical Association, 58(301):13–30, 1963. 1773 C RAMMER , K EARNS AND W ORTMAN V. Koltchinskii. Rademacher penalties and structural risk minimization. IEEE Transactions on Information Theory, 47(5):1902–1914, 2001. V. Koltchinskii and D. Panchenko. Rademacher processes and bounding the risk of function learning. High Dimensional Probability, II:443–459, 2000. A. Maurer. Algorithmic stability and meta-learning. Journal of Machine Learning Research, 6: 967–994, 2005. C. McDiarmid. On the method of bounded differences. Surveys in Combinatorics, pages 148–188, 1989. V. Vapnik. Statistical Learning Theory. Wiley, 1998. P. Wu and T. Dietterich. Improving SVM accuracy by training on auxiliary data sources. In Proceedings of the Twenty-First International Conference on Machine Learning, 2004. 1774