jmlr jmlr2009 jmlr2009-23 jmlr2009-23-reference knowledge-graph by maker-knowledge-mining

23 jmlr-2009-Discriminative Learning Under Covariate Shift


Source: pdf

Author: Steffen Bickel, Michael Brückner, Tobias Scheffer

Abstract: We address classification problems for which the training instances are governed by an input distribution that is allowed to differ arbitrarily from the test distribution—problems also referred to as classification under covariate shift. We derive a solution that is purely discriminative: neither training nor test distribution are modeled explicitly. The problem of learning under covariate shift can be written as an integrated optimization problem. Instantiating the general optimization problem leads to a kernel logistic regression and an exponential model classifier for covariate shift. The optimization problem is convex under certain conditions; our findings also clarify the relationship to the known kernel mean matching procedure. We report on experiments on problems of spam filtering, text classification, and landmine detection. Keywords: covariate shift, discriminative learning, transfer learning


reference text

S. Bickel and T. Scheffer. Dirichlet-enhanced spam filtering based on biased samples. In Advances in Neural Information Processing Systems, 2007. S. Bickel, M. Br¨ ckner, and T. Scheffer. Discriminative learning for differing training and test u distributions. In Proceedings of the International Conference on Machine Learning, 2007. 2154 D ISCRIMINATIVE L EARNING U NDER C OVARIATE S HIFT C. Cortes, M. Mohri, M. Riley, and A. Rostamizadeh. Sample selection bias correction theory. In Proceedings of the International Conference on Algorithmic Learning Theory, 2008. M. Dudik, R. Schapire, and S. Phillips. Correcting sample selection bias in maximum entropy density estimation. In Advances in Neural Information Processing Systems, 2005. C. Elkan. The foundations of cost-sensitive learning. In Proceedings of the International Joint Conference on Artificial Intellligence, 2001. J. Heckman. Sample selection bias as a specification error. Econometrica, 47:153–161, 1979. J. Huang, A. Smola, A. Gretton, K. Borgwardt, and B. Sch¨ lkopf. Correcting sample selection bias o by unlabeled data. In Advances in Neural Information Processing Systems, 2007. N. Japkowicz and S. Stephen. The class imbalance problem: A systematic study. Intelligent Data Analysis, 6:429–449, 2002. T. Joachims. A probabilistic analysis of the Rocchio algorithm with TFIDF for text categorization. In Proceedings of the 14th International Conference on Machine Learning, 1997. J. Lunceford and M. Davidian. Stratification and weighting via the propensity score in estimation of causal treatment effects: a comparative study. Statistics in Medicine, 23(19):2937–2960, 2004. C. Manski and S. Lerman. The estimation of choice probabilities from choice based samples. Econometrica, 45(8):1977–1988, 1977. R. Prentice and R. Pyke. Logistic disease incidence models and case-control studies. Biometrika, 66(3):403–411, 1979. P. Rosenbaum and D. Rubin. The central role of the propensity score in observational studies for causal effects. Biometrika, 70(1):41–55, 1983. H. Shimodaira. Improving predictive inference under covariate shift by weighting the log-likelihood function. Journal of Statistical Planning and Inference, 90:227–244, 2000. B. Silverman. Density Estimation for Statistics and Data Analysis. Chapman & Hall, London, 1986. M. Sugiyama and K.-R. M¨ ller. Input-dependent estimation of generalization error under covariate u shift. Statistics and Decision, 23(4):249–279, 2005. M. Sugiyama, S. Nakajima, H. Kashima, P. von B¨ nau, and M. Kawanabe. Direct importance u estimation with model selection and its application to covariate shift adaptation. In Advances in Neural Information Processing Systems, 2008. J. Tsuboi, H. Kashima, S. Hido, S. Bickel, and M. Sugiyama. Direct density ratio estimation for large-scale covariate shift adaptation. In Proceedings of the SIAM International Conference on Data Mining, 2008. Y. Xue, X. Liao, L. Carin, and B. Krishnapuram. Multi-task learning for classification with Dirichlet process priors. Journal of Machine Learning Research, 8:35–63, 2007. B. Zadrozny. Learning and evaluating classifiers under sample selection bias. In Proceedings of the International Conference on Machine Learning, 2004. 2155