nips nips2004 nips2004-164 nips2004-164-reference knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Yves Grandvalet, Yoshua Bengio
Abstract: We consider the semi-supervised learning problem, where a decision rule is to be learned from labeled and unlabeled data. In this framework, we motivate minimum entropy regularization, which enables to incorporate unlabeled data in the standard supervised learning. Our approach includes other approaches to the semi-supervised problem as particular or limiting cases. A series of experiments illustrates that the proposed solution benefits from unlabeled data. The method challenges mixture models when the data are sampled from the distribution class spanned by the generative model. The performances are definitely in favor of minimum entropy regularization when generative models are misspecified, and the weighting of unlabeled data provides robustness to the violation of the “cluster assumption”. Finally, we also illustrate that the method can also be far superior to manifold learning in high dimension spaces. 1
[1] M. R. Amini and P. Gallinari. Semi-supervised logistic regression. In 15th European Conference on Artificial Intelligence, pages 390–394. IOS Press, 2002.
[2] J. O. Berger. Statistical Decision Theory and Bayesian Analysis. Springer, New York, 2 edition, 1985.
[3] M. Brand. Structure learning in conditional probability models via an entropic prior and parameter extinction. Neural Computation, 11(5):1155–1182, 1999.
[4] V. Castelli and T. M. Cover. The relative value of labeled and unlabeled samples in pattern recognition with an unknown mixing parameter. IEEE Trans. on Information Theory, 42(6):2102–2117, 1996.
[5] Y. Grandvalet. Logistic regression for partial labels. In 9th Information Processing and Management of Uncertainty in Knowledge-based Systems – IPMU’02, pages 1935–1941, 2002.
[6] G. J. McLachlan. Discriminant analysis and statistical pattern recognition. Wiley, 1992.
[7] K. Nigam and R. Ghani. Analyzing the effectiveness and applicability of co-training. In Ninth International Conference on Information and Knowledge Management, pages 86–93, 2000.
[8] K. Nigam, A. K. McCallum, S. Thrun, and T. Mitchell. Text classification from labeled and unlabeled documents using EM. Machine learning, 39(2/3):135–167, 2000.
[9] T. J. O’Neill. Normal discrimination with unclassified observations. Journal of the American Statistical Association, 73(364):821–826, 1978.
[10] M. Seeger. Learning with labeled and unlabeled data. Technical report, Institute for Adaptive and Neural Computation, University of Edinburgh, 2002.
[11] M. Szummer and T. S. Jaakkola. Information regularization with partially labeled data. In Advances in Neural Information Processing Systems 15. MIT Press, 2003.
[12] D. Zhou, O. Bousquet, T. Navin Lal, J. Weston, and B. Sch¨ lkopf. Learning with local and o global consistency. In Advances in Neural Information Processing Systems 16, 2004.
[13] X. Zhu, Z. Ghahramani, and J. Lafferty. Semi-supervised learning using Gaussian fields and harmonic functions. In 20th Int. Conf. on Machine Learning, pages 912–919, 2003.