nips nips2002 nips2002-89 nips2002-89-reference knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Nuno Vasconcelos
Abstract: We address the question of feature selection in the context of visual recognition. It is shown that, besides efficient from a computational standpoint, the infomax principle is nearly optimal in the minimum Bayes error sense. The concept of marginal diversity is introduced, leading to a generic principle for feature selection (the principle of maximum marginal diversity) of extreme computational simplicity. The relationships between infomax and the maximization of marginal diversity are identified, uncovering the existence of a family of classification procedures for which near optimal (in the Bayes error sense) feature selection does not require combinatorial search. Examination of this family in light of recent studies on the statistics of natural images suggests that visual recognition problems are a subset of it.
[1] S. Basu, C. Micchelli, and P. Olsen. Maximum Entropy and Maximum Likelihood Criteria for Feature Selection from Multivariate Data. In Proc. IEEE International Symposium on Circuits and Systems, Geneva, Switzerland,2000.
[2] A. Bell and T. Sejnowski. An Information Maximisation Approach to Blind Separation and Blind Deconvolution. Neural Computation, 7(6):1129–1159, 1995.
[3] B. Bonnlander and A. Weigand. Selecting Input Variables using Mutual Information and Nonparametric Density Estimation. In Proc. IEEE International ICSC Symposium on Artificial Neural Networks, Tainan,Taiwan,1994.
[4] D. Erdogmus and J. Principe. Information Transfer Through Classifiers and its Relation to Probability of Error. In Proc. of the International Joint Conference on Neural Networks, Washington, 2001.
[5] J. Huang and D. Mumford. Statistics of Natural Images and Models. In IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Fort Collins, Colorado, 1999.
[6] A. Jain and D. Zongker. Feature Selection: Evaluation, Application, and Small Sample Performance. IEEE Trans. on Pattern Analysis and Machine Intelligence, 19(2):153–158, February 1997.
[7] R. Linsker. Self-Organization in a Perceptual Network. IEEE Computer, 21(3):105–117, March 1988.
[8] J. Portilla and E. Simoncelli. Texture Modeling and Synthesis using Joint Statistics of Complex Wavelet Coefficients. In IEEE Workshop on Statistical and Computational Theories of Vision, Fort Collins, Colorado, 1999.
[9] J. Principe, D. Xu, and J. Fisher. Information-Theoretic Learning. In S. Haykin, editor, Unsupervised Adaptive Filtering, Volume 1: Blind-Souurce Separation. Wiley, 2000.
[10] G. Saon and M. Padmanabhan. Minimum Bayes Error Feature Selection for Continuous Speech Recognition. In Proc. Neural Information Proc. Systems, Denver, USA, 2000.
[11] K. Torkolla and W. Campbell. Mutual Information in Learning Feature Transforms. In Proc. International Conference on Machine Learning, Stanford, USA, 2000.
[12] G. Trunk. A Problem of Dimensionality: a Simple Example. IEEE Trans. on Pattern. Analysis and Machine Intelligence, 1(3):306–307, July 1979.
[13] N. Vasconcelos. Feature Selection by Maximum Marginal Diversity: Optimality and Implications for Visual Recognition. In submitted, 2002.
[14] N. Vasconcelos and G. Carneiro. What is the Role of Independence for Visual Regognition? In Proc. European Conference on Computer Vision, Copenhagen, Denmark, 2002.
[15] H. Yang and J. Moody. Data Visualization and Feature Selection: New Algorithms for Nongaussian Data. In Proc. Neural Information Proc. Systems, Denver, USA, 2000.