nips nips2005 nips2005-77 nips2005-77-reference knowledge-graph by maker-knowledge-mining

77 nips-2005-From Lasso regression to Feature vector machine


Source: pdf

Author: Fan Li, Yiming Yang, Eric P. Xing

Abstract: Lasso regression tends to assign zero weights to most irrelevant or redundant features, and hence is a promising technique for feature selection. Its limitation, however, is that it only offers solutions to linear models. Kernel machines with feature scaling techniques have been studied for feature selection with non-linear models. However, such approaches require to solve hard non-convex optimization problems. This paper proposes a new approach named the Feature Vector Machine (FVM). It reformulates the standard Lasso regression into a form isomorphic to SVM, and this form can be easily extended for feature selection with non-linear models by introducing kernels defined on feature vectors. FVM generates sparse solutions in the nonlinear feature space and it is much more tractable compared to feature scaling kernel machines. Our experiments with FVM on simulated data show encouraging results in identifying the small number of dominating features that are non-linearly correlated to the response, a task the standard Lasso fails to complete.


reference text

[Canu et al., 2002] Canu, S. and Grandvalet, Y. Adaptive Scaling for Feature Selection in SVMs NIPS 15, 2002 [Hochreiter et al., 2004] Hochreiter, S. and Obermayer, K. Gene Selection for Microarray Data. In Kernel Methods in Computational Biology, pp. 319-355, MIT Press, 2004. [Krishnapuram et al., 2003] Krishnapuram, B. et al. Joint classifier and feature optimization for cancer diagnosis using gene expression data. The Seventh Annual International Conference on Research in Computational Molecular Biology (RECOMB) 2003, ACM press, April 2003 [Ng et al., 2003] Ng, A. Feature selection, L1 vs L2 regularization, and rotational invariance. ICML 2004 [Perkins et al., 2003] Perkins, S., Lacker, K. & Theiler, J. Grafting: Fast,Incremental Feature Selection by gradient descent in function space JMLR 2003 1333-1356 [Roth, 2004] Roth, V. The Generalized LASSO. IEEE Transactions on Neural Networks (2004), Vol. 15, NO. 1. [Tibshirani et al., 1996] Tibshirani, R. Optimal Reinsertion:Regression shrinkage and selection via the lasso. J.R.Statist. Soc. B(1996), 58,No.1, 267-288 [Cover et al., 1991] Cover, TM. and Thomas, JA. Elements in Information Theory. New York: John Wiley & Sons Inc (1991). [Weston et al., 2000] Weston, J., Mukherjee, S., Chapelle, O., Pontil, M., Poggio, T. and Vapnik V. Feature Selection for SVMs NIPS 13, 2000