nips nips2007 nips2007-96 nips2007-96-reference knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Shigeyuki Oba, Motoaki Kawanabe, Klaus-Robert Müller, Shin Ishii
Abstract: In bioinformatics it is often desirable to combine data from various measurement sources and thus structured feature vectors are to be analyzed that possess different intrinsic blocking characteristics (e.g., different patterns of missing values, observation noise levels, effective intrinsic dimensionalities). We propose a new machine learning tool, heterogeneous component analysis (HCA), for feature extraction in order to better understand the factors that underlie such complex structured heterogeneous data. HCA is a linear block-wise sparse Bayesian PCA based not only on a probabilistic model with block-wise residual variance terms but also on a Bayesian treatment of a block-wise sparse factor-loading matrix. We study various algorithms that implement our HCA concept extracting sparse heterogeneous structure by obtaining common components for the blocks and specific components within each block. Simulations on toy and bioinformatics data underline the usefulness of the proposed structured matrix factorization concept. 1
[1] I. Nabney and Christopher Bishop. Netlab: Netlab neural network software. http://www.ncrg.aston.ac.uk/netlab/, 1995.
[2] C.M. Bishop. Bayesian PCA. In Proceedings of 11th conference on Advances in neural information processing systems, pages 382–388. MIT Press Cambridge, MA, USA, 1999.
[3] N. Srebro and T. Jaakkola. Weighted low rank matrix approximations. In Proceedings of 20th International Conference on Machine Learning, pages 720–727, 2003.
[4] A. d’Aspremont, F. R. Bach, and L. El Ghaoui. Full regularization path for sparse principal component analysis. In Proceedings of the 24th International Conference on Machine Learning, 2007.
[5] S. Oba, M. Sato, I. Takemasa, M. Monden, K. Matsubara, and S. Ishii. A Bayesian missing value estimation method for gene expression profile data. Bioinformatics, 19(16):2088–2096, 2003.
[6] M. Ohira, S. Oba, Y. Nakamura, E. Isogai, S. Kaneko, A. Nakagawa, T. Hirata, H. Kubo, T. Goto, S. Yamada, Y. Yoshida, M. Fuchioka, S. Ishii, and A. Nakagawara. Expression profiling using a tumor-specific cDNA microarray predicts the prognosis of intermediate risk neuroblastomas. Cancer Cell, 7(4):337–350, Apr 2005.