cvpr cvpr2013 cvpr2013-201 cvpr2013-201-reference knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Hua Wang, Feiping Nie, Heng Huang, Chris Ding
Abstract: To better understand, search, and classify image and video information, many visual feature descriptors have been proposed to describe elementary visual characteristics, such as the shape, the color, the texture, etc. How to integrate these heterogeneous visual features and identify the important ones from them for specific vision tasks has become an increasingly critical problem. In this paper, We propose a novel Sparse Multimodal Learning (SMML) approach to integrate such heterogeneous features by using the joint structured sparsity regularizations to learn the feature importance of for the vision tasks from both group-wise and individual point of views. A new optimization algorithm is also introduced to solve the non-smooth objective with rigorously proved global convergence. We applied our SMML method to five broadly used object categorization and scene understanding image data sets for both singlelabel and multi-label image classification tasks. For each data set we integrate six different types of popularly used image features. Compared to existing scene and object cat- egorization methods using either single modality or multimodalities of features, our approach always achieves better performances measured.
[1] A. Argyriou, T. Evgeniou, and M. Pontil. Multi-task feature learning. In NIPS, 2007.
[2] X. Cai, F. Nie, H. Huang, and F. Kamangar. Heterogeneous image feature integration via multi-modal spectral clustering. In CVPR, 2011.
[3] N. Dalal and B. Triggs. Histograms of oriented gradients for human detection. In CVPR (1), pages 886–893, 2005.
[4] S. Gao, L. Chia, and I. Tsang. Multi-layer group sparse codingłfor concurrent image classification and annotation. In CVPR, 2011.
[5] P. Gehler and S. Nowozin. On feature combination for multiclass object classification. In ICCV, 2009.
[6] S. Ji, L. Tang, S. Yu, and J. Ye. A shared-subspace learning framework for multi-label classification. ACM Transactions on Knowledge Discovery from Data (TKDD), 4(2): 1– 29, 2010.
[7] A. Kapoor, K. Grauman, R. Urtasun, and T. Darrell. Gaussian processes for object categorization. International journal of computer vision, 88(2): 169–188, 2010.
[8] M. Kloft, U. Brefeld, P. Laskov, and S. Sonnenburg. Nonsparse multiple kernel learning. In NIPS.
[9] G. Lanckriet, N. Cristianini, P. Bartlett, L. Ghaoui, and M. Jordan. Learning the kernel matrix with semidefinite programming. JMLR, 5:27–72, 2004.
[10] D. Lewis, Y. Yang, T. Rose, and F. Li. Rcv1 : A new benchmark collection for text categorization research. The Journal of Machine Learning Research, 5:361–397, 2004.
[11] F. Nie, H. Wang, H. Huang, and C. Ding. Unsupervised and semi-supervised learning via l1-norm graph. In ICCV, pages 2268–2273, 2011.
[12] A. Oliva and A. B. Torralba. Modeling the shape of the scene: A holistic representation of the spatial envelope. International Journal of Computer Vision, 42(3): 145–175, 2001.
[13] S. Sonnenburg, G. R ¨atsch, C. Sch a¨fer, and B. Sch o¨lkopf. Large scale multiple kernel learning. JMLR, 7:153 1–1565, 2006.
[14] J. Suykens, T. Van Gestel, and J. De Brabanter. Least squares support vector machines. World Scientific Pub Co Inc, 2002.
[15] H. Wang, H. Huang, and C. Ding. Image annotation using multi-label correlated Green’s function. In ICCV, 2009.
[16] H. Wang, F. Nie, H. Huang, S. L. Risacher, C. Ding, A. J. Saykin, L. Shen, and ADNI. A new sparse multi-task regression and feature selection method to identify brain imaging predictors for memory performance. ICCV 2011: IEEE Conference on Computer Vision, pages 557–562, 2011.
[17] H. Wang, F. Nie, H. Huang, S. L. Risacher, A. J. Saykin, L. Shen, et al. Identifying disease sensitive and quantitative trait-relevant biomarkers from multidimensional heterogeneous imaging genetics data via sparse multimodal multitask learning. Bioinformatics, 28(12):i127–i136, 2012.
[18] J. Wu and J. M. Rehg. Where am i: Place instance and cate- gory recognition using spatial pact. In CVPR, 2008.
[19] J. Ye, S. Ji, and J. Chen. Multi-class discriminant kernel learning via convex programming. JMLR, 9:719–758, 2008.
[20] S. Yu, T. Falck, A. Daemen, L. Tranchevent, J. Suykens, B. De Moor, and Y. Moreau. L 2-norm multiple kernel learning and its application to biomedical data fusion. BMC bioinformatics, 11(1):309, 2010.
[21] M. Zhang and Z. Zhou. ML-KNN: A lazy learning approach to multi-label learning. Pattern Recognition, 40(7):2038– 2048, 2007. 333 111000002