iccv iccv2013 iccv2013-258 iccv2013-258-reference knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Tianzhu Zhang, Bernard Ghanem, Si Liu, Changsheng Xu, Narendra Ahuja
Abstract: In this paper, we propose a low-rank sparse coding (LRSC) method that exploits local structure information among features in an image for the purpose of image-level classification. LRSC represents densely sampled SIFT descriptors, in a spatial neighborhood, collectively as lowrank, sparse linear combinations of codewords. As such, it casts the feature coding problem as a low-rank matrix learning problem, which is different from previous methods that encode features independently. This LRSC has a number of attractive properties. (1) It encourages sparsity in feature codes, locality in codebook construction, and low-rankness for spatial consistency. (2) LRSC encodes local features jointly by considering their low-rank structure information, and is computationally attractive. We evaluate the LRSC by comparing its performance on a set of challenging benchmarks with that of 7 popular coding and other state-of-theart methods. Our experiments show that by representing local features jointly, LRSC not only outperforms the state-ofthe-art in classification accuracy but also improves the time complexity of methods that use a similar sparse linear repre- sentation model for feature coding [36].
[1] R. Achanta, A. Shaji, K. Smith, A. Lucchi, P. Fua, and S. Susstrunk. Slic superpixels. In Technical report, EPFL, 2010.
[2] H. Bay, T. Tuytelaars, and L. V. Gool. Surf: Speeded up robust features. In ECCV, 2006.
[3] O. Boiman, I. Rehovot, E. Shechtman, and M. Irani. In defense of nearestneighbor based image classification. In CVPR, 2008.
[4] A. Bosch, A. Zisserman, and X. Mu˜ noz. Scene classification via plsa. In ECCV, 2006.
[5] Y.-L. Boureau, F. Bach, Y. LeCun, and J. Ponce. Learning mid-level features for recognition. In CVPR, 2010.
[6] Y.-L. Boureau, N. L. Roux, F. Bach, J. Ponce, and Y. LeCun. Ask the locals: multi-way local pooling for image recognition. In ICCV, 2011.
[7] N. Dalal and B. Triggs. Histograms of oriented gradients for human detection. In CVPR, 2005.
[8] O. Duchenne, A. Joulin, and J. Ponce. A graph-matching kernel for object categorization. In ICCV, 2011.
[9] L. Fei-Fei, R. Fergus, and P. Perona. Learning generative visual models from few training examples: An incremental bayesian approach tested on 101 object categories. In CVIU, 2007.
[10] L. Fei-Fei and P. Perona. A bayesian hierarchical model for learning natural scene categories. In CVPR, 2005.
[11] S. Gao, I. Tsang, L. Chia, and P. Zhao. Local features are not lonely - laplacian sparse coding for image classification. In CVPR, 2010.
[12] P. Gehler and S. Nowozin. On feature combination formulticlass object classification. In ICCV, 2009.
[13] G. Griffin, A. Holub, and P. Perona. Caltech-256 object category dataset. In techreport, 2007.
[14] http://www.robots.ox.ac.uk/ vgg/software/MKL/.
[15] Y. Huang, K. Huang, Y. Yu, and T. Tan. Salient coding for image classification. In CVPR, 2011.
[16] A. Hyvarinen and P. O. Hoyer. A two-layer sparse coding model learns simple and complex cell receptive fields and topography from natural images. Vision Research, 41(18):2413–23, 2001.
[17] P. Jain, B. Kulis, and K. Grauman. Fast image search for learned metrics. In CVPR, 2008.
[18] J.Wang, J. Yang, K. Yu, F. Lv, T. Huang, and Y. Gong. Locality-constrained linear coding for image classification. In CVPR, 2010.
[19] J.Wu and J. Rehg. Beyond the euclidean distance: Creating effective visual codebooks using the histogram intersection kernel. In ICCV, 2009.
[20] Y. Ke and R. Sukthankar. Pca-sift: A more distinctive representation for local image descriptors. In CVPR, 2004.
[21] J. Kim and G. K. Asymmetric region-to-image matching for comparing images with generic object categories. In CVPR, 2010.
[22] S. Lazebnik, C. Schmid, and J. Ponce. Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories. In CVPR, 2006.
[23] L. J. Li and L. Fei-Fei. What, where and who? classifying events by scene and object recognition. In ICCV, 2007.
[24] Z. Lin, A. Ganesh, J. Wright, L. Wu, M. Chen, and Y. Ma. Fast convex optimization algorithms for exact recovery of a corrupted low-rank matrix. In Technical Report UILU-ENG-09-2214, UIUC, August 2009.
[25] L. Liu, L. Wang, and X. Liu. In defense of softassignment coding. In ICCV, 2011.
[26] S. Liu, J. Feng, Z. Song, T. Zhang, H. Lu, C. Xu, and S. Yan. Hi, magic closet, tell me what to wear! In ACM Multimedia, 2012.
[27] D. G. Lowe. Distinctive image features from scaleinvariant keypoints. In IJCV, 2004.
[28] A. Oliva and A. Torralba. Modeling the shape of the scene: A holistic representation of the spatial envelope. IJCV, 42(1): 145–175, 2001 .
[29] Y. Peng, A. Ganesh, J. Wright, W. Xu, and Y. Ma. RASL: Robust Alignment by Sparse and Low-rank Decomposition for Linearly Correlated Images. TPAMI (to appear), 2011.
[30] F. Sadeghi and M. F. Tappen. Latent pyramidal regions for recognizing scenes. In ECCV, 2012. [3 1] A. Shabou and H. Le-Borgne. Locality-constrained and spatially regularized coding for scene categorization. In CVPR, 2012.
[32] S. Todorovic and N. Ahuja. Learning subcategory relevances for category recognition. In CVPR, 2008.
[33] J. C. van Gemert, J.-M. Geusebroek, C. J. Veenman, and A. W. M. Smeulders. Kernel codebooks for scene categorization. In ECCV, 2008.
[34] A. Vedaldi, V. Gulshan, M. Varma, and A. Zisserman. Multiple kernels for object detection. In ICCV, 2009.
[35] J. Yang, Y. Li, Y. Tian, L. Duan, and W. Gao. Group-sensitive multiple kernel learning for object categorization. In ICCV, 2009.
[36] J. Yang, K. Yu, Y. Gong, and T. Huang. Linear spatial pyramid matching using sparse coding for image classification. In CVPR, 2009.
[37] K. Yu, T. Zhang, and Y. Gong. Nonlinear learning using local coordinate coding. In NIPS, 2009.
[38] H. Zhang, A. C. Berg, M. Maire, and J. Malik. Svm-knn: Discriminative nearest neighbor classification for visual category recognition. In CVPR, 2006.
[39] T. Zhang, B. Ghanem, S. Liu, and N. Ahuja. Low-rank sparse learning for robust visual tracking. In European Conference on Computer Vision, pages 1–14, 2012.
[40] T. Zhang, B. Ghanem, S. Liu, and N. Ahuja. Robust visual tracking via structured multi-task sparse learning. International Journal of Computer Vision, 101(2):367–383, 2013. 288