nips nips2009 nips2009-104 nips2009-104-reference knowledge-graph by maker-knowledge-mining

104 nips-2009-Group Sparse Coding


Source: pdf

Author: Samy Bengio, Fernando Pereira, Yoram Singer, Dennis Strelow

Abstract: Bag-of-words document representations are often used in text, image and video processing. While it is relatively easy to determine a suitable word dictionary for text documents, there is no simple mapping from raw images or videos to dictionary terms. The classical approach builds a dictionary using vector quantization over a large set of useful visual descriptors extracted from a training set, and uses a nearest-neighbor algorithm to count the number of occurrences of each dictionary word in documents to be encoded. More robust approaches have been proposed recently that represent each visual descriptor as a sparse weighted combination of dictionary words. While favoring a sparse representation at the level of visual descriptors, those methods however do not ensure that images have sparse representation. In this work, we use mixed-norm regularization to achieve sparsity at the image level as well as a small overall dictionary. This approach can also be used to encourage using the same dictionary words for all the images in a class, providing a discriminative signal in the construction of image representations. Experimental results on a benchmark image classification dataset show that when compact image or dictionary representations are needed for computational efficiency, the proposed approach yields better mean average precision in classification. 1


reference text

[1] R. Baeza-Yates and B. Ribeiro-Neto. Modern Information Retrieval. Addison Wesley, Harlow, England, 1999.

[2] J. Duchi and Y. Singer. Boosting with structural sparsity. In International Conference on Machine Learning (ICML), 2009.

[3] M. Elad and M. Aharon. Image denoising via sparse and redundant representation over learned dictionaries. IEEE Transaction on Image Processing, 15(12):3736–3745, 2006.

[4] M. Everingham, L. Van Gool, C. K. I. Williams, J. Winn, and A. Zisserman. The PASCAL Visual Object Classes Challenge 2007 (VOC2007) Results. http://www.pascalnetwork.org/challenges/VOC/voc2007/workshop/index.html.

[5] H. Lee, A. Battle, R. Raina, and A. Y. Ng. Efficient sparse coding algorithms. In Advances in Neural Information Processing Systems (NIPS), 2007.

[6] D. G. Lowe. Object recognition from local scale-invariant features. In International Conference on Computer Vision (ICCV), pages 1150–1157, 1999.

[7] J. Mairal, F. Bach, J. Ponce, and G. Sapiro. Online dictionary learning for sparse coding. In International Conference on Machine Learning (ICML), 2009.

[8] J. Mairal, M. Elad, and G. Sapiro. Sparse representation for color image restoration. IEEE Transaction on Image Processing, 17(1), 2008.

[9] J. Mairal, M. Leordeanu, F. Bach, M. Hebert, and J. Ponce. Discriminative sparse image models for class-specific edge detection and image interpretation. In European Conference on Computer Vision (ECCV), 2008.

[10] S. Negahban and M. Wainwright. Phase transitions for high-dimensional joint support recovery. In Advances in Neural Information Processing Systems 22, 2008.

[11] D. Nister and H. Stewenius. Scalable recognition with a vocabulary tree. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2006.

[12] G. Obozinski, B. Taskar, and M. Jordan. Joint covariate selection for grouped classification. Technical Report 743, Dept. of Statistics, University of California Berkeley, 2007.

[13] B. A. Olshausen and D. J. Field. Sparse coding with an overcomplete basis set: A strategy employed by v1? Vision Research, 37, 1997.

[14] P. Quelhas, F. Monay, J. M. Odobez, D. Gatica-Perez, T. Tuytelaars, and L. J. Van Gool. Modeling scenes with local descriptors and latent aspects. In International Conference on Computer Vision (ICCV), 2005.

[15] E. Tola, V. Lepetit, and P. Fua. A fast local descriptor for dense matching. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2008.

[16] J. yang, K. Yu, Y. Gong, and T. Huang. Linear spatial pyramid matching using sparse coding for image classification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2009. 8