iccv iccv2013 iccv2013-384 iccv2013-384-reference knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Hua Wang, Feiping Nie, Weidong Cai, Heng Huang
Abstract: Representing the raw input of a data set by a set of relevant codes is crucial to many computer vision applications. Due to the intrinsic sparse property of real-world data, dictionary learning, in which the linear decomposition of a data point uses a set of learned dictionary bases, i.e., codes, has demonstrated state-of-the-art performance. However, traditional dictionary learning methods suffer from three weaknesses: sensitivity to noisy and outlier samples, difficulty to determine the optimal dictionary size, and incapability to incorporate supervision information. In this paper, we address these weaknesses by learning a Semi-Supervised Robust Dictionary (SSR-D). Specifically, we use the ℓ2,0+ norm as the loss function to improve the robustness against outliers, and develop a new structured sparse regularization com, , tom. . cai@sydney . edu . au , heng@uta .edu make the learning tasks easier to deal with and reduce the computational cost. For example, in image tagging, instead of using the raw pixel-wise features, semi-local or patch- based features, such as SIFT and geometric blur, are usually more desirable to achieve better performance. In practice, finding a set of compact features bases, also referred to as dictionary, with enhanced representative and discriminative power, plays a significant role in building a successful computer vision system. In this paper, we explore this important problem by proposing a novel formulation and its solution for learning Semi-Supervised Robust Dictionary (SSRD), where we examine the challenges in dictionary learning, and seek opportunities to overcome them and improve the dictionary qualities. 1.1. Challenges in Dictionary Learning to incorporate the supervision information in dictionary learning, without incurring additional parameters. Moreover, the optimal dictionary size is automatically learned from the input data. Minimizing the derived objective function is challenging because it involves many non-smooth ℓ2,0+ -norm terms. We present an efficient algorithm to solve the problem with a rigorous proof of the convergence of the algorithm. Extensive experiments are presented to show the superior performance of the proposed method.
[1] M. Aharon, M. Elad, and A. Bruckstein. 퐾-SVD: An Algorithm for Designing Overcomplete Dictionaries for Sparse Representation. IEEE Transactions on Signal Processing, 54(1 1):431 1–4322, 2006.
[2] A. Argyriou, T. Evgeniou, and M. Pontil. Multi-task feature learning. NIPS, pages 41–48, 2007.
[3] S. Bengio, F. Pereira, Y. Singer, and D. Strelow. Group sparse coding. In NIPS, 2009.
[4] X. Cai, F. Nie, H. Huang, and C. Ding. Multi-class l2,1-norm support vector machine. In ICDM, pages 91–100, 2011.
[5] E. Cand e`s and T. Tao. Decoding by linear programming. IEEE Transactions on Information Theory, 5 1:4203–4215, 2005.
[6] E. Cand e`s and M. WAKIN. An introduction to compressive sensing. IEEE Signal Processing Magazine, 25(2):21–30, 2008.
[7] M. Elad and M. Aharon. Image denoising via sparse and redundant representations over learned dictionaries. IEEE Transactions on Image Processing, 15(12):3736–3745, 2006.
[8] H. Lee, A. Battle, R. Raina, and A. Ng. Efficient sparse coding algorithms. In NIPS, 2007.
[9] J. Mairal, F. Bach, J. Ponce, G. Sapiro, and A. Zisserman. Discriminative learned dictionaries for local image analysis. In CVPR, pages 1–8, 2008.
[10] J. Mairal, F. Bach, J. Ponce, G. Sapiro, and A. Zisserman. Supervised dictionary learning. In NIPS, pages 1033–1040, 2009.
[11] F. Nie, H. Huang, X. Cai, and C. Ding. Efficient and Robust Feature Selection via Joint l2, 1-Norms Minimization. In NIPS, 2010.
[12] G. Obozinski, B. Taskar, and M. Jordan. Multi-task feature selection. Technical report, Department of Statistics, University of California, Berkeley, 2006.
[13] R. Raina, A. Battle, H. Lee, B. Packer, and A. Ng. Selftaught learning: Transfer learning from unlabeled data. In ICML, pages 759–766, 2007.
[14] R. Tibshirani. Regression shrinkage and selection via the LASSO. Journal of the Royal Statistical Society. Series B (Methodological), pages 267–288, 1996.
[15] H. Wang, F. Nie, and H. Huang. Robust and discriminative self-taught learning. In ICML, pages 298–306, 2013.
[16] H. Wang, F. Nie, H. Huang, S. L. Risacher, C. Ding, A. J. Saykin, L. Shen, and ADNI. A new sparse multi-task regres- sion and feature selection method to identify brain imaging predictors for memory performance. IEEE Conference on Computer Vision (ICCV), pages 557–562, 2011.
[17] M. Zhang and Z. Zhou. Ml-knn: A lazy learning approach to multi-label learning. Pattern Recognition, 40(7):2038–2048, 2007.
[18] Q. Zhang and B. Li. Discriminative K-SVD for dictionary learning in face recognition. In CVPR, pages 2691–2698, 2010. 11 115522