iccv iccv2013 iccv2013-229 iccv2013-229-reference knowledge-graph by maker-knowledge-mining

229 iccv-2013-Large-Scale Video Hashing via Structure Learning


Source: pdf

Author: Guangnan Ye, Dong Liu, Jun Wang, Shih-Fu Chang

Abstract: Recently, learning based hashing methods have become popular for indexing large-scale media data. Hashing methods map high-dimensional features to compact binary codes that are efficient to match and robust in preserving original similarity. However, most of the existing hashing methods treat videos as a simple aggregation of independent frames and index each video through combining the indexes of frames. The structure information of videos, e.g., discriminative local visual commonality and temporal consistency, is often neglected in the design of hash functions. In this paper, we propose a supervised method that explores the structure learning techniques to design efficient hash functions. The proposed video hashing method formulates a minimization problem over a structure-regularized empirical loss. In particular, the structure regularization exploits the common local visual patterns occurring in video frames that are associated with the same semantic class, and simultaneously preserves the temporal consistency over successive frames from the same video. We show that the minimization objective can be efficiently solved by an Acceler- ated Proximal Gradient (APG) method. Extensive experiments on two large video benchmark datasets (up to around 150K video clips with over 12 million frames) show that the proposed method significantly outperforms the state-ofthe-art hashing methods.


reference text

sup-

[1] http : / /www .ni st .gov/ it l iad/mig/med1 2 . c fm/ . /

[2] H. Bondell and B. Reich. Simultaneous regression shrinkage, variable selection, and supervised clustering of predictors with oscar. Biometrics, 2008.

[3] L. Cao, Z. Li, Y. Mu, and S.-F. Chang. Submodular video hashing: A unified framework towards video pooling and indexing. In ACM MM, 2012.

[4] M. Douze, H. Jegou, and C. Schmid. An image-based approach to video copy detection with spatio-temporal post-filtering. TMM, 2010.

[5] J. Duchi, S. Shalev-Shwartz, Y. Singer, and T. Chandra. Efficient projections onto the ‘1 -ball for learning in high dimensions. In ICML, 2008.

[6] A. Gionis, P. Indyk, and R. Motwani. Similarity search in high dimensions via hashing. In VLDB, 1999.

[7] Y. Gong and S. Lazebnik. Iterative quantization: A procrustean approach to learning binary codes. In CVPR, 2011.

[8] J. Heo, Y. Lee, J. He, S.-F. Chang, and S.-E. Yoon. Spherical hashing. In CVPR, 2012.

[9] Y.-G. Jiang, G. Ye, S.-F. Chang, D. Ellis, and A. C. Loui. Consumer video understanding: A benchmark database and an evaluation ofhuman and machine performance. In ICMR, 2011.

[10] Y.-G. Jiang, X. Zeng, G. Ye, D. Ellis, S.-F. Chang, S. Bhattacharya, and M. Shah. Columbia-UCF TRECVID 2010 multimedia event detection: Combining multiple modalities, contextual concepts, and temporal matching. In TRECVID workshop, 2010.

[11] B. Kulis and T. Darrell. Learning to hash with binary reconstructive embeddings. In NIPS, 2009.

[12] P. Li, A. Shrivastava, J. Moore, and C. Konig. Hashing algorithms for large-

[13]

[14]

[15]

[16]

[17]

[18]

[19]

[20]

[21]

[22]

[23] scale learning. In NIPS, 2011. W. Liu, J. Wang, S. Kumar, and S.-F. Chang. Hashing with Graphs. In ICML, 2011. W. Liu, J. Wang, R. Ji, Y.-G. Jiang, and S.-F. Chang. Supervised hashing with kernels. In CVPR, 2012. D. Lowe. Distinctive image features from scale-invariant keypoints. IJCV, 2004. K Mikolajczyk and C Schmid. Scale and affine invariant interest point detectors. IJCV, 2004. Y. Nesterov. Smooth minimization of non-smooth functions. Math. Program, 2005. G. Shakhnarovich. Learning task-specific similarity. PhD thesis, Massachusetts Institute of Technology, 2005. J. Song, Y. Yang, Z. Huang, H. Shen, and R. Hong. Multiple feature hashing for real-time large scale near-duplicate video retrieval. In ACM MM, 2011. A. Torralba, R. Fergus, and Y. Weiss. Small codes and large image databases for recognition. In CVPR, 2008. J. Wang, S. Kumar, and S.-F. Chang. Semi-supervised hashing for large scale search. PAMI, 2012. Y. Weiss, A. Torralba, and R. Fergus. Spectral hashing. In NIPS, 2008. H. Xu, J. Wang, Z. Li, G. Zeng, S. Li, and N. Yu. Complementary hashing for approximate nearest neighbor search. In ICCV, 2011. 22227799