iccv iccv2013 iccv2013-116 iccv2013-116-reference knowledge-graph by maker-knowledge-mining

116 iccv-2013-Directed Acyclic Graph Kernels for Action Recognition


Source: pdf

Author: Ling Wang, Hichem Sahbi

Abstract: One of the trends of action recognition consists in extracting and comparing mid-level features which encode visual and motion aspects of objects into scenes. However, when scenes contain high-level semantic actions with many interacting parts, these mid-level features are not sufficient to capture high level structures as well as high order causal relationships between moving objects resulting into a clear drop in performances. In this paper, we address this issue and we propose an alternative action recognition method based on a novel graph kernel. In the main contributions of this work, we first describe actions in videos using directed acyclic graphs (DAGs), that naturally encode pairwise interactions between moving object parts, and then we compare these DAGs by analyzing the spectrum of their sub-patterns that capture complex higher order interactions. This extraction and comparison process is computationally tractable, re- sulting from the acyclic property of DAGs, and it also defines a positive semi-definite kernel. When plugging the latter into support vector machines, we obtain an action recognition algorithm that overtakes related work, including graph-based methods, on a standard evaluation dataset.


reference text

[1] J. Aggarwal and M. Ryoo. Human Activity Analysis: A Review. ACM Computing Surveys, 43(3): 16: 1–16:43, 2011.

[2] F. R. Bach. Graph Kernels between Point Clouds. In ICML, 2008.

[3] W. Brendel and S. Todorovic. Learning Spatiotemporal Graphs of Human Activities. In ICCV, 2011.

[4] A. Gaidon, Z. Harchaoui, and C. Schmid. A time series kernel for action recognition. In BMVC, 2011.

[5] A. Gaidon, Z. Harchaoui, and C. Schmid. Recognizing activities with cluster-trees of tracklets. In BMVC, 2012.

[6] T. G ¨artner, P. A. Flach, and S. Wrobel. On Graph Kernels: Hardness Results and Efficient Alternatives. In Proceedings of the 16th Annual Conference on Computational Learning Theory and the 7th Kernel Workshop, 2003.

[7] A. Gilbert, J. Illingworth, and R. Bowden. Action Recognition Using Mined Hierarchical Compound Features. IEEE Trans. Pattern Anal. Mach. Intell., 33(5):883–897, 2011.

[8] Z. Harchaoui and F. Bach. Image Classification with Segmentation Graph Kernels. In CVPR, 2007.

[9] Y. Jung, H. Park, D.-Z. Du, and B. L. Drake. A Decision Criterion for the Optimal Number of Clusters in Hierarchical Clustering. J. of Global Optimization, 25(1):91–1 11, 2003.

[10] U. Kang, H. Tong, and J. Sun. Fast Random Walk Graph Kernel. In SDM, 2012.

[11] A. Kovashka and K. Grauman. Learning a Hierarchy of Discriminative Space-Time Neighborhood Features for Human Action Recognition. In CVPR, 2010.

[12] N. Kriege and P. Mutzel. Subgraph Matching Kernels for

[13]

[14]

[15]

[16]

[17]

[18]

[19]

[20]

[21]

[22]

[23] Attributed Graphs. In ICML, 2012. T. Lan, Y. Wang, and G. Mori. Discriminative Figure-Centric Models for Joint Action Localization and Recognition. In ICCV, 2011. I. Laptev, M. Marszałek, C. Schmid, and B. Rozenfeld. Learning Realistic Human Actions from Movies. In CVPR, 2008. Q. V. Le, W. Y. Zou, S. Y. Yeung, and A. Y. Ng. Learning hierarchical invariant spatio-temporal features for action recognition with independent subspace analysis. In CVPR, 2011. J. Liu, Y. Yang, I. Saleemi, and M. Shah. Learning semantic features for action recognition via diffusion maps. Computer Vision and Image Understanding, 116(3):361–377, 2012. J. Liu, Y. Yang, and M. Shah. Learning Semantic Visual Vocabularies Using Diffusion Distance. In CVPR, 2009. R. Poppe. A survey on vision-based human action recognition. Image and Vision Computing, 28(6):976–990, 2010. M. Raptis, I. Kokkinos, and S. Soatto. Discovering discriminative action parts from mid-level video representations. In CVPR, 2012. M. D. Rodriguez, J. Ahmed, and M. Shah. Action MACH: A Spatio-temporal Maximum Average Correlation Height Filter for Action Recognition. In CVPR, 2008. C. Schuldt, I. Laptev, and B. Caputo. Recognizing Human Actions: A Local SVM Approach. In ICPR, 2004. N. Shervashidze, S. Vishwanathan, T. Petri, K. Mehlhorn, and K. Borgwardt. Efficient graphlet kernels for large graph comparison. In AISTATS, 2009. S. Todorovic. Human Activities as Stochastic Kronecker Graphs. In ECCV, 2012.

[24] S. V. N. Vishwanathan, N. N. Schraudolph, R. I. Kondor, and K. M. Borgwardt. Graph Kernels. J. Mach. Learn. Res., 11:1201–1242, 2010.

[25] H. Wang, A. Kl¨ aser, C. Schmid, and C.-L. Liu. Action Recognition by Dense Trajectories. In CVPR, 2011.

[26] H. Wang, M. M. Ullah, A. Kl¨ aser, I. Laptev, and C. Schmid. Evaluation of local spatio-temporal features for action recognition. In BMVC, 2009.

[27] F. Yuan, G.-S. Xia, H. Sahbi, and V. Prinet. Mid-level features and spatio-temporal context for activity recognition. Pattern Recognition, 45(12):4182–4191, 2012.

[28] W. Zhang, X. Wang, D. Zhao, and X. Tang. Graph Degree Linkage: Agglomerative Clustering on a Directed Graph. In ECCV, 2012. 33 116758