iccv iccv2013 iccv2013-231 iccv2013-231-reference knowledge-graph by maker-knowledge-mining

231 iccv-2013-Latent Multitask Learning for View-Invariant Action Recognition


Source: pdf

Author: Behrooz Mahasseni, Sinisa Todorovic

Abstract: This paper presents an approach to view-invariant action recognition, where human poses and motions exhibit large variations across different camera viewpoints. When each viewpoint of a given set of action classes is specified as a learning task then multitask learning appears suitable for achieving view invariance in recognition. We extend the standard multitask learning to allow identifying: (1) latent groupings of action views (i.e., tasks), and (2) discriminative action parts, along with joint learning of all tasks. This is because it seems reasonable to expect that certain distinct views are more correlated than some others, and thus identifying correlated views could improve recognition. Also, part-based modeling is expected to improve robustness against self-occlusion when actors are imaged from different views. Results on the benchmark datasets show that we outperform standard multitask learning by 21.9%, and the state-of-the-art alternatives by 4.5–6%.


reference text

[1] A. Argyriou, T. Evgeniou, and M. Pontil. Convex multi-task feature learning. Machine Learning, 73:243–272, 2008. 3, 5

[2] R. Caruana. Multitask learning. Machine Learning, 28(1):41–75, 1997. 1

[3] A. Farhadi and M. K. Tabrizi. Learning to recognize activi-

[4]

[5]

[6]

[7]

[8] ties from the wrong view point. In ECCV, 2008. 1 A. Farhadi, M. K. Tabrizi, I. Endres, and D. A. Forsyth. A latent model of discriminative aspect. In ICCV, 2009. 1 P. F. Felzenszwalb, R. B. Girshick, D. McAllester, and D. Ramanan. Object detection with discriminatively trained partbased models. PAMI, 32(9): 1627–45, 2010. 2, 4 N. Gkalelis, H. Kim, A. Hilton, N. Nikolaidis, and I. Pitas. The i3DPost multi-miew and 3D human action/interaction database. CVMP, 2009. 4 D. Gong and G. Medioni. Dynamic manifold warping for view invariant action recognition. ICCV, 2011. 1 A. Iosifidis, N. Nikolaidis, and I. Pitas. Movement recognition exploiting multi-view information. In MMSP, 2010. 7

[9] I. N. Junejo, E. Dexter, I. Laptev, and P. P ´erez. View-independent action recognition from temporal selfsimilarities. PAMI, 33(1): 172–85, 2011. 1, 7

[10] Z. Kang and K. Grauman. Learning with whom to share in multi-task feature learning. ICML, 2011. 2, 3, 4, 5

[11] R. Li and T. Zickler. Discriminative virtual views for crossview action recognition. CVPR, 2012. 1, 6, 7

[12] J. Liu, M. Shah, B. Kuipers, and S. Savarese. Cross-view action recognition via view knowledge transfer. CVPR, 2011. 1, 7

[13] N. Loeff and A. Farhadi. Scene discovery by matrix factorization. In ECCV, 2008. 1

[14] F. Lv and R. Nevatia. Single view human action recognition using key pose matching and viterbi path searching. In CVPR, 2007. 1

[15] V. Parameswaran and R. Chellappa. View invariance for human action recognition. IJCV, 66(1):83–101, 2006. 1

[16] A. Quattoni, M. Collins, and T. Darrell. Transfer learning for image classification with sparse prototype representations. In CVPR, 2008. 1

[17] C. Rao, A. Yilmaz, and M. Shah. View-invariant representation and recognition of actions. IJCV, 50(2):203–226, 2002. 1

[18] Y. Shen and H. Foroosh. View-invariant action recognition from point triplets. PAMI, 31(10): 1898–905, 2009. 1

[19] T. F. Syeda-Mahmood, M. A. O. Vasilescu, and S. Sethi. Recognizing action events from multiple viewpoints. In IEEE Workshop on Detection and Recognition of Events in Video, pages 64–, 2001 . 1

[20] A. Torralba, K. P. Murphy, and W. T. Freeman. Sharing visual features for multiclass and multiview object detection. PAMI, 29(5):854–869, 2007. 1

[21] H. Wang. Dense Trajectories. http : / / l ear . inri alpe s . fr /peop le /wang/ den s e_ t ra j e ct o ri s / , 2011. 4 e

[22] H. Wang, A. Kl¨ aser, C. Schmid, and C.-L. Liu. Action Recognition by Dense Trajectories. In IEEE Conference on Computer Vision & Pattern Recognition, pages 3 169–3 176, Colorado Springs, United States, June 2011. 4

[23] D. Weinland, E. Boyer, and R. Ronfard. Action recognition from arbitrary views using 3D Exemplars. ICCV, 2007. 1

[24] D. Weinland, M. O¨zuysal, and P. Fua. Making action recognition robust to occlusions and viewpoint changes. ECCV, 2010. 4, 7

[25] D. Weinland, R. Ronfard, and E. Boyer. Free viewpoint action recognition using motion history volumes. CVIU, 104(2):249–257, 2006. 1, 4

[26] X. Wu and Y. Jia. View-invariant action recognition using

[27]

[28]

[29]

[30] [3 1] latent kernelized structural svm. In ECCV, 2012. 1, 4, 7 Y. Wu. Mining actionlet ensemble for action recognition with depth cameras. In CVPR, 2012. 7 P. Yan, S. M. Khan, and M. Shah. Learning 4D action feature models for arbitrary view action recognition. ICCV, 2008. 1 A. Yilmaz and M. Shah. Actions sketch : A novel action representation. CVPR, 2005. 1 C.-N. J. Yu and T. Joachims. Learning structural SVMs with latent variables. ICML, 2009. 2, 4 T. Zhang, B. Ghanem, S. Liu, and N. Ahuja. Robust visual tracking via structured multi-task sparse learning. IJCV, 101(2):367–383, 2012. 1 33 112358