cvpr cvpr2013 cvpr2013-172 cvpr2013-172-reference knowledge-graph by maker-knowledge-mining

172 cvpr-2013-Finding Group Interactions in Social Clutter


Source: pdf

Author: Ruonan Li, Parker Porfilio, Todd Zickler

Abstract: We consider the problem of finding distinctive social interactions involving groups of agents embedded in larger social gatherings. Given a pre-defined gallery of short exemplar interaction videos, and a long input video of a large gathering (with approximately-tracked agents), we identify within the gathering small sub-groups of agents exhibiting social interactions that resemble those in the exemplars. The participants of each detected group interaction are localized in space; the extent of their interaction is localized in time; and when the gallery ofexemplars is annotated with group-interaction categories, each detected interaction is classified into one of the pre-defined categories. Our approach represents group behaviors by dichotomous collections of descriptors for (a) individual actions, and (b) pairwise interactions; and it includes efficient algorithms for optimally distinguishing participants from by-standers in every temporal unit and for temporally localizing the extent of the group interaction. Most importantly, the method is generic and can be applied whenever numerous interacting agents can be approximately tracked over time. We evaluate the approach using three different video collections, two that involve humans and one that involves mice.


reference text

[1] M. Amer and S. Todorovic. A chains model for localizing participants of group activities in videos. In ICCV, 2011.

[2] X. Burgos-Artizzu, P. Dollar, D. Lin, D. Anderson, and P. Perona. Social behavior recognition in continuous videos. In CVPR, 2012.

[3] W. Choi and S. Savarese. A unified framework for multi-target tracking and collective activity recognition. In ECCV, 2012.

[4] M. Cristani, G. Paggetti, A. Fossati, L. Bazzani, D. Tosato, A. D. Bue, G. Menegaz, and V. Murino. Social interaction discovery by statistical analysis of f-formations. In BMVC, 2011.

[5] C. Crouch and E. Mazur. Peer instruction: Ten years of experience and results. American Journal of Physics, 69:970–977, 2001 .

[6] P. Dollar, V. Rabaud, G. Cottrell, and S. Belongie. Behavior recognition via sparse spatio-temporal features. In VS-PETS, 2005.

[7] O. Duchenne, I. Laptev, J. Sivic, F. Bach, and J. Ponce. Automatic annotation of human actions in video. In ICCV, 2009.

[8] P. Felzenszwalb, R. Girshick, D. McAllester, and D. Ramanan. Object detection with discriminatively trained part based models. PAMI, 32(9): 1627–1645, 2010.

[9] M. Grant and S. Boyd. CVX: Matlab software for disciplined convex programming, version 1.21. http : / / cvxr . com/ cvx, Apr. 2011.

[10] A. Hakeem and M. Shah. Learning, detection and representation of multi-agent events in videos. Artificial Intelligence, 171:586 605, 2007.

[11] S. Hongeng and R. Nevatia. Multi-agent event recognition. In ICCV, 2001.

[12] S. Intille and A. Bobick. Recognizing planned, multiperson action. CVIU, 81:414 – 445, 2001.

[13] Y. Ke, R. Sukthankar, and M. Hebert. Volumetric features for video event detection. IJCV, 88(3):339 – 362, 2010.

[14] C. Lampert, M. Blaschko, and T. Hofmann. Efficient subwindow search: A branch and bound framework for object localization. PAMI, 31(12):2129–2142, 2011.

[15] T. Lan, Y. Wang, W. Yang, S. Robinovitch, and G. Mori. Discriminative latent models for recognizing contextual group activities. PAMI, 34(8): 1549–1562, 2012. –

[16] I. Laptev and P. Perez. Retrieving actions in movies. In ICCV, 2007.

[17] R. Li and R. Chellappa. Group motion segmentation using a spatiotemporal driving force model. In CVPR, 2010.

[18] R. Li, P. Porfilio, and T. Zickler. Finding group interactions in social clutter. Technical Report TR-01-13, Harvard School of Engineering and Applied Sciences, ftp://ftp.deas.harvard.edu/techreports/tr01-13.pdf, 2013.

[19] V. Morariu and L. Davis. Multi-agent event recognition in structured scenarios. In CVPR, 2011.

[20] B. Ni, S. Yan, and A. Kassim. Recognizing human group activities by localized causalities. In CVPR, 2009.

[21] M. S. Ryoo and J. K. Aggarwal. Spatio-temporal relationship match: Video structure comparison for recognition of complex human activities. In ICCV, 2009.

[22] E. Shechtman and M. Irani. Space-time behavioral correlation. In CVPR, 2005.

[23] K. Weinberger, J. Blitzer, and L. Saul. Distance metric learning for large margin nearest neighbor classification. In NIPS, 2005.

[24] J. Yuan, Z. Liu, and Y. Wu. Discriminative video pattern search for efficient action detection. PAMI, 33(9): 1728 – 1743, 2011. 222777222977