cvpr cvpr2013 cvpr2013-103 cvpr2013-103-reference knowledge-graph by maker-knowledge-mining

103 cvpr-2013-Decoding Children's Social Behavior


Source: pdf

Author: James M. Rehg, Gregory D. Abowd, Agata Rozga, Mario Romero, Mark A. Clements, Stan Sclaroff, Irfan Essa, Opal Y. Ousley, Yin Li, Chanho Kim, Hrishikesh Rao, Jonathan C. Kim, Liliana Lo Presti, Jianming Zhang, Denis Lantsman, Jonathan Bidwell, Zhefan Ye

Abstract: We introduce a new problem domain for activity recognition: the analysis of children ’s social and communicative behaviors based on video and audio data. We specifically target interactions between children aged 1–2 years and an adult. Such interactions arise naturally in the diagnosis and treatment of developmental disorders such as autism. We introduce a new publicly-available dataset containing over 160 sessions of a 3–5 minute child-adult interaction. In each session, the adult examiner followed a semistructured play interaction protocol which was designed to elicit a broad range of social behaviors. We identify the key technical challenges in analyzing these behaviors, and describe methods for decoding the interactions. We present experimental results that demonstrate the potential of the dataset to drive interesting research questions, and show preliminary results for multi-modal activity recognition.


reference text

[1] W. Choi, K. Shahid, and S. Savarese. Learning context for collective activity recognition. In CVPR, 2011. 2

[2] F. Eyben, M. Wollmer, and B. Schuller. openSMILE-The munich versatile and fast open-source audio feature extractor. Proc. ACM Multimedia, pages 1459–1462, 2010. 7

[3] A. Fathi, A. Farhadi, and J. M. Rehg. Understanding egocentric activities. In ICCV, 2011. 2

[4] A. Fathi, J. K. Hodgins, and J. M. Rehg. Social interactions: a first-person perspective. In CVPR, 2012. 2

[5] L. Gorelick, M. Blank, E. Shechtman, M. Irani, and

[6]

[7]

[8]

[9]

[10]

[11]

[12]

[13]

[14] R. Basri. Actions as Space-Time Shapes. IEEE Trans. PAMI, 29(12):2247–53, Dec. 2007. 2 Y. Ke, R. Sukthankar, and M. Hebert. Volumetric features for video event detection. IJCV, 2010. 2 J. Kim, H. Rao, and M. Clements. Investigating the use of formant based features for detection of affective dimensions in speech. In Proc. 4th Intl. Conf. on Affective Computing and Intelligent Interaction, pages 369–377, 2011. 7 A. Kylli ¨ainen and J. K. Hietanen. Skin conductance responses to another person’s gaze in children with autism. Journal of Autism and Developmental Disorders, 36(4):517– 525, May 2006. 2 T. Lan, Y. Wang, W. Yang, and G. Mori. Beyond actions: discriminative models for contextual group activities. In NIPS, 2010. 2 I. Laptev, M. Marszalek, C. Schmid, and B. Rozenfeld. Learning Realistic Human Actions from Movies. In CVPR, 2008. 2 M. J. Marin-Jimenez, A. Zisserman, and V. Ferrari. ”Here’s looking at you, kid.” Detecting people looking at each other in videos. In BMVC, 2011. 5 R. Messing, C. Pal, and H. Kautz. Activity Recognition Using the Velocity Histories of Tracked Keypoints. In ICCV, 2009. 2 V. I. Morariu and L. S. Davis. Multi-agent event recognition in structured scenarios. In CVPR, 2011. 2 O. Y. Ousley, R. Arriaga, G. D. Abowd, and M. Morrier. Rapid assessment of social-communicative abilities in infants at risk for autism. Technical Report CBI-100, Center for Behavior Imaging, Georgia Tech, Jan 2012. Available at www . cbi . gat e ch . edu / t e chreport s . 1, 3

[15] K. Prabhakar and J. M. Rehg. Categorizing turn-taking interactions. In ECCV, Florence, Italy, 2012. 2

[16] M. S. Ryoo and J. K. Aggarwal. Spatio-temporal relationship match: Video structure comparison for recognition of complex human activities. In ICCV, Kyoto, Japan, 2009. 2

[17] B. Schuller, M. Valstar, F. Eyben, G. McKeown, R. Cowie, and M. Pantic. The first international audio/visual emotion challenge. In Proc. 4th Intl. Conf. on Affective Computing and Intelligent Interaction, 2011. 7

[18] D. Tran and A. Sorokin. Human Activity Recognition with Metric Learning. ECCV, pages 548–561, 2008. 2

[19] O. Tuzel, F. Porikli, and P. Meer. Region covariance: A fast descriptor for detection and classification. ECCV, 2006. 6

[20] A. Wetherby, J. Woods, L. Allen, J. Cleary, H. Dickinson, and C. Lord. Early indicators of autism spectrum disorders in the second year oflife. Journal ofAutism andDevelopmental Disorders, 34:473–493, 2004. 1

[21] Z. Ye, Y. Li, A. Fathi, Y. Han, A. Rozga, G. D. Abowd, and J. M. Rehg. Detecting eye contact using wearable eyetracking glasses. In 2nd Workshop on Pervasive Eye Tracking and Mobile Eye-based Interaction (PETMEI), 2012. 5

[22] J. Zhang, L. Lo Presti, and S. Sclaroff. Online multi-person tracking by tracker hierarchy. In Proc. IEEE Conf. on Advanced Video and Signal Based Surveillance, 2012. 6 333444112199