nips nips2012 nips2012-2 nips2012-2-reference knowledge-graph by maker-knowledge-mining

2 nips-2012-3D Social Saliency from Head-mounted Cameras

Source: pdf

Author: Hyun S. Park, Eakta Jain, Yaser Sheikh

Abstract: A gaze concurrence is a point in 3D where the gaze directions of two or more people intersect. It is a strong indicator of social saliency because the attention of the participating group is focused on that point. In scenes occupied by large groups of people, multiple concurrences may occur and transition over time. In this paper, we present a method to construct a 3D social saliency ďŹ eld and locate multiple gaze concurrences that occur in a social scene from videos taken by head-mounted cameras. We model the gaze as a cone-shaped distribution emanating from the center of the eyes, capturing the variation of eye-in-head motion. We calibrate the parameters of this distribution by exploiting the ďŹ xed relationship between the primary gaze ray and the head-mounted camera pose. The resulting gaze model enables us to build a social saliency ďŹ eld in 3D. We estimate the number and 3D locations of the gaze concurrences via provably convergent modeseeking in the social saliency ďŹ eld. Our algorithm is applied to reconstruct multiple gaze concurrences in several real world scenes and evaluated quantitatively against motion-captured ground truth. 1

reference text

[1] D. Marr. Vision: A Computational Investigation into the Human Representation and Processing of Visual Information. Phenomenology and the Cognitive Sciences, 1982.

[2] N. Snavely, M. Seitz, and R. Szeliski. Photo tourism: Exploring photo collections in 3D. TOG, 2006.

[3] R. Fergus, P. Perona, and A. Zisserman. Object class recognition by unsupervised scale-invariant learning. In CVPR, 2003. 8

[4] A. Gupta, S. Satkin, A. A. Efros, and M. Hebert. From scene geometry to human workspace. In CVPR, 2011.

[5] A. Vinciarelli, M. Pantic, and H. Bourlard. Social signal processing: Survey of an emerging domain. Image and Vision Computing, 2009.

[6] E. Murphy-Chutorian and M. M. Trivedi. Head pose estimation in computer vision: A survey. TPAMI, 2009.

[7] R. S. Jampel and D. X. Shi. The primary position of the eyes, the resetting saccade, and the transverse visual head plane. head movements around the cervical joints. Investigative Ophthalmology and Vision Science, 1992.

[8] R. R. Murphy. Human-robot interaction in rescue robotics. IEEE Trans. on Systems, Man and Cybernetics, 2004.

[9] S. Marks, B. WÂ¨ nsche, and J. Windsor. Enhancing virtual environment-based surgical teamwork training u with non-verbal communication. In GRAPP, 2009.

[10] N. Bilton. A rose-colored view may come standard: Google glass. The New York Times, April 2012.

[11] J.-G. Wang and E. Sung. Study on eye gaze estimation. IEEE Trans. on Systems, Man and Cybernetics, 2002.

[12] E. D. Guestrin and M. Eizenman. General theory of remote gaze estimation using the pupil center and corneal reďŹ‚ection. IEEE Trans. on Biomedical Engineering, 2006.

[13] C. Hennessey and P. Lawrence. 3D point-of-gaze estimation on a volumetric display. In ETRA, 2008.

[14] D. Li, J. Babcock, and D. J. Parkhurst. openEyes: a low-cost head-mounted eye-tracking solution. In ETRA, 2006.

[15] K. Takemura, Y. Kohashi, T. Suenaga, J. Takamatsu, and T. Ogasawara. Estimating 3D point-of-regard and visualizing gaze trajectories under natural head movements. In ETRA, 2010.

[16] N. J. Emery. The eyes have it: the neuroethology, function and evolution of social gaze. Neuroscience and Biobehavioral Reviews, 2000.

[17] A. H. Gee and R. Cipolla. Determining the gaze of faces in images. Image and Vision Computing, 1994.

[18] P. Ballard and G. C. Stockman. Controlling a computer via facial aspect. IEEE Trans. on Systems, Man and Cybernetics, 1995.

[19] R. Rae and H. J. Ritter. Recognition of human head orientation based on artiďŹ cial neural networks. IEEE Trans. on Neural Networks, 1998.

[20] N. M. Robertson and I. D. Reid. Estimating gaze direction from low-resolution faces in video. In ECCV, 2006.

[21] B. Noris, K. Benmachiche, and A. G. Billard. Calibration-free eye gaze direction detection with gaussian processes. In GRAPP, 2006.

[22] S. M. Munn and J. B. Pelz. 3D point-of-regard, position and head orientation from a portable monocular video-based eye tracker. In ETRA, 2008.

[23] G. Welch and E. Foxlin. Motion tracking: no silver bullet, but a respectable arsenal. IEEE Computer Graphics and Applications, 2002.

[24] F. Pirri, M. Pizzoli, and A. Rudi. A general method for the point of regard estimation in 3d space. In CVPR, 2011.

[25] R. Stiefelhagen, M. Finke, J. Yang, and A. Waibel. From gaze to focus of attention. In VISUAL, 1999.

[26] K. Smith, S. O. Ba, J.-M. Odobez, and D. Gatica-Perez. Tracking the visual focus of attention for a varying number of wandering people. TPAMI, 2008.

[27] L. Bazzani, D. Tosato, M. Cristani, M. Farenzena, G. Pagetti, G. Menegaz, and V. Murino. Social interactions by visual focus of attention in a three-dimensional environment. Expert Systems, 2011.

[28] M. Cristani, L. Bazzani, G. Paggetti, A. Fossati, D. Tosato, A. Del Bue, G. Menegaz, and V. Murino. Social interaction discovery by statistical analysis of F-formations. In BMVC, 2011.

[29] A. Fathi, J. K. Hodgins, and J. M. Rehg. Social interaction: A ďŹ rst-person perspective. In CVPR, 2012.

[30] R. I. Hartley and A. Zisserman. Multiple View Geometry in Computer Vision. Cambridge University Press, 2004.

[31] T. Shiratori, H. S. Park, L. Sigal, Y. Sheikh, and J. K. Hodgins. Motion capture from body-mounted cameras. TOG, 2011.

[32] M. A. Fischler and R. C. Bolles. Random sample consensus: a paradigm for model ďŹ tting with applications to image analysis and automated cartography. Communications of the ACM, 1981.

[33] V. Lepetit, F. Moreno-Noguer, and P. Fua. EPnP: An accurate O(n) solution to the PnP problem. IJCV, 2009.

[34] H. Misslisch, D. Tweed, and T. Vilis. Neural constraints on eye motion in human eye-head saccades. Journal of Neurophysiology, 1998.

[35] E. M. Klier, H. Wang, A. G. Constantin, and J. D. Crawford. Midbrain control of three-dimensional head orientation. Science, 2002.

[36] D. E. Angelaki and B. J. M. Hess. Control of eye orientation: where does the brainâ€™s role end and the muscleâ€™s begin? European Journal of Neuroscience, 2004.

[37] K. Fukunaga and L. D. Hostetler. The estimation of the gradient of a density function, with applications in pattern recognition. IEEE Trans. on Information Theory, 1975. 9