iccv iccv2013 iccv2013-155 iccv2013-155-reference knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Xiaoyu Ding, Wen-Sheng Chu, Fernando De_La_Torre, Jeffery F. Cohn, Qiao Wang
Abstract: Automatic facial Action Unit (AU) detection from video is a long-standing problem in facial expression analysis. AU detection is typically posed as a classification problem between frames or segments of positive examples and negative ones, where existing work emphasizes the use of different features or classifiers. In this paper, we propose a method called Cascade of Tasks (CoT) that combines the use ofdifferent tasks (i.e., , frame, segment and transition)for AU event detection. We train CoT in a sequential manner embracing diversity, which ensures robustness and generalization to unseen data. In addition to conventional framebased metrics that evaluate frames independently, we propose a new event-based metric to evaluate detection performance at event-level. We show how the CoT method consistently outperforms state-of-the-art approaches in both frame-based and event-based metrics, across three public datasets that differ in complexity: CK+, FERA and RUFACS.
[1] Z. Ambadar, J. F. Cohn, and L. I. Reed. All smiles are not created equal: Morphology and timing of smiles perceived as amused, polite, and embarrassed/nervous. Journal of nonverbal behavior, 33(1): 17–34, 2009.
[2] M. S. Bartlett, G. C. Littlewort, M. G. Frank, C. Lainscsek, I. R. Fasel, and J. R. Movellan. Automatic recognition of facial actions in spontaneous expressions. Journal of Multimedia, 1(6):22–35, 2006. 22440066
[3] C.-C. Chang and C.-J. Lin. LIBSVM: A library for support vector machines. ACM Transactions on Intelligent Systems and Technology, 2(3):27: 1—-27:27, 2011.
[4] K. Y. Chang, T. L. Liu, and S. H. Lai. Learning partiallyobserved hidden conditional random fields for facial expression recognition. In CVPR, 2009.
[5] S. Chew, P. Lucey, S. Lucey, J. Saragih, J. F. Cohn, S. Sridharan, and Others. Person-independent facial expression detection using constrained local models. In AFGR, 2011.
[6] W.-S. Chu, F. De la Torre, and J. F. Cohn. Selective transfer machine for personalized facial action unit detection. In CVPR, 2013.
[7] W.-S. Chu, F. Zhou, and F. De la Torre. Unsupervised temporal commonality discovery. In ECCV, 2012.
[8] J. F. Cohn, Z. Ambadar, and P. Ekman. Observer-based measurement of facial expression with the Facial Action Coding System. Oxford University Press Series in Affective Science., New York, NY: Oxford University, 2007.
[9] J. F. Cohn and F. De la Torre (In press). Automated face analysis for affective computing. In Handbook of affective computing. Oxford, New York, NY.
[10] F. De la Torre and J. F. Cohn. Facial expression analysis. Visual Analysis of Humans: Looking at People, page 377, 2011.
[11] F. De la Torre, T. Simon, Z. Ambadar, and J. F. Cohn. FASTFACS: A computer-assisted system to increase speed and reliability of manual FACS coding. In Affective Computing and Intelligent Interaction (ACII), 2011.
[12] P. Ekman, W. V. Friesen, and J. C. Hager. Facial action coding system: Research Nexus. Network Research Information, Salt Lake City, UT., 2002.
[13] P. Ekman and E. Rosenberg. What the face reveals. Oxford, New York, NY, 2nd edition, 2005.
[14] C. E. Fairbairn, M. A. Sayette, J. M. Levine, J. F. Cohn, and K. G. Creswell. The effects of alcohol on the emotional displays of Whites in interracial groups. Emotion, 13(3):468– 477, 2013.
[15] M. Hoai, Z.-Z. Lan, and F. De la Torre. Joint segmentation and classification of human actions in video. In CVPR, 2011.
[16] B. Jiang, M. F. Valstar, and M. Pantic. Action unit detection using sparse appearance descriptors in space-time video volumes. In AFGR, 2011.
[17] P. Lucey, J. F. Cohn, T. Kanade, J. Saragih, Z. Ambadar, and I. Matthews. The Extended Cohn-Kanade Dataset (CK+): A complete dataset for action unit and emotion-specified expression. In CVPR Workshops, 2010.
[18] S. Lucey, A. B. Ashraf, and J. F. Cohn. Investigating spontaneous facial action recognition through AAM representations of the face. Face Recognition, pages 275–286, 2007.
[19] A. Martinez and S. Du. A model of the perception of facial expressions of emotion by humans: research overview and perspectives. Journal of Machine Learning Research, 13: 1589–1608, 2012.
[20] I. Matthews and S. Baker. Active appearance models revisited. IJCV, 60(2): 135–164, 2004.
[21] M. Pantic and M. S. Bartlett. Machine analysis of facial expressions. Face Recognition, 2(8):377–416, 2007.
[22] S. Park, G. Mohammadi, R. Artstein, and L.-P. Morency. Crowdsourcing micro-Level multimedia nnnotations : The challenges of evaluation and interface. In Proceedings of the ACM multimedia workshops, 2012.
[23] A. Rakotomamonjy, F. Bach, S. Canu, Y. Grandvalet, and Others. SimpleMKL. Journal of Machine Learning Research, 9:2491–2521, 2008.
[24] O. Rudovic, V. Pavlovic, and M. Pantic. Kernel Conditional Ordinal Random Fields for Temporal Segmentation of Facial Action Units. In ECCV Workshops, 2012.
[25] T. Senechal, V. Rapp, H. Salam, R. Seguier, K. Bailly, and
[26]
[27]
[28]
[29]
[30] [3 1]
[32]
[33]
[34] L. Prevost. Facial action recognition combining heterogeneous features via multikernel learning. Systems, Man, and Cybernetics, Part B: Cybernetics, IEEE Transactions on, 42(4):993–1005, May 2012. L. Shang. Nonparametric discriminant HMM and application to facial expression recognition. In CVPR, 2009. T. Simon, M. H. Nguyen, F. De la Torre, and J. F. Cohn. Action unit detection with segment-based SVMs. In CVPR, 2010. U. Tariq and T. Huang. Features and fusion for expression recognitionA comparative analysis. In CVPR, 2012. U. Tariq, K.-H. Lin, Z. Li, X. Zhou, Z. Wang, V. Le, T. S. Huang, X. Lv, and T. X. Han. Emotion recognition from an ensemble of features. In AFGR, Mar. 2011. Y. Tong, J. Chen, and Q. Ji. A unified probabilistic framework for spontaneous facial action modeling and understanding. PAMI, 32(2):258–273, 2010. M. F. Valstar, M. Mehu, B. Jiang, M. Pantic, and K. Scherer. Meta-Analysis of the first facial expression recognition challenge. IEEE Transactions on Systems, Man, and Cybernetics, Part B, 42(4):966–979, 2012. M. F. Valstar and M. Pantic. Fully automatic recognition of the temporal phases of facial actions. IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics, 42(1):28–43, 2012. T. Wu, N. J. Butko, P. Ruvolo, J. Whitehill, M. S. Bartlett, and J. R. Movellan. Multilayer architectures for facial action unit recognition. Systems, Man, and Cybernetics, Part B: Cybernetics, IEEE Transactions on, 42(4): 1027–1038, 2012. X. Wu and R. Srihari. Incorporating prior knowledge with weighted margin support vector machines. In SIGKDD.
[35]
[36]
[37]
[38]
[39] ACM Press, 2004. X. Xiong and F. De la Torre. Supervised descent method and its applications to face alignment. In CVPR, 2013. G. Zhao and M. Pietik¨ ainen. Boosted multi-resolution spatiotemporal descriptors for facial expression recognition. Pattern Recognition Letters, 30(12): 1117–1 127, Sept. 2009. L. Zhong, Q. Liu, P. Yang, B. Liu, J. Huang, and D. N. Metaxas. Learning active facial patches for expression analysis. In CVPR, 2012. F. Zhou, F. De la Torre, and J. F. Cohn. Unsupervised discovery of facial events. In CVPR, June 2010. Y. Zhu, F. De la Torre, J. F. Cohn, and Y.-J. Zhan. Dynamic cascades with bidirectional bootstrapping for action unit detection in spontaneous facial behavior. IEEE Transactions on Affective Computing, pages 1–14, 2011. 22440077