cvpr cvpr2013 cvpr2013-420 cvpr2013-420-reference knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Xuehan Xiong, Fernando De_la_Torre
Abstract: Many computer vision problems (e.g., camera calibration, image alignment, structure from motion) are solved through a nonlinear optimization method. It is generally accepted that 2nd order descent methods are the most robust, fast and reliable approaches for nonlinear optimization ofa general smoothfunction. However, in the context of computer vision, 2nd order descent methods have two main drawbacks: (1) The function might not be analytically differentiable and numerical approximations are impractical. (2) The Hessian might be large and not positive definite. To address these issues, thispaperproposes a Supervised Descent Method (SDM) for minimizing a Non-linear Least Squares (NLS) function. During training, the SDM learns a sequence of descent directions that minimizes the mean of NLS functions sampled at different points. In testing, SDM minimizes the NLS objective using the learned descent directions without computing the Jacobian nor the Hessian. We illustrate the benefits of our approach in synthetic and real examples, and show how SDM achieves state-ofthe-art performance in the problem of facial feature detec- tion. The code is available at www. .human sen sin g. . cs . cmu . edu/in t ra fa ce.
[1] K. T. Abou-Moustafa, F. De la Torre, and F. P. Ferrie. Pareto discriminant analysis. In CVPR, 2010. 1
[2] S. Baker and I. Matthews. Lucas-Kanade 20 years on: A unifying framework. IJCV, 56(3):221 255, March 2004. 2
[3] M. Bartlett, G. Littlewort, M. Frank, C. Lainscsek, I. Fasel, and J. Movellan. Automatic recognition of facial actions in spontaneous expressions. Journal of Multimedia, 1(6):22–35, 2006. 6
[4] P. N. Belhumeur, D. W. Jacobs, D. J. Kriegman, and N. Kumar. Localizing parts of faces using a consensus of exemplars. In CVPR, 201 1. 2, 3, 5, 6
[5] M. J. Black and A. D. Jepson. Eigentracking: Robust matching and tracking of objects using view-based representation. IJCV, 26(1):63–84, 1998. 1, 2
[6] V. Blanz and T. Vetter. A morphable model for the synthesis of 3D faces. In SIGGRAPH, 1999. 2
[7] G. Bradski. The OpenCV Library. Dr. Dobb’s Journal of Software Tools, 2000. 5
[8] A. Buchanan and A. W. Fitzgibbon. Damped newton algorithms for matrix factorization with missing data. In CVPR, 2005. 1
[9] R. H. Byrd, P. Lu, and J. Nocedal. A limited memory algorithm for bound constrained optimization. SIAM Journal on Scientific and Statistical Computing, 16(5): 1190–1208, 1995. 2
[10] X. Cao, Y. Wei, F. Wen, and J. Sun. Face alignment by explicit shape regression. –
[11]
[12]
[13]
[14]
[15]
[16]
[17]
[18]
[19]
[20]
[21]
[22]
[23]
[24]
[25]
[26]
[27]
[28]
[29] In CVPR, 2012. 4, 6 T. Cootes, G. Edwards, and C. Taylor. Active appearance models. TPAMI, 23(6):681–685, 2001. 2, 4, 6 T. F. Cootes, M. C. Ionita, C. Lindner, and P. Sauer. Robust and accurate shape model fitting using random forest regression voting. In ECCV, 2012. 3 D. Cristinacce and T. Cootes. Automatic feature localisation with constrained local models. Journal of Pattern Recognition, 41(10):3054–3067, 2008. 3, 6 F. De la Torre and M. H. Nguyen. Parameterized kernel principal component analysis: Theory and applications to supervised and unsupervised image alignment. In CVPR, 2008. 1, 2 P. Doll a´r, P. Welinder, and P. Perona. Cascaded pose regression. In CVPR, 2010. 2, 4 J. Friedman. Greedy function approximation: A gradient boosting machine. Annals of Statistics, 29(5): 1189–1232, 2001. 2, 4 R. Gross, I. Matthews, J. Cohn, T. Kanade, and S. Baker. Multi-pie. In AFGR, 2007. 6 Y. Huang, Q. Liu, and D. N. Metaxas. A component-based framework for generalized face alignment. IEEE Transactions on Systems, Man, and Cybernetics, 41(1):287–298, 201 1. 3 M. J. Jones and T. Poggio. Multidimensional morphable models. In ICCV, 1998. 2 M. Kim, S. Kumar, V. Pavlovic, and H. Rowley. Face tracking and recognition with visual constraints in real-world videos. In CVPR, 2008. 2, 6 D. Lowe. Distinctive image features from scale-invariant keypoints. IJCV, 60(2):91–1 10, 2004. 2 B. Lucas and T. Kanade. An iterative image registration technique with an application to stereo vision. In Proceedings of Imaging Understanding Workshop, 1981. 1, 2 I. Matthews and S. Baker. Active appearance models revisited. IJCV, 60(2): 135–164, 2004. 6 S. Rivera and A. M. Martinez. Learning deformable shape manifolds. Pattern Recognition, 45(4): 1792–1801, 2012. 3 E. Sanchez, F. De la Torre, and D. Gonzalez. Continuous regression for nonrigid image alignment. In ECCV, 2012. 3 J. Saragih. Principal regression analysis. In CVPR, 2011. 2, 3, 5, 6 J. Saragih and R. Goecke. A nonlinear discriminative approach to AAM fitting. In ICCV, 2007. 2, 4 J. Saragih, S. Lucey, and J. Cohn. Face alignment through subspace constrained mean-shifts. In ICCV, 2009. 3 P. Tresadern, P. Sauer, and T. F. Cootes. Additive update predictors in active appearance models. In BMVC, 2010. 2, 4
[30] G. Tzimiropoulos, S. Zafeiriou, and M. Pantic. Robust and efficient parametric face alignment. In ICCV, 2011. 2
[31] X. Zhu and D. Ramanan. Face detection, pose estimation, and landmark localization in the wild. In CVPR, 2012. 3
[32] K. Zimmermann, J. Matas, and T. Svoboda. Tracking by an optimal sequence of linear predictors. TPAMI, 31(4):677–692, 2009. 3 555333668 Figure 6: Example results from our method on LFPW dataset. The first two rows show faces with strong changes in pose and illumination, and faces partially occluded. The last row shows the 10 worst images measured by normalized mean error. Figure 7: Example results on LFW-A&C; dataset. Figure 8: Comparison betwe n the tracking results from SDM (top row) and person-specifc tracker (bot om row). Figure 9: Example results on the Youtube Celebrity dataset. 555333779