iccv iccv2013 iccv2013-130 iccv2013-130-reference knowledge-graph by maker-knowledge-mining

130 iccv-2013-Dynamic Structured Model Selection

Source: pdf

Author: David Weiss, Benjamin Sapp, Ben Taskar

Abstract: Ben Taskar University of Washington Seattle, WA t as kar @ c s . washingt on . edu In many cases, the predictive power of structured models for for complex vision tasks is limited by a trade-off between the expressiveness and the computational tractability of the model. However, choosing this trade-off statically a priori is suboptimal, as images and videos in different settings vary tremendously in complexity. On the other hand, choosing the trade-off dynamically requires knowledge about the accuracy of different structured models on any given example. In this work, we propose a novel two-tier architecture that provides dynamic speed/accuracy trade-offs through a simple type of introspection. Our approach, which we call dynamic structured model selection (DMS), leverages typically intractable features in structured learning problems in order to automatically determine ’ which of several models should be used at test-time in order to maximize accuracy under a fixed budgetary constraint. We demonstrate DMS on two sequential modeling vision tasks, and we establish a new state-of-the-art in human pose estimation in video with an implementation that is roughly 23 faster than the prevaino uims sptleanmdeanrtda implementation.

reference text

[1] A. Bedagkar-Gala and S. Shah. Joint modeling of algorithm behavior and image quality for algorithm performance prediction. In BMVC, 2010.

[2] P. Buehler, M. Everingham, D. Huttenlocher, and A. Zisserman. Upper body detection and tracking in extended signing sequences. IJCV, 95: 180–197, 2011.

[3] D. Chen, M. Bilgic, L. Getoor, and D. Jacobs. Dynamic processing allocation in video. PAMI, 33(1 1), 2011.

[4] M. Chen, Z. Xu, K. Weinberg, O. Chapelle, and D. Kedem. Classifier cascade for minimizing feature evaluation cost. In AISATATS, 2012.

[5] P. Felzenszwalb and D. Huttenlocher. Efficient graph-based image segmentation. IJCV, 59(2), 2004.

[6] T. Gao and D. Koller. Active classification based on value of classifier. In NIPS, 2011.

[7] A. Grubb and D. Bagnell. Speedboost: Anytime prediction with uniform near-optimality. In AISTATS, 2012.

[8] H. He, H. Daum e´ III, and J. Eisner. Imitation learning by coaching. In NIPS, 2012.

[9] N. Jammalamadaka, A. Zisserman, M. Eichner, V. Ferrari, and C.V.Jawahar. Has my algorithm succeeded? An evaluator for human pose estimators. In ECCV, 2012.

[10] J. Jiang, A. Teichart, H. Daum e´ III, and J. Eisner. Learned prioritization for trading off accuracy and speed. In NIPS, 2012.

[11] S. Karayev, T. Baumgartner, M. Fritz, and T. Darrell. Timely object recognition. NIPS, 2012.

[12] S. Lacoste-Julien, M. Jaggi, M. Schmidt, and P. Pletscher. Block-coordinate Frank-Wolfe optimization for structural SVMs. In ICML, 2013.

[13] L. Ladick´ y, P. Torr, and A. Zisserman. Human pose estimation using a joint pixel-wise and part-wise formulation. In CVPR, 2013.

[14] P. Lanchantin and X. Rodet. Dynamic model selection for spectrael voice conversion. In Interspeech, 2010.

[15] C. Liu. Beyond Pixels: Exploring New Representations and Applications for Motion Analysis. PhD thesis, MIT, 2009.

[16] D. Park and D. Ramanan. N-best maximal decoders for part models. In ICCV, 2011.

[17] V. Raykar, B. Krishnapuram, and S. Yu. Designing efficient cascaded classifiers: tradeoff between accuracy and cost. In SIGKDD, 2010.

[18] B. Sapp and B. Taskar. MODEC: Multimodal decomposable models for human pose estimatino. In CVPR, 2013.

[19] B. Sapp, D. Weiss, and B. Taskar. Parsing human motion with stretchable models. In CVPR, 2011.

[20] B. Taskar, C. Guestrin, and D. Koller. Max-margin Markov networks. In NIPS, 2003.

[21] K. Trapeznikov and V. Saligrama. Supervised sequential classification under budget constraints. In AISTATS, 2013.

[22] P. Viola and M. Jones. Robust real-time object detection. IJCV, 57(2): 137–154, 2002.

[23] D. Weiss and B. Taskar. Structured prediction cascades. In AISTATS, 2010.

[24] Y. Yang and D. Ramanan. Articulated pose estimation with flexible mixtures-of-parts. In CVPR, 2011. 2663