cvpr cvpr2013 cvpr2013-294 cvpr2013-294-reference knowledge-graph by maker-knowledge-mining

294 cvpr-2013-Multi-class Video Co-segmentation with a Generative Multi-video Model

Source: pdf

Author: Wei-Chen Chiu, Mario Fritz

Abstract: Video data provides a rich source of information that is available to us today in large quantities e.g. from online resources. Tasks like segmentation benefit greatly from the analysis of spatio-temporal motion patterns in videos and recent advances in video segmentation has shown great progress in exploiting these addition cues. However, observing a single video is often not enough to predict meaningful segmentations and inference across videos becomes necessary in order to predict segmentations that are consistent with objects classes. Therefore the task of video cosegmentation is being proposed, that aims at inferring segmentation from multiple videos. But current approaches are limited to only considering binary foreground/background -inf .mpg . de segmentation and multiple videos of the same object. This is a clear mismatch to the challenges that we are facing with videos from online resources or consumer videos. We propose to study multi-class video co-segmentation where the number of object classes is unknown as well as the number of instances in each frame and video. We achieve this by formulating a non-parametric bayesian model across videos sequences that is based on a new videos segmentation prior as well as a global appearance model that links segments of the same class. We present the first multi-class video co-segmentation evaluation. We show that our method is applicable to real video data from online resources and outperforms state-of-the-art video segmentation and image co-segmentation baselines.

reference text

[1] D. Batra, A. Kowdle, D. Parikh, J. Luo, and T. Chen. icoseg: Interactive co-segmentation with intelligent scribble guidance. In CVPR, 2010. 5

[2] D. Blei and P. Frazier. Distance dependent chinese restaurant processes. In ICML, 2010. 2, 4

[3] T. Brox and J. Malik. Object segmentation by long term analysis of point trajectories. In ECCV, 2010. 2, 5

[4] A. Chambolle and T. Pock. A first-order primal-dual algorithm for convex problems with applications to imaging. Journal of Mathematical Imaging and Vision, 40(1): 120– 145, 2011. 3

[5] D.-J. Chen, H.-T. Chen, and L.-W. Chang. Video object cosegmentation. In ACM Multimedia, 2012. 1, 2

[6] T. Darrell and A. Pentland. Robust estimation of a multilayered motion representation. In IEEE Workshop on Visual Motion, 1991 . 2

[7] F. Galasso, M. Iwasaki, K. Nobori, and R. Cipolla. Spatio-

[8]

[9]

[10]

[11]

[12]

[13]

[14]

[15]

[16]

[17]

[18] temporal clustering of probabilistic region trajectories. In ICCV, 2011. 2 S. Ghosh, A. Ungureanu, E. Sudderth, and D. Blei. Spatial distance dependent chinese restaurant processes for image segmentation. In NIPS, 2011. 2, 3 H. Greenspan, J. Goldberger, and A. Mayer. A probabilistic framework for spatio-temporal video representation & indexing. In ECCV, 2002. 2 A. Joulin, F. Bach, and J. Ponce. Multi-class cosegmentation. In CVPR, 2012. 2, 5, 6 G. Kim and E. P. Xing. On multiple foreground cosegmentation. In CVPR, 2012. 5 D. Kuettel, M. Breitenstein, L. Van Gool, and V. Ferrari. What’s going on? discovering spatio-temporal dependencies in dynamic scenes. In CVPR, 2010. 2 B. F. N. Jojic and A.Kannan. Learning appearance and transparency manifolds of occluded objects in layers. In CVPR, 2003. 2 P. Ochs and T. Brox. Object segmentation in video: a hierarchical variational approach for turning point trajectories into dense regions. In ICCV, 2011. 2, 6 J. Pitman. Combinatorial stochasticprocesses, volume 1875. Springer-Verlag, 2006. 2 J. C. Rubio, J. Serrat, and A. M. L ´opez. Video cosegmentation. In ACCV, 2012. 1, 2, 5 J. Sivic, B. C. Russell, A. A. Efros, A. Zisserman, and W. T. Freeman. Discovering objects and their location in images. In ICCV, 2005. 2 E. Sudderth, A. Torralba, W. Freeman, and A. Willsky. Describing visual scenes using transformed objects and parts. IJCV, 2008. 2

[19] D. Sun, E. Sudderth, and M. Black. Layered image motion with explicit occlusions, temporal consistency, and depth ordering. In NIPS, 2010. 2

[20] Y. W. Teh, M. I. Jordan, M. J. Beal, and D. M. Blei. Hierarchical dirichlet processes. Journal of the American Statistical Association, 2006. 3

[21] S. Vicente, C. Rother, and V. Kolmogorov. Object cosegmentation. In CVPR, 2011. 2

[22] J. Y. Wang and E. H. Adelson. Spatio-temporal segmentation of video data. In SPIE, 1994. 2

[23] J. Y. A. Wang and E. H. Adelson. Layered representation for motion analysis. In CVPR, 1993. 2

[24] X. Wang and E. Grimson. Spatial latent dirichlet allocation. NIPS, 2007. 2 333222888