iccv iccv2013 iccv2013-75 iccv2013-75-reference knowledge-graph by maker-knowledge-mining

75 iccv-2013-CoDeL: A Human Co-detection and Labeling Framework

Source: pdf

Author: Jianping Shi, Renjie Liao, Jiaya Jia

Abstract: We propose a co-detection and labeling (CoDeL) framework to identify persons that contain self-consistent appearance in multiple images. Our CoDeL model builds upon the deformable part-based model to detect human hypotheses and exploits cross-image correspondence via a matching classifier. Relying on a Gaussian process, this matching classifier models the similarity of two hypotheses and efficiently captures the relative importance contributed by various visual features, reducing the adverse effect of scattered occlusion. Further, the detector and matching classifier together make our modelfit into a semi-supervised co-training framework, which can get enhanced results with a small amount of labeled training data. Our CoDeL model achieves decent performance on existing and new benchmark datasets.

reference text

[1] D. Anguelov, K. chih Lee, S. B. Gokturk, and B. Sumengen. Contextual identity recognition in personal photo albums. In CVPR, pages 1–7, 2007.

[2] S. Y. Bao, Y. Xiang, and S. Savarese. Object co-detection. In ECCV, 2012.

[3] P. L. Bartlett and M. H. Wegkamp. Classification with a reject option using a hinge loss. The Journal of Machine Learning Research, 9: 1823–1840, 2008.

[4] A. Blum and T. Mitchell. Combining labeled and unlabeled data with co-training. In Annual Conference on Computational Learning Theory, pages 92–100, 1998.

[5] N. Dalal and B. Triggs. Histograms of oriented gradients for human detection. In CVPR, pages 886–893, 2005.

[6] P. Doll a´r, C. Wojek, B. Schiele, and P. Perona. Pedestrian detection: An evaluation of the state of the art. PAMI, 34(4):743–761, 2012.

[7] A. Ess, B. Leibe, and L. V. Gool. Depth and appearance for mobile scene analysis. In ICCV, 2007.

[8] P. F. Felzenszwalb, R. B. Girshick, D. McAllester, and

[9]

[10]

[11]

[12]

[13]

[14]

[15]

[16]

[17]

[18]

[19] D. Ramanan. Object detection with discriminatively trained part based models. PAMI, 32(9): 1627–1645, 2010. R. Garg, D. Ramanan, S. M. Seitz, and N. Snavely. Where’s Waldo: matching people in images of crowds. In CVPR, pages 1793–1800, 2011. R. B. Girshick, P. F. Felzenszwalb, and D. McAllester. Discriminatively trained deformable part models, release 5. http://people.cs.uchicago.edu/ rbg/latent-release5/. G. Kim and E. P. Xing. On multiple foreground cosegmentation. In CVPR, pages 837–844, 2012. D. G. Lowe. Distinctive image features from scale-invariant keypoints. IJCV, 60(2):91–1 10, 2004. T. Ojala, M. Pietikainen, and T. Maenpaa. Multiresolution gray-scale and rotation invariant texture classification with local binary patterns. PAMI, 24(7):971–987, 2002. P. Ott and M. Everingham. Implicit color segmentation features for pedestrian and object detection. In ICCV, pages 723–730, 2009. C. E. Rasmussen. Gaussian processes for machine learning. Citeseer, 2006. N. Razavi, J. Gall, P. Kohli, and L. V. Gool. Latent hough transform for object detection. In ECCV, pages 3 12–325, 2012. W. R. Schwartz, A. Kembhavi, D. Harwood, and L. S. Davis. Human detection using partial least squares analysis. In ICCV, pages 24–31, 2009. J. Sivic, M. Everingham, and A. Zisserman. Who are you?Learning person specific classifiers from video. In CVPR, pages 1145–1 152, 2009. J. Sivic, C. L. Zitnick, and R. Szeliski. Finding people in repeated shots of the same scene. In BMVC, 2006.

[20] Y. Song and T. Leung. Context-aided human recognition– clustering. ECCV, pages 382–395, 2006.

[21] M. Tapaswi, M. Bauml, and R. Stiefelhagen. Knock! Knock! Who is it? probabilistic person identification in TV-series. In CVPR, pages 2658–2665, 2012.

[22] H. Trevor, T. Robert, and J. H. Friedman. The elements of statistical learning. Springer New York, 2001 .

[23] O. Tuzel, F. Porikli, and P. Meer. Pedestrian detection via classification on riemannian manifolds. PAMI, 30(10): 1713– 1727, 2008.

[24] P. Viola and M. J. Jones. Robust real-time face detection. IJCV, 57(2):137–154, 2004.

[25] X. Wang, T. X. Han, and S. Yan. An HOG-LBP human detector with partial occlusion handling. In ICCV, pages 32– 39, 2009.

[26] P. Wohlhart, M. Donoser, P. M. Roth, and H. Bischof. Detecting partially occluded objects with an implicit shape model random field. In ACCV, 2012.

[27] L. Zhang, L. Chen, M. Li, and H. Zhang. Automated annotation of human faces in family albums. In ACM international conference on Multimedia, pages 355–358, 2003.

[28] T. Zhang. Statistical behavior and consistency of classification methods based on convex risk minimization. Annals of Statistics, pages 56–85, 2004.

[29] Q. Zhu, M.-C. Yeh, K.-T. Cheng, and S. Avidan. Fast human detection using a cascade of histograms of oriented gradients. In CVPR, pages 1491–1498, 2006.

[30] X. Zhu and A. B. Goldberg. Introduction to semi-supervised learning. Synthesis Lectures on Artificial Intelligence and Machine Learning, 3(1): 1–130, 2009. 2103