nips nips2013 nips2013-84 nips2013-84-reference knowledge-graph by maker-knowledge-mining

84 nips-2013-Deep Neural Networks for Object Detection


Source: pdf

Author: Christian Szegedy, Alexander Toshev, Dumitru Erhan

Abstract: Deep Neural Networks (DNNs) have recently shown outstanding performance on image classification tasks [14]. In this paper we go one step further and address the problem of object detection using DNNs, that is not only classifying but also precisely localizing objects of various classes. We present a simple and yet powerful formulation of object detection as a regression problem to object bounding box masks. We define a multi-scale inference procedure which is able to produce high-resolution object detections at a low cost by a few network applications. State-of-the-art performance of the approach is shown on Pascal VOC. 1


reference text

[1] Narendra Ahuja and Sinisa Todorovic. Learning the taxonomy and models of categories present in arbitrary images. In International Conference on Computer Vision, 2007.

[2] Yoshua Bengio. Learning deep architectures for ai. Foundations and Trends R in Machine Learning, 2(1):1–127, 2009.

[3] Dan Ciresan, Alessandro Giusti, Juergen Schmidhuber, et al. Deep neural networks segment neuronal membranes in electron microscopy images. In Advances in Neural Information Processing Systems 25, 2012.

[4] Navneet Dalal and Bill Triggs. Histograms of oriented gradients for human detection. In Computer Vision and Pattern Recognition, 2005.

[5] J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-Fei. ImageNet: A Large-Scale Hierarchical Image Database. In Computer Vision and Pattern Recognition, 2009.

[6] John Duchi, Elad Hazan, and Yoram Singer. Adaptive subgradient methods for online learning and stochastic optimization. In Conference on Learning Theory. ACL, 2010.

[7] M. Everingham, L. Van Gool, C. K. I. Williams, J. Winn, and A. Zisserman. The pascal visual object classes (voc) challenge. International Journal of Computer Vision, 88(2):303–338, 2010.

[8] Cl´ ment Farabet, Camille Couprie, Laurent Najman, and Yann LeCun. Learning hierarchical features e for scene labeling. IEEE Transactions on Pattern Analysis and Machine Intelligence, 35(8):1915–1929, 2013.

[9] Pedro F Felzenszwalb, Ross B Girshick, David McAllester, and Deva Ramanan. Object detection with discriminatively trained part-based models. IEEE Transactions on Pattern Analysis and Machine Intelligence, 32(9):1627–1645, 2010.

[10] Sanja Fidler and Aleˇ Leonardis. Towards scalable representations of object categories: Learning a hiers archy of parts. In Computer Vision and Pattern Recognition, 2007.

[11] R. B. Girshick, P. F. Felzenszwalb, and D. McAllester. Discriminatively trained deformable part models, release 5. http://people.cs.uchicago.edu/ rbg/latent-release5/.

[12] Geoffrey E Hinton and Ruslan R Salakhutdinov. Reducing the dimensionality of data with neural networks. Science, 313(5786):504–507, 2006.

[13] Iasonas Kokkinos and Alan Yuille. Inference and learning with hierarchical shape models. International Journal of Computer Vision, 93(2):201–225, 2011.

[14] Alex Krizhevsky, Ilya Sutskever, and Geoff Hinton. Imagenet classification with deep convolutional neural networks. In Advances in Neural Information Processing Systems 25, 2012.

[15] Quoc V Le, Marc’Aurelio Ranzato, Rajat Monga, Matthieu Devin, Kai Chen, Greg S Corrado, Jeff Dean, and Andrew Y Ng. Building high-level features using large scale unsupervised learning. In International Conference on Machine Learning, 2012.

[16] Yann LeCun and Yoshua Bengio. Convolutional networks for images, speech, and time series. The handbook of brain theory and neural networks, 1995.

[17] Jorge S´ nchez and Florent Perronnin. High-dimensional signature compression for large-scale image a classification. In Computer Vision and Pattern Recognition, 2011.

[18] Hannes Schulz and Sven Behnke. Object-class segmentation using deep convolutional neural networks. In Proceedings of the DAGM Workshop on New Challenges in Neural Computation, 2011.

[19] Long Zhu, Yuanhao Chen, Alan Yuille, and William Freeman. Latent hierarchical structural learning for object detection. In Computer Vision and Pattern Recognition, 2010.

[20] Song Chun Zhu and David Mumford. A stochastic grammar of images. Computer Graphics and Vision, 2(4):259–362, 2007. 9