iccv iccv2013 iccv2013-376 iccv2013-376-reference knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Lukáš Neumann, Jiri Matas
Abstract: An unconstrained end-to-end text localization and recognition method is presented. The method introduces a novel approach for character detection and recognition which combines the advantages of sliding-window and connected component methods. Characters are detected and recognized as image regions which contain strokes of specific orientations in a specific relative position, where the strokes are efficiently detected by convolving the image gradient field with a set of oriented bar filters. Additionally, a novel character representation efficiently calculated from the values obtained in the stroke detection phase is introduced. The representation is robust to shift at the stroke level, which makes it less sensitive to intra-class variations and the noise induced by normalizing character size and positioning. The effectiveness of the representation is demonstrated by the results achieved in the classification of real-world characters using an euclidian nearestneighbor classifier trained on synthetic data in a plain form. The method was evaluated on a standard dataset, where it achieves state-of-the-art results in both text localization and recognition.
[1] J. Canny. A computational approach to edge detection. IEEE Transactions on Pattern Analysis and Machine Intelligence, 8:679–698, 1986. 2
[2] N. Dalal and B. Triggs. Histograms of oriented gradients for human detection. In CVPR 2005, volume 1, pages 886–893. IEEE, 2005. 1
[3] T. E. de Campos, B. R. Babu, and M. Varma. Character recognition in natural images. VISAPP, 05-08 February 2009, 2009. 2 103 ICDAR 2011 dataset
[4] B. Epshtein, E. Ofek, and Y. Wexler. Detecting text in natural scenes with stroke width transform. In CVPR 2010, pages 2963 –2970. 1, 2, 7
[5] S. M. Hanif and L. Prevost. Text detection and localization in complex scene images using constrained adaboost algorithm. In ICDAR 2009, pages 1–5. IEEE, 2009. 7 duced by the fact that a subregion of a character might be another character (“nD”). A failed detection of a letter on word boundary which consists of just one stroke (“i”)
[6] L. Jung-Jin, P.-H. Lee, S.-W. Lee, A. Yuille, and C. Koch. Adaboost for text detection in natural scene. In ICDAR 2011, pages 429–434, 2011. 1, 2
[7] S. M. Lucas. Text locating competition results. ICDAR 2005, 0:80–85, 2005. 2
[8] S. M. Lucas, A. Panaretos, L. Sosa, A. Tang, S. Wong, and R. Young. ICDAR 2003 robust reading competitions. In ICDAR 2003, page 682, 2003. 2
[9] A. Mishra, K. Alahari, and C. V. Jawahar. Top-down and bottom-up cues for scene text recognition. In CVPR 2012, pages 2687 –2694, june 2012. 2
[10] M. Muja and D. G. Lowe. Fast approximate nearest neighbors with automatic algorithm configuration. In VISSAPP09, pages 331–340, 2009. 5
[11] L. Neumann and J. Matas. A method for text localization and
[12]
[13]
[14]
[15] recognition in real-world images. In ACCV 2010, volume IV of LNCS 6495, pages 2067–2078, November 2010. 1, 2 L. Neumann and J. Matas. Real-time scene text localization and recognition. In CVPR 2012, pages 3538 –3545, 6 2012. 1, 2, 7 Y.-F. Pan, X. Hou, and C.-L. Liu. Text localization in natural scene images based on conditional random field. In ICDAR 2009, pages 6–10. IEEE Computer Society, 2009. 1, 2 A. Shahab, F. Shafait, and A. Dengel. ICDAR 2011 robust reading competition challenge 2: Reading text in scene images. In ICDAR 2011, pages 1491–1496, 2011. 1, 2, 6, 7 C. Shi, C. Wang, B. Xiao, Y. Zhang, and S. Gao. Scene text detection using graph model built upon maximally stable extremal regions. PR, 34(2): 107 116, 2013. 1, 7 P. Viola and M. J. Jones. Robust real-time face detection. International journal of computer vision, 57(2): 137–154, 2004. 1 K. Wang, B. Babenko, and S. Belongie. End-to-end scene text recognition. In ICCV 2011, 2011. 1, 2 C. Wolf and J.-M. Jolion. Object count/area graphs for the evaluation of object detection and segmentation algorithms. Int. J. Doc. Anal. Recognit., 8:280–296, August 2006. 7 C. Yao, X. Bai, W. Liu, Y. Ma, and Z. Tu. Detecting texts of arbitrary orientations in natural images. In CVPR 2012, pages 1083 –1090, june 2012. 1, 2 C. Yi and Y. Tian. Text string detection from natural scenes by structure-based partition and grouping. Image Processing, IEEE Transactions on, 20(9):2594 –2605, sept. 2011. 7 –
[16]
[17]
[18]
[19]
[20] 104