acl acl2011 acl2011-124 acl2011-124-reference knowledge-graph by maker-knowledge-mining

124 acl-2011-Exploiting Morphology in Turkish Named Entity Recognition System


Source: pdf

Author: Reyyan Yeniterzi

Abstract: Turkish is an agglutinative language with complex morphological structures, therefore using only word forms is not enough for many computational tasks. In this paper we analyze the effect of morphology in a Named Entity Recognition system for Turkish. We start with the standard word-level representation and incrementally explore the effect of capturing syntactic and contextual properties of tokens. Furthermore, we also explore a new representation in which roots and morphological features are represented as separate tokens instead of representing only words as tokens. Using syntactic and contextual properties with the new representation provide an 7.6% relative improvement over the baseline.


reference text

Silviu Cucerzan and David Yarowski. 1999. Language independent named entity recognition combining morphological and contextual evidence. In Proceedings of the Joint SIGDAT Conference on EMNLP and VLC, pages 90–99. Dilek Z. Hakkani-T u¨r. 2000. Statistical Language Modelling for Turkish. Ph.D. thesis, Department of Computer Engineering, Bilkent University. Dan Klein, Joseph Smarr, Huy Nguyen, and Christopher D. Manning. 2003. Named entity recognition with character-level models. In Proceedings of the seventh conference on Natural language learning at HLT-NAACL 2003 - Volume 4, pages 180–183. Dilek K ¨uc ¨uk and Adnan Yazici. 2009. Named entity recognition experiments on Turkish texts. In Proceedings of the 8th International Conference on Flexible Query Answering Systems, FQAS ’09, pages 524–535, Berlin, Heidelberg. Springer-Verlag. Kemal Oflazer. 1994. Two-level description of Turkish morphology. Literary and Linguistic Computing, 9(2): 137–148. Has ¸im Sak, Tunga G ¨ung¨ or, and Murat Sara ¸clar. 2008. Turkish language resources: Morphological parser, morphological disambiguator and web corpus. In Advances in Natural Language Processing, volume 5221 of Lecture Notes in Computer Science, pages 417–427. G ¨okhan Tur, Dilek Z. Hakkani-T u¨r, and Kemal Oflazer. 2003. A statistical information extraction system for Turkish. In Natural Language Engineering, pages 181–210. 110