acl acl2013 acl2013-327 acl2013-327-reference knowledge-graph by maker-knowledge-mining

327 acl-2013-Sorani Kurdish versus Kurmanji Kurdish: An Empirical Comparison


Source: pdf

Author: Kyumars Sheykh Esmaili ; Shahin Salavati

Abstract: Resource scarcity along with diversity– both in dialect and script–are the two primary challenges in Kurdish language processing. In this paper we aim at addressing these two problems by (i) building a text corpus for Sorani and Kurmanji, the two main dialects of Kurdish, and (ii) highlighting some of the orthographic, phonological, and morphological differences between these two dialects from statistical and rule-based perspectives.


reference text

Abolfazl AleAhmad, Hadi Amiri, Ehsan Darrudi, Masoud Rahgozar, and Farhad Oroumchian. 2009. Hamshahri: A standard Persian Text Collection. Knowledge-Based Systems, 22(5):382–387. Wafa Barkhoda, Bahram ZahirAzami, Anvar Bahrampour, and Om-Kolsoom Shahryari. 2009. A Comparison between Allophone, Syllable, and Diphone based TTS Systems for Kurdish Language. In Signal Processing and Information Technology (ISSPIT), 2009 IEEE International Symposium on, pages 557–562. Robert MW Dixon. 1994. Ergativity. Cambridge University Press. 304 Margreet Dorleijn. 1996. The Decay of Ergativity in Kurdish. Kyumars Sheykh Esmaili, Shahin Salavati, Somayeh Yosefi, Donya Eliassi, Purya Aliabadi, Shownm Hakimi, and Asrin Mohammadi. 2013. Building a Test Collection for Sorani Kurdish. In (to appear) Proceedings of the 10th IEEE/ACS International Conference on Computer Systems and Applications (AICCSA ’13). Kyumars Sheykh Esmaili. 2012. Challenges in Kurdish Text Processing. CoRR, abs/1212.0074. G ´erard Gautier. 1996. A Lexicographic Environment for Kurdish Language using 4th Dimension. In Proceedings of ICEMCO. G ´erard Gautier. 1998. Building a Kurdish Language Corpus: An Overview of the Technical Problems. In Proceedings of ICEMCO. Alexander Gelbukh and Grigori Sidorov. 2001. Zipf and Heaps Laws’ Coefficients Depend on Language. In Computational Linguistics and Intelligent Text Processing, pages 332–335. Springer. Guardian. 2013. The Guardian. www.guardian.co.uk/. Goeffrey Haig and Yaron Matras. 2002. Kurdish Linguistics: A Brief Overview. Sprachtypologie und Universalienforschung / Language Typology and Universals, 55(1). Amir Hassanpour, Jaffer Sheyholislami, and Tove Skutnabb-Kangas. 2012. Introduction. Kurdish: Linguicide, Resistance and Hope. International Journal of the Sociology of Language, 2012(217): 118. Harold Stanley Heaps. 1978. Information Retrieval: Computational and Theoretical Aspects. Academic Press, Inc. Orlando, FL, USA. Linyuan L u¨, Zi-Ke Zhang, and Tao Zhou. 2013. Deviation of Zipf’s and Heaps’ Laws in Human Languages with Limited Dictionary Sizes. Scientific reports, 3. David N. MacKenzie. 1961. Kurdish Dialect Studies. Oxford University Press. Laura Mahalingappa. 2010. The Acquisition of SplitErgativity in Kurmanji Kurdish. In The Proceedings of the Workshop on the Acquisition of Ergativity. Yaron Matras and Salih Akin. 2012. A Survey of the Kurdish Dialect Continuum. In Proceedings of the 2nd International Conference on Kurdish Studies. Yaron Matras and Gertrud Reershemius. 1991 . Standardization Beyond the State: the Cases of Yiddish, Kurdish and Romani. Von Gleich and Wolff, 1991: 103–123. Pewan. 2013. Pewan’s Download Link. https://dl.dropbox.com/u/10883 132/Pewan.zip. Peyamner. 2013. Peyamner News Agency. http://www.peyamner.com/. Pollet Samvelian. 2006. When Morphology Does Better Than Syntax: The Ezafe Construction in Persian. Ms., Universit e´ de Paris. Pollet Samvelian. 2007. A Lexical Account of Sorani Kurdish Prepositions. In The Proceedings of the 14th International Conference on Head-Driven Phrase Structure Grammar, pages 235–249, Stanford. CSLI Publications. Jacques Savoy. 1999. A Stemming Procedure and Stopword List for General French Corpora. JASIS, 50(10):944–952. Faramarz Shahsavari. 2010. Laki and Kurdish. Iran and the Caucasus, 14(1):79–82. Mehrnoush Shamsfard. 2011. Challenges and Open Problems in Persian Text Processing. ings of LTC’11. In Proceed- Wheeler M. Thackston. 2006a. Kurmanji Kurdish: A Reference Grammar with Selected Readings. Harvard University. Wheeler M. Thackston. 2006b. Sorani Kurdish: A Reference Grammar with Selected Readings. Harvard University. TREC. 2013. http://trec.nist.gov/. Text REtrieval Conference. VOA. 2013a. Voice of America - Kurdish (Kurmanji) . http://www.dengeamerika.com/. VOA. 2013b. Voice of America - Kurdish (Sorani). http://www.dengiamerika.com/. G ´eraldine Walther and Beno ıˆt Sagot. 2010. Developing a Large-scale Lexicon for a Less-Resourced Language. In SaLTMiL’s Workshop on Lessresourced Languages (LREC). G ´eraldine Walther. 2011. Fitting into Morphological Structure: Accounting for Sorani Kurdish Endoclitics. In Stefan M ¨uller, editor, The Proceedings of the Eighth Mediterranean Morphology Meeting (MMM8), pages 299–322, Cagliari, Italy. George Kingsley Zipf. 1949. Human Behaviour and the Principle of Least-Effort. Addison-Wesley. 305