acl acl2013 acl2013-71 knowledge-graph by maker-knowledge-mining

71 acl-2013-Bootstrapping Entity Translation on Weakly Comparable Corpora

Source: pdf

Author: Taesung Lee ; Seung-won Hwang

Abstract: This paper studies the problem of mining named entity translations from comparable corpora with some “asymmetry”. Unlike the previous approaches relying on the “symmetry” found in parallel corpora, the proposed method is tolerant to asymmetry often found in comparable corpora, by distinguishing different semantics of relations of entity pairs to selectively propagate seed entity translations on weakly comparable corpora. Our experimental results on English-Chinese corpora show that our selective propagation approach outperforms the previous approaches in named entity translation in terms of the mean reciprocal rank by up to 0.16 for organization names, and 0.14 in a low com- parability case.

Reference: text

Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 edu Abstract This paper studies the problem of mining named entity translations from comparable corpora with some “asymmetry”. [sent-2, score-0.481]

2 Our experimental results on English-Chinese corpora show that our selective propagation approach outperforms the previous approaches in named entity translation in terms of the mean reciprocal rank by up to 0. [sent-4, score-0.533]

3 Early research on NE translation used phonetic similarities, for example, to mine the translation ‘Mandelson’→ ‘曼尔 t森o’ [ManDeErSen] awtiitohn s‘i mMilaanrd seolsuonnd’s→. [sent-9, score-0.388]

4 However, not all NE translations are based on transliterations, as shown in Table 1—Some translations, especially the names of most organizations, are based on semantic equivalences. [sent-10, score-0.212]

5 Corpus-based approaches (Kupiec, 1993; Feng, 2004), by mining external signals from a large corpus, such as parenthetical translation “成 (Jackie Chan)”, complement the problem of transliteration-based approaches, but the coverage of this approach is limited to popular entities with such evidence. [sent-26, score-0.374]

6 The most effective known approach to NE translation has been a holistic framework (You et al. [sent-27, score-0.23]

7 龙 In these approaches, both 1) arbitrary translations and 2) lesser-known entities can be handled, by propagating the translation scores of known entities to lesser-known entities if they co-occur frequently in both corpora. [sent-31, score-0.668]

8 For example, a lesserknown entity Tom Watson can be translated if Mandelson and Tom Watson co-occur frequently in an English corpus, and their Chinese translations also co-occur frequently in a Chinese corpus, i. [sent-32, score-0.302]

9 A research question we ask in this paper is: What if comparable corpora are not comparable enough to support this symmetry assumption? [sent-35, score-0.301]

10 For example, co-occurrence of Mandelson and Tom Watson may not appear in a Chinese corpus, which may lead to the translation of Tom Watson into another Chinese entity Gordon Brown which happens to co-occur with the Chinese translation of Mandelson. [sent-41, score-0.483]

11 For example, relations between Mandelson and Tom Watson, should be semantically different from Chinese relations between ‘ 戈登·布 ’ (Gordon Brown) ea rnedl a‘曼tion德s 尔 be 森tw’e (Mandelson). [sent-43, score-0.308]

12 In clear contrast, we selectively propagate seed translations, only when the relations in the two corpora share the same semantics. [sent-46, score-0.43]

13 This selective propagation can be especially effective for translating challenging types of enti- 朗 ties such as organizations including the WTO used with and without abbreviation in both languages. [sent-47, score-0.28]

14 A naive approach to increase the precision would be to consider multitype co-occurrences, hoping that highly precise translations of some type, e. [sent-51, score-0.236]

15 When translating ‘WTO’ using the co-occurrence with ‘Mandelson’, other co-occurrences such as (London, Mandelson) and (EU, Mandelson) produce a lot of noise because the right translation of WTO does not share much phonetic/semantic similarity. [sent-59, score-0.218]

16 25 higher than the existing ap- 1The MRR for organization names achieved by a topic model-based approach was 0. [sent-61, score-0.201]

17 More formally, we enable selective propagation of seed translations on weakly comparable corpora, by 1) clarifying the detailed meaning of relational information of co-occurring entities, and 2) identifying the contexts of the relational information using statement-level context comparison. [sent-64, score-0.66]

18 In other words, we propagate the translation score of a known translation pair to a neighbor pair if the semantics of their relations in English and Chinese corpora are equivalent to accurately propagate the scores. [sent-65, score-0.672]

19 n Fdo o jro eixna→m加pl入e, i(f2), wthee knn forwom ‘R a pair o→f statementsa “Russia(1)joins(2) the WTO(3)” and “俄斯(1) 加入(2) 世贸组织(3)”, we can propagate the translation score of (Russia, 俄斯)(1) to (WTO, 世贸组织)(3). [sent-67, score-0.24]

20 For this goal, we first extract relations among entities in documents, such as visit and join, and mine semantically equivalent relations across the languages, e. [sent-70, score-0.563]

21 Once these relation translations are mined, similar document pairs can be identified by comparing each constituent relationship among entities using their relations. [sent-73, score-0.472]

22 Knowing document similarity improves NE translation, and improved NE translation can boost the accuracy ofdocument and relationship similarity. [sent-74, score-0.321]

23 14 higher MRR than seed translations, on less comparable corpora. [sent-81, score-0.237]

24 632 2 Related Work This work is related to two research streams: NE translation and semantically equivalent relation mining. [sent-82, score-0.319]

25 Entity translation Existing approaches on NE translation can be categorized into 1) transliteration-based, 2) corpusbased, and 3) hybrid approaches. [sent-83, score-0.342]

26 , 2009) rely on meanings of constituent letters or words to handle organization name translation such as ‘Bank of China ( 中国银行)’, whose translation is derived from ‘China ( 中国)’ , and ‘a bank (银行)’ . [sent-89, score-0.492]

27 (201 1) build such graphs using the context similarity, measured with a bag of words approach, of entities in news corpora to translate NEs. [sent-108, score-0.233]

28 That is, two entities are considered to be similar if the two entities in different languages have similar occurrence distributions over time. [sent-115, score-0.224]

29 Semantically similar relation mining Recently, similar relation mining in one language has been studied actively as a key part of automatic knowledge base construction. [sent-119, score-0.268]

30 , 2012) finds similar relations with almost the same support sets–the sets of NE pairs that co-occur with the relations. [sent-122, score-0.279]

31 However, because of the regional locality of information, bilingual corpora contain many NE pairs that appear in only one of the support sets of the semantically identical relations. [sent-123, score-0.255]

32 , 2011) finds related relations using seed pairs of one given relation; then, using K-means clustering, it finds relations that are semantically similar to the given relation. [sent-125, score-0.649]

33 (201 1) try to find relation patterns 633 in multiple languages for given seed pairs of a relation. [sent-130, score-0.303]

34 Because this approach finds seed pairs in Wikipedia infoboxes, the number of retrievable relations is restricted to five. [sent-131, score-0.402]

35 Because such corpora contain asymmetric parts, the goal of our framework is to overcome asymmetry by distinguishing the semantics of relations, and leveraging document context defined by the relations of entities. [sent-135, score-0.361]

36 For this purpose, we build a mutual bootstrapping framework (Figure 2), between entity translation and relation translation using extracted relationships of entities (Figure 2 (a), Section 4. [sent-137, score-0.685]

37 2): Initializing a seed entity translation score, where eE is an English entity, and eC is a Chinese entity. [sent-141, score-0.481]

38 can be initialized by phonetic similarity or other NE translation methods. [sent-142, score-0.266]

39 1) Using TNt, we obtain a set of relation translations with a semantic similarity score, TRt(rE, rC), for an English relation rE and a Chinese relation rC (Figure 2 (b), Section 4. [sent-145, score-0.48]

40 2) Using TNt and TRt, we identify a set of semantically similar document pairs that describe the same event with a similarity score TDt (dE, dC) where dE is an English document and dC is a Chinese document (Figure 2 (c), Section 4. [sent-149, score-0.346]

41 3) Using TNt, TRt and TDt, we compute an improved entity translation score (Figure 2 (d), Section 4. [sent-151, score-0.312]

42 1, and how we obtain seed name translations in Section 4. [sent-157, score-0.33]

43 Next, we present our method for discovering relation translations across languages in Section 4. [sent-159, score-0.251]

44 4, we use the name translations and the relation translations to compare document contexts which can boost the precision of NE translation. [sent-162, score-0.513]

45 1 Statement Extraction We extract relational statements, which we exploit to propagate translation scores, from an English news corpus and a Chinese news corpus. [sent-166, score-0.402]

46 2 Seed Entity Translation We need a few seed translation pairs to initiate the framework. [sent-173, score-0.384]

47 We build a seed translation score indicating the similarity of an English entity eE and a Chinese entity eC using an existing method. [sent-174, score-0.671]

48 , 2007) as a base translation matrix to build the seed translation function. [sent-178, score-0.511]

49 We use an English-Chinese general word dictionary containing approximately 80,000 English-Chinese translation word pairs that was also used by Kim et al. [sent-180, score-0.215]

50 Formally, our goal is to measure a pairwise relation translation score TR(rE, rC) for an English relation rE ∈ RE and a Chinese relation rC ∈ RC where RE i∈s a set of all English relations and∈ RC is a set of all Chinese relations. [sent-188, score-0.566]

51 A basic clue is that relations of the same meaning are likely to be mentioned with the same entity pairs. [sent-190, score-0.266]

52 We formally define this clue for relations in the same language, and then describe that in the bilingual setting. [sent-193, score-0.215]

53 Likewise, we can define a support intersection for relations in the different languages using the translation score TN(eE, eC). [sent-196, score-0.403]

54 However, as shown in Table 2, we cannot use this value directly to measure the similarity because the support intersection of semantically similar bilingual relations (e. [sent-200, score-0.379]

55 Edges indicate that the relations have a non-empty support intersection, and edge labels show the size of the intersection. [sent-205, score-0.206]

56 Although visit is connected to criticize, visit is not connected to other criticize-relations such as denounce and blame, whereas criticize, denounce, and blame are inter635 6c all on 10 cal orn ereqquueesst ta-csklu stearp tpoe al fly? [sent-208, score-0.325]

57 For such relations, rather than assigning zero similarity to visit-relations, we compute a cluster membership function based on support pairs of the cluster members and the target relation, and then formulate a pairwise relation translation score. [sent-225, score-0.486]

58 Formally, we learn the membership function of a relation r to a cluster c using support vec- tor regression (Joachims, 1999) with the following features based on the support set of cluster c, H(c) = ∪r∈c H(r), and the support intersection of r and c∪, Hr∈c(r, c) = ∪r∗∈c H(r, r∗). [sent-226, score-0.375]

59 The use of Hwithin and∪ Hshared is based on the observation that a noun phrase pair that appear in only one relation tends to be an incorrectly chunked entity such as ‘World Trade’ from the ‘World Trade Organization’ . [sent-233, score-0.231]

60 We consider that two relations are similar if they have at least one cluster that the both relations belong to, which can be measured with S(r, c). [sent-235, score-0.293]

61 More formally, pairwise similarity of relations ri and rj is defined as − TR(ri,rj) = maxS(ri,c) · S(rj,c) (4) where C is a set of all clusters. [sent-236, score-0.38]

62 Therefore, we detect similar document pairs to boost the statement matching process. [sent-239, score-0.224]

63 Formally, we compute the similarity of two statements sE = (xE, rE, yE) and sC = (xC, rC, yC) in different languages as follows: TS(sE, sC) = TN(xE, xC)TR(rE, rC)TN(yE, yC) (5) With this definition, we can find similar statements described with different vocabularies in different languages. [sent-243, score-0.285]

64 5 SCj) SCj Iteration on TN In this section, we describe how we use the statement similarity function TS, and the document similarity function TD to improve and derive the next generation entity translation function TN(t+1). [sent-247, score-0.554]

65 We consider that a pair of an English entity eE and a Chinese entity eC are likely to indicate the same real world entity if they have 1) semantically similar relations to the same entity 2) under the same context. [sent-248, score-0.747]

66 Inotherwords, B∗ is a set of matching statement pairs mentioning the translation target eE and eC in the document pair. [sent-251, score-0.359]

67 Then, we use the following equation to improve the original entity translation function. [sent-252, score-0.312]

68 With this update, we obtain the improved NE translations considering the relations that an entity has to other entities under the same context to achieve higher precision. [sent-256, score-0.539]

69 5 Experiments In this section, we present experimental settings and results of translating entity names using our methods compared with several baselines. [sent-257, score-0.239]

70 The news corpora are not parallel but comparable corpora, with asymmetry of entities and relationship as the asymmetry in the number of documents also suggest. [sent-262, score-0.509]

71 To measure performance, we use mean reciprocal rank (MRR) to evaluate a translation function T: MRR(T) =|Q1|(u∑,v)∈QrankT1(u,v) (9) where Q is the set of gold English-Chinese translation pairs (u, v) and rankT(u, v) is the rank of T(u, v) in {T(u, w) |w is a Chinese entity}. [sent-269, score-0.386]

72 We only use entities with translations that appear in the Chinese corpus. [sent-275, score-0.273]

73 In total, we identified 490 English entities in the English news with Chinese translations in the Chinese news. [sent-277, score-0.327]

74 This method uses only a single type of entities to propagate the translation scores. [sent-283, score-0.352]

75 TPMH+P is the holistic method revised to use nTaive multi-type propagation that uses multiple types of entities to reinforce the translation scores. [sent-284, score-0.433]

76 THB is a linear combination of transliteration aTnd semantic translation methods (Lam et al. [sent-285, score-0.211]

77 For our TN(t) method , we use the result with (t) = 1, the seed translations, and (t) = 2, which means that only one pass of the whole framework is performed to improve the seed translation function. [sent-291, score-0.509]

78 In addition, we use translation pairs with scores above 0. [sent-292, score-0.215]

79 With only one iteration of selective propagation, the seed translation is improved to achieve the 0. [sent-302, score-0.403]

80 However, not all translations have phonetic similarity, especially organization names, as the low F1-score of TPSH+P, 0. [sent-305, score-0.357]

81 The naive multitype propagation TPMH+P shows decreased MRR for both persons and organizations compared to the single-type propagation TPSH+P, which shows a negative influence of diverse relation semantics of entities of different types. [sent-307, score-0.604]

82 THB achieves a better MRR than TPH+P due to the semantic translation of organization names. [sent-308, score-0.321]

83 TN(1) Moreover, the use of the corpora by could not fix this problem, and it finds another organization related to trade, ‘上合组织’ (Shanghai Cooperation Organization). [sent-389, score-0.281]

84 In contrast, our selective propagation method which uses the wrong TN(2), TN(1) , 组 seed translation by ‘上合织’ (Shanghai Cooperation Organization), successfully translates the WTO using statements such as (Russia, join, WTO), and its corresponding Chinese statement (俄罗斯, 加入, 织). [sent-390, score-0.691]

85 Similarly, both 世贸组 TN(1) the baseline THB and the seed translation matched Microsoft to incorrect Chinese entities that are phonetically similar as indicated by the underlined text. [sent-391, score-0.452]

86 In contrast, finds the correct translation despite the phonetic dissimilarity. [sent-392, score-0.281]

87 However, we can see that the MRR of the seed translation method drops significantly on D1 and D2, whereas our method shows 0. [sent-399, score-0.34]

88 The extracted statements are the exact translations of each corresponding part as indicated by the arrows. [sent-404, score-0.279]

89 The seed translation score TN1(WTO, 世贸组织) is not enough to match the entities. [sent-407, score-0.34]

90 This match helps translation of ‘WTO’ by inspecting the organization that Russia considers to join in both documents. [sent-409, score-0.394]

91 6 Conclusions This paper proposed a bootstrapping approach for entity translation using multilingual relational clustering. [sent-410, score-0.366]

92 Further, the proposed method could finds similar document pairs by comparing statements to enable us to focus on comparable parts of evidence. [sent-411, score-0.359]

93 Mining entity translations from comparable corpora: a holistic graph mapping approach. [sent-438, score-0.463]

94 Entity translation mining from comparable corpora: Combining graph mapping with corpus latent features. [sent-442, score-0.317]

95 Named entity transliteration and discovery from multilingual comparable corpora. [sent-452, score-0.249]

96 Named entity translation matching and learning: With application for mining unseen translations. [sent-467, score-0.356]

97 Mining parenthetical translations from the web by word alignment. [sent-474, score-0.208]

98 Bootstrapping multilingual relation discovery using english wikipedia and wikimedia-induced entity extraction. [sent-488, score-0.269]

99 A chineseenglish organization name translation system using heuristic web mining and asymmetric alignment. [sent-511, score-0.365]

100 Efficient entity translation mining: A parallelized graph alignment approach. [sent-519, score-0.346]

similar papers computed by tfidf model

tfidf for this paper:

wordName wordTfidf (topN-words)

[('mandelson', 0.388), ('wto', 0.269), ('tn', 0.197), ('chinese', 0.184), ('translation', 0.171), ('seed', 0.169), ('translations', 0.161), ('rc', 0.158), ('mrr', 0.155), ('organization', 0.15), ('visit', 0.143), ('entity', 0.141), ('ne', 0.138), ('thb', 0.136), ('rj', 0.126), ('relations', 0.125), ('statements', 0.118), ('hwithin', 0.116), ('entities', 0.112), ('moscow', 0.111), ('russia', 0.108), ('asymmetry', 0.104), ('propagation', 0.091), ('relation', 0.09), ('lam', 0.086), ('ri', 0.08), ('statement', 0.079), ('organizations', 0.079), ('sei', 0.079), ('scj', 0.078), ('ee', 0.074), ('join', 0.073), ('kim', 0.071), ('tnt', 0.071), ('propagate', 0.069), ('comparable', 0.068), ('hwang', 0.068), ('corpora', 0.067), ('persons', 0.066), ('re', 0.065), ('document', 0.065), ('finds', 0.064), ('yc', 0.063), ('selective', 0.063), ('xe', 0.061), ('intersection', 0.061), ('ec', 0.06), ('jackie', 0.059), ('holistic', 0.059), ('xc', 0.059), ('hshared', 0.058), ('tdict', 0.058), ('tpsh', 0.058), ('trt', 0.058), ('semantically', 0.058), ('watson', 0.058), ('chan', 0.058), ('nes', 0.057), ('relational', 0.054), ('news', 0.054), ('symmetry', 0.052), ('tom', 0.052), ('names', 0.051), ('formally', 0.05), ('similarity', 0.049), ('stroudsburg', 0.048), ('jiang', 0.048), ('parenthetical', 0.047), ('translating', 0.047), ('support', 0.046), ('phonetic', 0.046), ('trade', 0.045), ('mining', 0.044), ('pairs', 0.044), ('cluster', 0.043), ('nakashole', 0.042), ('kupiec', 0.042), ('bipartite', 0.041), ('bilingual', 0.04), ('transliteration', 0.04), ('ye', 0.039), ('chenglong', 0.039), ('denounce', 0.039), ('fvar', 0.039), ('mandeersen', 0.039), ('multitype', 0.039), ('tpmh', 0.039), ('pa', 0.038), ('english', 0.038), ('comparability', 0.038), ('song', 0.037), ('sc', 0.037), ('boost', 0.036), ('naive', 0.036), ('die', 0.035), ('edge', 0.035), ('graph', 0.034), ('ehrs', 0.034), ('jinhan', 0.034), ('pohang', 0.034)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 1.0000004 71 acl-2013-Bootstrapping Entity Translation on Weakly Comparable Corpora

Author: Taesung Lee ; Seung-won Hwang

2 0.19174285 138 acl-2013-Enriching Entity Translation Discovery using Selective Temporality

Author: Gae-won You ; Young-rok Cha ; Jinhan Kim ; Seung-won Hwang

Abstract: This paper studies named entity translation and proposes “selective temporality” as a new feature, as using temporal features may be harmful for translating “atemporal” entities. Our key contribution is building an automatic classifier to distinguish temporal and atemporal entities then align them in separate procedures to boost translation accuracy by 6. 1%.

3 0.14750211 174 acl-2013-Graph Propagation for Paraphrasing Out-of-Vocabulary Words in Statistical Machine Translation

Author: Majid Razmara ; Maryam Siahbani ; Reza Haffari ; Anoop Sarkar

Abstract: Out-of-vocabulary (oov) words or phrases still remain a challenge in statistical machine translation especially when a limited amount of parallel text is available for training or when there is a domain shift from training data to test data. In this paper, we propose a novel approach to finding translations for oov words. We induce a lexicon by constructing a graph on source language monolingual text and employ a graph propagation technique in order to find translations for all the source language phrases. Our method differs from previous approaches by adopting a graph propagation approach that takes into account not only one-step (from oov directly to a source language phrase that has a translation) but multi-step paraphrases from oov source language words to other source language phrases and eventually to target language translations. Experimental results show that our graph propagation method significantly improves performance over two strong baselines under intrinsic and extrinsic evaluation metrics.

4 0.14634195 164 acl-2013-FudanNLP: A Toolkit for Chinese Natural Language Processing

Author: Xipeng Qiu ; Qi Zhang ; Xuanjing Huang

Abstract: The growing need for Chinese natural language processing (NLP) is largely in a range of research and commercial applications. However, most of the currently Chinese NLP tools or components still have a wide range of issues need to be further improved and developed. FudanNLP is an open source toolkit for Chinese natural language processing (NLP) , which uses statistics-based and rule-based methods to deal with Chinese NLP tasks, such as word segmentation, part-ofspeech tagging, named entity recognition, dependency parsing, time phrase recognition, anaphora resolution and so on.

5 0.13827994 255 acl-2013-Name-aware Machine Translation

Author: Haibo Li ; Jing Zheng ; Heng Ji ; Qi Li ; Wen Wang

Abstract: We propose a Name-aware Machine Translation (MT) approach which can tightly integrate name processing into MT model, by jointly annotating parallel corpora, extracting name-aware translation grammar and rules, adding name phrase table and name translation driven decoding. Additionally, we also propose a new MT metric to appropriately evaluate the translation quality of informative words, by assigning different weights to different words according to their importance values in a document. Experiments on Chinese-English translation demonstrated the effectiveness of our approach on enhancing the quality of overall translation, name translation and word alignment over a high-quality MT baseline1 .

6 0.13058431 169 acl-2013-Generating Synthetic Comparable Questions for News Articles

7 0.12017729 160 acl-2013-Fine-grained Semantic Typing of Emerging Entities

8 0.11983398 74 acl-2013-Building Comparable Corpora Based on Bilingual LDA Model

9 0.1180684 93 acl-2013-Context Vector Disambiguation for Bilingual Lexicon Extraction from Comparable Corpora

10 0.11689939 223 acl-2013-Learning a Phrase-based Translation Model from Monolingual Data with Application to Domain Adaptation

11 0.11116827 352 acl-2013-Towards Accurate Distant Supervision for Relational Facts Extraction

12 0.10296135 154 acl-2013-Extracting bilingual terminologies from comparable corpora

13 0.10190815 92 acl-2013-Context-Dependent Multilingual Lexical Lookup for Under-Resourced Languages

14 0.09458673 44 acl-2013-An Empirical Examination of Challenges in Chinese Parsing

15 0.093929455 16 acl-2013-A Novel Translation Framework Based on Rhetorical Structure Theory

16 0.093830302 10 acl-2013-A Markov Model of Machine Translation using Non-parametric Bayesian Inference

17 0.093743801 193 acl-2013-Improving Chinese Word Segmentation on Micro-blog Using Rich Punctuations

18 0.09222997 374 acl-2013-Using Context Vectors in Improving a Machine Translation System with Bridge Language

19 0.089153312 210 acl-2013-Joint Word Alignment and Bilingual Named Entity Recognition Using Dual Decomposition

20 0.088425063 80 acl-2013-Chinese Parsing Exploiting Characters

similar papers computed by lsi model

lsi for this paper:

topicId topicWeight

[(0, 0.247), (1, -0.028), (2, 0.058), (3, -0.004), (4, 0.135), (5, 0.073), (6, -0.121), (7, 0.031), (8, 0.05), (9, -0.002), (10, 0.048), (11, -0.053), (12, 0.004), (13, 0.042), (14, 0.054), (15, -0.002), (16, -0.004), (17, -0.035), (18, -0.148), (19, 0.053), (20, -0.02), (21, -0.052), (22, -0.009), (23, 0.123), (24, -0.064), (25, 0.016), (26, -0.001), (27, 0.052), (28, -0.034), (29, 0.075), (30, 0.016), (31, -0.129), (32, 0.106), (33, -0.062), (34, -0.009), (35, 0.035), (36, -0.11), (37, -0.059), (38, 0.001), (39, -0.083), (40, -0.026), (41, -0.012), (42, 0.047), (43, 0.042), (44, -0.042), (45, -0.061), (46, -0.066), (47, 0.023), (48, 0.039), (49, -0.026)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.96169901 71 acl-2013-Bootstrapping Entity Translation on Weakly Comparable Corpora

Author: Taesung Lee ; Seung-won Hwang

2 0.90508682 138 acl-2013-Enriching Entity Translation Discovery using Selective Temporality

Author: Gae-won You ; Young-rok Cha ; Jinhan Kim ; Seung-won Hwang

3 0.66844881 255 acl-2013-Name-aware Machine Translation

Author: Haibo Li ; Jing Zheng ; Heng Ji ; Qi Li ; Wen Wang

4 0.63895011 160 acl-2013-Fine-grained Semantic Typing of Emerging Entities

Author: Ndapandula Nakashole ; Tomasz Tylenda ; Gerhard Weikum

Abstract: Methods for information extraction (IE) and knowledge base (KB) construction have been intensively studied. However, a largely under-explored case is tapping into highly dynamic sources like news streams and social media, where new entities are continuously emerging. In this paper, we present a method for discovering and semantically typing newly emerging out-ofKB entities, thus improving the freshness and recall of ontology-based IE and improving the precision and semantic rigor of open IE. Our method is based on a probabilistic model that feeds weights into integer linear programs that leverage type signatures of relational phrases and type correlation or disjointness constraints. Our experimental evaluation, based on crowdsourced user studies, show our method performing significantly better than prior work.

5 0.63457161 93 acl-2013-Context Vector Disambiguation for Bilingual Lexicon Extraction from Comparable Corpora

Author: Dhouha Bouamor ; Nasredine Semmar ; Pierre Zweigenbaum

Abstract: This paper presents an approach that extends the standard approach used for bilingual lexicon extraction from comparable corpora. We focus on the unresolved problem of polysemous words revealed by the bilingual dictionary and introduce a use of a Word Sense Disambiguation process that aims at improving the adequacy of context vectors. On two specialized FrenchEnglish comparable corpora, empirical experimental results show that our method improves the results obtained by two stateof-the-art approaches.

6 0.63331211 139 acl-2013-Entity Linking for Tweets

7 0.62662518 92 acl-2013-Context-Dependent Multilingual Lexical Lookup for Under-Resourced Languages

8 0.62573367 154 acl-2013-Extracting bilingual terminologies from comparable corpora

9 0.61046976 174 acl-2013-Graph Propagation for Paraphrasing Out-of-Vocabulary Words in Statistical Machine Translation

10 0.60189146 352 acl-2013-Towards Accurate Distant Supervision for Relational Facts Extraction

11 0.5951004 236 acl-2013-Mapping Source to Target Strings without Alignment by Analogical Learning: A Case Study with Transliteration

12 0.57940483 16 acl-2013-A Novel Translation Framework Based on Rhetorical Structure Theory

13 0.55468804 164 acl-2013-FudanNLP: A Toolkit for Chinese Natural Language Processing

14 0.5516426 68 acl-2013-Bilingual Data Cleaning for SMT using Graph-based Random Walk

15 0.55101186 169 acl-2013-Generating Synthetic Comparable Questions for News Articles

16 0.54689658 301 acl-2013-Resolving Entity Morphs in Censored Data

17 0.54657739 219 acl-2013-Learning Entity Representation for Entity Disambiguation

18 0.54539287 179 acl-2013-HYENA-live: Fine-Grained Online Entity Type Classification from Natural-language Text

19 0.53998792 223 acl-2013-Learning a Phrase-based Translation Model from Monolingual Data with Application to Domain Adaptation

20 0.5357219 256 acl-2013-Named Entity Recognition using Cross-lingual Resources: Arabic as an Example

similar papers computed by lda model

lda for this paper:

topicId topicWeight

[(0, 0.03), (6, 0.019), (11, 0.481), (24, 0.054), (26, 0.032), (35, 0.074), (42, 0.032), (48, 0.031), (67, 0.018), (70, 0.053), (88, 0.019), (90, 0.013), (95, 0.075)]

similar papers list:

simIndex simValue paperId paperTitle

1 0.98790991 170 acl-2013-GlossBoot: Bootstrapping Multilingual Domain Glossaries from the Web

Author: Flavio De Benedictis ; Stefano Faralli ; Roberto Navigli

Abstract: We present GlossBoot, an effective minimally-supervised approach to acquiring wide-coverage domain glossaries for many languages. For each language of interest, given a small number of hypernymy relation seeds concerning a target domain, we bootstrap a glossary from the Web for that domain by means of iteratively acquired term/gloss extraction patterns. Our experiments show high performance in the acquisition of domain terminologies and glossaries for three different languages.

2 0.98673308 75 acl-2013-Building Japanese Textual Entailment Specialized Data Sets for Inference of Basic Sentence Relations

Author: Kimi Kaneko ; Yusuke Miyao ; Daisuke Bekki

Abstract: This paper proposes a methodology for generating specialized Japanese data sets for textual entailment, which consists of pairs decomposed into basic sentence relations. We experimented with our methodology over a number of pairs taken from the RITE-2 data set. We compared our methodology with existing studies in terms of agreement, frequencies and times, and we evaluated its validity by investigating recognition accuracy.

3 0.9673562 50 acl-2013-An improved MDL-based compression algorithm for unsupervised word segmentation

Author: Ruey-Cheng Chen

Abstract: We study the mathematical properties of a recently proposed MDL-based unsupervised word segmentation algorithm, called regularized compression. Our analysis shows that its objective function can be efficiently approximated using the negative empirical pointwise mutual information. The proposed extension improves the baseline performance in both efficiency and accuracy on a standard benchmark.

4 0.96383071 26 acl-2013-A Transition-Based Dependency Parser Using a Dynamic Parsing Strategy

Author: Francesco Sartorio ; Giorgio Satta ; Joakim Nivre

Abstract: We present a novel transition-based, greedy dependency parser which implements a flexible mix of bottom-up and top-down strategies. The new strategy allows the parser to postpone difficult decisions until the relevant information becomes available. The novel parser has a ∼12% error reduction in unlabeled attach∼ment score over an arc-eager parser, with a slow-down factor of 2.8.

5 0.96251571 286 acl-2013-Psycholinguistically Motivated Computational Models on the Organization and Processing of Morphologically Complex Words

Author: Tirthankar Dasgupta

Abstract: In this work we present psycholinguistically motivated computational models for the organization and processing of Bangla morphologically complex words in the mental lexicon. Our goal is to identify whether morphologically complex words are stored as a whole or are they organized along the morphological line. For this, we have conducted a series of psycholinguistic experiments to build up hypothesis on the possible organizational structure of the mental lexicon. Next, we develop computational models based on the collected dataset. We observed that derivationally suffixed Bangla words are in general decomposed during processing and compositionality between the stem . and the suffix plays an important role in the decomposition process. We observed the same phenomena for Bangla verb sequences where experiments showed noncompositional verb sequences are in general stored as a whole in the ML and low traces of compositional verbs are found in the mental lexicon. 1 IInnttrroodduuccttiioonn Mental lexicon is the representation of the words in the human mind and their associations that help fast retrieval and comprehension (Aitchison, 1987). Words are known to be associated with each other in terms of, orthography, phonology, morphology and semantics. However, the precise nature of these relations is unknown. An important issue that has been a subject of study for a long time is to identify the fundamental units in terms of which the mental lexicon is i itkgp .ernet . in organized. That is, whether lexical representations in the mental lexicon are word based or are they organized along morphological lines. For example, whether a word such as “unimaginable” is stored in the mental lexicon as a whole word or do we break it up “un-” , “imagine” and “able”, understand the meaning of each of these constituent and then recombine the units to comprehend the whole word. Such questions are typically answered by designing appropriate priming experiments (Marslen-Wilson et al., 1994) or other lexical decision tasks. The reaction time of the subjects for recognizing various lexical items under appropriate conditions reveals important facts about their organization in the brain. (See Sec. 2 for models of morphological organization and access and related experiments). A clear understanding of the structure and the processing mechanism of the mental lexicon will further our knowledge of how the human brain processes language. Further, these linguistically important and interesting questions are also highly significant for computational linguistics (CL) and natural language processing (NLP) applications. Their computational significance arises from the issue of their storage in lexical resources like WordNet (Fellbaum, 1998) and raises the questions like, how to store morphologically complex words, in a lexical resource like WordNet keeping in mind the storage and access efficiency. There is a rich literature on organization and lexical access of morphologically complex words where experiments have been conducted mainly for derivational suffixed words of English, Hebrew, Italian, French, Dutch, and few other languages (Marslen-Wilson et al., 2008; Frost et al., 1997; Grainger, et al., 1991 ; Drews and Zwitserlood, 1995). However, we do not know of any such investigations for Indian languages, which 123 Sofia, BuPrlgoacreiead, iAngusgu osft 4h-e9 A 2C01L3 S.tu ?c d2en0t1 3Re Ases aorc hiat Wio nrk fsohro Cp,om papguesta 1ti2o3n–a1l2 L9in,guistics are morphologically richer than many of their Indo-European cousins. Moreover, Indian languages show some distinct phenomena like, compound and composite verbs for which no such investigations have been conducted yet. On the other hand, experiments indicate that mental representation and processing of morphologically complex words are not quite language independent (Taft, 2004). Therefore, the findings from experiments in one language cannot be generalized to all languages making it important to conduct similar experimentations in other languages. This work aims to design cognitively motivated computational models that can explain the organization and processing of Bangla morphologically complex words in the mental lexicon. Presently we will concentrate on the following two aspects:   OOrrggaanniizzaattiioonn aanndd pprroocceessssiinngg ooff BBaannggllaa PPo o l yy-mmoorrpphheemmiicc wwoorrddss:: our objective here is to determine whether the mental lexicon decomposes morphologically complex words into its constituent morphemes or does it represent the unanalyzed surface form of a word. OOrrggaanniizzaattiioonn aanndd pprroocceessssiinngg ooff BBaannggllaa ccoomm-ppoouunndd vveerrbbss ((CCVV)) :: compound verbs are the subject of much debate in linguistic theory. No consensus has been reached yet with respect to the issue that whether to consider them as unitary lexical units or are they syntactically assembled combinations of two independent lexical units. As linguistic arguments have so far not led to a consensus, we here use cognitive experiments to probe the brain signatures of verb-verb combinations and propose cognitive as well as computational models regarding the possible organization and processing of Bangla CVs in the mental lexicon (ML). With respect to this, we apply the different priming and other lexical decision experiments, described in literature (Marslen-Wilson et al., 1994; Bentin, S. and Feldman, 1990) specifically for derivationally suffixed polymorphemic words and compound verbs of Bangla. Our cross-modal and masked priming experiment on Bangla derivationally suffixed words shows that morphological relatedness between lexical items triggers a significant priming effect, even when the forms are phonologically/orthographically unrelated. These observations are similar to those reported for English and indicate that derivationally suffixed words in Bangla are in general accessed through decomposition of the word into its constituent morphemes. Further, based on the experimental data we have developed a series of computational models that can be used to predict the decomposition of Bangla polymorphemic words. Our evaluation result shows that decom- position of a polymorphemic word depends on several factors like, frequency, productivity of the suffix and the compositionality between the stem and the suffix. The organization of the paper is as follows: Sec. 2 presents related works; Sec. 3 describes experiment design and procedure; Sec. 4 presents the processing of CVs; and finally, Sec. 5 concludes the paper by presenting the future direction of the work. 2 RReellaatteedd WWoorrkkss 2. . 11 RReepprreesseennttaattiioonn ooff ppoollyymmoorrpphheemmiicc wwoorrddss Over the last few decades many studies have attempted to understand the representation and processing of morphologically complex words in the brain for various languages. Most of the studies are designed to support one of the two mutually exclusive paradigms: the full-listing and the morphemic model. The full-listing model claims that polymorphic words are represented as a whole in the human mental lexicon (Bradley, 1980; Butterworth, 1983). On the other hand, morphemic model argues that morphologically complex words are decomposed and represented in terms of the smaller morphemic units. The affixes are stripped away from the root form, which in turn are used to access the mental lexicon (Taft and Forster, 1975; Taft, 1981 ; MacKay, 1978). Intermediate to these two paradigms is the partial decomposition model that argues that different types of morphological forms are processed separately. For instance, the derived morphological forms are believed to be represented as a whole, whereas the representation of the inflected forms follows the morphemic model (Caramazza et al., 1988). Traditionally, priming experiments have been used to study the effects of morphology in language processing. Priming is a process that results in increase in speed or accuracy of response to a stimulus, called the target, based on the occurrence of a prior exposure of another stimulus, called the prime (Tulving et al., 1982). Here, subjects are exposed to a prime word for a short duration, and are subsequently shown a target word. The prime and target words may be morphologically, phonologically or semantically re124 lated. An analysis of the effect of the reaction time of subjects reveals the actual organization and representation of the lexicon at the relevant level. See Pulvermüller (2002) for a detailed account of such phenomena. It has been argued that frequency of a word influences the speed of lexical processing and thus, can serve as a diagnostic tool to observe the nature and organization of lexical representations. (Taft, 1975) with his experiment on English inflected words, argued that lexical decision responses of polymorphemic words depends upon the base word frequency. Similar observation for surface word frequency was also observed by (Bertram et al., 2000;Bradley, 1980;Burani et al., 1987;Burani et al., 1984;Schreuder et al., 1997; Taft 1975;Taft, 2004) where it has been claimed that words having low surface frequency tends to decompose. Later, Baayen(2000) proposed the dual processing race model that proposes that a specific morphologically complex form is accessed via its parts if the frequency of that word is above a certain threshold of frequency, then the direct route will win, and the word will be accessed as a whole. If it is below that same threshold of frequency, the parsing route will win, and the word will be accessed via its parts. 2. . 22 RReepprreesseennttaattiioonn ooff CCoommppoouunndd A compound verb (CV) consists of two verbs (V1 and V2) acting as and expresses a single expression For example, in the sentence VVeerrbbss a sequence of a single verb of meaning. রুটিগুল ো খেল খেল ো (/ruTigulo kheYe phela/) ―bread-plural-the eat and drop-pres. Imp‖ ―Eat the breads‖ the verb sequence “খেল খেল ো (eat drop)” is an example of CV. Compound verbs are a special phenomena that are abundantly found in IndoEuropean languages like Indian languages. A plethora of works has been done to provide linguistic explanations on the formation of such word, yet none so far has led to any consensus. Hook (1981) considers the second verb V2 as an aspectual complex comparable to the auxiliaries. Butt (1993) argues CV formations in Hindi and Urdu are either morphological or syntactical and their formation take place at the argument struc- ture. Bashir (1993) tried to construct a semantic analysis based on “prepared” and “unprepared mind”. Similar findings have been proposed by Pandharipande (1993) that points out V1 and V2 are paired on the basis of their semantic compatibility, which is subject to syntactic constraints. Paul (2004) tried to represent Bangla CVs in terms of HPSG formalism. She proposes that the selection of a V2 by a V1 is determined at the semantic level because the two verbs will unify if and only if they are semantically compatible. Since none of the linguistic formalism could satisfactorily explain the unique phenomena of CV formation, we here for the first time drew our attention towards psycholinguistic and neurolinguistic studies to model the processing of verb-verb combinations in the ML and compare these responses with that of the existing models. 3 TThhee PPrrooppoosseedd AApppprrooaacchheess 3. . 11 TThhee ppssyycchhoolliinngguuiissttiicc eexxppeerriimmeennttss We apply two different priming experiments namely, the cross modal priming and masked priming experiment discussed in (Forster and Davis, 1984; Rastle et al., 2000;Marslen-Wilson et al., 1994; Marslen-Wilson et al., 2008) for Bangla morphologically complex words. Here, the prime is morphologically derived form of the target presented auditorily (for cross modal priming) or visually (for masked priming). The subjects were asked to make a lexical decision whether the given target is a valid word in that language. The same target word is again probed but with a different audio or visual probe called the control word. The control shows no relationship with the target. For example, baYaska (aged) and baYasa (age) is a prime-target pair, for which the corresponding control-target pair could be naYana (eye) and baYasa (age). Similar to (Marslen-Wilson et al., 2008) the masked priming has been conducted for three different SOA (Stimulus Onset Asynchrony), 48ms, 72ms and 120ms. The SOA is measured as the amount of time between the start the first stimulus till the start of the next stimulus. TCM abl-’+ Sse-+ O1 +:-DatjdgnmAshielbatArDu)f(osiAMrawnteihmsgcdaoe)lEx-npgmAchebamr)iD-gnatmprhdiYlbeaA(n ftrTsli,ae(+gnrmdisc)phroielctn)osrelated, and - implies unrelated. There were 500 prime-target and controltarget pairs classified into five classes. Depending on the class, the prime is related to the target 125 either in terms of morphology, semantics, orthography and/or Phonology (See Table 1). The experiments were conducted on 24 highly educated native Bangla speakers. Nineteen of them have a graduate degree and five hold a post graduate degree. The age of the subjects varies between 22 to 35 years. RReessuullttss:: The RTs with extreme values and incorrect decisions were excluded from the data. The data has been analyzed using two ways ANOVA with three factors: priming (prime and control), conditions (five classes) and prime durations (three different SOA). We observe strong priming effects (p<0.05) when the target word is morphologically derived and has a recognizable suffix, semantically and orthographically related with respect to the prime; no priming effects are observed when the prime and target words are orthographically related but share no morphological or semantic relationship; although not statistically significant (p>0.07), but weak priming is observed for prime target pairs that are only semantically related. We see no significant difference between the prime and control RTs for other classes. We also looked at the RTs for each of the 500 target words. We observe that maximum priming occurs for words in [M+S+O+](69%), some priming is evident in [M+S+O-](51%) and [M'+S-O+](48%), but for most of the words in [M-S+O-](86%) and [M-S-O+](92%) no priming effect was observed. 3. . 22 FFrreeqquueennccyy DDiissttrriibbuuttiioonn MMooddeellss ooff MMoo rrpphhoo-llooggiiccaall PPrroocceessssiinngg From the above results we saw that not all polymorphemic words tend to decompose during processing, thus we need to further investigate the processing phenomena of Bangla derived words. One notable means is to identify whether the stem or suffix frequency is involved in the processing stage of that word. For this, we apply different frequency based models to the Bangla polymorphemic words and try to evaluate their performance by comparing their predicted results with the result obtained through the priming experiment. MMooddeell --11:: BBaassee aanndd SSuurrffaaccee wwoorrdd ffrreeqquueennccyy ee ff-ffeecctt -- It states that the probability of decomposition of a Bangla polymorphemic word depends upon the frequency of its base word. Thus, if the stem frequency of a polymorphemic word crosses a given threshold value, then the word will decomposed into its constituent morpheme. Similar claim has been made for surface word frequency model where decomposition depends upon the frequency of the surface word itself. We have evaluated both the models with the 500 words used in the priming experiments discussed above. We have achieved an accuracy of 62% and 49% respectively for base and surface word frequency models. MMooddeell --22:: CCoommbbiinniinngg tthhee bbaassee aanndd ssuurrffaaccee wwoorrdd ffrreeq quueennccyy -- In a pursuit towards an extended model, we combine model 1 and 2 together. We took the log frequencies of both the base and the derived words and plotted the best-fit regression curve over the given dataset. The evaluation of this model over the same set of 500 target words returns an accuracy of 68% which is better than the base and surface word frequency models. However, the proposed model still fails to predict processing of around 32% of words. This led us to further enhance the model. For this, we analyze the role of suffixes in morphological processing. MMooddeell -- 33:: DDeeggrreeee ooff AAffffiixxaattiioonn aanndd SSuuffffiixx PPrroodd-uuccttiivviittyy:: we examine whether the regression analysis between base and derived frequency of Bangla words varies between suffixes and how these variations affect morphological decomposition. With respect to this, we try to compute the degree of affixation between the suffix and the base word. For this, we perform regression analysis on sixteen different Bangla suffixes with varying degree of type and token frequencies. For each suffix, we choose 100 different derived words. We observe that those suffixes having high value of intercept are forming derived words whose base frequencies are substantially high as compared to their derived forms. Moreover we also observe that high intercept value for a given suffix indicates higher inclination towards decomposition. Next, we try to analyze the role of suffix type/token ratio and compare them with the base/derived frequency ratio model. This has been done by regression analysis between the suffix type-token ratios with the base-surface frequency ratio. We further tried to observe the role of suffix productivity in morphological processing. For this, we computed the three components of productivity P, P* and V as discussed in (Hay and Plag, 2004). P is the “conditioned degree of productivity” and is the probability that we are encountering a word with an affix and it is representing a new type. P* is the “hapaxedconditioned degree of productivity”. It expresses the probability that when an entirely new word is 126 encountered it will contain the suffix. V is the “type frequency”. Finally, we computed the productivity of a suffix through its P, P* and V values. We found that decomposition of Bangla polymorphemic word is directly proportional to the productivity of the suffix. Therefore, words that are composed of productive suffixes (P value ranges between 0.6 and 0.9) like “-oYAlA”, “-giri”, “-tba” and “-panA” are highly decomposable than low productive suffixes like “-Ani”, “-lA”, “-k”, and “-tama”. The evaluation of the proposed model returns an accuracy of 76% which comes to be 8% better than the preceding models. CCoommbbiinniinngg MMooddeell --22 aanndd MMooddeell -- 33:: One important observation that can be made from the above results is that, model-3 performs best in determining the true negative values. It also possesses a high recall value of (85%) but having a low precision of (50%). In other words, the model can predict those words for which decomposition will not take place. On the other hand, results of Model-2 posses a high precision of 70%. Thus, we argue that combining the above two models can better predict the decomposition of Bangla polymorphemic words. Hence, we combine the two models together and finally achieved an overall accuracy of 80% with a precision of 87% and a recall of 78%. This surpasses the performance of the other models discussed earlier. However, around 22% of the test words were wrongly classified which the model fails to justify. Thus, a more rigorous set of experiments and data analysis are required to predict access mechanisms of such Bangla polymorphemic words. 3. . 33 SStteemm- -SSuuffffiixx CCoommppoossiittiioonnaalliittyy Compositionality refers to the fact that meaning of a complex expression is inferred from the meaning of its constituents. Therefore, the cost of retrieving a word from the secondary memory is directly proportional to the cost of retrieving the individual parts (i.e the stem and the suffix). Thus, following the work of (Milin et al., 2009) we define the compositionality of a morphologically complex word (We) as: C(We)=α 1H(We)+α α2H(e)+α α3H(W|e)+ α4H(e|W) Where, H(x) is entropy of an expression x, H(W|e) is the conditional entropy between the stem W and suffix e and is the proportionality factor whose value is computed through regression analysis. Next, we tried to compute the compositionality of the stem and suffixes in terms of relative entropy D(W||e) and Point wise mutual information (PMI). The relative entropy is the measure of the distance between the probability distribution of the stem W and the suffix e. The PMI measures the amount of information that one random variable (the stem) contains about the other (the suffix). We have compared the above three techniques with the actual reaction time data collected through the priming and lexical decision experiment. We observed that all the three information theoretic models perform much better than the frequency based models discussed in the earlier section, for predicting the decomposability of Bangla polymorphemic words. However, we think it is still premature to claim anything concrete at this stage of our work. We believe much more rigorous experiments are needed to be per- formed in order to validate our proposed models. Further, the present paper does not consider factors related to age of acquisition, and word familiarity effects that plays important role in the processing of morphologically complex words. Moreover, it is also very interesting to see how stacking of multiple suffixes in a word are processed by the human brain. 44 OOrrggaanniizzaattiioonn aanndd PPrroocceessssiinngg ooff CCoomm-ppoouunndd VVeerrbbss iinn tthhee MMeennttaall LLeexxiiccoonn Compound verbs, as discussed above, are special type of verb sequences consisting of two or more verbs acting as a single verb and express a single expression of meaning. The verb V1 is known as pole and V2 is called as vector. For example, “ওঠে পড়া ” (getting up) is a compound verb where individual words do not entirely reflects the meaning of the whole expression. However, not all V1+V2 combinations are CVs. For example, expressions like, “নিঠে য়াও ”(take and then go) and “ নিঠে আঠ ়া” (return back) are the examples of verb sequences where meaning of the whole expression can be derived from the mean- ing of the individual component and thus, these verb sequences are not considered as CV. The key question linguists are trying to identify for a long time and debating a lot is whether to consider CVs as a single lexical units or consider them as two separate units. Since linguistic rules fails to explain the process, we for the first time tried to perform cognitive experiments to understand the organization and processing of such verb sequences in the human mind. A clear understanding about these phenomena may help us to classify or extract actual CVs from other verb 127 sequences. In order to do so, presently we have applied three different techniques to collect user data. In the first technique, we annotated 4500 V1+V2 sequences, along with their example sentences, using a group of three linguists (the expert subjects). We asked the experts to classify the verb sequences into three classes namely, CV, not a CV and not sure. Each linguist has received 2000 verb pairs along with their respective example sentences. Out of this, 1500 verb sequences are unique to each of them and rest 500 are overlapping. We measure the inter annotator agreement using the Fleiss Kappa (Fleiss et al., 1981) measure (κ) where the agreement lies around 0.79. Next, out of the 500 common verb sequences that were annotated by all the three linguists, we randomly choose 300 V1+V2 pairs and presented them to 36 native Bangla speakers. We ask each subjects to give a compositionality score of each verb sequences under 1-10 point scale, 10 being highly compositional and 1 for noncompositional. We found an agreement of κ=0.69 among the subjects. We also observe a continuum of compositionality score among the verb sequences. This reflects that it is difficult to classify Bangla verb sequences discretely into the classes of CV and not a CV. We then, compare the compositionality score with that of the expert user’s annotation. We found a significant correlation between the expert annotation and the compositionality score. We observe verb sequences that are annotated as CVs (like, খেঠে খিল )কঠে খি ,ওঠে পড ,have got low compositionality score (average score ranges between 1-4) on the other hand high compositional values are in general tagged as not a cv (নিঠে য়া (come and get), নিঠে আে (return back), তুঠল খেঠেনি (kept), গনিঠে পিল (roll on floor)). This reflects that verb sequences which are not CV shows high degree of compositionality. In other words non CV verbs can directly interpret from their constituent verbs. This leads us to the possibility that compositional verb sequences requires individual verbs to be recognized separately and thus the time to recognize such expressions must be greater than the non-compositional verbs which maps to a single expression of meaning. In order to validate such claim we perform a lexical decision experiment using 32 native Bangla speakers with 92 different verb sequences. We followed the same experimental procedure as discussed in (Taft, 2004) for English polymorphemic words. However, rather than derived words, the subjects were shown a verb sequence and asked whether they recognize them as a valid combination. The reaction time (RT) of each subject is recorded. Our preliminarily observation from the RT analysis shows that as per our claim, RT of verb sequences having high compositionality value is significantly higher than the RTs for low or noncompositional verbs. This proves our hypothesis that Bangla compound verbs that show less compositionality are stored as a hole in the mental lexicon and thus follows the full-listing model whereas compositional verb phrases are individually parsed. However, we do believe that our experiment is composed of a very small set of data and it is premature to conclude anything concrete based only on the current experimental results. 5 FFuuttuurree DDiirreeccttiioonnss In the next phase of our work we will focus on the following aspects of Bangla morphologically complex words: TThhee WWoorrdd FFaammiilliiaarriittyy EEffffeecctt:: Here, our aim is to study the role of familiarity of a word during its processing. We define the familiarity of a word in terms of corpus frequency, Age of acquisition, the level of language exposure of a person, and RT of the word etc. RRoollee ooff ssuuffffiixx ttyyppeess iinn mmoorrpphhoollooggiiccaall ddeeccoo mm ppoo-ssiittiioonn:: For native Bangla speakers which morphological suffixes are internalized and which are just learnt in school, but never internalized. We can compare the representation of Native, Sanskrit derived and foreign suffixes in Bangla words. CCoommppuuttaattiioonnaall mmooddeellss ooff oorrggaanniizzaattiioonn aanndd pprroocceessssiinngg ooff BBaannggllaa ccoommppoouunndd vveerrbbss :: presently we have performed some small set of experiments to study processing of compound verbs in the mental lexicon. In the next phase of our work we will extend the existing experiments and also apply some more techniques like, crowd sourcing and language games to collect more relevant RT and compositionality data. Finally, based on the collected data we will develop computational models that can explain the possible organizational structure and processing mechanism of morphologically complex Bangla words in the mental lexicon. Reference Aitchison, J. (1987). ―Words in the mind: An introduction to the mental lexicon‖. Wiley-Blackwell, 128 Baayen R. H. (2000). ―On frequency, transparency and productivity‖. G. Booij and J. van Marle (eds), Yearbook of Morphology, pages 181-208, Baayen R.H. (2003). ―Probabilistic approaches to morphology‖. Probabilistic linguistics, pages 229287. Baayen R.H., T. Dijkstra, and R. Schreuder. (1997). ―Singulars and plurals in dutch: Evidence for a parallel dual-route model‖. Journal of Memory and Language, 37(1):94-1 17. Bashir, E. (1993), ―Causal Chains and Compound Verbs.‖ In M. K. Verma ed. (1993). Bentin, S. & Feldman, L.B. (1990). The contribution of morphological and semantic relatedness to repetition priming at short and long lags: Evidence from Hebrew. Quarterly Journal of Experimental Psychology, 42, pp. 693–71 1. Bradley, D. (1980). Lexical representation of derivational relation, Juncture, Saratoga, CA: Anma Libri, pp. 37-55. Butt, M. (1993), ―Conscious choice and some light verbs in Urdu.‖ In M. K. Verma ed. (1993). Butterworth, B. (1983). Lexical Representation, Language Production, Vol. 2, pp. 257-294, San Diego, CA: Academic Press. Caramazza, A., Laudanna, A. and Romani, C. (1988). Lexical access and inflectional morphology. Cognition, 28, pp. 297-332. Drews, E., and Zwitserlood, P. (1995).Morphological and orthographic similarity in visual word recognition. Journal of Experimental Psychology:HumanPerception andPerformance, 21, 1098– 1116. Fellbaum, C. (ed.). (1998). WordNet: An Electronic Lexical Database, MIT Press. Forster, K.I., and Davis, C. (1984). Repetition priming and frequency attenuation in lexical access. Journal of Experimental Psychology: Learning, Memory, and Cognition, 10, 680–698. Frost, R., Forster, K.I., & Deutsch, A. (1997). What can we learn from the morphology of Hebrew? A masked-priming investigation of morphological representation. Journal of Experimental Psychology: Learning, Memory, and Cognition, 23, 829–856. Grainger, J., Cole, P., & Segui, J. (1991). Masked morphological priming in visual word recognition. Journal of Memory and Language, 30, 370–384. Hook, P. E. (1981). ―Hindi Structures: Intermediate Level.‖ Michigan Papers on South and Southeast Asia, The University of Michigan Center for South and Southeast Studies, Ann Arbor, Michigan. Joseph L Fleiss, Bruce Levin, and Myunghee Cho Paik. 1981. The measurement of interrater agreement. Statistical methods for rates and proportions,2:212–236. MacKay,D.G.(1978), Derivational rules and the internal lexicon. Journal of Verbal Learning and Verbal Behavior, 17, pp.61-71. Marslen-Wilson, W.D., & Tyler, L.K. (1997). Dissociating types of mental computation. Nature, 387, pp. 592–594. Marslen-Wilson, W.D., & Tyler, L.K. (1998). Rules, representations, and the English past tense. Trends in Cognitive Sciences, 2, pp. 428–435. Marslen-Wilson, W.D., Tyler, L.K., Waksler, R., & Older, L. (1994). Morphology and meaning in the English mental lexicon. Psychological Review, 101, pp. 3–33. Marslen-Wilson,W.D. and Zhou,X.( 1999). Abstractness, allomorphy, and lexical architecture. Language and Cognitive Processes, 14, 321–352. Milin, P., Kuperman, V., Kosti´, A. and Harald R., H. (2009). Paradigms bit by bit: an information- theoretic approach to the processing of paradigmatic structure in inflection and derivation, Analogy in grammar: Form and acquisition, pp: 214— 252. Pandharipande, R. (1993). ―Serial verb construction in Marathi.‖ In M. K. Verma ed. (1993). Paul, S. (2004). An HPSG Account of Bangla Compound Verbs with LKB Implementation, Ph.D. Dissertation. CALT, University of Hyderabad. Pulvermüller, F. (2002). The Neuroscience guage. Cambridge University Press. of Lan- Stolz, J.A., and Feldman, L.B. (1995). The role of orthographic and semantic transparency of the base morpheme in morphological processing. In L.B. Feldman (Ed.) Morphological aspects of language processing. Hillsdale, NJ: Lawrence Erlbaum Associates Inc. Taft, M., and Forster, K.I.(1975). Lexical storage and retrieval of prefix words. Journal of Verbal Learning and Verbal Behavior, Vol.14, pp. 638-647. Taft, M.(1988). A morphological decomposition model of lexical access. Linguistics, 26, pp. 657667. Taft, M. (2004). Morphological decomposition and the reverse base frequency effect. Quarterly Journal of Experimental Psychology, 57A, pp. 745-765 Tulving, E., Schacter D. L., and Heather A.(1982). Priming Effects in Word Fragment Completion are independent of Recognition Memory. Journal of Experimental Psychology: Learning, Memory and Cognition, vol.8 (4). 129

6 0.95267111 376 acl-2013-Using Lexical Expansion to Learn Inference Rules from Sparse Data

same-paper 7 0.94693607 71 acl-2013-Bootstrapping Entity Translation on Weakly Comparable Corpora

8 0.91050071 61 acl-2013-Automatic Interpretation of the English Possessive

9 0.85260701 245 acl-2013-Modeling Human Inference Process for Textual Entailment Recognition

10 0.76642984 156 acl-2013-Fast and Adaptive Online Training of Feature-Rich Translation Models

11 0.74919081 242 acl-2013-Mining Equivalent Relations from Linked Data

12 0.74243957 358 acl-2013-Transition-based Dependency Parsing with Selectional Branching

13 0.71819937 27 acl-2013-A Two Level Model for Context Sensitive Inference Rules

14 0.70414573 154 acl-2013-Extracting bilingual terminologies from comparable corpora

15 0.68400198 169 acl-2013-Generating Synthetic Comparable Questions for News Articles

16 0.67257321 202 acl-2013-Is a 204 cm Man Tall or Small ? Acquisition of Numerical Common Sense from the Web

17 0.67147863 208 acl-2013-Joint Inference for Heterogeneous Dependency Parsing

18 0.66687882 387 acl-2013-Why-Question Answering using Intra- and Inter-Sentential Causal Relations

19 0.6573382 132 acl-2013-Easy-First POS Tagging and Dependency Parsing with Beam Search

20 0.65409207 138 acl-2013-Enriching Entity Translation Discovery using Selective Temporality